09-26-2008
To get an output by combining fields from two different files
Hi guys,
I couldn't find solution to this problem. If anyone knows please help me out.
your guidance is highly appretiated.
I have two files -
FILE1 has the following 7 columns ( - has been added to make columns visible enough else columns are separated by single space)
155.34 - leg - 1 - 344 - TC200232 - 292 - 930
152.88 - leg - 1 - 344 -TC215306 - 2 - 743
123.94 - leg - 1 - 344 -TC210135 - 423 - 1148
FILE2
>TC200232.pep
AYNGFNNSNIIRDGVAIINSSGALKLTNRSYNVIGHAFHPNPVPIFNSSTKNVTSFSTYF
VFAIVPLEKTSGGFGFA
>TC210135.pep
GFGDFGKDSNFESQIALYGDAKVVNGGIQMSGSMGFSAGRILNKKPFKLIDGNPRKMVSF
SLHFVFSLSRENGDGFAFVMVPIGYPFDVFDGGSFGLLGNRKMKFLAVEFDTFMDEKYGD
VNDNHVGVDLSS
>TC215306.pep
PRLKQDLTLVGSVIVSDEKKSVQIPDPEREGDDLKHLVGRAIYSSPIR
I want an output like this - FILE3 - which is same as FILE2 but the line starting with '>' should also contain (region 292 to 930 of SEQ) where 292 and 930 are the corresponding columns 6 and 7 of FILE1 for the common id i.e. TC200232 (present in both the files)
>TC200232.pep (region 292 to 930 of SEQ)
AYNGFNNSNIIRDGVAIINSSGALKLTNRSYNVIGHAFHPNPVPIFNSSTKNVTSFSTYF
VFAIVPLEKTSGGFGFA
>TC210135.pep (region 423 to 1148 of SEQ)
GFGDFGKDSNFESQIALYGDAKVVNGGIQMSGSMGFSAGRILNKKPFKLIDGNPRKMVSF
SLHFVFSLSRENGDGFAFVMVPIGYPFDVFDGGSFGLLGNRKMKFLAVEFDTFMDEKYGD
VNDNHVGVDLSS
>TC215306.pep (region 2 to 743 of SEQ)
PRLKQDLTLVGSVIVSDEKKSVQIPDPEREGDDLKHLVGRAIYSSPIR
Last edited by smriti_shridhar; 09-26-2008 at 08:03 AM..
Reason: formatting
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Can someone tell me how to do this using sed, awk, or any other basic shell scripting? Basically I have two text files with the following contained in each file:
File A:
a b c
d e f
g h i
File B:
1
2
3
I want the final outcome to look like this:
a b c 1
d e f 2
g h i 3
How... (3 Replies)
Discussion started by: shocker
3 Replies
2. Shell Programming and Scripting
I am using:
ps -A -o command,%cpu
to get process and cpu usage figures. I want to use awk to split up the columns it returns. If I use:
awk '{print "Process: "$1"\nCPU Usage: "$NF"\n"}'
the $NF will get me the value in the last column, but if there is more than one word in the... (2 Replies)
Discussion started by: json4639
2 Replies
3. Shell Programming and Scripting
Hello!
I am writing a program to run through two large lists of data (~300,000 rows), find where rows in one file match another, and combine them based on matching fields. Due to the large file sizes, I'm guessing AWK will be the most efficient way to do this. Overall, the input and output I'm... (5 Replies)
Discussion started by: Michelangelo
5 Replies
4. Shell Programming and Scripting
Hi All,
Looking for a quick AWK script to output some differences between two files.
FILE1
device1 1.1.1.1 PINGS
device1 2.2.2.2 PINGS
FILE2
2862 SITE1 device1-prod 1.1.1.1 icmp - 0 ... (4 Replies)
Discussion started by: stacky69
4 Replies
5. Shell Programming and Scripting
Hello I am trying to develop a shell script that takes a text file such as this...
E-mail@ Soc.Sec.No. *--------Name-----------* Class *School.Curriculum.Major.* Campus.Phone
JCC2380 XXX-XX-XXXX CAREY, JULIE C JR-II BISS CPSC BS INFO TECH 412/779-9445
JAC1936 XXX-XX-XXXX... (7 Replies)
Discussion started by: crimputt
7 Replies
6. Shell Programming and Scripting
Hi,
I have a file of the following format:
AV 103
AV 104
AV 105
AV 308
AV 517
BN 210
BN 211
BN 212
BN 218
and the desired output is :
AV 103-105 3
AV 308 1
AV 517 1
BN 210-212 3 (5 Replies)
Discussion started by: rochitsharma
5 Replies
7. Shell Programming and Scripting
I need to take 2 input files and create 1 output based on matches from each file. I am looking to match field #1 in both files (Userid) and create an output file that will be a combination of fields from
both file1 and file2 if there are any differences in the fields 2,3,4,5,or 6.
Below is an... (5 Replies)
Discussion started by: ambroze
5 Replies
8. Shell Programming and Scripting
Hi,
I have 3 files with one column value as shown
File: a.txt
------------
Data_a1
Data_a2
File2: b.txt
------------
Data_b1
Data_b2
Data_b3
Data_b4
File3: c.txt
------------
Data_c1
Data_c2
Data_c3
Data_c4
Data_c5 (6 Replies)
Discussion started by: vfrg
6 Replies
9. Shell Programming and Scripting
I would like to join two files when two columns in each file matches with each other and then produce an output when taking multiple columns.
Like I have file A
1234,ABCD,23,JOHN,NJ,USA
2345,ABCD,24,SAM,NY,USA
5678,GHIJ,24,TOM,NY,USA
5678,WXYZ,27,MAT,NJ,USA
and file B
... (2 Replies)
Discussion started by: mady135
2 Replies
10. UNIX for Dummies Questions & Answers
Hello,
I'm back again looking for your precious help-
This time I need to merge two text files with matching two fields, output only common records with mixed output.
Let's look at the example:
FILE1
56153;AAA0708;3;TEST1TEST1;
89014;BBB0708;3;TEST2TEST2;
89014;BBB0708;4;TEST3TEST3;
... (7 Replies)
Discussion started by: emare
7 Replies
LEARN ABOUT OPENSOLARIS
comb
sccs-comb(1) User Commands sccs-comb(1)
NAME
sccs-comb, comb - combine SCCS deltas
SYNOPSIS
comb [-os] [-csid-list] [-psid] s.filename...
DESCRIPTION
comb generates a shell script (see sh(1)) that you can use to reconstruct the indicated s.files. This script is written to the standard
output.
If a directory name is used in place of the s.filename argument, the comb command applies to all s.files in that directory. Unreadable
s.files produce an error; processing continues with the next file (if any). The use of `-' as the s.filename argument indicates that the
names of files are to be read from the standard input, one s.file per line.
If no options are specified, comb preserves only the most recent (leaf) delta in a branch, and the minimal number of ancestors needed to
preserve the history.
OPTIONS
The following options are supported:
-o For each `get -e' generated, access the reconstructed file at the release of the delta to be created. Otherwise, the recon-
structed file is accessed at the most recent ancestor. The use of -o can decrease the size of the reconstructed s.file. It
can also alter the shape of the delta tree of the original file.
-s Generate scripts to gather statistics, rather than combining deltas. When run, the shell scripts report: the file name, size
(in blocks) after combining, original size (also in blocks), and the percentage size change, computed by the formula:
100 * ( original - combined ) / original
This option can be used to calculate the space that is saved, before actually doing the combining.
-csid-list Include the indicated list of deltas. All other deltas are omitted. sid-list is a comma-separated list of SCCS delta IDs
(SIDs). To specify a range of deltas, use a `-' separator instead of a comma, between two SIDs in the list.
-pSID The SID of the oldest delta to be preserved.
FILES
s.COMB reconstructed SCCS file
comb????? temporary file
ATTRIBUTES
See attributes(5) for descriptions of the following attributes:
+-----------------------------+-----------------------------+
| ATTRIBUTE TYPE | ATTRIBUTE VALUE |
+-----------------------------+-----------------------------+
|Availability |SUNWsprot |
+-----------------------------+-----------------------------+
SEE ALSO
sccs(1), sccs-admin(1), sccs-cdc(1), sccs-delta(1), sccs-help(1), sccs-prs(1), sccs-prt(1), sccs-rmdel(1), sccs-sccsdiff(1), what(1), sccs-
file(4), attributes(5)
DIAGNOSTICS
Use the SCCS help command for explanations (see sccs-help(1)).
BUGS
comb might rearrange the shape of the tree of deltas. It might not save any space; in fact, it is possible for the reconstructed file to
actually be larger than the original.
SunOS 5.11 30 Jun 2007 sccs-comb(1)