03-27-2013
Based on column in file1, find match in file2 and print matching lines
file1:
Quote:
comp54049_c1_seq33
comp51795_c0_seq4
comp46214_c0_seq1
comp51509_c0_seq2
comp1000362_c0_seq1
file2:
Quote:
>m.149837_g.149837__ORF_g.149837_m.149837_type:internal_len:169_(-)_comp100001_c0_seq1:3-509(-)
FHPPVSDSCKRCDMYKNQIKIAPENEKIQLNADHELHLRKAESARNGMNNDVELCKTDPN
>m.180533_g.180533__ORF_g.180533_m.180533_type:internal_len:99_(-)_comp1000362_c0_seq1:3-299(-)
QSLPFPPNYISLSHAGTLSVNPCTAYRLLKDFVSLSTGDFIIQNGANSGVGRVVIQLCKA
I need to find matches for any lines in file1 that appear in file2. Desired output is '>' plus the file1 term, followed by the line
after the match in file2 (so the title is a little misleading):
Quote:
>comp1000362_c0_seq1
QSLPFPPNYISLSHAGTLSVNPCTAYRLLKDFVSLSTGDFIIQNGANSGVGRVVIQLCKA
This is honestly beyond what I can do without spending the whole night on it, so I'm hoping someone out there is feeling altruistic.
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Hi All,
as you can see I'm pretty new to this board. :D
I'm struggling around with small script to search a few fields in another file.
Basically I have file1 looking like this:
15:38:28 sz:10001 pr:14.16
15:38:28 sz:10002 pr:18.41
15:38:29 sz:10003 pr:19.28
15:38:30 sz:10004... (1 Reply)
Discussion started by: floripoint
1 Replies
2. UNIX for Advanced & Expert Users
File1 row is same as column 2 in file 2.
Also file 2 will either start with A, B or C.
And 3rd column in file 2 is always F2.
When column 2 of file 2 matches file1 column, print all those rows into a separate file.
Here is an example.
file 1:
100
103
104
108
file 2:
... (6 Replies)
Discussion started by: i.scientist
6 Replies
3. Shell Programming and Scripting
Match column 3 in file1 to column 1 in file 2 and replace with column 2 from file2
file 1 sample
SNDK 80004C101 AT
XLNX 983919101 BB
NETL 64118B100 BS
AMD 007903107 CC
KLAC 482480100 DC
TER 880770102 KATS
ATHR 04743P108 KATS... (7 Replies)
Discussion started by: rydz00
7 Replies
4. Shell Programming and Scripting
Hi,
I have file1 like this
aaa
ggg
ddd
vvv
eeeand file2
aaa 2
aaa 443
xxx 76
aaa 34
ggg 33
wee 99
ggg 33
ddd 1
ddd 10
ddd 98
sds 23 (4 Replies)
Discussion started by: polsum
4 Replies
5. UNIX for Dummies Questions & Answers
I have very limited coding skills but I'm wondering if someone could help me with this. There are many threads about matching strings in two files, but I have no idea how to add a column from one file to another based on a matching string.
I'm looking to match column1 in file1 to the number... (3 Replies)
Discussion started by: pathunkathunk
3 Replies
6. Shell Programming and Scripting
I have a file containing texts and indexes. I need the text between (and including ) INDEX and number "1" alone in line. I have managed this:
awk '/INDEX/,/1$/{if (!/1$/)print}' file1.txt
It works for all indexes.
And then I have second file with years and indexes per year, one per line... (3 Replies)
Discussion started by: phoebus
3 Replies
7. Shell Programming and Scripting
Hello,
I have two files file 1 and file 2 each having result of a query on certain database tables and need to compare for Col1 in file1 with Col3 in file2, compare Col2 with Col4 and output the value of Col1 from File1 which is a) not present in Col3 of File2 b) value of Col2 is different from... (2 Replies)
Discussion started by: RasB15
2 Replies
8. Shell Programming and Scripting
I have two files.
File 1 is a two-column index file, e.g.
comp11084_c0_seq6:130-468(-) comp12746_c0_seq3:140-478(+)
comp11084_c0_seq3:201-539(-) comp12746_c0_seq2:191-529(+)
File 2 is a sequence file with headers named with the same terms that populate file 1. ... (1 Reply)
Discussion started by: pathunkathunk
1 Replies
9. Shell Programming and Scripting
I have a list of IDs in file1 and a list of sequences in file2. I can print sequences from file2, but I'm asking for help in printing the sequences in the same order as the IDs appear in file1.
file1:
EN_comp12952_c0_seq3:367-1668
ES_comp17168_c1_seq6:1-864
EN_comp13395_c3_seq14:231-1088... (5 Replies)
Discussion started by: pathunkathunk
5 Replies
10. UNIX for Dummies Questions & Answers
I want to print only the lines in file2 that match file1, in the same order as they appear in file 1
file1
file2
desired output:
I'm getting the lines to match
awk 'FNR==NR {a++}; FNR!=NR && a' file1 file2
but they are in sorted order, which is not what I want:
Can anyone... (4 Replies)
Discussion started by: pathunkathunk
4 Replies
COMM(1) BSD General Commands Manual COMM(1)
NAME
comm -- select or reject lines common to two files
SYNOPSIS
comm [-123i] file1 file2
DESCRIPTION
The comm utility reads file1 and file2, which should be sorted lexically, and produces three text columns as output: lines only in file1;
lines only in file2; and lines in both files.
The filename ``-'' means the standard input.
The following options are available:
-1 Suppress printing of column 1, lines only in file1.
-2 Suppress printing of column 2, lines only in file2.
-3 Suppress printing of column 3, lines common to both.
-i Case insensitive comparison of lines.
Each column will have a number of tab characters prepended to it equal to the number of lower numbered columns that are being printed. For
example, if column number two is being suppressed, lines printed in column number one will not have any tabs preceding them, and lines
printed in column number three will have one.
The comm utility assumes that the files are lexically sorted; all characters participate in line comparisons.
ENVIRONMENT
The LANG, LC_ALL, LC_COLLATE, and LC_CTYPE environment variables affect the execution of comm as described in environ(7).
EXIT STATUS
The comm utility exits 0 on success, and >0 if an error occurs.
SEE ALSO
cmp(1), diff(1), sort(1), uniq(1)
STANDARDS
The comm utility conforms to IEEE Std 1003.2-1992 (``POSIX.2'').
The -i option is an extension to the POSIX standard.
HISTORY
A comm command appeared in Version 4 AT&T UNIX.
BSD
December 12, 2009 BSD