Based on column in file1, find match in file2 and print matching lines


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Based on column in file1, find match in file2 and print matching lines
# 1  
Old 03-27-2013
Based on column in file1, find match in file2 and print matching lines

file1:
Quote:
comp54049_c1_seq33
comp51795_c0_seq4
comp46214_c0_seq1
comp51509_c0_seq2
comp1000362_c0_seq1
file2:
Quote:
>m.149837_g.149837__ORF_g.149837_m.149837_type:internal_len:169_(-)_comp100001_c0_seq1:3-509(-)
FHPPVSDSCKRCDMYKNQIKIAPENEKIQLNADHELHLRKAESARNGMNNDVELCKTDPN
>m.180533_g.180533__ORF_g.180533_m.180533_type:internal_len:99_(-)_comp1000362_c0_seq1:3-299(-)
QSLPFPPNYISLSHAGTLSVNPCTAYRLLKDFVSLSTGDFIIQNGANSGVGRVVIQLCKA
I need to find matches for any lines in file1 that appear in file2. Desired output is '>' plus the file1 term, followed by the line after the match in file2 (so the title is a little misleading):
Quote:
>comp1000362_c0_seq1
QSLPFPPNYISLSHAGTLSVNPCTAYRLLKDFVSLSTGDFIIQNGANSGVGRVVIQLCKA
This is honestly beyond what I can do without spending the whole night on it, so I'm hoping someone out there is feeling altruistic.
# 2  
Old 03-27-2013
If "169_(-)_comp100001_c0_seq1" is always third field in your file2 based on delimiter colon and you have two fields "169_(-)_" before "comp100001_c0_seq1", then try this else modify the command accordingly to your input

Code:
$ awk -F: ' NR == FNR { arr[$0]=1; next } { sub("[^_]+_[^_]+_","",$3); if(arr[$3]){print $3;getline; print } }' file1 file2
comp1000362_c0_seq1
QSLPFPPNYISLSHAGTLSVNPCTAYRLLKDFVSLSTGDFIIQNGANSGVGRVVIQLCKA

# 3  
Old 03-27-2013
Code:
$ cat temp.sh
while read pattern; do
  grep -q $pattern file2
  if [ $? -ne 0 ]; then continue; fi
  line_number=`grep -m 1 -n $pattern file2 | cut -f 1 -d :`
  echo ">$pattern"
  sed -n "$line_number { n; p; q }" file2
done < file1

Code:
$ ./temp.sh
>comp1000362_c0_seq1
QSLPFPPNYISLSHAGTLSVNPCTAYRLLKDFVSLSTGDFIIQNGANSGVGRVVIQLCKA

I'm hoping you also feel altruistic. Smilie
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Compare file1 and file2, print matching lines in same order as file1

I want to print only the lines in file2 that match file1, in the same order as they appear in file 1 file1 file2 desired output: I'm getting the lines to match awk 'FNR==NR {a++}; FNR!=NR && a' file1 file2 but they are in sorted order, which is not what I want: Can anyone... (4 Replies)
Discussion started by: pathunkathunk
4 Replies

2. Shell Programming and Scripting

Print sequences from file2 based on match to, AND in same order as, file1

I have a list of IDs in file1 and a list of sequences in file2. I can print sequences from file2, but I'm asking for help in printing the sequences in the same order as the IDs appear in file1. file1: EN_comp12952_c0_seq3:367-1668 ES_comp17168_c1_seq6:1-864 EN_comp13395_c3_seq14:231-1088... (5 Replies)
Discussion started by: pathunkathunk
5 Replies

3. Shell Programming and Scripting

Match single line in file1 to groups of lines in file2

I have two files. File 1 is a two-column index file, e.g. comp11084_c0_seq6:130-468(-) comp12746_c0_seq3:140-478(+) comp11084_c0_seq3:201-539(-) comp12746_c0_seq2:191-529(+) File 2 is a sequence file with headers named with the same terms that populate file 1. ... (1 Reply)
Discussion started by: pathunkathunk
1 Replies

4. Shell Programming and Scripting

Compare file1 for matching line in file2 and print the difference in matching lines

Hello, I have two files file 1 and file 2 each having result of a query on certain database tables and need to compare for Col1 in file1 with Col3 in file2, compare Col2 with Col4 and output the value of Col1 from File1 which is a) not present in Col3 of File2 b) value of Col2 is different from... (2 Replies)
Discussion started by: RasB15
2 Replies

5. Shell Programming and Scripting

Match part of string in file2 based on column in file1

I have a file containing texts and indexes. I need the text between (and including ) INDEX and number "1" alone in line. I have managed this: awk '/INDEX/,/1$/{if (!/1$/)print}' file1.txt It works for all indexes. And then I have second file with years and indexes per year, one per line... (3 Replies)
Discussion started by: phoebus
3 Replies

6. UNIX for Dummies Questions & Answers

if matching strings in file1 and file2, add column from file1 to file2

I have very limited coding skills but I'm wondering if someone could help me with this. There are many threads about matching strings in two files, but I have no idea how to add a column from one file to another based on a matching string. I'm looking to match column1 in file1 to the number... (3 Replies)
Discussion started by: pathunkathunk
3 Replies

7. Shell Programming and Scripting

Match one column of file1 with that of file2

Hi, I have file1 like this aaa ggg ddd vvv eeeand file2 aaa 2 aaa 443 xxx 76 aaa 34 ggg 33 wee 99 ggg 33 ddd 1 ddd 10 ddd 98 sds 23 (4 Replies)
Discussion started by: polsum
4 Replies

8. Shell Programming and Scripting

Match column 3 in file1 to column 1 in file 2 and replace with column 2 from file2

Match column 3 in file1 to column 1 in file 2 and replace with column 2 from file2 file 1 sample SNDK 80004C101 AT XLNX 983919101 BB NETL 64118B100 BS AMD 007903107 CC KLAC 482480100 DC TER 880770102 KATS ATHR 04743P108 KATS... (7 Replies)
Discussion started by: rydz00
7 Replies

9. UNIX for Advanced & Expert Users

print contents of file2 for matching pattern in file1 - AWK

File1 row is same as column 2 in file 2. Also file 2 will either start with A, B or C. And 3rd column in file 2 is always F2. When column 2 of file 2 matches file1 column, print all those rows into a separate file. Here is an example. file 1: 100 103 104 108 file 2: ... (6 Replies)
Discussion started by: i.scientist
6 Replies

10. Shell Programming and Scripting

awk/sed search lines in file1 matching columns in file2

Hi All, as you can see I'm pretty new to this board. :D I'm struggling around with small script to search a few fields in another file. Basically I have file1 looking like this: 15:38:28 sz:10001 pr:14.16 15:38:28 sz:10002 pr:18.41 15:38:29 sz:10003 pr:19.28 15:38:30 sz:10004... (1 Reply)
Discussion started by: floripoint
1 Replies
Login or Register to Ask a Question