Finding out the common lines in two files using 4 fields with the help of awk and UNIX


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Finding out the common lines in two files using 4 fields with the help of awk and UNIX
# 1  
Old 08-27-2013
Finding out the intersection in two files using 4 fields with the help of awk and UNIX

Dear All,
I have 2 files. If field 1, 2, 4 and 5 matches in both file1 and file2, I want to print the whole line of file1 and file2 one after another in my output file.

File1:
Code:
sc2/80         20      .        A       T         86       F=5;U=4
sc2/60         55      .        G       T         76       F=5;U=4 
sc2/68         20      .        T       C         71       F=5;U=4
sc2/24         24      .        T       G         31       F=5;U=4

File2:
Code:
sc2/99         84      .        C       G         61       F=5;U=4
sc2/80         20      .        A       T         30       F=5;U=4
sc2/60         40      .        G       T         76       F=5;U=4 
sc2/30         20      .        T       C         71       F=5;U=4
sc2/24         24      .        T       G         91       F=5;U=4

Expected Output:

Code:
sc2/80         20      .        A       T         86       F=5;U=4
sc2/80         20      .        A       T         30       F=5;U=4
sc2/24         24      .        T       G         31       F=5;U=4
sc2/24         24      .        T       G         91       F=5;U=4

I am new in the field and I appreciate your help.

Last edited by NamS; 08-27-2013 at 07:20 AM..
# 2  
Old 08-27-2013
Try this (Assuming it's tab separated):
Code:
awk -F\t 'NR==FNR{a[$1,$2,$4,$5]=$0;next}($1,$2,$4,$5) in a{print $0;print a[$1,$2,$4,$5]}' file1 file2

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Is there a UNIX command that can compare fields of files with differing number of fields?

Hi, Below are the sample files. x.txt is from an Excel file that is a list of users from Windows and y.txt is a list of database account. $ head -500 x.txt y.txt ==> x.txt <== TEST01 APP_USER_PROFILE USER03 APP_USER_PROFILE TEST02 APP_USER_EXP_PROFILE TEST04 APP_USER_PROFILE USER01 ... (3 Replies)
Discussion started by: newbie_01
3 Replies

2. UNIX for Beginners Questions & Answers

Awk: output lines with common field to separate files

Hi, A beginner one. my input.tab (tab-separated): h1 h2 h3 h4 h5 item1 grpA 2 3 customer1 item2 grpB 4 6 customer1 item3 grpA 5 9 customer1 item4 grpA 0 0 customer2 item5 grpA 9 1 customer2 objective: output a file for each customer ($5) with the item number ($1) only if $2 matches... (2 Replies)
Discussion started by: beca123456
2 Replies

3. UNIX for Dummies Questions & Answers

Filter lines common in two files

Thanks everyone. I got that problem solved. I require one more help here. (Yes, UNIX definitely seems to be fun and useful, and I WILL eventually learn it for myself. But I am now on a different project and don't really have time to go through all the basics. So, I will really appreciate some... (6 Replies)
Discussion started by: latsyrc
6 Replies

4. UNIX for Dummies Questions & Answers

keeping last record among group of records with common fields (awk)

input: ref.1;rack.1;1 #group1 ref.1;rack.1;2 #group1 ref.1;rack.2;1 #group2 ref.2;rack.3;1 #group3 ref.2;rack.3;2 #group3 ref.2;rack.3;3 #group3 Among records from same group (i.e. with same 1st and 2nd field - separated by ";"), I would need to keep the last record... (5 Replies)
Discussion started by: beca123456
5 Replies

5. Shell Programming and Scripting

finding common numbers (contents) across 2 or 3 files

I have 3 files which are tab delimited and have numbers in it. file 1 1 2 3 4 5 6 7 File 2 3 5 7 8 File 3 1 (4 Replies)
Discussion started by: Lucky Ali
4 Replies

6. Shell Programming and Scripting

awk- comparing fields from the same column, finding discontinuities.

Hello, I have a file with two fields. The first field repeats itself for quite a while but the second field changes. What I want to do is to go through the first column until its value changes (and while it doesn't, verify that the second field is in a sequence from 0-15). Example input: ... (13 Replies)
Discussion started by: acsg
13 Replies

7. Shell Programming and Scripting

Get common lines from multiple files

FileA chr1 31237964 NP_001018494.1 PUM1 M340L chr1 31237964 NP_055491.1 PUM1 M340L chr1 33251518 NP_037543.1 AK2 H191D chr1 33251518 NP_001616.1 AK2 H191D chr1 57027345 NP_001004303.2 C1orf168 P270S FileB chr1 ... (9 Replies)
Discussion started by: genehunter
9 Replies

8. Shell Programming and Scripting

Common lines from files

Hello guys, I need a script to get the common lines from two files with a criteria that if the first two columns match then I keep the maximum value of the 5th column.(tab separated columns) . 3rd and 4th columns corresponds to the row which has highest value for the 5th column. Sample... (2 Replies)
Discussion started by: jaysean
2 Replies

9. Shell Programming and Scripting

Common lines from files

Hello guys, I need a script to get the common lines from two files with a criteria that if the first two columns match then I keep the maximum value of the 3rd column.(tab separated columns) Sample input: file1: 111 222 0.1 333 444 0.5 555 666 0.4 file 2: 111 222 0.7 555 666... (5 Replies)
Discussion started by: jaysean
5 Replies

10. Shell Programming and Scripting

To find all common lines from 'n' no. of files

Hi, I have one situation. I have some 6-7 no. of files in one directory & I have to extract all the lines which exist in all these files. means I need to extract all common lines from all these files & put them in a separate file. Please help. I know it could be done with the help of... (11 Replies)
Discussion started by: The Observer
11 Replies
Login or Register to Ask a Question