Compare multiple files with multiple number of columns


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Compare multiple files with multiple number of columns
# 1  
Old 06-27-2012
Compare multiple files with multiple number of columns

Hi,

input file1

Code:
abcd 123 198 xyz1:0909090-0909091
ghij 234 999 xyz2:987654:987655
kilo 7890 7990 xyz3:12345-12357
prem 9 112 xyz5:97-1134

input file2

Code:
abcd 123 198 xyz1:0909090-0909091 -9.122 0
abed 88 98 xyz1:98989-090808 -1.234 1.345
ghij 234 999 xyz2:987654:987655 -10.87090909 5
chas 765 897 xyz3:777777-777778 0 -10.87654
kilo 7890 7990 xyz3:12345-12357 -8.7666 0
hello 4123 4321 xyz1:5656-5756 -7.132 0.01

I want to match the first four columns of my file1 to the first four columns of file2, and if there is any match, I want the records from input file2. So, now my output would be

output

Code:
abcd 123 198 xyz1:0909090-0909091 -9.122 0
abed 88 98 xyz1:98989-090808 -1.234 1.345
ghij 234 999 xyz2:987654:987655 -10.87090909 5
chas 765 897 xyz3:777777-777778 0 -10.87654
kilo 7890 7990 xyz3:12345-12357 -8.7666 0
hello 4123 4321 xyz1:5656-5756 -7.132 0.01


Last edited by jacobs.smith; 06-27-2012 at 04:42 PM..
# 2  
Old 06-27-2012
Code:
~/unix.com$ awk -F':' 'NR==FNR{A[$1]=$0;next}$1 in A' file1 file2

Another shorter one:
Code:
~/unix.com$ awk -F':' 'A[$1]++' file1 file2

BTW you meant this as your output file?
Code:
abcd 123 198 xyz1:0909090-0909091 -9.122 0
ghij 234 999 xyz2:987654:987655 -10.87090909 5
kilo 7890 7990 xyz3:12345-12357 -8.7666 0


Last edited by tukuyomi; 06-27-2012 at 04:08 PM..
This User Gave Thanks to tukuyomi For This Post:
# 3  
Old 06-27-2012
Quote:
Originally Posted by tukuyomi
Code:
~/unix.com$ awk -F':' 'NR==FNR{A[$1]=$0;next}$1 in A' file1 file2

Another shorter one:
Code:
~/unix.com$ awk -F':' 'A[$1]++' file1 file2

BTW you meant this as your output file?
Code:
abcd 123 198 xyz1:0909090-0909091 -9.122 0
ghij 234 999 xyz2:987654:987655 -10.87090909 5
kilo 7890 7990 xyz3:12345-12357 -8.7666 0

Hi tukuyomi,

I want to compare the first four columns of each file against each file and not just the first column.

Thanks for the current solution.

Do you think, there is another way to do it by matching the first four columns?
# 4  
Old 06-27-2012
Quote:
the first four columns of each file
as in abcd 123 198 xyz1
In other words, each columns are separated by ' '(space)?
Quote:
and not just the first column
$1 in my script means the first field, as in abcd 123 198 xyz1, because I set FS to :

If anything else, sorry I don't understand your request, as, for me, your output == cat file2

Last edited by tukuyomi; 06-27-2012 at 05:03 PM..
This User Gave Thanks to tukuyomi For This Post:
# 5  
Old 06-27-2012
Hi tukuyomi,

Please pardon me if I wasn't clear.

Here you go.

1. I want to match the first four columns of my file1 to the first four columns of file2, and if there is any match, I want the matching records from input file2 whose first four columns match to the input file1. Also, I want the other records from input file2, even though there is no match to input file 1. This is the reason why you are seeing output == cat input file2.

2. My four columns are

Code:
abcd 123 198 xyz1:0909090-0909091

These are separated by spaces.

3. Also, I would really appreciate if you can write into another file, the unmatched records from input file1 which would contain the following record

Code:
prem 9 112 xyz5:97-1134

# 6  
Old 06-27-2012
Here is it, but i'm still not sure about output file (my awk script below cat file2 to output (print>>"output")
the notmatched file contains the unmatched strings from file1
Code:
~/unix.com$ awk 'NR==FNR{A[$1$2$3$4]=1;print>>"output";next}!A[$1$2$3$4]{print>>"notmatched"}' file2 file1

This User Gave Thanks to tukuyomi For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Removing carriage returns from multiple lines in multiple files of different number of columns

Hello Gurus, I have a multiple pipe separated files which have records going over multiple Lines. End of line separator is \n and records going over multiple lines have <CR> as separator. below is example from one file. 1|ABC DEF|100|10 2|PQ RS T|200|20 3| UVWXYZ|300|30 4| GHIJKL|400|40... (7 Replies)
Discussion started by: dJHa
7 Replies

2. Shell Programming and Scripting

Compare Multiple Columns in one file

Hello guys, I am quite new to Shell Scripting and I need help for this I have a CSV file like this: Requisition,Order,RequisitionLineNumber,OrderLineNumber REQ1,Order1,1,1 REQ1,Order1,1,3 REQ2,Order2,1,5 Basically what I want to do is compare the first 3 fields If all 3 fields are the same... (5 Replies)
Discussion started by: jeffreybsu
5 Replies

3. Shell Programming and Scripting

Compare columns of multiple files and print those unique string from File1 in an output file.

Hi, I have multiple files that each contain one column of strings: File1: 123abc 456def 789ghi File2: 123abc 456def 891jkl File3: 234mno 123abc 456def In total I have 25 of these type of file. (5 Replies)
Discussion started by: owwow14
5 Replies

4. Shell Programming and Scripting

Merging multiple files from multiple columns

Hi guys, I have very basic linux experience so I need some help with a problem. I have 3 files from which I want to extract columns based on common fields between them. File1: --- rs74078040 NA 51288690 T G 461652 0.99223 0.53611 3 --- rs77209296 NA 51303525 T G 461843 0.98973 0.60837 3... (10 Replies)
Discussion started by: bartman2099
10 Replies

5. Shell Programming and Scripting

Compare multiple columns from 2 files

Hi, I need to compare multiple columns from 2 files. I can, for example, have these 2 files: file1: col1, col2, col3,col4 a,1,4,7 b,2,5,8 c,3,6,9file2: col1, col2, col3,col4 a,2,3,2 b,5,7,5 c,1,9,8As a result, I need for example the difference between the columns 2 and 4: col2,... (3 Replies)
Discussion started by: Subbeh
3 Replies

6. UNIX for Dummies Questions & Answers

cutting multiple columns into multiple files

Hypothetically, suppose that file1 id v1 v2 v3 v4 v5 v6 v7..........v100 1 1 1 1 1 1 2 2 .....50 2 1 1 1 1 1 2 2 .....50 3 1 1 1 1 1 2 2 .....50 4 1 1 1 1 1 2 2 .....50 5 1 1 1 1 1 2 2 .....50 I want to write a loop such that I take the id# and the first 5 columns (v1-v5) into the... (3 Replies)
Discussion started by: johnkim0806
3 Replies

7. Shell Programming and Scripting

need help with post:extract multiple columns from multiple files

hello, I will would be grateful if anyone can help me reply to my post extract multiple cloumns from multiple files; skip rows and include filenames; awk Please see this thread. Thanks manishabh (0 Replies)
Discussion started by: manishabh
0 Replies

8. Shell Programming and Scripting

number subtraction of multiple columns

I get the point of number subtraction in one column awk 'NR==1 {n=$1; next}; {n-=$1} END {print n}' inputfile but I cannot figure it out how to do this to multiple columns. awkward. (6 Replies)
Discussion started by: awkward
6 Replies

9. Shell Programming and Scripting

Combine multiple columns from multiple files

Hi there, I was wondering if someone can help me with this. I am trying the combine multiple columns from multiple files into one file. Example file 1: c0t0d0 c0t2d0 # hostname vgname c0t0d1 c0t2d1 # hostname vgname c0t0d2 c0t2d2 # hostname vgname c0t1d0 c0t3d0 # hostname vgname1... (5 Replies)
Discussion started by: martva
5 Replies

10. Shell Programming and Scripting

Compare multiple columns between 2 files

hello I need to compare 2 text files. File 1 has 2 columns and file 2 has 1 to many. Sample: File 1: 111 555 222 666 333 777 444 755 File 2: 000 110 113 114 844 111 555 999 202 777 865 098 023 222 313 499 065 655 333 011 890 777 433 (15 Replies)
Discussion started by: stevesmith
15 Replies
Login or Register to Ask a Question