Compare two files based on column


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Compare two files based on column
# 1  
Old 06-22-2014
Compare two files based on column

Hi, I have two files roughly 1200 fields in length for each row, sorted on the 2nd field. I need to compare based on that 2nd column between file1 and file2 and print lines that exist in both files into separate files (I can't guarantee that every line in file1 is in file2).

Example:

Code:
File1: 
1,12345,1,1,1...
1,12344,1,1,0...
1,12343,1,1,0...
1,12342,1,1,0...
...

Code:
File2:
1,12346,1,1,1...
1,12343,1,1,1...
1,12342,1,1,1...
...

Desired Output:

Code:
_File1:
1,12343,1,1,0...
1,12342,1,1,0...

Code:
_File2:
1,12343,1,1,1...
1,12342,1,1,1...

I need to run actual value comparisons on the other fields later, so I need separate files containing their matching lines. Only the 2nd field should be compared on, differences in other fields are perfectly fine. That 2nd field is unique, won't appear more than once per file. Alternatively, if it's simpler to remove lines from both files that don't match each other (to get the same desired output), that works too. Thanks in advance.

Edit: If it helps, that 2nd field is always 15 digits in length (never letters) and no other 15-digit field exists in either file besides that 2nd field

Last edited by origon; 06-22-2014 at 12:55 PM..
# 2  
Old 06-22-2014
Here is one way using 2 nawk statements to match the input files, just the order of the input files are different:

Code:
nawk -F"," 'FNR==NR{a[$2]++;next}a[$2]' file1 file2
1,12343,1,1,1...
1,12342,1,1,1...

Code:
nawk -F"," 'FNR==NR{a[$2]++;next}a[$2]' file2 file1
1,12343,1,1,0...
1,12342,1,1,0...

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Need awk or Shell script to compare Column-1 of two different CSV files and print if column-1 matche

Example: I have files in below format file 1: zxc,133,joe@example.com cst,222,xyz@example1.com File 2 Contains: hxd hcd jws zxc cst File 1 has 50000 lines and file 2 has around 30000 lines : Expected Output has to be : hxd hcd jws (5 Replies)
Discussion started by: TestPractice
5 Replies

2. Shell Programming and Scripting

Compare two csv's with column based

Hi, I am having below two CSV's col_1,col_2,col_3 1,2,4 1,3,6 col_1,col_3,col2,col_5,col_6 1,2,3,4,5 1,6,3,,, I need to compare based on the columns where the mismatch is expected output col_1,col_2,col_3 1,2,4 (3 Replies)
Discussion started by: rohit_shinez
3 Replies

3. Shell Programming and Scripting

Compare based on column value

Hi Experts, I want to compare 2 text files based on their column values text1 is like prd-1234 yes no yes yes prd-2345 no no no yes prd-6475 yes yes yes no and test 2 is prd-1234 no no no yes prd-2345 yes no no no desired out put as follows prd-1234 1 3 prd-235 1 4 basically it shows... (5 Replies)
Discussion started by: tijomonmathew
5 Replies

4. Shell Programming and Scripting

How to compare 2 files column's more than 5?

Hi All I am just trying to compare 2 file using column information using following code awk ' NR==FNR {A=$9; next} {B=A; print $0,B""?B:" Not -In file" } ' OFS="\t" file1 file2if file1 matches with file2 then print $9 content in file 1 along with file2 $0 suppose if I keyed on only $1 in... (17 Replies)
Discussion started by: Akshay Hegde
17 Replies

5. Shell Programming and Scripting

Compare three files based on two fields

Guys, I tried searching on the internet and I couldn't get the answer for this problem. I have 3 files. First 2 fields of all of them are of same type, say they come from various databases but first two fields in the 3 files means the same. I need to verify the entries that are not present... (4 Replies)
Discussion started by: PikK45
4 Replies

6. Shell Programming and Scripting

Nawk script to compare records of a file based on a particular column.

Hi Gurus, I am struggling with nawk command where i am processing a file based on columns. Here is the sample data file. UM113570248|24-AUG-11|4|man1|RR211 Alert: Master Process failure |24-AUG-11 UM113570624|24-AUG-11|4|man1| Alert: Pattern 'E_DCLeDAOException' found |24-AUG-11... (7 Replies)
Discussion started by: usha rao
7 Replies

7. Shell Programming and Scripting

Compare Two Files(Column By Column) In Perl or shell

Hi, I am writing a comparator script, which comapre two txt files(column by column) below are the precondition of this comparator 1)columns of file are not seperated Ex. file1.txt 8888812341181892 1243548895685687 8945896789897789 1111111111111111 file2.txt 9578956789567897... (2 Replies)
Discussion started by: kumar96877
2 Replies

8. Shell Programming and Scripting

compare 2 files based on columns

Hi Experts, Is there a way to compare 2 files by columns and print matching cases. I have 2 files as below, I want cases where col1 and col2 in f1 matches col1 and col2 in f2 to be printed as output. The separator is space. I want the output to have col1 col2 col 3 from both files printed... (7 Replies)
Discussion started by: novice_man
7 Replies

9. Shell Programming and Scripting

Compare files column to column based on keys

Here is my situation. I need to compare two tab separated files (diff is not useful since there could be known difference between files). I have found similar posts , but not fully matching.I was thinking of writing a shell script using cut and grep and while loop but after going thru posts it... (2 Replies)
Discussion started by: blackjack101
2 Replies

10. Shell Programming and Scripting

How to compare 2 files & get only few columns based on a condition related to both files?

Hiiiii friends I have 2 files which contains huge data & few lines of it are as shown below File1: b.dat(which has 21 columns) SSR 1976 8 12 13 10 44.00 39.0700 70.7800 7.0 0 0.00 0 2.78 0.00 0.00 0 0.00 2.78 0 NULL ISC 1976 8 12 22 32 37.39 36.2942 70.7338... (6 Replies)
Discussion started by: reva
6 Replies
Login or Register to Ask a Question