Compare multiple columns between 2 files


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Compare multiple columns between 2 files
# 1  
Old 09-18-2006
Compare multiple columns between 2 files

hello

I need to compare 2 text files. File 1 has 2 columns and file 2 has 1 to many.
Sample:

File 1:

111 555
222 666
333 777
444 755

File 2:

000 110 113 114
844 111 555 999
202 777 865 098 023
222 313 499
065 655 333 011 890 777
433

Results should be from file1:
222 666
444 755

I need to find all the lines from file 1 that the data in column 1 and 2 do not also exist in a row in anywhere in file 2. The files are a few thousand lines each.

I've looked at awk and nawk but to be honest I'm not sure where to begin with this.

Thanks very much
# 2  
Old 09-18-2006
try using associative arrays -
Code:
#!/bin/ksh
# this looks at all the columns in searchfile 
# prints a line from outputfile when no column in 
#    the outputfile line matches any column in 
#     searchfile
awk '
	FILENAME=="searchfile" {
	 for(i=1;i<=NF;i++ ) {Keys[$i]++}
	}
	FILENAME=="outputfile" {
		found=0
		for(i=1;i<=NF;i++) { if(Keys[$i]>0) {found=1; break;}}
		if (found == 0) {
			print $0
		}
	}
' searchfile outputfile

# 3  
Old 09-18-2006
In the aux file you have the solution
Code:
#!/bin/bash

echo > aux
for i in `cat file1`
do
  bool=1
  for j in `cat file2`
  do
    if [ `echo $i` = `echo $j` ]
      then
        bool=0
    fi
  done
  if [ $bool -eq 1 ]
    then
      echo $i >> aux
  fi
done

Ask if you have questions, see you
# 4  
Old 09-18-2006
Bug

Awesome.

A couple of little changes I needed to make for it to work exactly as I needed:

awk '
FILENAME=="searchfile" {
for(i=1;i<=NF;i++ ) {Keys[$i]++}
}
FILENAME=="outputfile" {
found=0
for(i=1;i<=NF;i++) { if(Keys[$i]>0) {found++; }}
if (found != 2 ) {
print $0
}
}
' searchfile outputfile

Many thanks for your help on this!!!
# 5  
Old 09-18-2006
can you guys tell me what's the search file and output file here. My guess is that output file is empty before the process and the search file is file2. Am i wrong in my assumption? Let me know.
# 6  
Old 09-18-2006
Compare

The search file is the example "file 2" from my original post, the output file would be "file 1." File 1 has 2 columns to find in any row of file 2. The output of the script goes to standard out and is all rows from "file 1" whose columns do not have matching values in either column of of any row of file 2. Take a good look at the example in my original post to understand what the script does.
Cheers
# 7  
Old 09-20-2006
Comparing multi columns over 2 files

Greetings folks

I've found a problem with the code, it does everything described in the original thread except it doesn't do it row by row.
For example: if i have the following files

File1
111 222
333 444
555 666

File2
000 999 211 333
111 020 222
444 990 433

The output should be:
333 444
555 666

Actual result is:
555 666

Because it finds the columns 333 and 444 from file1 in 2 different rows of file2. I need the rows from file1 to be compared to match 2 columns in of any 1 row in file2. I've been playing with this for 2 days and am once again stumped.

Thanks in advance!

#!/bin/ksh
# this looks at all the columns in hcs.txt
# prints a line from wmh.txt when no column in
# the wmh.txt line matches any column in
# hcs.txt
awk '
FILENAME=="file2" {
for(i=1;i<=NF;i++ ) {Keys[$i]++}
}
FILENAME=="file1" {
found=0
for(i=1;i<=NF;i++)
{ if(Keys[$i]>0) {found++; }}
if (found != 2 ) {
print $0; found = 0; break;
}
}
' file2 file1
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

How to compare two columns in two files?

Hi All, I have a.dat file with content 1,338,30253395122015103,2015103,UB0085000,STMT151117055527002,,, 1,338,30253395122015103,2015103,UB0085000,STMT151117055527001,,, and b.dat having content 1,STMT151117055527001,a1.txt,b1.txt,c1.txt 1,STMT151117055527002,a2.txt,b2.txt,c2.txt ... (13 Replies)
Discussion started by: PRAMOD 96
13 Replies

2. UNIX for Dummies Questions & Answers

Help need to compare columns in files

Hi, Below is my requirement file1 id|cnt 1|1 2|2 3|3 file2 id_1|cnt_1 1|1 2|1 3|1 I want to compare cnt and cnt_1 columns, if they are differ then give the details Am using below awk command, but the output is not as expected. (2 Replies)
Discussion started by: grandhirahuletl
2 Replies

3. Shell Programming and Scripting

Compare 2 csv files by columns, then extract certain columns of matcing rows

Hi all, I'm pretty much a newbie to UNIX. I would appreciate any help with UNIX coding on comparing two large csv files (greater than 10 GB in size), and output a file with matching columns. I want to compare file1 and file2 by 'id' and 'chain' columns, then extract exact matching rows'... (5 Replies)
Discussion started by: bkane3
5 Replies

4. Shell Programming and Scripting

Compare Multiple Columns in one file

Hello guys, I am quite new to Shell Scripting and I need help for this I have a CSV file like this: Requisition,Order,RequisitionLineNumber,OrderLineNumber REQ1,Order1,1,1 REQ1,Order1,1,3 REQ2,Order2,1,5 Basically what I want to do is compare the first 3 fields If all 3 fields are the same... (5 Replies)
Discussion started by: jeffreybsu
5 Replies

5. Shell Programming and Scripting

Compare columns of multiple files and print those unique string from File1 in an output file.

Hi, I have multiple files that each contain one column of strings: File1: 123abc 456def 789ghi File2: 123abc 456def 891jkl File3: 234mno 123abc 456def In total I have 25 of these type of file. (5 Replies)
Discussion started by: owwow14
5 Replies

6. Shell Programming and Scripting

Compare multiple columns from 2 files

Hi, I need to compare multiple columns from 2 files. I can, for example, have these 2 files: file1: col1, col2, col3,col4 a,1,4,7 b,2,5,8 c,3,6,9file2: col1, col2, col3,col4 a,2,3,2 b,5,7,5 c,1,9,8As a result, I need for example the difference between the columns 2 and 4: col2,... (3 Replies)
Discussion started by: Subbeh
3 Replies

7. Shell Programming and Scripting

Compare multiple files with multiple number of columns

Hi, input file1 abcd 123 198 xyz1:0909090-0909091 ghij 234 999 xyz2:987654:987655 kilo 7890 7990 xyz3:12345-12357 prem 9 112 xyz5:97-1134 input file2 abcd 123 198 xyz1:0909090-0909091 -9.122 0 abed 88 98 xyz1:98989-090808 -1.234 1.345 ghij 234 999 xyz2:987654:987655 -10.87090909 5... (5 Replies)
Discussion started by: jacobs.smith
5 Replies

8. UNIX for Dummies Questions & Answers

Compare Columns in two files

Hi all, I would like to compare a column in one file to a column in another file and when there is a match it prints the first column and the corresponding second column. Example File1 ABA ABC ABE ABF File 2 ABA 123 ABB 124 ABD 125 ABC 126 So what I would like printed to a file... (0 Replies)
Discussion started by: pcg
0 Replies

9. Shell Programming and Scripting

How to compare 2 files & get only few columns based on a condition related to both files?

Hiiiii friends I have 2 files which contains huge data & few lines of it are as shown below File1: b.dat(which has 21 columns) SSR 1976 8 12 13 10 44.00 39.0700 70.7800 7.0 0 0.00 0 2.78 0.00 0.00 0 0.00 2.78 0 NULL ISC 1976 8 12 22 32 37.39 36.2942 70.7338... (6 Replies)
Discussion started by: reva
6 Replies

10. Shell Programming and Scripting

How to compare two columns in two files?

Hello all, Could someone please let me know shell script or awk solution to compare two columns in two files? Here is the sample - file1.txt abc/xyz,M1234 ddd/lyg,M2345 cnn/tnt,G0123 file2.txt A,abc/xyz,kk,dd,zz,DCT,G0123,1 A,ddd/lyg,kk,dd,zz,DCT,M1234,1... (17 Replies)
Discussion started by: sncoupons
17 Replies
Login or Register to Ask a Question