Compare columns of multiple files and print those unique string from File1 in an output file.


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Compare columns of multiple files and print those unique string from File1 in an output file.
# 1  
Old 10-15-2013
Compare columns of multiple files and print those unique string from File1 in an output file.

Hi,

I have multiple files that each contain one column of strings:

File1:
Code:
123abc
456def
789ghi

File2:
Code:
123abc
456def
891jkl

File3:
Code:
234mno
123abc
456def

In total I have 25 of these type of file.

I want to compare the strings in File 1 with the 24 other files and Print in my output ONLY those strings in File 1 that do not appear in ANY other file.

Can anyone help me?
Thanks!
Moderator's Comments:
Mod Comment Since you didn't include CODE tags I can't be sure whether or not you intended for there to be an empty line at the start of File1. When I added CODE tags for you, I assumed that you did not want an empty line at the start of File1.
Please use CODE tags so we don't have to guess.

Last edited by Don Cragun; 10-15-2013 at 08:15 AM.. Reason: Add CODE tags.
# 2  
Old 10-15-2013
Could this help ?

Code:
awk 'NR==FNR{a[$1]++;next}{ if ( $i in a){a[$1]="Y"}} END{ for (i in a){if (a[i] != "Y"){print i}}} ' file1 file2 file3

# 3  
Old 10-15-2013
How about this:
Code:
comm -23 <(sort file1) <(sort file2 file3 file4)

# 4  
Old 10-15-2013
A slight simplification of pravin27's script:
Code:
awk '
FNR == NR { l[$0]; next }
$0 in l { delete l[$0] }
END { for(i in l) print i } 
' File*

As always, if you want to try this on a Solaris/SunOS system, use /usr/xpg4/bin/awk, /usr/xpg6/bin/awk, or nawk instead of just awk.
These 2 Users Gave Thanks to Don Cragun For This Post:
# 5  
Old 10-15-2013
Thank you for all of the responses. Don Cragun, your simplification of Pravin's command line works great. Simple and effective. Thanks again!
# 6  
Old 10-15-2013
Hi.

Using grep, assuming that the 24 files will fit into memory available:
Code:
#!/usr/bin/env bash

# @(#) s1	Demonstrate inverse, "-v", match, grep with auxiliary file.

# Utility functions: print-as-echo, print-line-with-visual-space, debug.
# export PATH="/usr/local/bin:/usr/bin:/bin"
pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
db() { : ; }
C=$HOME/bin/context && [ -f $C ] && $C grep

pl " Input data file primary data*:"
head primary data*

pl " Results:"
grep -v -f <( cat data* ) primary

exit 0

producing:
Code:
$ ./s1

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 2.6.26-2-amd64, x86_64
Distribution        : Debian GNU/Linux 5.0.8 (lenny, workstation) 
bash GNU bash 3.2.39
grep GNU grep 2.5.3

-----
 Input data file primary data*:
==> primary <==
123abc
456def
789ghi

==> data1 <==
123abc
456def
891jkl

==> data2 <==
234mno
123abc
456def

-----
 Results:
789ghi

See man pages for details.

Best wishes ... cheers, drl
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Split into multiple files by using Unique columns in a UNIX file

I have requirement to split below file (sample.csv) into multiple files by using the unique columns (first 3 are unique columns) sample.csv 123|22|56789|ABCDEF|12AB34|2019-07-10|2019-07-10|443.3400|1|1 123|12|5679|BCDEFG|34CD56|2019-07-10|2019-07-10|896.7200|1|2... (3 Replies)
Discussion started by: RVSP
3 Replies

2. UNIX for Dummies Questions & Answers

Compare file1 and file2, print matching lines in same order as file1

I want to print only the lines in file2 that match file1, in the same order as they appear in file 1 file1 file2 desired output: I'm getting the lines to match awk 'FNR==NR {a++}; FNR!=NR && a' file1 file2 but they are in sorted order, which is not what I want: Can anyone... (4 Replies)
Discussion started by: pathunkathunk
4 Replies

3. Shell Programming and Scripting

[Solved] awk compare two different columns of two files and print all from both file

Hi, I want to compare two columns from file1 with another two column of file2 and print matched and unmatched column like this File1 1 rs1 abc 3 rs4 xyz 1 rs3 stu File2 1 kkk rs1 AA 10 1 aaa rs2 DD 20 1 ccc ... (2 Replies)
Discussion started by: justinjj
2 Replies

4. Shell Programming and Scripting

Compare multiple files, identify common records and combine unique values into one file

Good morning all, I have a problem that is one step beyond a standard awk compare. I would like to compare three files which have several thousand records against a fourth file. All of them have a value in each row that is identical, and one value in each of those rows which may be duplicated... (1 Reply)
Discussion started by: nashton
1 Replies

5. Shell Programming and Scripting

Compare multiple files and print unique lines

Hi friends, I have multiple files. For now, let's say I have two of the following style cat 1.txt cat 2.txt output.txt Please note that my files are not sorted and in the output file I need another extra column that says the file from which it is coming. I have more than 100... (19 Replies)
Discussion started by: jacobs.smith
19 Replies

6. Shell Programming and Scripting

awk compare specific columns from 2 files, print new file

Hello. I have two files. FILE1 was extracted from FILE2 and modified thanks to help from this post. Now I need to replace the extracted, modified lines into the original file (FILE2) to produce the FILE3. FILE1 1466 55.27433 14.72050 -2.52E+03 3.00E-01 1.05E+04 2.57E+04 1467 55.27433... (1 Reply)
Discussion started by: jm4smtddd
1 Replies

7. Programming

Compare two files of 4 columns and o/p unique,append zero's

I have to files File1 1 23 2 34 3 7 4 56 5 61 6 22 7 65 File2 2 21 4 32 7 22 Now i need to compare column1 of both the files and generate a third file which should contain all the values of 1st column of 1st file and in the second column i need to get the coressponding row... (2 Replies)
Discussion started by: kamuju
2 Replies

8. Shell Programming and Scripting

compare two columns of different files and print the matching second file..

Hi, I have two tab separated files; file1: S.No ddi fi cu o/l t+ t- 1 0.5 0.6 o 0.1 0.2 2 0.2 0.3 l 0.3 0.4 3 0.5 0.8 l 0.1 0.6 ... (5 Replies)
Discussion started by: vasanth.vadalur
5 Replies

9. Shell Programming and Scripting

Compare multiple fields in file1 to file2 and print line and next line

Hello, I have two files that I need to compare and print out the line from file2 that has the first 6 fields matching the first 6 fields in file1. Complicating this are the following restrictions 1. file1 is only a few thousand lines at most and file2 is greater than 2 million 2. I need to... (7 Replies)
Discussion started by: gillesc_mac
7 Replies

10. Shell Programming and Scripting

compare columns from seven files and print the output

Hi guys, I need some help to come out with a solution . I have seven such files but I am showing only three for convenience. filea a5 20 a8 16 fileb a3 42 a7 14 filec a5 23 a3 07 The output file shoud contain the data in table form showing first field of... (7 Replies)
Discussion started by: smriti_shridhar
7 Replies
Login or Register to Ask a Question