Compare columns of multiple files and print those unique string from File1 in an output file. | Unix Linux Forums | Shell Programming and Scripting

  Go Back    


Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here.

Compare columns of multiple files and print those unique string from File1 in an output file.

Shell Programming and Scripting


Tags
awk, compare

Closed Thread    
 
Thread Tools Search this Thread Display Modes
    #1  
Old 10-15-2013
owwow14 owwow14 is offline
Registered User
 
Join Date: Oct 2013
Last Activity: 14 April 2014, 12:09 PM EDT
Posts: 39
Thanks: 40
Thanked 0 Times in 0 Posts
Compare columns of multiple files and print those unique string from File1 in an output file.

Hi,

I have multiple files that each contain one column of strings:

File1:

Code:
123abc
456def
789ghi

File2:

Code:
123abc
456def
891jkl

File3:

Code:
234mno
123abc
456def

In total I have 25 of these type of file.

I want to compare the strings in File 1 with the 24 other files and Print in my output ONLY those strings in File 1 that do not appear in ANY other file.

Can anyone help me?
Thanks!
Moderator's Comments:
Since you didn't include CODE tags I can't be sure whether or not you intended for there to be an empty line at the start of File1. When I added CODE tags for you, I assumed that you did not want an empty line at the start of File1.
Please use CODE tags so we don't have to guess.

Last edited by Don Cragun; 10-15-2013 at 07:15 AM.. Reason: Add CODE tags.
Sponsored Links
    #2  
Old 10-15-2013
pravin27 pravin27 is offline Forum Advisor  
Advisor
 
Join Date: Sep 2009
Last Activity: 16 April 2014, 8:22 AM EDT
Location: ./India/Mumbai
Posts: 1,201
Thanks: 54
Thanked 266 Times in 259 Posts
Could this help ?


Code:
awk 'NR==FNR{a[$1]++;next}{ if ( $i in a){a[$1]="Y"}} END{ for (i in a){if (a[i] != "Y"){print i}}} ' file1 file2 file3

Sponsored Links
    #3  
Old 10-15-2013
Subbeh Subbeh is offline
Registered User
 
Join Date: May 2011
Last Activity: 16 April 2014, 9:30 AM EDT
Posts: 313
Thanks: 32
Thanked 81 Times in 80 Posts
How about this:

Code:
comm -23 <(sort file1) <(sort file2 file3 file4)

    #4  
Old 10-15-2013
Don Cragun's Avatar
Don Cragun Don Cragun is online now Forum Staff  
Moderator
 
Join Date: Jul 2012
Last Activity: 16 April 2014, 2:52 PM EDT
Location: San Jose, CA, USA
Posts: 3,442
Thanks: 140
Thanked 1,188 Times in 1,007 Posts
A slight simplification of pravin27's script:

Code:
awk '
FNR == NR { l[$0]; next }
$0 in l { delete l[$0] }
END { for(i in l) print i } 
' File*

As always, if you want to try this on a Solaris/SunOS system, use /usr/xpg4/bin/aw k, /usr/xpg6/bin/awk , or nawk instead of just awk .
The Following 2 Users Say Thank You to Don Cragun For This Useful Post:
owwow14 (10-18-2013), pravin27 (10-15-2013)
Sponsored Links
    #5  
Old 10-15-2013
owwow14 owwow14 is offline
Registered User
 
Join Date: Oct 2013
Last Activity: 14 April 2014, 12:09 PM EDT
Posts: 39
Thanks: 40
Thanked 0 Times in 0 Posts
Thank you for all of the responses. Don Cragun, your simplification of Pravin's command line works great. Simple and effective. Thanks again!
Sponsored Links
    #6  
Old 10-15-2013
drl's Avatar
drl drl is online now Forum Advisor  
Registered Voter
 
Join Date: Apr 2007
Last Activity: 16 April 2014, 2:40 PM EDT
Location: Saint Paul, MN USA / BSD, CentOS, Debian, OS X, Solaris
Posts: 1,630
Thanks: 24
Thanked 176 Times in 160 Posts
Hi.

Using grep, assuming that the 24 files will fit into memory available:

Code:
#!/usr/bin/env bash

# @(#) s1	Demonstrate inverse, "-v", match, grep with auxiliary file.

# Utility functions: print-as-echo, print-line-with-visual-space, debug.
# export PATH="/usr/local/bin:/usr/bin:/bin"
pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
db() { : ; }
C=$HOME/bin/context && [ -f $C ] && $C grep

pl " Input data file primary data*:"
head primary data*

pl " Results:"
grep -v -f <( cat data* ) primary

exit 0

producing:

Code:
$ ./s1

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 2.6.26-2-amd64, x86_64
Distribution        : Debian GNU/Linux 5.0.8 (lenny, workstation) 
bash GNU bash 3.2.39
grep GNU grep 2.5.3

-----
 Input data file primary data*:
==> primary <==
123abc
456def
789ghi

==> data1 <==
123abc
456def
891jkl

==> data2 <==
234mno
123abc
456def

-----
 Results:
789ghi

See man pages for details.

Best wishes ... cheers, drl
Sponsored Links
Closed Thread

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Compare multiple files, identify common records and combine unique values into one file nashton Shell Programming and Scripting 1 05-20-2013 02:57 AM
Compare multiple files and print unique lines jacobs.smith Shell Programming and Scripting 19 02-03-2012 05:59 PM
Compare two files of 4 columns and o/p unique,append zero's kamuju Programming 2 04-13-2010 04:15 AM
compare two columns of different files and print the matching second file.. vasanth.vadalur Shell Programming and Scripting 5 10-06-2009 05:59 PM
compare columns from seven files and print the output smriti_shridhar Shell Programming and Scripting 7 06-11-2008 12:22 AM



All times are GMT -4. The time now is 02:58 PM.