Unix/Linux Go Back    


Shell Programming and Scripting Unix shell scripting - KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and shell scripts and shell scripting languages here.

Compare columns of multiple files and print those unique string from File1 in an output file.

Shell Programming and Scripting


Tags
awk, compare

Closed Linux or Unix Question    
 
Thread Tools Search this Thread Display Modes
    #1  
Old Unix and Linux 10-15-2013
owwow14 owwow14 is offline
Registered User
 
Join Date: Oct 2013
Last Activity: 2 July 2015, 8:13 AM EDT
Posts: 65
Thanks: 56
Thanked 1 Time in 1 Post
Compare columns of multiple files and print those unique string from File1 in an output file.

Hi,

I have multiple files that each contain one column of strings:

File1:

Code:
123abc
456def
789ghi

File2:

Code:
123abc
456def
891jkl

File3:

Code:
234mno
123abc
456def

In total I have 25 of these type of file.

I want to compare the strings in File 1 with the 24 other files and Print in my output ONLY those strings in File 1 that do not appear in ANY other file.

Can anyone help me?
Thanks!
Moderator's Comments:
Since you didn't include CODE tags I can't be sure whether or not you intended for there to be an empty line at the start of File1. When I added CODE tags for you, I assumed that you did not want an empty line at the start of File1.
Please use CODE tags so we don't have to guess.

Last edited by Don Cragun; 10-15-2013 at 07:15 AM.. Reason: Add CODE tags.
Sponsored Links
    #2  
Old Unix and Linux 10-15-2013
pravin27 pravin27 is offline Forum Advisor  
Advisor
 
Join Date: Sep 2009
Last Activity: 26 June 2015, 3:24 AM EDT
Location: ./India/Bangalore
Posts: 1,237
Thanks: 58
Thanked 283 Times in 276 Posts
Could this help ?


Code:
awk 'NR==FNR{a[$1]++;next}{ if ( $i in a){a[$1]="Y"}} END{ for (i in a){if (a[i] != "Y"){print i}}} ' file1 file2 file3

Sponsored Links
    #3  
Old Unix and Linux 10-15-2013
Subbeh Subbeh is offline
Registered User
 
Join Date: May 2011
Last Activity: 7 May 2015, 3:00 AM EDT
Posts: 332
Thanks: 37
Thanked 87 Times in 86 Posts
How about this:

Code:
comm -23 <(sort file1) <(sort file2 file3 file4)

    #4  
Old Unix and Linux 10-15-2013
Don Cragun's Unix or Linux Image
Don Cragun Don Cragun is offline Forum Staff  
Moderator
 
Join Date: Jul 2012
Last Activity: 5 July 2015, 3:12 AM EDT
Location: San Jose, CA, USA
Posts: 6,659
Thanks: 281
Thanked 2,222 Times in 1,904 Posts
A slight simplification of pravin27's script:

Code:
awk '
FNR == NR { l[$0]; next }
$0 in l { delete l[$0] }
END { for(i in l) print i } 
' File*

As always, if you want to try this on a Solaris/SunOS system, use /usr/xpg4/bin/aw k, /usr/xpg6/bin/awk , or nawk instead of just awk .
The Following 2 Users Say Thank You to Don Cragun For This Useful Post:
owwow14 (10-18-2013), pravin27 (10-15-2013)
Sponsored Links
    #5  
Old Unix and Linux 10-15-2013
owwow14 owwow14 is offline
Registered User
 
Join Date: Oct 2013
Last Activity: 2 July 2015, 8:13 AM EDT
Posts: 65
Thanks: 56
Thanked 1 Time in 1 Post
Thank you for all of the responses. Don Cragun, your simplification of Pravin's command line works great. Simple and effective. Thanks again!
Sponsored Links
    #6  
Old Unix and Linux 10-15-2013
drl's Unix or Linux Image
drl drl is offline Forum Advisor  
Registered Voter
 
Join Date: Apr 2007
Last Activity: 4 July 2015, 2:05 PM EDT
Location: Saint Paul, MN USA / BSD, CentOS, Debian, OS X, Solaris
Posts: 1,778
Thanks: 63
Thanked 236 Times in 212 Posts
Hi.

Using grep, assuming that the 24 files will fit into memory available:

Code:
#!/usr/bin/env bash

# @(#) s1	Demonstrate inverse, "-v", match, grep with auxiliary file.

# Utility functions: print-as-echo, print-line-with-visual-space, debug.
# export PATH="/usr/local/bin:/usr/bin:/bin"
pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
db() { : ; }
C=$HOME/bin/context && [ -f $C ] && $C grep

pl " Input data file primary data*:"
head primary data*

pl " Results:"
grep -v -f <( cat data* ) primary

exit 0

producing:

Code:
$ ./s1

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 2.6.26-2-amd64, x86_64
Distribution        : Debian GNU/Linux 5.0.8 (lenny, workstation) 
bash GNU bash 3.2.39
grep GNU grep 2.5.3

-----
 Input data file primary data*:
==> primary <==
123abc
456def
789ghi

==> data1 <==
123abc
456def
891jkl

==> data2 <==
234mno
123abc
456def

-----
 Results:
789ghi

See man pages for details.

Best wishes ... cheers, drl
Sponsored Links
Closed Linux or Unix Question

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Linux More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Compare multiple files, identify common records and combine unique values into one file nashton Shell Programming and Scripting 1 05-20-2013 02:57 AM
Compare multiple files and print unique lines jacobs.smith Shell Programming and Scripting 19 02-03-2012 05:59 PM
Compare two files of 4 columns and o/p unique,append zero's kamuju Programming 2 04-13-2010 04:15 AM
compare two columns of different files and print the matching second file.. vasanth.vadalur Shell Programming and Scripting 5 10-06-2009 05:59 PM
compare columns from seven files and print the output smriti_shridhar Shell Programming and Scripting 7 06-11-2008 12:22 AM



All times are GMT -4. The time now is 03:48 AM.