Simple script to find common strings in two files


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Simple script to find common strings in two files
# 1  
Old 09-22-2010
Simple script to find common strings in two files

Hi ,
I want to write a simple script.
I have two files

file1:
Code:
BCSpeciality
Backend
CB
CBAPQualDisp
CBCimsVFTRCK
CBDSNQualDisp
CBDefault
CBDisney
CBFaxMCGen
CBMCGeneral
CBMCQualDisp

file2:
Code:
CSpeciality
Backend
CB
CBAPQualDisp
CBCimsVFTRCK
CBDSNQualDisp
CBDefault
CBDisney
CBFaxMCGen
CBMCGeneral
CBMCQualDisp
CBPLQualDisp
CBQualNonCID
CBRColl
CBRecon
CBRepr2
CBRisk


if the line is present in both files then the line should be written to a third file; if it is not there is both files then it should be ignored.

Last edited by Franklin52; 09-22-2010 at 04:10 AM.. Reason: Please use code tags!
# 2  
Old 09-22-2010
Code:
$ ruby -ne 'BEGIN{a=File.read("file1").split(/\n+/)}; print $_ if a.include?($_.chomp)' file2

# 3  
Old 09-22-2010
Code:
grep -f file1 file2

# 4  
Old 09-22-2010
Quote:
Originally Posted by rdcwayx
Code:
grep -f file1 file2

note the order of grep. file2 should come first.
# 5  
Old 09-22-2010
yes, thank you to point that.

here is the code which no care of the files sequence.

Code:
awk 'NR==FNR{a[$1]++;next} a[$1] ' file1 file2

# 6  
Old 09-22-2010
And that would have to be:
Code:
grep -Fxf file2 file1

otherwise BCSpeciality would get matched for example..

The order does not matter if you use:
Code:
awk 'NR==FNR{A[$0];next}$0 in A' file1 file2

S.

--
kurumi, I get:
Code:
-e:1: undefined local variable or method `a' for main:Object (NameError)



---------- Post updated at 07:16 ---------- Previous update was at 06:57 ----------

rdcwayx,
by using $1 instead of $0 awk would match words instead of lines. It could well be that is what the OP actually intended - in fact that would seem likely - so your awk would be better suited and then
Code:
grep -wFf file2 file1

would be needed, and my awk would become:
Code:
awk 'NR==FNR{A[$1];next}$1 in A' file1 file2


Last edited by Scrutinizer; 09-22-2010 at 02:24 AM..
This User Gave Thanks to Scrutinizer For This Post:
# 7  
Old 09-22-2010
Quote:
Originally Posted by kurumi
Quote:
Originally Posted by rdcwayx
Code:
grep -f file1 file2

note the order of grep. file2 should come first.
For this task, the order is irrelevant; a successful match must only occur when the line is in both files.

The problem here is the use of regular expressions for what is a fixed-string job. The -f option without -F uses basic regular expressions. If they aren't wrapped with "^" and "$", they allow substring matches to occur. That's incorrect for this case. Matches must be whole lines.

To make matters worse, if a line contains a regular expression special character (such as a "."), it may match a character that is not its literal self. Properly escaping a file to protect against this is error prone.

The correct solution is to avoid regular expressions and instead use fixed strings (-F) that must match an entire line (-x).

Code:
grep -Fxf file1 file2

or
Code:
grep -Fxf file2 file1

They are interchangeable.

Regards,
Alister
This User Gave Thanks to alister For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Find common files between two directories

I have two directories Dir 1 /home/sid/release1 Dir 2 /home/sid/release2 I want to find the common files between the two directories Dir 1 files /home/sid/release1>ls -lrt total 16 -rw-r--r-- 1 sid cool 0 Jun 19 12:53 File123 -rw-r--r-- 1 sid cool 0 Jun 19 12:53... (5 Replies)
Discussion started by: sidnow
5 Replies

2. Shell Programming and Scripting

Find Common Values Across Two Files

Hi All, I have two files like below: File1 MYFILE_28012012_1112.txt|4 MYFILE_28012012_1113.txt|51 MYFILE_28012012_1114.txt|57 MYFILE_28012012_1115.txt|57 MYFILE_28012012_1116.txt|57 MYFILE_28012012_1117.txt|57 File2 MYFILE_28012012_1110.txt|57 MYFILE_28012012_1111.txt|57... (2 Replies)
Discussion started by: angshuman
2 Replies

3. Shell Programming and Scripting

Need the script to remove common strings,tags etc

I have a file say "example.xml" and the contents of this example.xml are <project name="platform/packages/wallpapers/Basic" path="packages/wallpapers/Basic" revision="225e410f054c4ad5c828b0fec9be1b47c4376711"/> <project name="platform/packages/wallpapers/Galaxy4"... (3 Replies)
Discussion started by: acdc
3 Replies

4. Shell Programming and Scripting

Script to find NOT common strings in two files

Hi all, I'd like you to help or give any advise about the following: I have two (2) files, file1 and file2, both files have information common to each other. The contents of file1 is a subset of the contents of file2: file1: errormsgadmin esdp esgservices esignipa iprice ipvpn irm... (18 Replies)
Discussion started by: hnux
18 Replies

5. Shell Programming and Scripting

Script to find NOT common strings in two files

Hi all, I'd like you to help or give any advise about the following: I have two (2) files, file1 and file2, both files have information common to each other. The contents of file1 is a subset of the contents of file2: file1: errormsgadmin esdp esgservices esignipa iprice ipvpn irm... (0 Replies)
Discussion started by: hnux
0 Replies

6. UNIX for Advanced & Expert Users

Find common Strings in two large files

Hi , I have a text file in the format DB2: DB2: WB: WB: WB: WB: and a second text file of the format Time=00:00:00.473 Time=00:00:00.436 Time=00:00:00.016 Time=00:00:00.027 Time=00:00:00.471 Time=00:00:00.436 the last string in both the text files is of the... (4 Replies)
Discussion started by: kanthrajgowda
4 Replies

7. UNIX for Dummies Questions & Answers

how to find common words and take them out from two files

Hi, everyone, Let's say, we have xxx.txt A 1 2 3 4 5 C 1 2 3 4 5 E 1 2 3 4 5 yyy.txt A 1 2 3 4 5 B 1 2 3 4 5 C 1 2 3 4 5 D 1 2 3 4 5 E 1 2 3 4 5 First I match the first column I find intersection (A,C, E), then I want to take those lines with ACE out from yyy.txt, like A 1... (11 Replies)
Discussion started by: kaixinsjtu
11 Replies

8. Shell Programming and Scripting

Files common in two sets ??? How to find ??

Suppose we have 2 set of files set 1 set 2 ------ ------ abc hgb def ppp mgh vvv nmk sdf hgb ... (1 Reply)
Discussion started by: skyineyes
1 Replies

9. Shell Programming and Scripting

To find all common lines from 'n' no. of files

Hi, I have one situation. I have some 6-7 no. of files in one directory & I have to extract all the lines which exist in all these files. means I need to extract all common lines from all these files & put them in a separate file. Please help. I know it could be done with the help of... (11 Replies)
Discussion started by: The Observer
11 Replies
Login or Register to Ask a Question