Compare two big files for differences using Linux


 
Thread Tools Search this Thread
Top Forums UNIX for Beginners Questions & Answers Compare two big files for differences using Linux
# 1  
Old 08-15-2017
Compare two big files for differences using Linux

Hello everybody

Looking for help in comparing two files in Linux(files are big 800MB each).

Example:-

File1 has below data
Code:
$ cat file1
5,6,3
2.1.4
1,1,1
8,9,1



File2 has below data
Code:
$ cat file2
5,6,3
8,9,8
1,2,1
2,1,4



Need Output as below
Code:
8,9,8
1,2,1
1,1,1
8,9,1

tried below awk command but it giving below output which is not correct
Code:
$ awk 'NR==FNR{a[$0]++;next} !a[$0]' file2 file1
2.1.4
1,1,1
8,9,1

$ cat vlookup.awk
FNR==NR{
a[$1]=$2
next
}
{ if ($1 in a) {print $1, a[$1]} else {print $1, "NA"} }

awk -f vlookup.awk file2 file1 | column -t
$ awk -f vlookup.awk file2 file1 | column -t
5,6,3
2.1.4 NA
1,1,1 NA
8,9,1 NA



treid below do while loop with grep command but its taking lot of time.

Code:
$ cat scp.sh
rm -f newfile.txt
while read line
do
line1=`grep -ie "${line}" file1`
if [ $? -ne 0 ] ; then
echo "$line" >> file2
fi
done <CUDB_REF



Code:
./scp.sh
8,9,8
1,2,1

This is correct but taking very long time for big file

Pls suggest better way which is fast.


Moderator's Comments:
Mod Comment Please use CODE tags as required by forum rules!

Last edited by RudiC; 08-15-2017 at 01:22 PM.. Reason: Added CODE tags.
# 2  
Old 08-15-2017
How about
Code:
sort file[12] | tr '.' ',' | uniq -c | grep "^ *1"
      1 1,1,1
      1 1,2,1
      1 8,9,1
      1 8,9,8

EDIT: or even
Code:
sort file[12] | tr '.' ',' | uniq -u
1,1,1
1,2,1
8,9,1
8,9,8

# 3  
Old 08-15-2017
Thanks RudiC

And How about getting common lines out of these files
# 4  
Old 08-15-2017
How about man uniq? Look for the -d option...
# 5  
Old 08-15-2017
hi RudiC

The sort is good to list out differences but my requirement is to read content from file1 and check it from file2 and if its not present then print it .
Exactly what this do while and grep is doing. but in faster manner since the below code taking so much of time.

Code:
$ cat scp.sh
rm -f newfile.txt
while read line
do
line1=`grep -ie "${line}" file1`
if [ $? -ne 0 ] ; then
echo "$line" >> file2
fi
done <file2
  
 ./scp.sh
8,9,8
1,2,1


Moderator's Comments:
Mod Comment Seriously! Please use CODE tags as required by forum rules!

Last edited by RudiC; 08-15-2017 at 02:24 PM.. Reason: Added CODE tags.
# 6  
Old 08-15-2017
That's NOT what you requested:
Quote:
Originally Posted by shanul karim
.
.
.
Need Output as below
Code:
8,9,8
1,2,1
1,1,1
8,9,1

.
.
.
Try - given you have a recent bash for your shell which you failed to mention -
Code:
comm <(sort file1 | tr '.' ',') <(sort file2 | tr '.' ',')
1,1,1
	1,2,1
		2,1,4
		5,6,3
8,9,1
	8,9,8

# 7  
Old 08-15-2017
Thanks RudiC for your valuable feedback and resolution

Yes ture I need what you have shared in earlier chat. The only issue in output file I am unable to distinguish thar the difference entry belong to which file file1 or file2.

like

8,9,8 >> from file1
1,2,1 >> from file1
1,1,1 >> from file2
8,9,1 >> from file2

Since my files are very big around 800 MB each and for this I need this.

if possible to get two different file. One listing differences from file1 to file2 and other listing difference file2 to file1.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Linux/Shell script - How to compare 2 arrays based on patterns and get the differences

I have FILE 1 (This file has all master columns/headers) A|B|C|D|E|F|G|H|STATUS FILE 2 A|C|F|I|OFF_STATUS 3|4|5|4|Y 6|7|8|5|Y Below command give me all headers of FILE 2 into array2.txt file paste <(head -1 FILE2.txt | tr '|' '\n')>array2.txt So I would like to compare... (2 Replies)
Discussion started by: jmadhams
2 Replies

2. Shell Programming and Scripting

Compare directories and copy differences (files) in a another directory

Hey im working on script that can compare 2 directory and check difference, then copy difference files in third diretory. here is the story: in folder one we have 12 subfolder and in each of them near 500 images hosted. 01 02 03 04 05 06 07 08 09 10 11 12 in folder 2 we have same subfolder... (2 Replies)
Discussion started by: nimafire
2 Replies

3. Shell Programming and Scripting

Compare and merge two big CSV files

Hi all, i need help. I have two csv files with a huge amount of data. I need the first column of the first file, to be compared with the data of the second, to have at the end a file with the data not present in the second file. Example File1: (only one column) profile_id 57036226... (11 Replies)
Discussion started by: SirMannu
11 Replies

4. UNIX for Beginners Questions & Answers

Shell Script to Compare Files and Email the differences

Hi, I have 2 files abc.txt and bdc.txt. I am using $diff -y abc.txt bcd.txt -- compared the files side by side I would like to write a Shell Script to cmpare the files side by side and print the results( which are not matched) in a side by side format and save the results in another... (10 Replies)
Discussion started by: vasuvv
10 Replies

5. Shell Programming and Scripting

Compare two big files for differences using Linux

Hello everybody Looking for help in comparing two files in Linux(files are big 800MB each). Example:- File1 has below data $ cat file1 5,6,3 2.1.4 1,1,1 8,9,1 File2 has below data $ cat file2 5,6,3 8,9,8 1,2,1 2,1,4 (1 Reply)
Discussion started by: shanul karim
1 Replies

6. Shell Programming and Scripting

Need to compare the two files and list out differences between the two

Hi, I need to compare the two files and list out difference between the two. Please assist. Best regards, Vishal (2 Replies)
Discussion started by: Vishal_dba
2 Replies

7. UNIX for Advanced & Expert Users

best method to compare 2 big files in unix

Hi , I have a requirement to compare 2 files which can contain 40 million or more records and more than 20 fields to compare . Currently I am using awk scripting , and since awk has a memory issue, I am not able to process file more than 10 million records. Any suggestions or pointers to... (7 Replies)
Discussion started by: rashmisb
7 Replies

8. Shell Programming and Scripting

Differences between 2 Flat Files and process the differences

Hi Hope you are having a great weeknd !! I had a question and need your expertise for this : I have 2 files File1 & File2(of same structure) which I need to compare on some columns. I need to find the values which are there in File2 but not in File 1 and put the Differences in another file... (5 Replies)
Discussion started by: newbie_8398
5 Replies

9. Shell Programming and Scripting

Compare two text files and Only show the differences

Hi experts, I'mvery new to shell scripting and learning it now currently i am having a problem which may look easy to u :) i have two files File 1: Start :Thu Nov 19 10:33:09 2009 ABCDGFSDJ.txt APDemoNew.ppt APDemoOutline.doc ARDemoNew.ppt ARDemoOutline.doc File 2: Start... (10 Replies)
Discussion started by: CelvinSaran
10 Replies

10. UNIX for Dummies Questions & Answers

Compare 2 files for a single column and output differences

Hi, I have a column in 2 different files which i want to compare, and output the results to a different file. The columns are in different positions in those 2 files. File 1 the column is in position 10-15 File 2 the column is in position 15-20 Please advise Thanks (1 Reply)
Discussion started by: samit_9999
1 Replies
Login or Register to Ask a Question