08-15-2017
Compare two big files for differences using Linux
Hello everybody
Looking for help in comparing two files in Linux(files are big 800MB each).
Example:-
File1 has below data
$ cat file1
5,6,3
2.1.4
1,1,1
8,9,1
File2 has below data
$ cat file2
5,6,3
8,9,8
1,2,1
2,1,4
Need Output as below
8,9,8
1,2,1
1,1,1
8,9,1
tried below awk command but it giving below output which is not correct
$ awk 'NR==FNR{a[$0]++;next} !a[$0]' file2 file1
2.1.4
1,1,1
8,9,1
$ cat vlookup.awk
FNR==NR{
a[$1]=$2
next
}
{ if ($1 in a) {print $1, a[$1]} else {print $1, "NA"} }
awk -f vlookup.awk file2 file1 | column -t
$ awk -f vlookup.awk file2 file1 | column -t
5,6,3
2.1.4 NA
1,1,1 NA
8,9,1 NA
treid below do while loop with grep command but its taking lot of time.
$ cat scp.sh
rm -f newfile.txt
while read line
do
line1=`grep -ie "${line}" file1`
if [ $? -ne 0 ] ; then
echo "$line" >> file2
fi
done <CUDB_REF
./scp.sh
8,9,8
1,2,1
This is correct but taking very long time for big file
Pls suggest better way which is fast.
10 More Discussions You Might Find Interesting
1. UNIX for Dummies Questions & Answers
Hi,
I have a column in 2 different files which i want to compare, and output the results to a different file. The columns are in different positions in those 2 files.
File 1 the column is in position 10-15
File 2 the column is in position 15-20
Please advise
Thanks (1 Reply)
Discussion started by: samit_9999
1 Replies
2. Shell Programming and Scripting
Hi experts,
I'mvery new to shell scripting and learning it now
currently i am having a problem which may look easy to u :)
i have two files
File 1:
Start :Thu Nov 19 10:33:09 2009
ABCDGFSDJ.txt
APDemoNew.ppt
APDemoOutline.doc
ARDemoNew.ppt
ARDemoOutline.doc
File 2:
Start... (10 Replies)
Discussion started by: CelvinSaran
10 Replies
3. Shell Programming and Scripting
Hi
Hope you are having a great weeknd !! I had a question and need your expertise for this :
I have 2 files File1 & File2(of same structure) which I need to compare on some columns. I need to find the values which are there in File2 but not in File 1 and put the Differences in another file... (5 Replies)
Discussion started by: newbie_8398
5 Replies
4. UNIX for Advanced & Expert Users
Hi ,
I have a requirement to compare 2 files which can contain 40 million or more records and more than 20 fields to compare .
Currently I am using awk scripting , and since awk has a memory issue, I am not able to process file more than 10 million records.
Any suggestions or pointers to... (7 Replies)
Discussion started by: rashmisb
7 Replies
5. Shell Programming and Scripting
Hi,
I need to compare the two files and list out difference between the two.
Please assist.
Best regards,
Vishal (2 Replies)
Discussion started by: Vishal_dba
2 Replies
6. UNIX for Beginners Questions & Answers
Hello everybody
Looking for help in comparing two files in Linux(files are big 800MB each).
Example:-
File1 has below data
$ cat file1
5,6,3
2.1.4
1,1,1
8,9,1
File2 has below data
$ cat file2
5,6,3
8,9,8
1,2,1
2,1,4 (8 Replies)
Discussion started by: shanul karim
8 Replies
7. UNIX for Beginners Questions & Answers
Hi,
I have 2 files abc.txt and bdc.txt.
I am using
$diff -y abc.txt bcd.txt -- compared the files side by side
I would like to write a Shell Script to cmpare the files side by side and print the results( which are not matched) in a side by side format and save the results in another... (10 Replies)
Discussion started by: vasuvv
10 Replies
8. Shell Programming and Scripting
Hi all,
i need help.
I have two csv files with a huge amount of data.
I need the first column of the first file, to be compared with the data of the second, to have at the end a file with the data not present in the second file.
Example
File1: (only one column)
profile_id
57036226... (11 Replies)
Discussion started by: SirMannu
11 Replies
9. Shell Programming and Scripting
Hey
im working on script that can compare 2 directory and check difference, then copy difference files in third diretory.
here is the story:
in folder one we have 12 subfolder and in each of them near 500 images hosted.
01 02 03 04 05 06 07 08 09 10 11 12
in folder 2 we have same subfolder... (2 Replies)
Discussion started by: nimafire
2 Replies
10. UNIX for Beginners Questions & Answers
I have
FILE 1 (This file has all master columns/headers)
A|B|C|D|E|F|G|H|STATUS
FILE 2
A|C|F|I|OFF_STATUS
3|4|5|4|Y
6|7|8|5|Y
Below command give me all headers of FILE 2 into array2.txt file
paste <(head -1 FILE2.txt | tr '|' '\n')>array2.txt
So I would like to compare... (2 Replies)
Discussion started by: jmadhams
2 Replies
diff3(1) General Commands Manual diff3(1)
Name
diff3 - 3-way differential file comparison
Syntax
diff3 [-ex3] file1 file2 file3
Description
The command compares three versions of a file, and publishes the ranges of text that disagree, flagged with the following codes:
==== all three files differ
====1 file1 is different
====2 file2 is different
====3 file3 is different
The type of change needed to convert a given range of a given file to some other is indicated in one of these ways:
f : n1 a Text is to be appended after line number n1 in file f, where f = 1, 2, or 3.
f : n1 , n2 c
Text is to be changed in the range line n1 to line n2. If n1 = n2, the range may be abbreviated to n1.
The original contents of the range follows immediately after a c indication. When the contents of two files are identical, the contents of
the lower-numbered file is suppressed.
Options
-3 Produces an editor script containing the changes between file1 and file2 that are to be incorporated into file3.
-e Produces an editor script containing the changes between file2 and file3 that are to be incorporated into file1.
-x Produces an editor script containing the changes among all three files.
Examples
Under the -e option, publishes a script for the editor that incorporates into file1 all changes between file2 and file3 - that is, the
changes that would normally be flagged ==== and ====3. Option -x (-3) produces a script to incorporate only changes flagged ==== (====3).
The following command applies the resulting script to `file1':
(cat script; echo '1,$p') | ed - file1
Restrictions
Text lines that consist of a single `.' defeat -e.
Files
/tmp/d3?????
/usr/lib/diff3
See Also
cmp(1), comm(1), diff(1), dffmk(1), join(1), sccsdiff(1), uniq(1)
diff3(1)