Diff two files with threshold value


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Diff two files with threshold value
# 1  
Old 09-15-2011
Diff two files with threshold value

i have two big file which have thousand of line.
i have to sort on two key fields then diff the file.

if the interger value of one of the column is less then or greater then 1 it should ignore it.

for example
File1
Code:
abc|7000|jhon|2.3
xyz|9000|sam|6.7
pqr|8000|kapi|4.6

File2
Code:
abc|7000|jhon|2.3
xyz|9000|sam|6.7
pqr|8000|kapi|4.5
uvw|1000|abe|5.6

diff Output --expected
Code:
uvw|1000|abe|5.6

the commands i tried out are below
Code:
sort file1 > sortfile1 |sort file2 sortfile2 | diff sortfile1 sortfile2

output

Code:
2c2,3
< pqr|8000|kapi|4.6
---
> pqr|8000|kapi|4.5
> uvw|1000|abe|5.6

Moderator's Comments:
Mod Comment
Please use code tags when posting data and code samples!

Last edited by vgersh99; 09-15-2011 at 01:13 PM.. Reason: code tags, please!
# 2  
Old 09-15-2011
Hi,

Your explanation is not clear for me:

Quote:
i have two big file which have thousand of line.
Both files have same number of lines?

Quote:
i have to sort on two key fields then diff the file.
Which fields?

Quote:
if the interger value of one of the column is less then or greater then 1 it should ignore it.
I don't understand this.

Quote:
diff Output --expected

Code:
  uvw|1000|abe|5.6

I don't understand that output. Lines that exists in one file but not in the other?

Regards,
Birei
# 3  
Old 09-16-2011
Hi thanks for your response

1) both the file have different number of line
file1 may have 50 thousand records and file 2 may have 51 thousand record

2) I have to sort the file on the second and third column
for example
i have to sort first file1 and file2 on column 7000 and column jhon
Code:
abc|7000|jhon|2.3
xyz|9000|sam|6.7
pqr|8000|kapi|4.6


3) the file 1 have
Code:
pqr|8000|kapi|4.6

file 2 have
Code:
pqr|8000|kapi|4.5


there is difference of .1 in forth column (4.6 and 4.5)
the diff should ignore if the difference is of .1 in the forth column

4) if the difference is of more then .1 it should report in output


Last edited by Franklin52; 09-16-2011 at 03:42 AM.. Reason: Please use code tags, thank you
# 4  
Old 09-16-2011
Therefore, the output should be:

1.- Lines that exists only in one of the two files.
2.- Lines whose first three columns are the same but the difference between the number of the fourth is different from 0.1 or -0.1. In that case, the line of what file is what you have to write to output?

Regards,
Birei
# 5  
Old 09-16-2011
Thanks Birei for Quick response
Yes you are correct

Fourth column is Real number and if its difference is 0.1 or -0.1 it is acceptable so no need to be in output.

If the difference is more then 0.1 it should be in output.
let me put it in example

1) file1
Code:
abc|7000|jhon|2.3
xyz|9000|sam|6.7
pqr|8000|kapi|4.6
lmn|3000|kapi|4.6

2) file 2
Code:
abc|7000|jhon|2.3
xyz|9000|sam|6.7
pqr|8000|kapi|4.5
lmn|3000|kapi|4.1


Expectation
1)the third row is having diff of 0.1 so it is acceptable, not required to be in output

2) the forth row is having a diff more then 0.1 it is 0.5 so should be in the output.
Output
Code:
> lmn|3000|kapi|4.1
< lmn|3000|kapi|4.6


Last edited by Franklin52; 09-16-2011 at 06:40 AM.. Reason: Please use code tags, thank you
# 6  
Old 09-16-2011
do you want output if the file occurs in only 1 file, or only if it appears in both files and has a difference of more than 0.1 in column 3?
# 7  
Old 09-16-2011
yes
only if it appears in both files
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Diff 3 files, but diff only their 2nd column

Guys i have 3 files, but i want to compare and diff only the 2nd column path=`/home/whois/doms` for i in `cat domain.tx` do whois $i| sed -n '/Registry Registrant ID:/,/Registrant Email:/p' > $path/$i.registrant whois $i| sed -n '/Registry Admin ID:/,/Admin Email:/p' > $path/$i.admin... (10 Replies)
Discussion started by: kenshinhimura
10 Replies

2. Shell Programming and Scripting

Script for deleting files and directories when the file system reaches the threshold

Hi Can someone assist in writing a script. I have a filesystem named /sybase in my aix lpar. When this filesystem becomes 94% full all the files and directories under /sybase/logs should be deleted immediately. :confused: (7 Replies)
Discussion started by: newtoaixos
7 Replies

3. Shell Programming and Scripting

.procmailrc and uudeview (put attachments from diff senders to diff folders)

Moderator, please, delete this topic (1 Reply)
Discussion started by: optik77
1 Replies

4. Shell Programming and Scripting

Using Diff to Compare 2 files

Hi I've been trying various methods that I have found online with regards to comparing 2 files using the diff command. Nothing seems to work. The problem is that I'm not too familiar with the proper syntax. Can you please assist me. Here is my script: #!/bin/bash awk -F',' -v file1="$1"... (9 Replies)
Discussion started by: ladyAnne
9 Replies

5. Shell Programming and Scripting

diff bw two files

Hi All, I have two files which look as below File1 serial="1" name="abc" type="employee" field="IT" serial="2" name="cde" type="intern" field="Marketing" serial="3" name="pqr" type="contractor" field="IT" serial="4" name="xyz" type="employee" field="Sales" File2 serial="1"... (3 Replies)
Discussion started by: grajp002
3 Replies

6. Shell Programming and Scripting

diff of files

Hi, I have 2 files.I want to check if file1 is contained in file2. A.txt: ----- AAA BBB B.txt: ------ CCC AAA BBB DDD I want to check if A.txt is contained in B.txt. Can it be done using SED ? (12 Replies)
Discussion started by: giri_luck
12 Replies

7. Shell Programming and Scripting

Diff b/w 2 files

Hi Masters, I have two files named file1 and file2. Both the files contains the same contents with some difference in comments,space.But no content change. I tried to find the diff between the two files to make sure that contents are same. For that i tried diff -ibw file1 file2 But... (1 Reply)
Discussion started by: ecearund
1 Replies

8. Shell Programming and Scripting

Compare all files in a directory to a threshold value

Hi guys, I have the following, and would like to enhance it be be able to run it in the hard coded directory and compare each file in the directory with the expectedSizeHow would I go about doing this? Thanks, Bloke #!/bin/sh ] || { echo "Usage: watchSizes 400"; exit 0 ; } #Hammer: How... (1 Reply)
Discussion started by: Bloke
1 Replies

9. Shell Programming and Scripting

Find duplicates from multuple files with 2 diff types of files

I need to compare 2 diff type of files and find out the duplicate after comparing each types of files: Type 1 file name is like: file1.abc (the extension abc could any 3 characters but I can narrow it down or hardcode for 10/15 combinations). The other file is file1.bcd01abc (the extension... (2 Replies)
Discussion started by: ricky007
2 Replies

10. Shell Programming and Scripting

diff 2 files; output diff's to 3rd file

Hello, I want to compare two files. All records in file 2 that are not in file 1 should be output to file 3. For example: file 1 123 1234 123456 file 2 123 2345 23456 file 3 should have 2345 23456 I have looked at diff, bdiff, cmp, comm, diff3 without any luck! (2 Replies)
Discussion started by: blt123
2 Replies
Login or Register to Ask a Question