compare 2 files and return unique lines in each file (based on condition)


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting compare 2 files and return unique lines in each file (based on condition)
# 1  
Old 08-25-2012
Data compare 2 files and return unique lines in each file (based on condition)

hi
my problem is little complicated one. i have 2 files which appear like this
Code:
file 1
abbsss:aa:22:34:as akl abc 1234
mkilll:as:ss:23:qs asc abc 0987
mlopii:cd:wq:24:as asd abc  7866

Code:
file2
lkoaa:as:24:32:sa alk abc 3245
lkmo:as:34:43:qs qsa abc 0987
kloia:ds:45:56:sa acq abc 7805

i have to check the unique lines on the basis of 4rth field (numerical field) which is always after abc in both files. i should check whether the value is matching with the line in other file (might have differet order) within +/- 100 range. As in example
Code:
mlopii:cd:wq:24:as asd abc  7866
kloia:ds:45:56:sa acq abc 7805

are not considered unique because they fall with in +/- 100 range. so my output should be as follows when checking for unique lines in file 1
Code:
abbsss:aa:22:34:as akl abc 1234

and while checking for unique lines in file 2
Code:
lkoaa:as:24:32:sa alk abc 3245

hope i am clear.
# 2  
Old 08-25-2012
Are the input files sorted on col 4, or can they be?
# 3  
Old 08-25-2012
Code:
 awk 'NR==FNR{a[FNR]=$0;} NR!=FNR{b[FNR]=$0;} END{for(x in a) { split(a[x],c_a," ");split(b[x],c_b," "); if(c_b[4]!= c_a[4] && (c_b[4]-c_a[4]>=100 || c_b[4]-c_a[4] <= -100)) {printf("%s\n%s\n",a[x],b[x]);}}}' file1 file2

Forget the above..got the req wrong i guess...

Last edited by msabhi; 08-25-2012 at 02:17 PM..
# 4  
Old 08-25-2012
Try this:
Code:
awk 'FNR == NR { # Accumulate records from 1st file.
        f1[++n1] = $0
        low1[n1] = $4 - 100
        mid1[n1] = $4
        high1[n1] = $4 + 100
        next
}
        { # Accumulate records from 2nd file
        low2[++n2] = $4 - 100
        high2[n2] = $4 + 100
        # Look for lines in 1st file that are in range of $4 in 2nd file
        for(i = 1; i <= n1; i++)
                if(($4 > low1[i]) && ($4 < high1[i]))
                        next # match found
        # This line is unique.
        print $0 > "UniqueIn2ndFile"
}
END     { # Look for lines in 2nd file that are unique versus 1st file
        for(j = 1; j <= n1; j++) {
                for(i = 1; i <= n2; i++)
                        if((mid1[j] > low2[i]) && (mid1[j] < high2[i]))
                                break # match found
                if(i > n2) print f1[j] > "UniqueIn1stFile"
        }
}' file1 file2


Last edited by Don Cragun; 08-25-2012 at 03:55 PM..
This User Gave Thanks to Don Cragun For This Post:
# 5  
Old 08-26-2012
Could ths help you ?

Code:
awk 'NR==FNR{a[NR]=$0;b[NR]=$4;next}
{if($4-b[FNR] > 100 || $4-b[FNR] < -100){ print a[FNR]; print $0}}' file1 file2

# 6  
Old 08-26-2012
Thank you very much for all posts. the files are not ordered and i hope the code provided by all would work even then
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Delete lines from file based on condition

I want to keep last 2 days data from a file and want to delete others data from the file. Please help me. Sample Input # cat messages-2 Apr 15 11:25:03 test1 kernel: imklog 4.6.2, log source = /proc/kmsg started. Apr 15 11:25:03 test1 rsyslogd: (re)start Apr 16 19:42:03 test1 kernel:... (2 Replies)
Discussion started by: makauser
2 Replies

2. Shell Programming and Scripting

Compare columns of multiple files and print those unique string from File1 in an output file.

Hi, I have multiple files that each contain one column of strings: File1: 123abc 456def 789ghi File2: 123abc 456def 891jkl File3: 234mno 123abc 456def In total I have 25 of these type of file. (5 Replies)
Discussion started by: owwow14
5 Replies

3. Shell Programming and Scripting

Compare multiple files, identify common records and combine unique values into one file

Good morning all, I have a problem that is one step beyond a standard awk compare. I would like to compare three files which have several thousand records against a fourth file. All of them have a value in each row that is identical, and one value in each of those rows which may be duplicated... (1 Reply)
Discussion started by: nashton
1 Replies

4. Shell Programming and Scripting

Deleting lines based on a condition for a group of files

hi i have a set of similar files. i want to delete lines until certain pattern appears in those files. for a single file the following command can be used but i want to do it for all the files at a time since the number is in thousands. awk '/PATTERN/{i++}i' file (6 Replies)
Discussion started by: anurupa777
6 Replies

5. Shell Programming and Scripting

extracting lines based on condition and copy to another file

hi i have an input file that contains some thing like this aaa acc aa abc1 1232 aaa abc2.... poo awq aa abc1 aaa aaa abc2 bbb bcc bb abc1 3214 bbb abc3.... bab bbc bz abc1 3214 bbb abc3.... vvv ssa as abc1 o09 aaa abc4.... azx aaq aa abc1 900 aqq abc19.... aaa aa aaaa abc1 899 aa... (8 Replies)
Discussion started by: anurupa777
8 Replies

6. Shell Programming and Scripting

compare 2 files and extract the data which is not present in other file with condition

I have 2 files whose data's are as follows : fileA 00 lieferungen 00 attractiop 01 done 02 forness 03 rasp 04 alwaysisng 04 funny 05 done1 fileB alwayssng dkhf fdgdfg dfgdg sdjkgkdfjg funny rasp (7 Replies)
Discussion started by: rajniman
7 Replies

7. Shell Programming and Scripting

Compare multiple files and print unique lines

Hi friends, I have multiple files. For now, let's say I have two of the following style cat 1.txt cat 2.txt output.txt Please note that my files are not sorted and in the output file I need another extra column that says the file from which it is coming. I have more than 100... (19 Replies)
Discussion started by: jacobs.smith
19 Replies

8. Shell Programming and Scripting

Compare columns of 2 files based on condition defined in a different file

I have a control file which tells me which are the fields in the files I need to compare and based on the values I need to print the exact value if key =Y and output is Y , or if output is Y/N then I need to print only Y if it matches or N if it does not match and if output =N , then skip the feild... (7 Replies)
Discussion started by: newtoawk
7 Replies

9. Shell Programming and Scripting

How to compare 2 files & get only few columns based on a condition related to both files?

Hiiiii friends I have 2 files which contains huge data & few lines of it are as shown below File1: b.dat(which has 21 columns) SSR 1976 8 12 13 10 44.00 39.0700 70.7800 7.0 0 0.00 0 2.78 0.00 0.00 0 0.00 2.78 0 NULL ISC 1976 8 12 22 32 37.39 36.2942 70.7338... (6 Replies)
Discussion started by: reva
6 Replies

10. Shell Programming and Scripting

Comparing 2 files and return the unique lines in first file

Hi, I have 2 files file1 ******** 01-05-09|java.xls| 02-05-08|c.txt| 08-01-09|perl.txt| 01-01-09|oracle.txt| ******** file2 ******** 01-02-09|windows.xls| 02-05-08|c.txt| 01-05-09|java.xls| 08-02-09|perl.txt| 01-01-09|oracle.txt| ******** (8 Replies)
Discussion started by: shekhar_v4
8 Replies
Login or Register to Ask a Question