Comparing two columns with two columns


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Comparing two columns with two columns
# 1  
Old 02-18-2013
Comparing two columns with two columns

I have a file that i need to compare the values in two columns with another two columns. For examples:

Code:
    Item  A     B     C    D
    1    201  3101   3101 201
    2   3101   201    202 3101  
    3   3101   201    201 3102 
    4   3101   201   3202 202

So what i trying to do is comparing column A&B to C&D, but values in column A&B and C&D are interchangeable. and i would like to have a output like 2 if both values are the same, 1 if only one is match and 0 if both are not match. The answer would be something like this


Code:
    Item  A     B     C    D   Output
    1    201  3101   3101 201    2
    2   3101   201    202 3101   1
    3   3101   201    201 3102   1
    4   3101   201   3202 202    0

Thank you in advance!!

Moderator's Comments:
Mod Comment edit by bakunin: which part of "Do you have any code fragments or data samples in your post? If so wrap them in code tags using the code tag button in the editor below", which is written right above the editor window, was not understandable? Please, really, really, definitely use them. Thanks.

Last edited by bakunin; 02-18-2013 at 10:42 AM..
# 2  
Old 02-18-2013
Code:
awk 'NR==1 {print; next} {
    if (($2 == $4 || $2 == $5) && ($3 == $4 || $3 == $5)) {
        print $0, 2
        next
    }
    else if (($2 == $4 || $2 == $5) && ($3 != $4 || $3 != $5)) {
        print $0, 1
        next
    }
    else if (($2 != $4 || $2 != $5) && ($3 == $4 || $3 == $5)) {
        print $0, 1
        next
    }
    else if (($2 != $4 || $2 != $5) && ($3 != $4 || $3 != $5)) {
        print $0, 0
        next
    }
}' file

# 3  
Old 02-18-2013
Code:
$ awk '{for(i=2;i<=NF;i++){if(A[$i,NR]++){n++}};$(NF+1)=n;n=0}1' file

1 201 3101 3101 201 2
2 3101 201 202 3101 1
3 3101 201 201 3102 1
4 3101 201 3202 202 0

OR lit bit more robust..

Code:
$ awk '{for(i=2;i<=3;i++){A[$i,NR]++};for(i=4;i<=NF;i++){if(A[$i,NR]){n++}};$(NF+1)=n;n=0}1' file

1 201 3101 3101 201 2
2 3101 201 202 3101 1
3 3101 201 201 3102 1
4 3101 201 3202 202 0


Last edited by pamu; 02-18-2013 at 05:02 AM..
This User Gave Thanks to pamu For This Post:
# 4  
Old 02-18-2013
Try the following...
Code:
#!/usr/bin/perl

while(<DATA>){
    chomp;
    my $count=0;
    @fields=split(/\s+/,$_);
    $count++ if $fields[3]=~/^($fields[1]|$fields[2])$/;
    $count++ if $fields[4]=~/^($fields[1]|$fields[2])$/;
    print "$_\t$count\n";
}
__DATA__
Item A B C D
1 201 3101 3101 201
2 3101 201 202 3101
3 3101 201 201 3102
4 3101 201 3202 202

Or as a (slightly less clear) one liner...

Code:
$ perl -ne 'chomp; $c=0;@f=split;map{$c++ if/^$f[1]|$f[2]$/}@f[3,4];print "$_ $c\n";' tmp/tmp.dat
1 201 3101 3101 201 2
2 3101 201 202 3101 1
3 3101 201 201 3102 1
4 3101 201 3202 202 0


Last edited by Skrynesaver; 02-18-2013 at 05:00 AM..
# 5  
Old 02-18-2013
A bit easier to read (?):
Code:
$ awk 'NR==1; NR > 1 {$0=$0" "; print $0 gsub(" "$2" | "$3, "&") - 1 }' file
Item A B C D
1 201 3101 3101 201 2
2 3101 201 202 3101 1
3 3101 201 201 3102 1
4 3101 201 3201 202 0

# 6  
Old 02-18-2013
Quote:
Originally Posted by balajesuri
Code:
awk 'NR==1 {print; next} {
    if (($2 == $4 || $2 == $5) && ($3 == $4 || $3 == $5)) {
        print $0, 2
        next
    }
    else if (($2 == $4 || $2 == $5) && ($3 != $4 || $3 != $5)) {
        print $0, 1
        next
    }
    else if (($2 != $4 || $2 != $5) && ($3 == $4 || $3 == $5)) {
        print $0, 1
        next
    }
    else if (($2 != $4 || $2 != $5) && ($3 != $4 || $3 != $5)) {
        print $0, 0
        next
    }
}' file

above logic can be reduced to.....

Code:
awk 'NR==1 {print; next} {
    if (($2 == $4 || $2 == $5) && ($3 == $4 || $3 == $5)) {
        $(NF+1)=2}
    else if (($2 == $4 || $2 == $5) || ($3 == $4 || $3 == $5)) {
        $(NF+1)=1}
    else {$(NF+1)=0}
    print
    }' file

---------- Post updated at 02:55 PM ---------- Previous update was at 02:50 PM ----------

Quote:
Originally Posted by RudiC
A bit easier to read (?):
Code:
$ awk 'NR==1; NR > 1 {$0=$0" "; print $0 gsub(" "$2" | "$3, "&") - 1 }' file
Item A B C D
1 201 3101 3101 201 2
2 3101 201 202 3101 1
3 3101 201 201 3102 1
4 3101 201 3201 202 0

This may fail for below input file....

Code:
$ awk 'NR==1; NR > 1 {$0=$0" "; print $0 gsub(" "$2" | "$3, "&") - 1 }' file

Item A B C D
1 201 3101 3101 201 2
2 3101 201 202 3101 1
3 3101 201 201 3102 1
4 3101 201 3202 202 0
4 201 201 201 201 1

These 2 Users Gave Thanks to pamu For This Post:
# 7  
Old 02-18-2013
Quote:
Originally Posted by pamu
above logic can be reduced to.....

Code:
awk 'NR==1 {print; next} {
    if (($2 == $4 || $2 == $5) && ($3 == $4 || $3 == $5)) {
        $(NF+1)=2}
    else if (($2 == $4 || $2 == $5) || ($3 == $4 || $3 == $5)) {
        $(NF+1)=1}
    else {$(NF+1)=0}
    print
    }' file

---------- Post updated at 02:55 PM ---------- Previous update was at 02:50 PM ----------



This may fail for below input file....

Code:
$ awk 'NR==1; NR > 1 {$0=$0" "; print $0 gsub(" "$2" | "$3, "&") - 1 }' file

Item A B C D
1 201 3101 3101 201 2
2 3101 201 202 3101 1
3 3101 201 201 3102 1
4 3101 201 3202 202 0
4 201 201 201 201 1

Pamu, thanks for pointing out regarding that, indeed, if two columns have the same values, the commands doesnt work. i have used your second suggestion and it work brilliantly
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Comparing two columns from two different files

Hi, I have a single-column file1 having records like: 00AB01/11 43TG22/00 78RC09/34 ...... ...... and a second file , file 2 having two columns like 78RC09/34 1 45FD11/11 2 00AB01/11 3 43TG22/00 4 ...... ...... (8 Replies)
Discussion started by: amarn
8 Replies

2. Shell Programming and Scripting

Comparing columns in a file

I have two files. One a small one and another one is big. The smaller one look like this: Filename: 1.tmp 3453 0 326543 1 2321 0 3212 1 The big file looks like this: Filename 1.res 0.3232 2321 9.2922 123 0.983 3212 8.373 326543 0.9 3453 1.098 3432 I want to extract those lines... (2 Replies)
Discussion started by: shoaibjameel123
2 Replies

3. Shell Programming and Scripting

Comparing rows and columns

Hi, i've a .csv file with the data as below: - file1.h, 2.0 file2.c, 3.1 file1.h, 2.5 file3.c, 3.3.3 file1.h, 1.2.3 I want to remove the duplicate file names considering only the one with the highest version number.. output should be file1.h, 2.5 file2.c, 3.1 file3.c,... (3 Replies)
Discussion started by: pravsripad
3 Replies

4. Shell Programming and Scripting

comparing two columns from two different files

Hello, I have two files as 1.txt and 2.txt with number as columns. 1.txt 0 53.7988 1 -30.0859 2 20.1632 3 14.2135 4 14.6366 5 -37.6258 . . . 31608 -8.57333 31609 -2.58554 31610 -24.2857 2.txt (1 Reply)
Discussion started by: AKD
1 Replies

5. Shell Programming and Scripting

Comparing two columns

Hi, I want to compare two columns and find out missing entries e:g Column 1 Column 2 1 1 2 2 3 13 4 10 19 234 Results woud be 13. I will appreciate very much if anyone help me :). (12 Replies)
Discussion started by: krabu
12 Replies

6. UNIX for Dummies Questions & Answers

Comparing 2 columns from 2 files

Hi, I have two files with the same number of columns. Basically I want to print the 2 columns that match between the two files. File1 looks like this: dr12 12 6 abn dr14 12 7 abn File2 looks something like this: dr12 12 8 abn dr12 14 7 abn So basically if the first... (1 Reply)
Discussion started by: kylle345
1 Replies

7. Shell Programming and Scripting

comparing 2 columns from 2 files

Hey, I have 2 files that have a name and then a number: File 1: dog 21 dog 24 cat 33 cat 27 dog 76 cat 65 File 2: dog 109 dog 248 cat 323 cat 207 cat 66 (2 Replies)
Discussion started by: dcfargo
2 Replies

8. Shell Programming and Scripting

comparing the columns in two files

I have two files file1 and file 2 both are having multiple coloumns.i want to select only two columns. i used following code to get the desired columns,with ',' as delimiter cut -d ',' -f 1,2 file1 | sort > file1.new cut -d ',' -f 1,2 file2 | sort > file2.new I want to get the coloums... (1 Reply)
Discussion started by: bab123
1 Replies

9. Shell Programming and Scripting

Comparing Columns of two FIles

Dear all, I have two files in UNIX File1 and File2 as below: File1: 1,1234,.,67.897,,0 1,4134,.,87.97,,4 0,1564,.,97.8,,1 File2: 2,8798,.,67.897,,0 2,8879,.,77.97,,4 0,1564,.,97.8,,1 I want to do the following: (1) Make sure that both the files have equal number of columns and if... (4 Replies)
Discussion started by: ggopal
4 Replies

10. UNIX for Advanced & Expert Users

Comparing Columns of two FIles

Dear all, I have two files in UNIX File1 and File2 as below: File1: 1,1234,.,67.897,,0 1,4134,.,87.97,,4 0,1564,.,97.8,,1 File2: 2,8798,.,67.897,,0 2,8879,.,77.97,,4 0,1564,.,97.8,,1 I want to do the following: (1) Make sure that both the files have equal number of columns and if... (1 Reply)
Discussion started by: ggopal
1 Replies
Login or Register to Ask a Question