Compare columns of 2 files based on condition defined in a different file
I have a control file which tells me which are the fields in the files I need to compare and based on the values I need to print the exact value if key =Y and output is Y , or if output is Y/N then I need to print only Y if it matches or N if it does not match and if output =N , then skip the feild to compare and write it to a output file
For ex:
my control file
Code:
key|compare_field|output
Y|Field_1|Y
N|Filed_2|Y/N
Y|Field_3|Y
N|Field_4|Y/N
N|Field_5|N
N|Field_6|Y/N
file1
field_1|feld_2|field_3|field_4|field_5|field_6
000|adbc|edfr|hjkl|890|jlk|ioy
678|jfjd|djla|uopp|678|jyh|jkl
file2
field_1|feld_2|field_3|field_4|field_5|field_6
000|adbc|edfr|hjkl|890|jlk|ioy
678|juio|djla|uopu|678|jyh|jkl
my output should be
field_1|feld_2|field_3|field_4|field_6
000|Y|edfr|Y|Y
678|N|djla|N|Y
I was trying to do it in 2 parts and then combine, but I am lost, need your help to combine this logic.
Code:
# to copy the field names as the header in the report file.
nawk -F\| 'END {print x } $NF =="Y" || $NF == "Y\/N" { printf "%s",$2 FS >> "report_file" }' control_file
To compare the 2 files and print the output as Y or N
Code:
nawk -F'|' '{ getline x <f; split(x,F,"|")}
NR >1 {for(i=2;i<=NF;i++) $i=(F[i]==$i)?"Y";"N"}1' OFS="|" f=file2 file1
I can do then seperately, but I am not able to read the control file and compare the files based on the control file.
Please help me.
Thanks in Advance
newtoawk
Last edited by Scott; 11-07-2010 at 05:57 AM..
Reason: Please use code tags
awk -F'|' 'NR==FNR && NR>1 {a[++i]=$1$3;next} #Read first file i.e. control_file starting from line 2 and fill the array 'a' with value $1$3 i.e. Key and output field
FNR>1 { if (b[FNR]) { c[FNR]=$0} else { b[FNR]=$0} } #Read file1 and file2 and fill the array 'b' for file1 and 'c' for file2
END {
for(k in c) {
split(c[k],d,"|");split(b[k],e,"|") ; # fill the array 'd' and 'e' by spilting record into fields of file1 and file2
for (j=1;j<=6;j++) {
if (a[j]=="YY") { #if key and output both are 'Y' then print the field as it is
printf "%s|", d[j]
}
else if(a[j] != "NN") { #if key and output both are not 'N'
printf "%s|" ,(d[j]==e[j])?"Y":"N" #if field from file1 and file2 same then print 'Y' else 'N'
}
}printf "\n"
}
}' control_file file1 file2
thanks Pravin, it worked. Can I pass the field delimiter as a variable.B'cas I need to read the output format from a file.
for ex: instead of nawk -F'|' -- can I do it something like this
output_format=| or output_format=\t
nawk -F'$output_format' ... does this work, or is there anyother way to do it.
Hi,
I want to compare two columns from file1 with another two column of file2 and print matched and unmatched column like this
File1
1 rs1 abc
3 rs4 xyz
1 rs3 stu
File2
1 kkk rs1 AA 10
1 aaa rs2 DD 20
1 ccc ... (2 Replies)
I have a file some thing like this:
GN Name=YWHAB;
RC TISSUE=Keratinocyte;
RC TISSUE=Thymus;
CC -!- FUNCTION: Adapter protein implicated in the regulation of a large
CC spectrum of both general and specialized signaling pathways
GN Name=YWHAE;
RC TISSUE=Liver;
RC ... (13 Replies)
Hi I have a matrix with n rows and m columns like below example. i want to extract all the pairs with values <200.
Input
A B C D
A 100 206 51 300
B 206 100 72 48
C 351 22 100 198
D 13 989 150 100
Output format
A,A:200
A,C:51
B,B:100... (2 Replies)
I have a file named file.txt that looks as follows
//class1.txt
45
234
67
89
90
//class2.txt
456
34
78
89
120
class1 and class2.txt are the names of files in a folder named folder1.
The content of class1.txt file in folder1
67 9
89 5
234 9The content of class2.txt file in... (1 Reply)
hi
my problem is little complicated one. i have 2 files which appear like this
file 1
abbsss:aa:22:34:as akl abc 1234
mkilll:as:ss:23:qs asc abc 0987
mlopii:cd:wq:24:as asd abc 7866
file2
lkoaa:as:24:32:sa alk abc 3245
lkmo:as:34:43:qs qsa abc 0987
kloia:ds:45:56:sa acq abc 7805
i... (5 Replies)
Hi Experts,
Is there a way to compare 2 files by columns and print matching cases.
I have 2 files as below, I want cases where col1 and col2 in f1 matches col1 and col2 in f2 to be printed as output. The separator is space. I want the output to have col1 col2 col 3 from both files printed... (7 Replies)