Keeping record of file 2 based on a reference file 1 in awk
I have 2 input files (tab separated):
file1:
file2:
I am trying to append records of file 2 to file 1 if:
1) $1 of file 1 and $1 of file 2 are the same
AND
2) $2 of file 2 ≤ $2 of file 1 ≤ $3 of file 2
AND
3) file 2 contains the value '0' for ref_3 (i.e. 'ref_3:0')
then to count the number of records in file 2 that matched these criteria.
in order to get:
I tried the following, but it returns a blank output and I don't really understand why:
You aren't far off. Using your indentation style and making a few minor changes:
seems to do what you want. The problems in your code were:
The biggest problem is that (even though you said your input files had tab separated fields), there are no tabs in either of your input files. The fields in your input files and in the output you said you wanted are separated by three space characters,
brands that did not have any matched lines were not added to the ref[] array,
the reference counts array (ref[]) was treated as a scalar when you incremented its value, and
when you printed the counts, you only printed the count, not the array index and the count.
Changes to fix those minor issues are marked in red in the code above.
Note that your specification wasn't clear as to whether there should only be one output line for each brand if there are multiple input lines meeting your constraints or one output line for each input line meeting your constraints. The code above produces one output line for each input line in file2.txt that meets the constraints.
Note also that the order of the counts at the end of the output is in random order. Additional changes would be required if you need to have the output order of those line match the order in which each brand was first found in file2.txt (as it was in your sample output specification).
You might also want to compare the above with the following:
which produces the same output using a single if statement instead of a call to match(), a call to substr(), a call to split() and two if statements.
This User Gave Thanks to Don Cragun For This Post:
Hi,
I'd be grateful for your help with the following:
I have a file with a single column (file1). Let's say the values are:
a
b
c
5
d
I have a second, reference file (ref_file), which is colon-delimited, and is effectively a key. Let's say the values in it are:
a:1
b:2
c:3
d:4... (4 Replies)
I was wondering if anyone could explain to me how to split a variable length EBCDIC file into seperate files based on the record key. I have the COBOL layout, and so I need to split the file into 13 different EBCDIC files so that I can run each one through a C++ converter I have, and get the... (11 Replies)
Basically want to replace any field in input file from the refernce file ...
for example.
clar_2400:3113 in input file will be replaced by clar_2400:3113
Input file field seperator is ","
Field which is not found in reference will stay as it is ...
Input File
... (3 Replies)
input:
ref.1;rack.1;1 #group1
ref.1;rack.1;2 #group1
ref.1;rack.2;1 #group2
ref.2;rack.3;1 #group3
ref.2;rack.3;2 #group3
ref.2;rack.3;3 #group3
Among records from same group (i.e. with same 1st and 2nd field - separated by ";"), I would need to keep the last record... (5 Replies)
All,
We receive a file with a large no of records (records can vary) and we have to split it into two files based on another file. e.g.
File1:
UHDR 2008112
"25187","00000022","00",21-APR-1991,"" ,"D",-000000519,+0000000000,"C", ,+000000000,+000000000,000000000,"2","" ,21-APR-1991... (7 Replies)