awk help: Match data fields from 2 files & output results from both into 1 file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk help: Match data fields from 2 files & output results from both into 1 file
# 1  
Old 12-10-2012
awk help: Match data fields from 2 files & output results from both into 1 file

I need to take 2 input files and create 1 output based on matches from each file. I am looking to match field #1 in both files (Userid) and create an output file that will be a combination of fields from
both file1 and file2 if there are any differences in the fields 2,3,4,5,or 6.

Below is an example of where in file2 the First Name=John, Email=john.doe@yahoo.com, and Phone Number=111-222-3333.
Since these fields are different than the values in file1 I need the output in file3 based on the example below using awk. Any help would be greatly appreciated.

file1:
Code:
jdoe|Doe|Johnny|111-222-9999|jdoe@gmail.com|Main Office|1020-771-AHSDEV100|512
1|Userid
2|Last Name
3|First Name
4|Phone Number
5|Email
6|Office
7|Description
8|Account

file2:
Code:
jdoe|Doe|John|111-222-3333|john.doe@yahoo.com|Main Office|0xF48F9F97AB1E9242A6CCB96BA1DB5C79|0x820A4CE019D51F45B7B911CDCDB5208D|Data1|Data2|Data3|Data4|Data5|Data6|0
1|Userid
2|Last Name
3|First Name
4|Phone Number
5|Email
6|Office
7|Office Location
8|Contact UUID
9|Data1
10|Data2
11|Data3
12|Data4
13|Data5
14|Data6
15|Active


file3:
Code:
{ "820A4CE019D51F45B7B911CDCDB5208D", "Doe", "Johnny", "111-222-9999", "jdoe@gmail.com", "jdoe", "F48F9F97AB1E9242A6CCB96BA1DB5C79", "Data1", "0" }

{ 
"file2 field #8 (Office Location) cut -c3-34"
"file1 field #2 (Last Name)"
"file1 field #3 (First Name)"
"file1 field #4 (Phone Number)"
"file1 field #5 (Email)"
"file1 field #1 (Userid)"
"file2 field #7 (Office Location) cut -c3-34"
"file2 field #9 (Data1)
"file2 field #15 (Active)"

Moderator's Comments:
Mod Comment
Please use code tags when posting data and code samples!

Last edited by vgersh99; 12-10-2012 at 07:08 PM.. Reason: code tags, please!
# 2  
Old 12-10-2012
try:
Code:
awk -F"|" '
NR==FNR {a[$1]=$0; for (i=2; i<=6; i++) b[$1,$i]=b[$1,i]=$i; next}
a[$1] {
  s=0;
  for (i=2; i<=6; i++) {if (!b[$1,$i]) s=1; continue;};
  if (s==1) {
    printf "{ \"";
    printf substr($8,3) "\", \"";
    printf b[$1,2]      "\", \"";
    printf b[$1,3]      "\", \"";
    printf b[$1,4]      "\", \"";
    printf b[$1,5]      "\", \"";
    printf $1           "\", \"";
    printf substr($7,3) "\", \"";
    printf $9           "\", \"";
    printf $15;
    print  "\" }";
  }
}
' file1 file2 > file3

Using only line 1 in example for file1 and file2.
This User Gave Thanks to rdrtx1 For This Post:
# 3  
Old 12-11-2012
Quote:
Originally Posted by rdrtx1
try:
Code:
awk -F"|" '
NR==FNR {a[$1]=$0; for (i=2; i<=6; i++) b[$1,$i]=b[$1,i]=$i; next}
a[$1] {
  s=0;
  for (i=2; i<=6; i++) {if (!b[$1,$i]) s=1; continue;};
  if (s==1) {
    printf "{ \"";
    printf substr($8,3) "\", \"";
    printf b[$1,2]      "\", \"";
    printf b[$1,3]      "\", \"";
    printf b[$1,4]      "\", \"";
    printf b[$1,5]      "\", \"";
    printf $1           "\", \"";
    printf substr($7,3) "\", \"";
    printf $9           "\", \"";
    printf $15;
    print  "\" }";
  }
}
' file1 file2 > file3

Using only line 1 in example for file1 and file2.
rdrtx1 - This works perfect and is exactly what I was looking for. I did find somethings that I would like to see if they could be added and it's my fault for thinking of them after the fact.

Is there a way to only output into file3 the results where there were differences? If fields 2,3,4,5,or 6 are the same they do not need to be outputted into file3.
# 4  
Old 12-11-2012
try fixed:
Code:
awk -F"|" 'NR==FNR {a[$1]=$0; for (i=2; i<=6; i++) b[$1,$i]=b[$1,i]=$i; next}
a[$1] {  
  s=0; 
  for (i=2; i<=6; i++) {if (!b[$1,$i]) s=1};  
  if (s==1) {    
    printf "{ \"";    
    printf substr($8,3) "\", \"";    
    printf b[$1,2]      "\", \"";    
    printf b[$1,3]      "\", \"";    
    printf b[$1,4]      "\", \"";    
    printf b[$1,5]      "\", \"";    
    printf $1           "\", \"";    
    printf substr($7,3) "\", \"";    
    printf $9           "\", \"";    
    printf $15;    
    print  "\" }";  
  }
}' file1 file2 > file3

This User Gave Thanks to rdrtx1 For This Post:
# 5  
Old 12-11-2012
Quote:
Originally Posted by rdrtx1
try fixed:
Code:
awk -F"|" 'NR==FNR {a[$1]=$0; for (i=2; i<=6; i++) b[$1,$i]=b[$1,i]=$i; next}
a[$1] {  
  s=0; 
  for (i=2; i<=6; i++) {if (!b[$1,$i]) s=1};  
  if (s==1) {    
    printf "{ \"";    
    printf substr($8,3) "\", \"";    
    printf b[$1,2]      "\", \"";    
    printf b[$1,3]      "\", \"";    
    printf b[$1,4]      "\", \"";    
    printf b[$1,5]      "\", \"";    
    printf $1           "\", \"";    
    printf substr($7,3) "\", \"";    
    printf $9           "\", \"";    
    printf $15;    
    print  "\" }";  
  }
}' file1 file2 > file3


WORKS PERFECT! I really appreciate your help. You and awk are amazing! I've been scripting with shell for 15+ years but never got to the advanced level of using awk or sed, I really need to invest more time into learning it. Do you do training classes? Smilie Thanks again.
# 6  
Old 12-11-2012
There's really no substitute for just using it until you're used to it.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

awk match two fields in two files

Hi, I have two TEST files t.xyz and a.xyz which have three columns each. a.xyz have more rows than t.xyz. I will like to output rows at which $1 and $2 of t.xyz match $1 and $2 of a.xyz. Total number of output rows should be equal to that of t.xyz. It works fine, but when I apply it to large... (6 Replies)
Discussion started by: geomarine
6 Replies

2. Shell Programming and Scripting

awk to update file based on match in 3 fields

Trying to use awk to store the value of $5 in file1 in array x. That array x is then used to search $4 of file1 to find aa match (I use x to skip the header in file1). Since $4 can have multiple strings in it seperated by a , (comma), I split them and iterate througn each split looking for a match.... (2 Replies)
Discussion started by: cmccabe
2 Replies

3. Shell Programming and Scripting

awk move select fields to match file prefix in two directories

In the awk below I am trying to use the file1 as a match to file2. In file2 the contents of $5,&6,and $7 (always tab-delimited) and are copied to the output under the header Quality metrics. The below executes but the output is empty. I have added comments to help and show my thinking. Thank you... (0 Replies)
Discussion started by: cmccabe
0 Replies

4. UNIX for Beginners Questions & Answers

Match Fields between two files, print portions of each file together when matched in ([g]awk)'

I've written an awk script to compare two fields in two different files and then print portions of each file on the same line when matched. It works reasonably well, but every now and again, I notice some errors and cannot seem to figure out what the issue may be and am turning to you for help. ... (2 Replies)
Discussion started by: jvoot
2 Replies

5. Shell Programming and Scripting

awk to print match or non-match and select fields/patterns for non-matches

In the awk below I am trying to output those lines that Match between file1 and file2, those Missing in file1, and those missing in file2. Using each $1,$2,$4,$5 value as a key to match on, that is if those 4 fields are found in both files the match, but if those 4 fields are not found then missing... (0 Replies)
Discussion started by: cmccabe
0 Replies

6. Shell Programming and Scripting

awk to print fields that match using conditions and a default value for non-matching in two files

Trying to use awk to match the contents of each line in file1 with $5 in file2. Both files are tab-delimited and there may be a space or special character in the name being matched in file2, for example in file1 the name is BRCA1 but in file2 the name is BRCA 1 or in file1 name is BCR but in file2... (6 Replies)
Discussion started by: cmccabe
6 Replies

7. Shell Programming and Scripting

awk to output match and mismatch with count using specific fields

In the below awk I am trying output to one file those lines that match between $2,$3,$4 of file1 and file2 with the count in (). I am also trying to output those lines that are missing between $2,$3,$4 of file1 and file2 with the count of in () each. Both input files are tab-delimited, but the... (7 Replies)
Discussion started by: cmccabe
7 Replies

8. Shell Programming and Scripting

AWK to match and merge data from 2 files into 1.

Hello, hopefully this is an easy on for the AWK guru's out there. I'm having some trouble figuring out how to match+merge data in 2 files into 1 single report. I've got my 2 files filtered and delimited, just need to MATCH $3 in file1 to $1 in file2, then put $0 from File1 and $2+$3 from File2... (6 Replies)
Discussion started by: right_coaster
6 Replies

9. Shell Programming and Scripting

[Shell/Perl(?)] Prepending timestamps to console output & writing results to a file

I do a lot of TSM work and I embarked on what I thought would be an easy task, and I'd be very happy for any input to save the pounding my keyboard is receiving :] By default, the output of TSM's console has no timestamping, making it hard to sort through accurately. This puts my console into... (5 Replies)
Discussion started by: Vryali
5 Replies

10. Shell Programming and Scripting

AWK Compare files, different fields, output

Hi All, Looking for a quick AWK script to output some differences between two files. FILE1 device1 1.1.1.1 PINGS device1 2.2.2.2 PINGS FILE2 2862 SITE1 device1-prod 1.1.1.1 icmp - 0 ... (4 Replies)
Discussion started by: stacky69
4 Replies
Login or Register to Ask a Question