Compare two files and merge into third


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Compare two files and merge into third
# 1  
Old 04-21-2014
Compare two files and merge into third

Hello:
Newbie with Awk. Trying to compare two files and merge data based on CID. Please see the input file format and desired output. Any help is appreciated.

TIA

Code:
Input File1 

CID1  --- TYP1 --- DCN1 --- INDATE1 --- IN-DATA1
CID2  --- TYP2 --- DCN2 --- INDATE2 --- IN-DATA2
CID3  --- TYP1 --- DCN3 --- INDATE3 --- IN-DATA3
CID4  --- TYP1 --- DCN4 --- INDATE4 --- IN-DATA4
CID5  --- TYP1 --- DCN5 --- INDATE5 --- IN-DATA5

Input File2 

CID2  --- TYP2 --- DCN2 --- OUTDATE2 --- OUT-DATA2
CID3  --- TYP1 --- DCN3 --- OUTDATE3 --- OUT-DATA3
CID1  --- TYP1 --- DCN1 --- OUTDATE1 --- OUT-DATA1
CID4  --- TYP1 --- DCN4 --- OUTDATE4 --- OUT-DATA4
CID5  --- TYP1 --- DCN5 --- OUTDATE5 --- OUT-DATA5

Expected output
----------------
CID: CID1
DCN: DCN1
Type: TYP1
INDATE1: Req data: IN-DATA1
OUTDATE1: resp data: OUT-DATA1
-----------------
CID: CID2
DCN: DCN2
Type: TYP2
INDATE2: Req data: IN-DATA2
OUTDATE2: resp data: OUT-DATA2
-------------------
CID ID: CID3
DCN: DCN3
Type: TYP1
INDATE3: Req data: IN-DATA3
OUTDATE3: resp data: OUT-DATA3

# 2  
Old 04-21-2014
You're going from a record oriented format to non-record format...where information is conveyed on more than one line. I suggest you sort both files in the very least.
# 3  
Old 04-21-2014
Code:
awk -F"--- " '
        NR == FNR {
                A[$1] = $(NF-1) FS $NF
                next
        }
        $1 in A {
                split ( A[$1], T )
                print "CID:", $1
                print "DCN:", $3
                print "Type:", $2
                print T[1] ": Req data: " T[2]
                print $(NF-1) ": resp data: " $NF
                print "---------------------------------------"
        }
' file1 file2

This User Gave Thanks to Yoda For This Post:
# 4  
Old 04-21-2014
Quote:
Originally Posted by wincrazy
Hello:
Newbie with Awk. Trying to compare two files and merge data based on CID. Please see the input file format and desired output. Any help is appreciated.

TIA

Code:
Input File1 

CID1  --- TYP1 --- DCN1 --- INDATE1 --- IN-DATA1
CID2  --- TYP2 --- DCN2 --- INDATE2 --- IN-DATA2
CID3  --- TYP1 --- DCN3 --- INDATE3 --- IN-DATA3
CID4  --- TYP1 --- DCN4 --- INDATE4 --- IN-DATA4
CID5  --- TYP1 --- DCN5 --- INDATE5 --- IN-DATA5

Input File2 

CID2  --- TYP2 --- DCN2 --- OUTDATE2 --- OUT-DATA2
CID3  --- TYP1 --- DCN3 --- OUTDATE3 --- OUT-DATA3
CID1  --- TYP1 --- DCN1 --- OUTDATE1 --- OUT-DATA1
CID4  --- TYP1 --- DCN4 --- OUTDATE4 --- OUT-DATA4
CID5  --- TYP1 --- DCN5 --- OUTDATE5 --- OUT-DATA5

Expected output
----------------
CID: CID1
DCN: DCN1
Type: TYP1
INDATE1: Req data: IN-DATA1
OUTDATE1: resp data: OUT-DATA1
-----------------
CID: CID2
DCN: DCN2
Type: TYP2
INDATE2: Req data: IN-DATA2
OUTDATE2: resp data: OUT-DATA2
-------------------
CID ID: CID3
DCN: DCN3
Type: TYP1
INDATE3: Req data: IN-DATA3
OUTDATE3: resp data: OUT-DATA3

How is you awk program supposed to determine which input records produce output? (Why is there no output for the CID4 and CID5 input lines?)

How is you awk program supposed to know whether to print CID: or CID ID: on the second line of output for a given pair of CID input lines?
# 5  
Old 04-21-2014
Good catch.. It was a typo on my part. Not much change from my requirements though.
The output need to contain CID4 and CID5, for brevity I left the off
It was a typo on "CID ID" and needs to be CID.
Thanks
Quote:
Originally Posted by Don Cragun
How is you awk program supposed to determine which input records produce output? (Why is there no output for the CID4 and CID5 input lines?)

How is you awk program supposed to know whether to print CID: or CID ID: on the second line of output for a given pair of CID input lines?
# 6  
Old 04-21-2014
It looks like Yoda already provided an awk script that comes close to your requirements but adds some additional spaces in the output at the end of some of the input fields, puts the lines of hyphens at the end of the entries instead of the start of the entries, and puts out records in the order in which they appear in File2 instead of the order in which they appear in File1. If those differences are important, you could try something like:
Code:
awk -F ' *--- *' '
NR == 1 {
	f2n = FILENAME
}
FNR == NR {
	od2[$1] = $4 ": resp data: " $5
	next
}
{	print "-----------------"
	printf("CID: %s\nDCN: %s\nType: %s\n%s: Req data: %s\n%s\n",
		$1, $3, $2, $4, $5, od2[$1])
}' File2 File1

You should probably also add some code to report cases where a CID appears in only one of the input files, but I'll leave that as an exercise for the reader.

If you want to run this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk, /usr/xpg6/bin/awk, or nawk.
This User Gave Thanks to Don Cragun For This Post:
# 7  
Old 04-21-2014
Thanks Yoda , Don cragun
It worked
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Compare and merge two big CSV files

Hi all, i need help. I have two csv files with a huge amount of data. I need the first column of the first file, to be compared with the data of the second, to have at the end a file with the data not present in the second file. Example File1: (only one column) profile_id 57036226... (11 Replies)
Discussion started by: SirMannu
11 Replies

2. Shell Programming and Scripting

Compare and Merge files

Hi All, I have two different files as shown below separated by a "|". I need to compare the first column from both the files and if they match merge both the columns. File 1 "S00172012"|"CHRONIC RENAL FAILURE"|""|"I" "S00159962"|"SUBENDO INFRC-INIT EPISD"|""|"I" "S00255303"|"BENIGN... (6 Replies)
Discussion started by: nua7
6 Replies

3. Shell Programming and Scripting

Checking in a directory how many files are present and basing on that merge all the files

Hi, My requirement is,there is a directory location like: :camp/current/ In this location there can be different flat files that are generated in a single day with same header and the data will be different, differentiated by timestamp, so i need to verify how many files are generated... (10 Replies)
Discussion started by: srikanth_sagi
10 Replies

4. Shell Programming and Scripting

AWK compare/merge

File1 2917,`0722,RDF1+TDEV,90(6),33,03E:0_12E:0,10000000c96c4af3_10000000c96c6e88,BL_db00p01e 2917,`0781,RDF1+TDEV,100(5),33,03E:0_12E:0,10000000c96c4af3_10000000c96c6e88,BL_db00p01e File2 2917,`0722,RDF1+TDEV,90(6),03E:0_12E:0,10000000c96c4af3_10000000c96c6e88,BL_db00p01e... (4 Replies)
Discussion started by: greycells
4 Replies

5. Shell Programming and Scripting

Shell Scripting: Compare pattern in two files and merge the o/p in one.

one.txt ONS.1287677000.820.log 20Oct2010 ONS.1287677000.123.log 21Oct2010 ONS.1287677000.456.log 22Oct2010 two.txt ONS.1287677000.820.log:V AC CC EN ONS.1287677000.123.log:V AC CC EN ONS.1287677000.820.log:V AC CC EN In file two.txt i have to look for pattern which column one... (17 Replies)
Discussion started by: saluja.deepak
17 Replies

6. Shell Programming and Scripting

Require compare command to compare 4 files

I have four files, I need to compare these files together. As such i know "sdiff and comm" commands but these commands compare 2 files together. If I use sdiff command then i have to compare each file with other which will increase the codes. Please suggest if you know some commands whcih can... (6 Replies)
Discussion started by: nehashine
6 Replies

7. UNIX for Dummies Questions & Answers

compare columns from 2 files and merge

Dear all, Being new to Unix i have a problem. I have 2 files: File 1: 118,1,0,2,3,0,5,0.3,0,0.3,0.6,1 118,2,1,2,2,0,5,0.4,0,0.4,0.4,1 118,4,2,0,3,0,5,0.7,0,0.3,0.6,1 118,6,4,1,0,0,5,0.8,0,0.2,0,1 File 2: 118,1,BFGL-NGS-109695,3610326,0,18,1,0.556,0.389,0.056,0.25,0.8183... (2 Replies)
Discussion started by: samwilkinson
2 Replies

8. Shell Programming and Scripting

Merge files of differrent size with one field common in both files using awk

hi, i am facing a problem in merging two files using awk, the problem is as stated below, file1: A|B|C|D|E|F|G|H|I|1 M|N|O|P|Q|R|S|T|U|2 AA|BB|CC|DD|EE|FF|GG|HH|II|1 .... .... .... file2 : 1|Mn|op|qr (2 Replies)
Discussion started by: shashi1982
2 Replies

9. Shell Programming and Scripting

compare the column from 3 files and merge that line

I have 3 file, each of has got 80000 records. file1.txt ----------------------- ABC001;active;modify;accept; ABC002;notactive;modify;accept; ABC003;notactive;no-modify;accept; ABC004;active;modify;accept; ABC005;active;no-modify;accept; file2.txt ---------------------------... (8 Replies)
Discussion started by: ganesh_mak
8 Replies

10. Shell Programming and Scripting

Compare two files and merge columns in a third

Hi, I'm working with snmp, with a little script I'm able to obtain from a switch a list with a couple of values with this format Port Mac 1 00:0A:0B:0C:0D:0E .... (hundred of entries) Now with a simple arp on a router I am able to obtain another list 00:0A:0B:0C:0D:0E... (20 Replies)
Discussion started by: CM64
20 Replies
Login or Register to Ask a Question