Help with complex merg of files with common field


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Help with complex merg of files with common field
# 8  
Old 07-15-2008
Sorry, the example was incorrect.

Basically, what I am calling the key is the field: <_05_1:MessageIdentifier>ERR:38736086_1215781057901</_05_1:MessageIdentifier> as the number within is always unique eg. 115781057901, 1215781057902, 1215781057903 and so on. In each file the data is placed after the key. Each file contains one type of data, so I am trying to report on the data by the key.

Originally, I have one file that contains all the data. So I egrep <_05_1:MessageIdentifier> and <Error:Exception> in one file, <_05_1:MessageIdentifier> and <06Smilieetail> in another and finally <_05_1:MessageIdentifier> and <DataPosted> in another. The reason I am doing this is because I am going to CUT the data to get what we want before I merge the files. If there is way of egreping all the fields and cutting each piece of data, that would sort my problem in one go.

<_05_1:MessageIdentifier>ERR:38736086_1215781057901</_05_1:MessageIdentifier>
<Error:Exception> Error was 121238123... </Error:Exeption>
<_05_1:MessageIdentifier>ERR:38736086_1215781057903</_05_1:MessageIdentifier>
<Error:Exception> Error was 4554641..... </Error:Exeption>
<_05_1:MessageIdentifier>ERR:38736086_1215781057905</_05_1:MessageIdentifier>
<Error:Exception> Error was 1277123.... </Error:Exeption>


<_05_1:MessageIdentifier>ERR:38736086_1215781057901</_05_1:MessageIdentifier>
<06Smilieetail> Code XYZ... </06Smilieetail>
<_05_1:MessageIdentifier>ERR:38736086_1215781057903</_05_1:MessageIdentifier>
<_05_1:MessageIdentifier>ERR:38736086_1215781057905</_05_1:MessageIdentifier>
<06Smilieetail> Code ABC... </06Smilieetail>
<06Smilieetail> Code AAA... </06Smilieetail>


<_05_1:MessageIdentifier>ERR:38736086_1215781057901</_05_1:MessageIdentifier>
<DataPosted> Data....... </DataPosted>
<DataPosted> Data....... </DataPosted>
<DataPosted> Data....... </DataPosted>
<_05_1:MessageIdentifier>ERR:38736086_1215781057903</_05_1:MessageIdentifier>
<DataPosted> Data....... </DataPosted>
<_05_1:MessageIdentifier>ERR:38736086_1215781057905</_05_1:MessageIdentifier>
<DataPosted> Data....... </DataPosted>
<DataPosted> Data....... </DataPosted>
<DataPosted> Data....... </DataPosted>
<DataPosted> Data....... </DataPosted>
<DataPosted> Data....... </DataPosted>
<DataPosted> Data....... </DataPosted>
# 9  
Old 07-15-2008
Sorry, the example was incorrect.

Basically, what I am calling the key is the field: <_05_1:MessageIdentifier>ERR:38736086_1215781057901</_05_1:MessageIdentifier> as the number within is always unique eg. 115781057901, 1215781057902, 1215781057903 and so on. In each file the data is placed after the key. Each file contains one type of data, so I am trying to report on the data by the key.

Originally, I have one file that contains all the data. So I egrep <_05_1:MessageIdentifier> and <Error:Exception> in one file, <_05_1:MessageIdentifier> and <06Smilieetail> in another and finally <_05_1:MessageIdentifier> and <DataPosted> in another. The reason I am doing this is because I am going to CUT the data to get what we want before I merge the files. If there is way of egreping all the fields and cutting each piece of data, that would sort my problem in one go.

<_05_1:MessageIdentifier>ERR:38736086_1215781057901</_05_1:MessageIdentifier>
<Error:Exception> Error was 121238123... </Error:Exeption>
<_05_1:MessageIdentifier>ERR:38736086_1215781057903</_05_1:MessageIdentifier>
<Error:Exception> Error was 4554641..... </Error:Exeption>
<_05_1:MessageIdentifier>ERR:38736086_1215781057905</_05_1:MessageIdentifier>
<Error:Exception> Error was 1277123.... </Error:Exeption>


<_05_1:MessageIdentifier>ERR:38736086_1215781057901</_05_1:MessageIdentifier>
<06Smilieetail> Code XYZ... </06Smilieetail>
<_05_1:MessageIdentifier>ERR:38736086_1215781057903</_05_1:MessageIdentifier>
<_05_1:MessageIdentifier>ERR:38736086_1215781057905</_05_1:MessageIdentifier>
<06Smilieetail> Code ABC... </06Smilieetail>
<06Smilieetail> Code AAA... </06Smilieetail>


<_05_1:MessageIdentifier>ERR:38736086_1215781057901</_05_1:MessageIdentifier>
<DataPosted> Data....... </DataPosted>
<DataPosted> Data....... </DataPosted>
<DataPosted> Data....... </DataPosted>
<_05_1:MessageIdentifier>ERR:38736086_1215781057903</_05_1:MessageIdentifier>
<DataPosted> Data....... </DataPosted>
<_05_1:MessageIdentifier>ERR:38736086_1215781057905</_05_1:MessageIdentifier>
<DataPosted> Data....... </DataPosted>
<DataPosted> Data....... </DataPosted>
<DataPosted> Data....... </DataPosted>
<DataPosted> Data....... </DataPosted>
<DataPosted> Data....... </DataPosted>
<DataPosted> Data....... </DataPosted>
# 10  
Old 07-15-2008
Something like this?
Code:
perl -ne'
  $key = $1 and next if /ERR:\d+_(\d+)/;
  $data{$key} = $data{$key} ?
    $data{$key} . "\n" . $1 :
      $1 if m|DataPosted>(.+?)</Data|;
  END {
    print map { $_ . "\n" . $data{$_} . "\n" } keys %data;
  }' logfile

# 11  
Old 07-15-2008
Radoulov,

Thank you for persevering with my query, I really apprecaite. As I am new Shell Scripting could you please give me some idea of what each line is doing? I have some idea but I do not completely apprecaite the code. Where is the files to process specified?
# 12  
Old 07-15-2008
Well,
try to execute it first passing your files as arguments:
(just copy/paste it on the command line)

Code:
perl -ne'
  $key = $1 and next if /ERR:\d+_(\d+)/;
  $data{$key} = $data{$key} ?
    $data{$key} . "\n" . $1 :
      $1 if m|DataPosted>(.+?)</Data|;
  END {
    print map { $_ . "\n" . $data{$_} . "\n" } keys %data;
  }' fileA fileB fileC ...

I'm trying to guess here, I'm not sure if you want only the lines inside the <DataPosted> tags and if they span over multiple lines.
# 13  
Old 07-15-2008
Radoulov,

I need the data from all the files. For example, for the first key, I would want:

<_05_1:MessageIdentifier>ERR:38736086_1215781057901</_05_1:MessageIdentifier>
<Error:Exception> Error was 121238123... </Error:Exception>
<O6Smilieetail> Code XYZ... </O6Smilieetail>
<DataPosted> Data....... </DataPosted>
<DataPosted> Data....... </DataPosted>
<DataPosted> Data....... </DataPosted>

and so on
# 14  
Old 07-15-2008
The code works and it does give me the data portion. Can it be expanded to give me the data from the other files too for each key?

1215781057901
Data.......
Data.......
Data.......
1215781057903
Data.......
1215781057905
Data.......
Data.......
Data.......
Data.......
Data.......
Data.......
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Awk: output lines with common field to separate files

Hi, A beginner one. my input.tab (tab-separated): h1 h2 h3 h4 h5 item1 grpA 2 3 customer1 item2 grpB 4 6 customer1 item3 grpA 5 9 customer1 item4 grpA 0 0 customer2 item5 grpA 9 1 customer2 objective: output a file for each customer ($5) with the item number ($1) only if $2 matches... (2 Replies)
Discussion started by: beca123456
2 Replies

2. UNIX for Dummies Questions & Answers

Values with common field in same line with awk

Hi all ! I almost did it but got a small problem. input: cars red cars blue cars green truck black Wanted: cars red-blue-green truck black Attempt: gawk 'BEGIN{FS="\t"}{a = a (a?"-":"")$2; $2=a; print $1 FS $2}' input But I also got the intermediate records... (2 Replies)
Discussion started by: beca123456
2 Replies

3. Shell Programming and Scripting

search & merg data from 3 files

i have 3 files which contains as below (example): yy-mm-dd hh:mm:sec lat lon depth mag 2006-01-01 23:17:26.80 39.8405 41.8795 2.0 3.3 2006-01-06 00:10:26.80 39.9570 41.2130 5.0 3.3 2006-01-06 06:59:02.10 39.4099 44.6065 10.0 3.7 2006-01-06 13:49:52.70... (4 Replies)
Discussion started by: oreka18
4 Replies

4. Shell Programming and Scripting

Compare a common field in two files and append a column from File 1 in File2

Hi Friends, I am new to Shell Scripting and need your help in the below situation. - I have two files (File 1 and File 2) and the contents of the files are mentioned below. - "Application handle" is the common field in both the files. (NOTE :- PLEASE REFER TO THE ATTACHMENT "Compare files... (2 Replies)
Discussion started by: Santoshbn
2 Replies

5. UNIX for Dummies Questions & Answers

how to join two files using "Join" command with one common field in this problem?

file1: Toronto:12439755:1076359:July 1, 1867:6 Quebec City:7560592:1542056:July 1, 1867:5 Halifax:938134:55284:July 1, 1867:4 Fredericton:751400:72908:July 1, 1867:3 Winnipeg:1170300:647797:July 15, 1870:7 Victoria:4168123:944735:July 20, 1871:10 Charlottetown:137900:5660:July 1, 1873:2... (2 Replies)
Discussion started by: mindfreak
2 Replies

6. UNIX for Dummies Questions & Answers

compare two files based on common field in unix

I have two files in UNIX. 1st file is Entity and Second File is References. 1st File has only one column named Entity ID and 2nd file has two columns Entity ID | Person ID. I want to produce a output file where entity id's are matching in both the files. Entity File 624197 624252 624264... (4 Replies)
Discussion started by: PRS
4 Replies

7. Shell Programming and Scripting

join files based on a common field

Hi experts, Would you please help me with this? I have several files and I need to join the forth field of them based on the common first field. here's an example... first file: 280346 39.88 -75.08 547.8 280690 39.23 -74.83 538.7 280729 40.83 -75.08 499.2 280907 40.9 -74.4 507.8... (5 Replies)
Discussion started by: GoldenFire
5 Replies

8. Shell Programming and Scripting

How to append two files with common field.

I have two files like File1 : will get this file from "who" command. It is a unix file. user val1 Jul 29 13:15 (IP Address1) user val3 Jul 30 03:21 (IP Address2) user val2 Jul 29 13:16 (IP Address3) user val4 Jul 29 13:17 (IP Address4) ... (4 Replies)
Discussion started by: manneni prakash
4 Replies

9. Shell Programming and Scripting

Merge files of differrent size with one field common in both files using awk

hi, i am facing a problem in merging two files using awk, the problem is as stated below, file1: A|B|C|D|E|F|G|H|I|1 M|N|O|P|Q|R|S|T|U|2 AA|BB|CC|DD|EE|FF|GG|HH|II|1 .... .... .... file2 : 1|Mn|op|qr (2 Replies)
Discussion started by: shashi1982
2 Replies

10. UNIX for Dummies Questions & Answers

Merg files

i have: file1 contains: 123abc file2 contains: 123 abc i used: paste file1 file2 > file3, and the output looks like this: 123abc 123 abc i used: cat file3 | awk '{print $1, $2}' > file4, result: 123abc 123 my intention is to get file looks like this: 123abc123 abc when i... (9 Replies)
Discussion started by: tjmannonline
9 Replies
Login or Register to Ask a Question