Using NAWK to merge two files


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Using NAWK to merge two files
# 1  
Old 06-06-2006
Using NAWK to merge two files

Two files need to be joined on a second column.

File 1
PHP Code:
          FRO     FRONTIER
    MWC     MIDWEST
    ORB     ORBITZ
    ATR     AIRTRAN
    BWI     BESTWESTERN 
File 2
PHP Code:
           M       AIRTRAN 8       TRIPS_YTD       A
    M       BESTWESTERN 8       TRIPS_YTD       B       20      TRIPS_TOTAL     A       17
    M       FRONTIER 8
    M       CAESAR 8       TRIPS_YTD       D       10
    M       ORBITZ 8       TRIPS 
Desired Output
PHP Code:
          ATR       AIRTRAN 8       TRIPS_YTD       A
    BWI       BESTWESTERN 8       TRIPS_YTD       B       20      TRIPS_TOTAL     A       17
    FRO       FRONTIER 8
    ORB       ORBITZ 8       TRIPS 
I am not a big expert in nawk... Any inputs are highly appreciated....

Code:
      nawk -f merge.awk f1.txt f2.txt
      
        BEGIN {
          FS=" "
          OFS=" "
        }
        
        FNR==NR {
          arr[FNR] = $1 OFS $2
          next
        }
        {
           print arr[FNR], $2
    }

Output

PHP Code:
    ATR AIRTRAN AIRTRAN
    BWI BESTWESTERN BESTWESTERN
    FRO FRONTIER CAESAR
    MWC MIDWEST FRONTIER
    ORB ORBITZ ORBITZ 
Please advise...
# 2  
Old 06-06-2006
I also tried to do this way

Code:
#!/usr/bin/ksh

awk '
BEGIN { OFS = " " } { print $2 , $1, $2 }
' f1.txt > temp_f1.txt

awk '
BEGIN { OFS = " " } { print $2 , $2, $3 ,$4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20  }
' f2.txt > temp_f2.txt

join -t " " -j1 1 -j2 1 -o 1.2 1.3 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11 2.12 2.13 2.14 2.15 2.16 2.17 2.18 2.19 2.20 2.21 2.22 2.23 2.24 2.25 temp_f1.txt te
mp_f2.txt > f3.dat

It was able to produce the required output....but the output file is not tab delimited...

Please help me with the best way to do this...
# 3  
Old 06-06-2006
what if you piped into one last awk:
Code:
....... lastfilename | awk 'BEGIN {OFS="\t"} { print $0 }' > f3.dat 
# or a tr
....... lastfilename | tr -s ' ' '\t' > f3.dat

# 4  
Old 06-06-2006
Thank you Jim...

Somehow the awk one didn't work...But the tr function worked...

Code:
#!/usr/bin/ksh

awk '
BEGIN { OFS = "\t" } { print $2 , $1, $2 }
' f1.txt > temp_f1.txt

awk '
BEGIN { OFS = "\t" } { print $2 , $2, $3 ,$4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27  }
' f2.txt > temp_f2.txt

join  -j1 1 -j2 1 -o 1.2 1.3 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11 2.12 2.13 2.14 2.15 2.16 2.17 2.18 2.19 2.20 2.21 2.22 2.23 2.24 2.25 2.26 2.27 temp_f1.tx
t temp_f2.txt | tr -s ' ' '\t' > f3.dat

But is this the right way to merge two files?
# 5  
Old 06-06-2006
Merge means to "intertwine" records - join means to put two records together into one - your choice of join is the best way to do it, because that's what join is for.
# 6  
Old 06-06-2006
Thank you again Jim...

In the above script, I have declared variables till $27 for File 2. File 1 is standard and it has only two columns. However, File 2 will be changing on the number of columns. If I get column $28 in File2, I would not be outputting field 28 with the current code..

Is there any way to enhance the code so that it will take any number of columns from File 2 rather than hard coding in the script...

My final script looks like this..

Code:
#!/usr/bin/ksh

DIR=$1
CODES_FILE=$2
METADATA_FILE=$3

cd ${DIR}

sort -k 2,2 ${CODES_FILE} > f1.dat
sort -k 2,2 ${METADATA_FILE} > f2.dat

awk '
BEGIN { OFS = "\t" } { print $2 , $1, $2 }
' f1.dat > temp_f1.dat

awk '
BEGIN { OFS = "\t" } { print $2 , $2, $3 ,$4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27  }
' f2.dat > temp_f2.dat

join  -j1 1 -j2 1 -o 1.2 1.3 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11 2.12 2.13 2.14 2.15 2.16 2.17 2.18 2.19 2.20 2.21 2.22 2.23 2.24 2.25 2.26 2.27 temp_f1.da
t temp_f2.dat | tr -s ' ' '\t' > f3.dat

Please advise
# 7  
Old 06-07-2006
Try..
Code:
$ cat f1.txt
FRO     FRONTIER
    MWC     MIDWEST
    ORB     ORBITZ
    ATR     AIRTRAN
    BWI     BESTWESTERN
$ cat f2.txt
M       AIRTRAN 8       TRIPS_YTD       A
    M       BESTWESTERN 8       TRIPS_YTD       B       20      TRIPS_TOTAL     A       17
    M       FRONTIER 8
    M       CAESAR 8       TRIPS_YTD       D       10
    M       ORBITZ 8       TRIPS
$ awk 'NR==FNR{a[$2]=$1;next}a[$2]&&sub($1,a[$2])' f1.txt f2.txt > f3.txt
$ cat f3.txt
ATR       AIRTRAN 8       TRIPS_YTD       A
    BWI       BESTWESTERN 8       TRIPS_YTD       B       20      TRIPS_TOTAL     A       17
    FRO       FRONTIER 8
    ORB       ORBITZ 8       TRIPS
$

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Nawk Problem - nawk out of space in tostring on

Hi.. i am running nawk scripts on solaris system to get records of file1 not in file2 and find duplicate records in a while with the following scripts -compare nawk 'NR==FNR{a++;next;} !a {print"line"FNR $0}' file1 file2duplicate - nawk '{a++}END{for(i in a){if(a-1)print i,a}}' file1in the middle... (12 Replies)
Discussion started by: Abhiraj Singh
12 Replies

2. Shell Programming and Scripting

Merge files and generate a resume in two files

Dear Gents, Please I need your help... I need small script :) to do the following. I have a thousand of files in a folder produced daily. I need first to merge all files called. txt (0009.txt, 0010.txt, 0011.txt) and and to output a resume of all information on 2 separate files in csv... (14 Replies)
Discussion started by: jiam912
14 Replies

3. UNIX for Dummies Questions & Answers

Nawk 2 files

file 1 abcedef|wert|13|03|10|04|23|A1|13|05|01|09|31|9.516666667 fsdasdf|ferg|12|04|25|21|21|A1|13|02|26|20|31|20.51666667 dfsfsad|gerg|12|04|25|21|21|A1|13|02|25|25|31|4.166666667 wqeqwee|wewe|13|02|01|23|52|A1|13|05|01|09|31|9.516666667 file 2 abcedef|fsdsa|3000 fsdasdf|dasfa|3200... (9 Replies)
Discussion started by: radius
9 Replies

4. Shell Programming and Scripting

Checking in a directory how many files are present and basing on that merge all the files

Hi, My requirement is,there is a directory location like: :camp/current/ In this location there can be different flat files that are generated in a single day with same header and the data will be different, differentiated by timestamp, so i need to verify how many files are generated... (10 Replies)
Discussion started by: srikanth_sagi
10 Replies

5. Shell Programming and Scripting

Merging two tab separated files via nawk

I searched a lot considering this theme,but still cant make my code working. I have two tab separated files, I want to do the following thing: File 1: xx1 y yy xx2 y yy xx3 y yy xx4 y yy File 2: xx1 z1 xx2 z2 xx3 z3 xx4 z4 xx5 z5 So I want to merge them ,according to... (9 Replies)
Discussion started by: divergenciya
9 Replies

6. Shell Programming and Scripting

ksh/nawk help with if statement to choose between 2 files

Hi! I am having a bit of a problem with my script. There are two different date formats that may appear in a file, either Jul-12 or Jul--6. I tried to create an if statement that searches for one of the formats and if that doesn't exist searches for the other, however it doesn't seem to be... (3 Replies)
Discussion started by: ther2000
3 Replies

7. Shell Programming and Scripting

match 2 files using nawk command

i am matching two files. the files are in the format file_1.txt 1|_|X|_| 2|_|W|_| 3|_|Y|_| 4|_|Z|_| 5|_|U|_| file_2.txt W|_|A|_| Z|_|C|_| V|_|B|_| X|_|D|_|sdff|_| Y|_| file_3.txt should be in the format (4 Replies)
Discussion started by: centurion_13
4 Replies

8. Shell Programming and Scripting

How can i merge these two files into several...

Given are File A and File B File A has for example 5 lines: AAA BBB CCC DDD EEE File B has 3 lines: 111 222 333 How can i merge A and B into: 111 222 333 AAA (first line from A) then a new file: (4 Replies)
Discussion started by: Y-T
4 Replies

9. Shell Programming and Scripting

Merge files of differrent size with one field common in both files using awk

hi, i am facing a problem in merging two files using awk, the problem is as stated below, file1: A|B|C|D|E|F|G|H|I|1 M|N|O|P|Q|R|S|T|U|2 AA|BB|CC|DD|EE|FF|GG|HH|II|1 .... .... .... file2 : 1|Mn|op|qr (2 Replies)
Discussion started by: shashi1982
2 Replies

10. Shell Programming and Scripting

how to merge these two files?

I have two files, each of them has 12 lines, fileA has 3 columns, fileB has 1 column, like the following FileA a 1 b 2 c 3 ..blabla FileB A B C ..blabla Now I am trying to put the content of fileB as column 3 of fileA, e.g. a 1 A b 2 B c 3 C (3 Replies)
Discussion started by: fedora
3 Replies
Login or Register to Ask a Question