Merging two CSV files by 3 primary keys (columns)


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Merging two CSV files by 3 primary keys (columns)
# 1  
Old 03-05-2010
Lightbulb Merging two CSV files by 3 primary keys (columns)

Hi there!

I have the following problem:

I have a set of files called rates_op_yyyyddmm with the format below (which corresponds to the file rates_op_20090130)

Code:
30-JAN-2009,ED,FEB09,C,96.375,,,0,,,,,,2.375,,,,,,
30-JAN-2009,ED,FEB09,C,96.5,,,0,,,,,,2.25,,,,,,
30-JAN-2009,I,JUN09,C,95,,,18342,,,,,,3.27,,,,,,
30-JAN-2009,I,JUN09,P,92,,,458,,,,,,0,,,,,,
30-JAN-2009,I,SEP09,P,92.125,,,0,,,,,,0,,,,,,
30-JAN-2009,I,SEP09,P,92.25,,,0,,,,,,0,,,,,,
30-JAN-2009,K,DEC09,C,94.25,,,0,,,,,,3.24,,,,,,
30-JAN-2009,K,DEC09,C,94.375,,,0,,,,,,3.115,,,,,,

and a second set of files called rates_f_yyyyddmm with the format below
(which corresponds to the file rates_f_20090130)

Code:
30-JAN-2009,ED,FEB09,F,,,,0,,,,,,98.735,,,,,,
30-JAN-2009,I,JUN09,F,,,,,98.23,,,98.31,98.22,98.27,,,,98.27,98.26,
30-JAN-2009,I,SEP09,F,,1,,4,98.175,,,98.26,98.16,98.25,,,,98.225,98.215,
30-JAN-2009,K,DEC09,F,,1,,4,98.17,,,98.26,98.16,99.21,,,,98.225,98.215,

While the first file has several lines where the columns 30-JAN-2009,ED,FEB09 can appear, in the second file 30-JAN-2009,ED,FEB09 can appear only once. The same applies for all the other key families 30-JAN-2009,I,JUN09 which are always defined by the first 3 columns.

After having identified a line of the first file with a line of the second file with the first 3 columns (as defined above), I would like to add in the first file, i.e. rates_op_20090130, either all lines after the F or the value appearing in the 14th column of the second file i.e. 98.735, 98.22, 98.25 and so on.

So the output should be:
rates_op_20090130
Code:
30-JAN-2009,ED,FEB09,C,96.375,,,0,,,,,,2.375,,,,,,98.735,
30-JAN-2009,ED,FEB09,C,96.375,,,0,,,,,,2.375,,,,,,98.735,
30-JAN-2009,ED,FEB09,C,96.5,,,0,,,,,,2.25,,,,,,98.735,
30-JAN-2009,I,JUN09,C,95,,,18342,,,,,,3.27,,,,,,98.22,
30-JAN-2009,I,JUN09,P,92,,,458,,,,,,0,,,,,,98.22,
30-JAN-2009,I,SEP09,P,92.125,,,0,,,,,,0,,,,,,98.25,
30-JAN-2009,I,SEP09,P,92.25,,,0,,,,,,0,,,,,,98.25,
30-JAN-2009,K,DEC09,C,94.25,,,0,,,,,,3.24,,,,,,99.21,
30-JAN-2009,K,DEC09,C,94.375,,,0,,,,,,3.115,,,,,,99.21,

This I need to do it for 200 files that have always the format
rates_op_yyyymmdd or rates_f_yyyymmdd.

I am rather new in UNIX so any help with the explanation of the code being used would be highly appreciated.

Thank you very much in advance!!!

Cheers,
Pep

Last edited by Scott; 03-05-2010 at 06:36 PM.. Reason: Please use code tags
# 2  
Old 03-05-2010
something along these lines for a starter:
Code:
#!/bin/ksh

for f in rates_f_[0-9][0-9]*
do
   nawk -F, -v OFS=, '
     NR==1 {
       split(FILENAME,a,"_")
       ARGV[ARGC++]= "rates_op_" a[3]
     }
     FNR==NR {f[$1,$2,$3]=$14; next}
     {
       idx=$1 SUBSEP $2 SUBSEP $3
       if (idx in f)
         $(++NF)=f[idx] OFS
       print > (FILENAME "_new")
     }
   ' "${f}"
done

# 3  
Old 03-08-2010
Thanks a lot! It is really solving the problem! Well done.

However when I open the file I get

9-DEC-2008,ED,DEC08,C,92.5,0,0,0,0,0,0,0,0,0,0,0,0,0,0,<cr>,98,

Do you know whether the <cr> should worry me?

In any case I am just going to use a few columns of these files that I know hot to proceed.

I am just a starter so your help was really very much appreciated.

Cheers and have a nice week!
Pep
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Match columns from two csv files and update field in one of the csv file

Hi, I have a file of csv data, which looks like this: file1: 1AA,LGV_PONCEY_LES_ATHEE,1,\N,1,00020460E1,0,\N,\N,\N,\N,2,00.22335321,0.00466628 2BB,LES_POUGES_ASF,\N,200,200,00006298G1,0,\N,\N,\N,\N,1,00.30887539,0.00050312... (10 Replies)
Discussion started by: djoseph
10 Replies

2. Shell Programming and Scripting

Compare 2 csv files by columns, then extract certain columns of matcing rows

Hi all, I'm pretty much a newbie to UNIX. I would appreciate any help with UNIX coding on comparing two large csv files (greater than 10 GB in size), and output a file with matching columns. I want to compare file1 and file2 by 'id' and 'chain' columns, then extract exact matching rows'... (5 Replies)
Discussion started by: bkane3
5 Replies

3. Shell Programming and Scripting

Perl - multiple keys and merging two files

Hi, I'm not a regular coder but some times I write some basic perl script, hence Perl is bit difficult for me :). I'm merging two files a.txt and b.txt into c.txt: a.txt ------ x001;frtb70;xyz;109 x001;frvt65;sec;239 x003;wqax34;jul;659 x004;yhud43;yhn;760 b.txt ------... (8 Replies)
Discussion started by: Lokesha
8 Replies

4. Shell Programming and Scripting

Help with merging two CSV files

Hi, I have following 2 CSV files file1.txt A1,B1,C1,D1,E1 A2,B2,C2,D2,E2 A3,B3,C3,D3,E3 .... file2.txt A1,B1,P1,Q1,R1,S1,T1,U1 A1,B1,P2,Q2,R2,S2,T2,U2 A1,B1,P3,Q3,R3,S3,T3,U3 A2,B2,X1,Y1,Z1,I1,J1,K1 A2,B2,X2,Y2,Z2,I2,J2,K2 A2,B2,X3,Y3,Z3,I3,J3,K3 A2,B2,X4,Y4,Z4,I4,J4,K4... (2 Replies)
Discussion started by: learnoutmore99
2 Replies

5. Shell Programming and Scripting

Merging Very large CSV files in Unix

Hi, I have two very large CSV files, which I want to merge (equi-join) based on a key (column). One of the file (say F1) would have ~30 MM records and 700 columns. The other file (~f2) would have same # of records and lesser columns (say 50). I want to create an output file joining on a... (3 Replies)
Discussion started by: student_007
3 Replies

6. Shell Programming and Scripting

Merging all (48) CSV files from a directory

I have 48 csv files in my directory that all have this form: Storm Speed (mph),43.0410781151 Storm motion direction (degrees),261.580774982 MLCAPE,2450.54098661 MLCIN,-9.85040520279 MLLCL,230 MLLFC,1070.39871 MLEL,207.194689294 MLCT,Not enough data Sbcape,2203.97617778... (3 Replies)
Discussion started by: RissaR
3 Replies

7. Shell Programming and Scripting

Merging files to create CSV file

Hi, I have different files of the same type, as: Time: 100 snr: 88 perf: 10 other: 222 Each of these files are created periodically. What I need to do is to merge all of them into one but having the following form: (2 Replies)
Discussion started by: Ravendark
2 Replies

8. UNIX for Dummies Questions & Answers

Merging Non-Empty Columns within a CSV

I am trying to place all my data in a single row (order doesn't matter). Note I am a Unix novice, so please go easy on me. Here is an example Raw data: row# (1) 45 64 23 (2) 32 1 6 56 (3) 32 45 Needs to be like this: row# (1) 45 (2) 32 (3) 32 ... (2 Replies)
Discussion started by: mmann1123
2 Replies

9. UNIX for Dummies Questions & Answers

Merging 2 .CSV files in Unix

I need a little help as I am a complete novice at scripting in unix. However, i am posed with an issue...:eek: i have two csv files in the following format@ FILE1.CSV: HEADER HEADER Header , , HEADER 001X ,,200 002X ,,300 003X ... (6 Replies)
Discussion started by: chachabronson
6 Replies

10. UNIX for Dummies Questions & Answers

Extracting columns from different files for later merging

Hello! I wan't to extract columns from two files and later combine them for plotting with gnuplot. If the files file1 and file2 look like: fiile1: a, 0.62,x b, 0.61,x file2: a, 0.43,x b, 0,49,x The desired output is a 0.62 0.62 b 0.61 0.49 Thank you in advance! (2 Replies)
Discussion started by: kingkong
2 Replies
Login or Register to Ask a Question