Merging all (48) CSV files from a directory


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Merging all (48) CSV files from a directory
# 1  
Old 06-20-2011
CPU & Memory Merging all (48) CSV files from a directory

I have 48 csv files in my directory that all have this form:
Code:
Storm Speed (mph),43.0410781151
Storm motion direction (degrees),261.580774982
MLCAPE,2450.54098661
MLCIN,-9.85040520279
MLLCL,230
MLLFC,1070.39871
MLEL,207.194689294
MLCT,Not enough data
Sbcape,2203.97617778
Sbcin,-25.2617829901
SbLCL,320
SbLFC,1120.39871
SbEL,214.374492767
SbCT,Not enough data
0-1km SRH,458.349258678
10m-1km SRH,329.149187093
0-3km SRH,463.618101844
10m-3km SRH,334.418030258
0-6km SRH,587.783727309
10m-6km SRH,458.583655724
SFC-1Km EHI,5.04120983549
SFC-3km EHI,5.1219068113
SFC-6Km EHI,7.02361277587
0-1km bulk shear (m/s),25.6914044845
0-3km bulk shear (m/s),29.6908256645
0-6km bulk shear (m/s),21.7298524954
0-9km bulk shear (m/s),28.1866755643

The first entry in every line is the same for every file, but the numbers change for every file. I'd like to merge all 48 csv files in a single directory and I'd also like a line to be created at the top of the file that contains the names of each of the files in the order in which they are read in. So, if I merged the file above with this file:
Code:
Storm Speed (mph),44.7614503892
Storm motion direction (degrees),231.668848047
MLCAPE,2803.79127488
MLCIN,-50.712494335
MLLCL,300
MLLFC,1668.39871
MLEL,199.222951335
MLCT,15.46839871
Sbcape,3159.56809086
Sbcin,-45.0430330816
SbLCL,240
SbLFC,1443.39871
SbEL,196.149400393
SbCT,15.84339871
0-1km SRH,471.772033724
10m-1km SRH,295.907072991
0-3km SRH,581.250777417
10m-3km SRH,405.385816683
0-6km SRH,530.668022483
10m-6km SRH,354.803061749
SFC-1Km EHI,5.18538543392
SFC-3km EHI,7.1038575986
SFC-6Km EHI,6.21746080521
0-1km bulk shear (m/s),18.5250715019
0-3km bulk shear (m/s),24.2810028978
0-6km bulk shear (m/s),20.0096587186
0-9km bulk shear (m/s),30.6511667163

I'd like it to create this file:
Code:
N02022007values.csv,N03122006values.csv
Storm Speed (mph),43.0410781151,44.7614503892
Storm motion direction (degrees),261.580774982,231.668848047
MLCAPE,2450.54098661,2803.79127488
MLCIN,-9.85040520279,-50.712494335
MLLCL,230,300
MLLFC,1070.39871,1668.39871
MLEL,207.194689294,199.222951335
MLCT,Not enough data,15.46839871
Sbcape,2203.97617778,3159.56809086
Sbcin,-25.2617829901,-45.0430330816
SbLCL,320,240
SbLFC,1120.39871,1443.39871
SbEL,214.374492767,196.149400393
SbCT,Not enough data,15.84339871
0-1km SRH,458.349258678,471.772033724
10m-1km SRH,329.149187093,295.907072991
0-3km SRH,463.618101844,581.250777417
10m-3km SRH,334.418030258,405.385816683
0-6km SRH,587.783727309,530.668022483
10m-6km SRH,458.583655724,354.803061749
SFC-1Km EHI,5.04120983549,5.18538543392
SFC-3km EHI,5.1219068113,7.1038575986
SFC-6Km EHI,7.02361277587,6.21746080521
0-1km bulk shear (m/s),25.6914044845,18.5250715019
0-3km bulk shear (m/s),29.6908256645,24.2810028978
0-6km bulk shear (m/s),21.7298524954,20.0096587186
0-9km bulk shear (m/s),28.1866755643,30.6511667163

Thanks for any help in advance!

Last edited by Franklin52; 06-20-2011 at 04:35 PM.. Reason: Please use code tags for code and data samples
# 2  
Old 06-20-2011
Code:
awk -F, 'NR==1{n=FILENAME}NR==FNR{a[NR]=$0;next}FNR==1{n=n","FILENAME}{a[FNR]=a[FNR]","$2}END{print n;for (i=1;i<=FNR;i++){print a[i]}}' *.csv > out.csv

# 3  
Old 06-20-2011
Not quite sure what you want with the file headers... Do you want all but the last ignored? All but the first? None of them ignored?

This can be adapted for any of those:
Code:
#!/bin/sh
cd /path/to/csvdir

(
        # set $1 = first file, $2 = second, etc.
        set -- *.csv

        # echo $1 $2 $3 ... $N changing space to comma with tr
        echo $* | tr ' ' ','

        # Print the first two lines (title) of first file
        head -n 2 $1

        # Loop through all files in $1, $2, ...
        for FILE
        do
                # Print out every file, ignoring first two lines
                tail "$FILE" -n +3
        done
# Dump the whole mess into the output file.  Don't put it in the
# same directory as the input files!
) > /path/to/output.csv

# 4  
Old 06-20-2011
Quote:
Originally Posted by Corona688
Code:
                # Print out every file, ignoring first two lines
                tail "$FILE" -n +3

I believe the OP is interested in merging like lines in files in a manner congruent to a paste(1) or join(1), and not cat(1). My understanding is that none of the lines in any file need to be skipped and that the first comma-delimited field is a key (although since all files use the same order, a solution may not need to inspect it) which needs to be removed from all but the first file's line to avoid redundant fields in each line in the final output.

Regards,
Alister

---------- Post updated at 03:45 PM ---------- Previous update was at 03:38 PM ----------

Perhaps something along these lines:
Code:
paste *.csv | sed 's/<tab>[^,]*//g'

Where <tab> represents a literal tab character and I have assumed that there are no tabs in your source files. If there are, you can choose a suitable delimiter by informing paste with the -d option and using it instead of a tab in the sed command.

Regards,
Alister
This User Gave Thanks to alister For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Check for null values in a columns. I have dozen of CSV files in a directory.

Hi Folks, I'm trying to write a simple file sanity check script. I have a directory with dozen CSV files containing id,edname,firstname,lastname,suffix,email. I like to write a awk script to check if first field contain a number and is not empty. and fields number 3,4 & 6 are not empty and... (3 Replies)
Discussion started by: dc34684
3 Replies

2. UNIX for Beginners Questions & Answers

How to zip csv files having specific pattern in a directory using UNIX shell script?

I have files in a Linux directory . Some of the file is listed below -rw-rw-r--. 1 roots roots 0 Dec 23 02:17 zzz_123_00000_A_1.csv -rw-rw-r--. 1 roots roots 0 Dec 23 02:18 zzz_121_00000_A_2.csv -rw-rw-r--. 1 roots roots 0 Dec 23 02:18 zzz_124_00000_A_3.csv drwxrwxr-x. 2 roots roots 6 Dec 23... (4 Replies)
Discussion started by: Balraj
4 Replies

3. Shell Programming and Scripting

Merging fields in CSV

Hi experts, I have a csv file which has one field (ID) repeated multiple times with corresponding other field values. I need to convert this file in a format where for a ID all other values has to be present in single field. For Eg : Here in below file ID 1 is repeated 3 times with different... (7 Replies)
Discussion started by: bharathbangalor
7 Replies

4. Shell Programming and Scripting

Help with merging two CSV files

Hi, I have following 2 CSV files file1.txt A1,B1,C1,D1,E1 A2,B2,C2,D2,E2 A3,B3,C3,D3,E3 .... file2.txt A1,B1,P1,Q1,R1,S1,T1,U1 A1,B1,P2,Q2,R2,S2,T2,U2 A1,B1,P3,Q3,R3,S3,T3,U3 A2,B2,X1,Y1,Z1,I1,J1,K1 A2,B2,X2,Y2,Z2,I2,J2,K2 A2,B2,X3,Y3,Z3,I3,J3,K3 A2,B2,X4,Y4,Z4,I4,J4,K4... (2 Replies)
Discussion started by: learnoutmore99
2 Replies

5. Shell Programming and Scripting

Merging Very large CSV files in Unix

Hi, I have two very large CSV files, which I want to merge (equi-join) based on a key (column). One of the file (say F1) would have ~30 MM records and 700 columns. The other file (~f2) would have same # of records and lesser columns (say 50). I want to create an output file joining on a... (3 Replies)
Discussion started by: student_007
3 Replies

6. Shell Programming and Scripting

Matching lines across multiple csv files and merging a particular field

I have about 20 CSV's that all look like this: "","","","","","","","","","","","","","","",""What I've been told I need to produce is the exact same thing, but with each file now containing the start_code from every other file where the email matches. It doesn't matter if any of the other... (1 Reply)
Discussion started by: Demosthenes
1 Replies

7. UNIX for Dummies Questions & Answers

Merging two CSV files by 3 primary keys (columns)

Hi there! I have the following problem: I have a set of files called rates_op_yyyyddmm with the format below (which corresponds to the file rates_op_20090130) 30-JAN-2009,ED,FEB09,C,96.375,,,0,,,,,,2.375,,,,,, 30-JAN-2009,ED,FEB09,C,96.5,,,0,,,,,,2.25,,,,,,... (2 Replies)
Discussion started by: Pep Puigvert
2 Replies

8. Shell Programming and Scripting

Merging files to create CSV file

Hi, I have different files of the same type, as: Time: 100 snr: 88 perf: 10 other: 222 Each of these files are created periodically. What I need to do is to merge all of them into one but having the following form: (2 Replies)
Discussion started by: Ravendark
2 Replies

9. UNIX for Dummies Questions & Answers

Merging 2 .CSV files in Unix

I need a little help as I am a complete novice at scripting in unix. However, i am posed with an issue...:eek: i have two csv files in the following format@ FILE1.CSV: HEADER HEADER Header , , HEADER 001X ,,200 002X ,,300 003X ... (6 Replies)
Discussion started by: chachabronson
6 Replies

10. Shell Programming and Scripting

Merging files belonging to same date in a directory

Hi, I want to merge files which belong to the same date in a directory. For ex: The below two files are modified on 24 Apr 2007. SystemOut.log SystemOut_07.04.24_19.46.40.log The merged file should be of the format SystemOut.log.2007.04.24 (1 Reply)
Discussion started by: rkaarthikeyan4u
1 Replies
Login or Register to Ask a Question