Combine and complete multiple CSV files based on 1 parameter


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Combine and complete multiple CSV files based on 1 parameter
# 1  
Old 08-19-2018
Combine and complete multiple CSV files based on 1 parameter

I have to create a new CSV file based on the value listed on the 3rd column from different CSV files. This is what I need:

1. I should substitute the first column from each file, excluding the headers, with the file name InputXX.
2. Then, I need to look for rows with 0 on the third column in the first file Input1.csv, and substitute the entire row with the same corresponding row from the second file Input2.csv. This process should be repeated for all rows till the end of the file (96 entries).

3. If the value on the third column from the second file is also 0, then, the value from the third file is the one that is used to complete the entry on the first file. If the value on the third file is also 0,then the value from the forth file is used to complete the entry. So on and so on. If the value for that particular row is 0 in all files, then, the original entry is kept.


4. Once the first file is completed, the process is repeated for the second file using data from the remaining files.

5. The rows from the last file that have been used to complete the first files are kept the same on that last file but the value on the third column is changed to 0 since they have been already used.

6. Finally, all data is combined into the one file keeping the headers at the very top.

The original files look like this:
Input1.csv
Code:
Source,Well_Source,Volume,Destination_Well,Destination
PCR_Plate,1,5,1,Deadpool
PCR_Plate,2,7,1,Deadpool
PCR_Plate,3,8,1,Deadpool
PCR_Plate,4,3,1,Deadpool
PCR_Plate,5,5,1,Deadpool
PCR_Plate,6,0,1,Deadpool
PCR_Plate,7,4.,1,Deadpool
PCR_Plate,8,7.1,1,Deadpool
PCR_Plate,9,0,1,Deadpool
PCR_Plate,10,0,1,Deadpool

Input2.csv
Code:
Source,Well_Source,Volume,Destination_Well,Destination
PCR_Plate,1,0,1,Deadpool
PCR_Plate,2,0,1,Deadpool
PCR_Plate,3,8,1,Deadpool
PCR_Plate,4,3,1,Deadpool
PCR_Plate,5,5,1,Deadpool
PCR_Plate,6,0,1,Deadpool
PCR_Plate,7,4.,1,Deadpool
PCR_Plate,8,7.1,1,Deadpool
PCR_Plate,9,10,1,Deadpool
PCR_Plate,10,0,1,Deadpool

Input3.csv
Code:
Source,Well_Source,Volume,Destination_Well,Destination
PCR_Plate,1,5,1,Deadpool
PCR_Plate,2,0,1,Deadpool
PCR_Plate,3,8,1,Deadpool
PCR_Plate,4,0,1,Deadpool
PCR_Plate,5,0,1,Deadpool
PCR_Plate,6,0,1,Deadpool
PCR_Plate,7,4.,1,Deadpool
PCR_Plate,8,7.1,1,Deadpool
PCR_Plate,9,10,1,Deadpool
PCR_Plate,10,0,1,Deadpool

Input4.csv
Code:
Source,Well_Source,Volume,Destination_Well,Destination
PCR_Plate,1,5,1,Deadpool
PCR_Plate,2,0,1,Deadpool
PCR_Plate,3,8,1,Deadpool
PCR_Plate,4,5,1,Deadpool
PCR_Plate,5,5,1,Deadpool
PCR_Plate,6,0,1,Deadpool
PCR_Plate,7,4.,1,Deadpool
PCR_Plate,8,7.1,1,Deadpool
PCR_Plate,9,10,1,Deadpool
PCR_Plate,10,10,1,Deadpool

The resulting CSV file should look like this:
Code:
Source,Well_Source,Volume,Destination_Well,Destination
Input1,1,5,1,Deadpool
Input1,2,7,1,Deadpool
Input1,3,8,1,Deadpool
Input1,4,3,1,Deadpool
Input1,5,5,1,Deadpool
Input1,6,0,1,Deadpool
Input1,7,4.,1,Deadpool
Input1,8,7.1,1,Deadpool
Input2,9,10,1,Deadpool
Input4,10,10,1,Deadpool
Input3,1,5,1,Deadpool
Input2,2,0,1,Deadpool
Input2,3,8,1,Deadpool
Input2,4,3,1,Deadpool
Input2,5,5,1,Deadpool
Input2,6,0,1,Deadpool
Input2,7,4.,1,Deadpool
Input2,8,7.1,1,Deadpool
Input3,9,10,1,Deadpool
Input2,10,0,1,Deadpool
Input4,1,5,1,Deadpool
Input3,2,0,1,Deadpool
Input3,3,8,1,Deadpool
Input4,4,5,1,Deadpool
Input4,5,5,1,Deadpool
Input3,6,0,1,Deadpool
Input3,7,4.,1,Deadpool
Input3,8,7.1,1,Deadpool
Input4,9,10,1,Deadpool
Input3,10,0,1,Deadpool
Input4,1,0,1,Deadpool
Input4,2,0,1,Deadpool
Input4,3,8,1,Deadpool
Input4,4,0,1,Deadpool
Input4,5,0,1,Deadpool
Input4,6,0,1,Deadpool
Input4,7,4.,1,Deadpool
Input4,8,7.1,1,Deadpool
Input4,9,0,1,Deadpool
Input4,10,0,1,Deadpool

I have been focusing on trying to get it to work using two files with little to no success:
Code:
awk -F, 'FNR==NR{a[$0]=$3 FS $3;next}{ print $0, a[$1]}'

Any help will be greatly appreciated
# 2  
Old 08-19-2018
Why are 0 lines, e.g.
Code:
Input2,2,0,1,Deadpool
Input2,6,0,1,Deadpool
Input2,10,0,1,Deadpool

in the output file?
# 3  
Old 08-19-2018
Only when the other files also have 0 for that particular row. Otherwise, it should be populated with the rows that come from the remaining files
# 4  
Old 08-19-2018
Code:
Input2,10,0,1,Deadpool

?
# 5  
Old 08-20-2018
Quote:
Code:
Input2,10,0,1,Deadpool

?
Only input file 4 shows a value >0 in row number 10. That value should be used to complete the first input file. The goal is try to complete all 10 rows in the first file. Once the first file is completed, the goal changes to try to complete the second input file. However, the data present in the new completed infile 1 cannot be used any longer
# 6  
Old 08-20-2018
OK, now that everthing is more or less clear, let's try
Code:
awk -F, '
FNR == 1        {FLNR++
                 FN[FLNR] = FILENAME
                 next
                }

                {V[FLNR, $2] = $3
                 D[FLNR, $2] = $4
                 if ($2 > MX2) MX2 = $2
                }

END             {print "Source,Well_Source,Volume,Destination_Well,Destination"
                 for (f=1; f<=FLNR; f++)
                   for (i=1; i<=MX2; i++)       {ft = f
                                                 while ((ft <= FLNR) && (V[ft,i] == 0)) ft++
                                                 if (ft > FLNR) ft = f
                                                 print FN[ft], i, V[ft,i], D[f,i], "Deadpool"
                                                 V[ft, i] = 0
                                                }
                }

'  OFS=, Input[1-4]
Source,Well_Source,Volume,Destination_Well,Destination
Input1,1,5,1,Deadpool
Input1,2,7,1,Deadpool
Input1,3,8,1,Deadpool
Input1,4,3,1,Deadpool
Input1,5,5,1,Deadpool
Input1,6,0,1,Deadpool
Input1,7,4.,1,Deadpool
Input1,8,7.1,1,Deadpool
Input2,9,10,1,Deadpool
Input4,10,10,1,Deadpool
Input3,1,5,1,Deadpool
Input2,2,0,1,Deadpool
Input2,3,8,1,Deadpool
Input2,4,3,1,Deadpool
Input2,5,5,1,Deadpool
Input2,6,0,1,Deadpool
Input2,7,4.,1,Deadpool
Input2,8,7.1,1,Deadpool
Input3,9,10,1,Deadpool
Input2,10,0,1,Deadpool
Input4,1,5,1,Deadpool
Input3,2,0,1,Deadpool
Input3,3,8,1,Deadpool
Input4,4,5,1,Deadpool
Input4,5,5,1,Deadpool
Input3,6,0,1,Deadpool
Input3,7,4.,1,Deadpool
Input3,8,7.1,1,Deadpool
Input4,9,10,1,Deadpool
Input3,10,0,1,Deadpool
Input4,1,0,1,Deadpool
Input4,2,0,1,Deadpool
Input4,3,8,1,Deadpool
Input4,4,0,1,Deadpool
Input4,5,0,1,Deadpool
Input4,6,0,1,Deadpool
Input4,7,4.,1,Deadpool
Input4,8,7.1,1,Deadpool
Input4,9,0,1,Deadpool
Input4,10,0,1,Deadpool

This User Gave Thanks to RudiC For This Post:
# 7  
Old 08-20-2018
Rudy
That worked like a charm! Any chance you can explain your code?
Appreciate it very much!
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

CSV File with Multiple Search Parameter

Dear Team Members, I have a unique problem. Below is the dataset which I have. I am writing a script which will read through the file and pull the invoice no. (Field 2 of C1 row). "C1",990001,"L1","HERO","MOTORCYCLE","ASIA-PACIFIC","BEIJING" "C2","CLUTCH","HYUNDAI",03032017... (13 Replies)
Discussion started by: chetanojha
13 Replies

2. Shell Programming and Scripting

Combine multiple rows based on selected column keys

Hello I want to collapse a file with multiple rows into consolidated lines of entries based on selected columns as the 'key'. Example: 1 2 3 Abc def ghi 1 2 3 jkl mno p qrts 6 9 0 mno def Abc 7 8 4 Abc mno mno abc 7 8 9 mno mno abc 7 8 9 mno j k So if columns 1, 2 and 3 are... (6 Replies)
Discussion started by: linuxlearner123
6 Replies

3. Shell Programming and Scripting

Awk: Combine multiple lines based on number of fields

If a file has following kind of data, comma delimited 1,2,3,4 1 1 1,2,3,4 1,2 2 2,3,4 My required output must have only 4 columns with comma delimited 1,2,3,4 111,2,3,4 1,222,3,4 I have tried many awk command using ORS="" but couldnt progress (10 Replies)
Discussion started by: mdkm
10 Replies

4. Shell Programming and Scripting

Combine 3 files based on a pattern

HI, I have 3 files that contain the following information (sql output from Oracle database stored in a txt file): File1.txt : alter table "SYS"."INT_COST_PRICE" enable row movement; alter table "SYS"."INT_SOH" enable row movement; alter table "SYSMAN"."XX_ACI_SKURTP" enable row movement;... (6 Replies)
Discussion started by: rparavastu
6 Replies

5. Shell Programming and Scripting

Combine multiple lines in file based on specific field

Hi, I have an issue to combine multiple lines of a file. I have records as below. Fields are delimited by TAB. Each lines are ending with a new line char (\n) Input -------- ABC 123456 abcde 987 890456 7890 xyz ght gtuv ABC 5tyin 1234 789 ghty kuio ABC ghty jind 1234 678 ght ... (8 Replies)
Discussion started by: ratheesh2011
8 Replies

6. Shell Programming and Scripting

combine lines from two files based on an if statement

I'm rather new to programming, and am attempting to combine lines from 2 files in a way that is way beyond my expertise - any help would be appreciated! I need to take a file (file1) and add columns to it from another file (file2). However, a line from file2 should only be added to a given line... (3 Replies)
Discussion started by: Cheri
3 Replies

7. Shell Programming and Scripting

How to combine two files based on fields?

I have two files which are as follows: File 1: 1 abc 250 2 pqr 300 3 xyz 100 File 2: 1 abc 230 2 pqr 700 3 xyz 500 Now I need output File, File 3as: S.No Name Count1 Count2 1 abc 250 230 2 pqr 300 700 3 xyz 100 500 NOTE: (13 Replies)
Discussion started by: karumudi7
13 Replies

8. Shell Programming and Scripting

Combine two files and put it in .csv file

Hi Freinds I have two .txt file gem1.txt and gem2.txt, Sample gem1.txt abstract (1.0.0) actionmailer (2.3.5, 2.2.2) actionpack (2.3.5, 2.2.2) activerecord (2.3.5, 2.2.2) activerecord-oracle_enhanced-adapter (1.1.9) activerecord-sqlserver-adapter (2.3.4) activeresource (2.3.5, 2.2.2)... (3 Replies)
Discussion started by: ankit_view24
3 Replies

9. Shell Programming and Scripting

Combine Multiple text or csv files column-wise

Hi All I am trying to combine columns from multiple text files into a single file using paste command but the record length being unequal in the different files the data is running over to the closest empty cell on the left. Please see below. What can i do to resolve this ? File 1 File... (15 Replies)
Discussion started by: venky_ibm
15 Replies

10. Shell Programming and Scripting

how to combine 2 lines in same files based on any text

hi, I want to combine two lines in same file. If the line ends with '&' it should belongs to previous line only Here i am writing example. Ex1: line 1 : return abcdefgh& line 2 : ijklmnopqr& line 3 : stuvw& line 4 : xyz output should be line 1: return abcdefghijklmnopqrstuvwxyz ... (11 Replies)
Discussion started by: spc432
11 Replies
Login or Register to Ask a Question