CSV to SQL insert: Awk for strings with multiple lines in csv


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting CSV to SQL insert: Awk for strings with multiple lines in csv
# 1  
Old 12-12-2011
CSV to SQL insert: Awk for strings with multiple lines in csv

Hi Fellows,

I have been struggling to fix an issue in csv records to compose sql statements and have been really losing sleep over it. Here is the problem:

I have csv files in the following pipe-delimited format:

Code:
Column1|Column2|Column3|Column4|NEWLINE
Address Type|some descriptive text|"Mailing
Physical
Shipping"|some text here|NEWLINE
Street Number/Name|some description for rec 2|None|some text here|NEWLINE
City|description for rec 3|None|some text here too|NEWLINE
Zip|description for rec 4|text here|sdfla sdflsdhlsdfsk|NEWLINE
Record 5|description|"Main - Main record
Administrative - Administrative phone number
HR - HR phone number
Services - Services line
Fax - Facsimile phone number
Other"|some text here|NEWLINE
Grades Offered|description here|"IT - Infant/toddler
PR - Preschool
PK - Prekindergarten
TK - Transitional Kindergarten
KG - Kindergarten
01 - First grade
02 - Second grade
UG - Ungraded
Other"|some text here|NEWLINE

The output format I am looking for is the following:
Code:
insert into MyTBl values ( Column1,Column2,Column3,Column4,NEWLINE);
insert into MyTBl values ( 'Address Type','some descriptive text','Mailing \nMailing \n Shipping' ,'some text here','NEWLINE');
insert into MyTBl values ( 'Street Number/Name','some description for rec 2','None','some text here','NEWLINE');
insert into MyTBl values ( 'City','description for rec 3','None','some text here too','NEWLINE');
insert into MyTBl values ( 'Zip','description for rec 4','text here','sdfla sdflsdhlsdfsk','NEWLINE');
insert into MyTBl values ( 'Record 5','description','Main - Main record \nAdministrative - Administrative phone number \nHR - HR phone number \nServices - Services line \nFax - Facsimile phone number \nOther','some text here','NEWLINE');
insert into MyTBl values ( 'Grades Offered','description here','IT - Infant/toddler \nPR - Preschool \nPK - Prekindergarten \nTK - Transitional Kindergarten \nKG - Kindergarten \n01 - First grade \n02 - Second grade \nUG - Ungraded \nOther','some text here','NEWLINE');

I am having difficulty preserving the spaces in the fourth column data which is often a multiple line text in double quotes (strings). I need to preserve these new lines in that column in my database table.

I wonder if someone can help in using awk or sed to do this. I use cygwin on windows platform.

thanks a lot!

Moderator's Comments:
Mod Comment How to use code tags

Last edited by Franklin52; 12-13-2011 at 04:12 AM.. Reason: Please use code tags for code and data samples, thank you
# 2  
Old 12-12-2011
Well, for starters, you need to adjust the CSV into the right number of columns...

Does this line
Code:
Column1|Column2|Column3|Column4|NEWLINE

actually exist in your input file?
# 3  
Old 12-12-2011
The do not actually are in each of the files but some files have headers.
Thank you for your message.
# 4  
Old 12-12-2011
How should I tell apart the files with headers from the files without?

Put this in a text file:

Code:
BEGIN { FS="|"; OFS="," }
# Append 'short' lines into one longer line
NF<5 { if(S){ $0="\\n" $0; } S=S $0; $0=S }
# When it fially ends in "NEWLINE", we know we have an entire record
$NF == "NEWLINE" {
        for(N=1; N<=NF; N++)
        # Add single quotes to things lacking double quotes
        if(substr($N,1,1) != "\"")
                $N = "'" $N "'";

        print "insert into MyTBI values("$1,$2,$3,$4,$5");";
        S="";
}

and use it like this:

Code:
$ awk -f myawk.txt < data
insert into MyTBI values('Column1','Column2','Column3','Column4','NEWLINE');
insert into MyTBI values('Address Type','some descriptive text',"Mailing\nPhysical\nShipping",'Unchanged from Version 1','NEWLINE');
insert into MyTBI values('Street Number/Name','some description for rec 2','None','some text here','NEWLINE');
insert into MyTBI values('City','description for rec 3','None','some text here too','NEWLINE');
insert into MyTBI values('Zip','description for rec 4','text here','sdfla sdflsdhlsdfsk','NEWLINE');
insert into MyTBI values('Record 5','description',"Main - Main record\nAdministrative - Administrative phone number\nHR - HR phone number\nServices - Services line\nFax - Facsimile phone number\nOther",'Unchanged from Version 1','NEWLINE');
insert into MyTBI values('Grades Offered','description here',"IT - Infant/toddler\nPR - Preschool\nPK - Prekindergarten\nTK - Transitional Kindergarten\nKG - Kindergarten\n01 - First grade\n02 - Second grade\nUG - Ungraded\nOther",'Unchanged from Version 1','NEWLINE');
$

Does | ever happen inside double-quotes? That won't be handled properly.
# 5  
Old 12-12-2011
Headers can be suppressed in csv generation process.
Pipes never occur in any records. That's why they are used as a delimiter.

Thank you Corona. That works great.
kind regards!
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk - CSV file - field with single or multiple spaces

Hi, In a csv file, I want to select records where first column has zero or multiple spaces. Eg: abc.csv ,123,a ,22,b ,11,c a,11,d So output should be: ,123,a ,22,b ,11,c Please advise (5 Replies)
Discussion started by: vegasluxor
5 Replies

2. Shell Programming and Scripting

Multiple carriage returns within quotation marks causing new lines in csv

There is a closed thread called "carriage returns within quotation marks causing new lines in csv" that I am unable to post to, so I am starting a new thread. The awk solution worked perfectly in most cases. We have some cases where there are multiple carriage returns within a single quoted... (9 Replies)
Discussion started by: Mary Roberts
9 Replies

3. UNIX for Dummies Questions & Answers

Need help combining txt files w/ multiple lines into csv single cell - also need data merge

:confused:Hello -- i just joined the forums. I am a complete noob -- only about 1 week into learning how to program anything... and starting with linux. I am working in Linux terminal. I have a folder with a bunch of txt files. Each file has several lines of html code. I want to combine... (2 Replies)
Discussion started by: jetsetter
2 Replies

4. Shell Programming and Scripting

awk read column csv and search in other csv

hi, someone to know how can i read a specific column of csv file and search the value in other csv columns if exist the value in the second csv copy entire row with all field in a new csv file. i suppose that its possible using awk but i m not expertise thanks in advance (8 Replies)
Discussion started by: giankan
8 Replies

5. Shell Programming and Scripting

Perl search csv fileA where two strings exist on another csv fileB

Hi I have two csv files, with the following formats: FileA.log: Application, This occured blah Application, That occured blah Application, Also this AnotherLog, Bob did this AnotherLog, Dave did that FileB.log: Uk, London, Application, datetime, LaterDateTime, Today it had'nt... (8 Replies)
Discussion started by: PerlNewbRP
8 Replies

6. Shell Programming and Scripting

Extract strings from multiple lines into one csv file

Hi all, Please go through my requirement. I have a log file in the location /opt/WebSphere61/AppServer/profiles/EMQbatchprofile/logs/EMQbatch This file contains the follwing pattern data <af type="tenured" id="42" timestamp="May 14 13:44:13 2011" intervalms="955.624"> <minimum... (8 Replies)
Discussion started by: satish.vampire
8 Replies

7. Shell Programming and Scripting

how to give multiple csv files as input in awk

Hi All, I am new to shell scripting..My problem is i want to give multiple csv files as input to awk script and process the data into one file.. My input file is File1 File2 File3 Product Location Period SalesPrice A x 8/11/2010 ... (7 Replies)
Discussion started by: kvth
7 Replies

8. Shell Programming and Scripting

insert data into specific lines of a CSV

So I work in a 1 to 1 laptop deployment and sometimes we need to mass order parts. The vendor will send us a text file and we have to manually input serial numbers. Well I have a full blown web based inventory system which I can pull serial number reports from. I then have to input the part... (4 Replies)
Discussion started by: tlarkin
4 Replies

9. Shell Programming and Scripting

Matching lines across multiple csv files and merging a particular field

I have about 20 CSV's that all look like this: "","","","","","","","","","","","","","","",""What I've been told I need to produce is the exact same thing, but with each file now containing the start_code from every other file where the email matches. It doesn't matter if any of the other... (1 Reply)
Discussion started by: Demosthenes
1 Replies

10. Shell Programming and Scripting

How to (n)awk lines of CSV with certain number of fields?

I have a CSV file with a variable number of fields per record. How do I print lines of a certain number of fields only? Several permutations of the following (including the use of escape characters) have failed to retrieve the line I'm after (1,2,3,4)... $ cat myfile 1,2,3,4 1,2,3 $ # Print... (1 Reply)
Discussion started by: cs03dmj
1 Replies
Login or Register to Ask a Question