How to remove extra double quotes from string in a delimited file??


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting How to remove extra double quotes from string in a delimited file??
# 1  
Old 07-07-2011
How to remove extra double quotes from string in a delimited file??

Hi Unix Gurus..
I am new to Unix. Please help me. The file I am getting is as follows:
Input File
Code:
"2011-07-06 03:53:23","0","I","NOT SET   ",,,,"123985","SAW CUT CONCRETE SLAB 20"THICK",,"98.57","","EACH","N"
"2011-07-06 03:53:23","0","I","NOT SET   ",,,,"204312","ARMAFLEX-1 3/8 X 3"",,"2.48","","PER FOOT","N"
"2011-07-06 03:53:23","0","I","NOT SET   ",,,,"205745","MISTING HEAD HOLLOW CONE 1/8"",,"4.32","","EACH","Y"

I want to remove the extra " present and the output file should look like this:
Output File
Code:
"2011-07-06 03:53:23","0","I","NOT SET   ",,,,"123985","SAW CUT CONCRETE SLAB 20THICK",,"98.57","","EACH","N"
"2011-07-06 03:53:23","0","I","NOT SET   ",,,,"204312","ARMAFLEX-1 3/8 X 3",,"2.48","","PER FOOT","N"
"2011-07-06 03:53:23","0","I","NOT SET   ",,,,"205745","MISTING HEAD HOLLOW CONE 1/8",,"4.32","","EACH","Y"

Please help... and thanks in advance.

Jay

Last edited by Franklin52; 07-07-2011 at 03:31 PM.. Reason: Please use code tags for data and code samples, thank you
# 2  
Old 07-07-2011
How about:

Code:
sed 's/\([^",]\)"\([^",]\)/\1\2/' < infile > outfile

That should only remove double-quotes that aren't bordered by , on either side

---------- Post updated at 09:45 AM ---------- Previous update was at 09:41 AM ----------

I guess that won't work since some of them do... Add clauses to handle those:

Code:
sed 's/\([^",]\)"\([^",]\)/\1\2/g;s/\([^",]\)""/\1"/g;s/""\([^",]\)/"\1/g' < infile > outfile

Also add the g's I forgot before, so it can handle more than one per line.
# 3  
Old 07-07-2011
Thanks a LOT.. Appreciate your help Smilie

Your solution is working perfectly fine...

Quick question.. To maintain data integrity if I want to leave the Inch sign as is like " and replace all the quotes " with | .. will it be possible?

Input File
Code:
"2011-07-06 03:53:23","0","I","NOT SET ",,,,"123985","SAW CUT CONCRETE SLAB 20"THICK",,"98.57","","EACH","N"
"2011-07-06 03:53:23","0","I","NOT SET ",,,,"204312","ARMAFLEX-1 3/8 X 3"",,"2.48","","PER FOOT","N"
"2011-07-06 03:53:23","0","I","NOT SET ",,,,"205745","MISTING HEAD HOLLOW CONE 1/8"",,"4.32","","EACH","Y"

I want to convert the quote " to pipe | and leave the inch " sign as is in description field. The output file will look like this:
Output File
Code:
|2011-07-06 03:53:23|,|0|,|I|,|NOT SET |,,,,|123985|,|SAW CUT CONCRETE SLAB 20"THICK|,,|98.57|,||,|EACH|,|N|
|2011-07-06 03:53:23|,|0|,|I|,|NOT SET |,,,,|204312|,|ARMAFLEX-1 3/8 X 3"|,,|2.48|,||,|PER FOOT|,|N|
|2011-07-06 03:53:23|,|0|,|I|,|NOT SET |,,,,|205745|,|MISTING HEAD HOLLOW CONE 1/8"|,,|4.32|,||,|EACH|,|Y|

Again.. appreciate your help Smilie

---------- Post updated at 11:50 AM ---------- Previous update was at 10:05 AM ----------

Hi Corona,
Please help!
This command
Code:
sed 's/\([^",]\)"\([^",]\)/\1\2/g;s/\([^",]\)""/\1"/g;s/""\([^",]\)/"\1/g' < infile > outfile

is not formatting the record correctly:

Input:
Code:
"2011-07-06 04:01:58","0","I","NOT SET ",,,,"950849","The leak rate was exceeded for RACK "C" MULTI COMPRESSOR RAC",,"22.18","","EACH","Y"

Output:
Code:
"2011-07-06 04:01:58","0","I","NOT SET ",,,,"950849","The leak rate was exceeded for RACK C" MULTI COMPRESSOR RAC",,"22.18","","EACH","Y"

Desired Output:
Code:
"2011-07-06 04:01:58","0","I","NOT SET ",,,,"950849","The leak rate was exceeded for RACK C MULTI COMPRESSOR RAC",,"22.18","","EACH","Y"


Last edited by Franklin52; 07-07-2011 at 03:33 PM.. Reason: Please use code tags for data and code samples, thank you
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Remove pipe(|) symbol ina file, except the ones which are enclosed in double quotes

I have file with are delimited by pipe(|) symbol, I wanted those to be removed except the ones which are enclosed in double quotes. If your quote file is: |Life is |Beautiful"|"Indeed life |is beautiful too|"|"But unix is fun| is not"|" It should return: Life is Beautiful"|"Indeed life is... (1 Reply)
Discussion started by: Sathyapts
1 Replies

2. Shell Programming and Scripting

Replace Double quotes within double quotes in a column with space while loading a CSV file

Hi All, I'm unable to load the data using sql loader where there are double quotes within the double quotes As these are optionally enclosed by double quotes. Sample Data : "221100",138.00,"D","0019/1477","44012075","49938","49938/15043000","Television - 22" Refurbished - Airwave","Supply... (6 Replies)
Discussion started by: mlavanya
6 Replies

3. Shell Programming and Scripting

Shell script that should remove unnecessary commas between double quotes in CSV file

i have data as below 123,"paul phiri",paul@yahoo.com,"po.box 23, BT","Eco Bank,Blantyre,Malawi" i need an output to be 123,"paul phiri",paul@yahoo.com,"po.box 23 BT","Eco Bank Blantyre Malawi" (5 Replies)
Discussion started by: mathias23
5 Replies

4. Shell Programming and Scripting

Replace double quotes with a single quote within a double quoted string

Hi Froum. I have tried in vain to find a solution for this problem - I'm trying to replace any double quotes within a quoted string with a single quote, leaving everything else as is. I have the following data: Before: ... (32 Replies)
Discussion started by: pchang
32 Replies

5. Shell Programming and Scripting

How to delete the commas in a .CSV file that are enclosed in a string with double quotes?

Okay, I would like to delete all the commas in a .CSV file (TEST.CSV) or at least substitute them with empty space, that are enclosed in double quote. Please see the sample file as below: column 1,column 2,column 3,column 4,column 5,column 6,column 7,column 8,column 9,column 10... (8 Replies)
Discussion started by: dhruuv369
8 Replies

6. Shell Programming and Scripting

Trying to remove double quotes

Hi, I am little new to forum and new on unix side. I have a small issue below: I am reading a file that has 5 columns something like below. col1,col2,col3,col4,col5 Some records are having double quoted values something like below: "value1","value2","value3","value4","value5" I need... (8 Replies)
Discussion started by: Saanvi1
8 Replies

7. Shell Programming and Scripting

Convert csv to pipe delimited except the ones in double quotes

I have a csv data file : A,B,C,D,"A,B",E,"GG,H" E,F,G,H,I,J,"S,P" I need to replace all "," with "|" except the ones between double quotes i.e A|B|C|D|"A,B"|E|"GG,H" E|F|G|H|I|J|"S,P" CAn someone assist? (8 Replies)
Discussion started by: Shivdatta
8 Replies

8. UNIX for Dummies Questions & Answers

unix command to insert double quotes in a delimited file

Hi, I am looking for a unix command which inserts double quotes around all values in a delimited file. For example, Input File 153665031,abc,abc ,abc 131278839,def,def ,dec 179821481,efg,efg ,lmn ... (6 Replies)
Discussion started by: Bachu
6 Replies

9. Shell Programming and Scripting

removing extra double quotes between pipe dilimeter

I have a flat file sample like this - "COURSE"|"ddddd " " dddd"|"sssddd sdsdsdsdx" dddddddd ffffff "aaaaa" dddddddd ffffff sdsdsd"|"xxxxxxx"| "COURSE"|"ffff " " bbbb"|"lllll"| The delimiter is pipe character (|) and the text are enclosed in double quotes... (5 Replies)
Discussion started by: vishalzone
5 Replies

10. Shell Programming and Scripting

Batch file to remove double quotes from a file

Hi I have to create a windows batch file which will read an input file and remove "double quotes" from each line of the file. For eg.If input file name is abcd.csv and contents is : "asasd,123123,213213,asd" "esfrf,dsfsdf,234324,sdfsdf" Then the batch file should remove "" from the... (11 Replies)
Discussion started by: akashtcs
11 Replies
Login or Register to Ask a Question