Unix/Linux Go Back    


Shell Programming and Scripting BSD, Linux, and UNIX shell scripting — Post awk, bash, csh, ksh, perl, php, python, sed, sh, shell scripts, and other shell scripting languages questions here.

Replace Double quotes within double quotes in a column with space while loading a CSV file

Shell Programming and Scripting


Closed    
 
Thread Tools Search this Thread Display Modes
    #1  
Old Unix and Linux 05-11-2015   -   Original Discussion by mlavanya
mlavanya mlavanya is offline
Registered User
 
Join Date: May 2015
Last Activity: 22 July 2015, 8:27 AM EDT
Posts: 5
Thanks: 1
Thanked 0 Times in 0 Posts
Replace Double quotes within double quotes in a column with space while loading a CSV file

Hi All,

I'm unable to load the data using sql loader where there are double quotes within the double quotes As these are optionally enclosed by double quotes.

Sample Data :


Code:
"221100",138.00,"D","0019/1477","44012075","49938","49938/15043000","Television - 22" Refurbished - Airwave","Supply of Delivery & Collection 1st Unit (1), Delivery & Collection Additions (1), Whitbread Refurb LCD (2)","Airwave Europe Ltd","15/04/2015","2520",""

Desired output :


Code:
"221100",138.00,"D","0019/1477","44012075","49938","49938/15043000","Television  - 22 Refurbished - Airwave","Supply of Delivery & Collection 1st  Unit (1), Delivery & Collection Additions (1), Whitbread Refurb LCD  (2)","Airwave Europe Ltd","15/04/2015","2520",""

I have checked for many threads posted in this site. and tried
Code:
sed 's/\([^",]\)"\([^",]\)/\1\2/' < infile > outfile


Code:
perl -anle 'my @fields = ($_ =~ /(?:^|,)(".*?"|[^,]*?)(?=,|$)/g);foreach my $f(@fields){$f=~s/"//g;$f=sprintf("\"%s\"",$f);}my $line=join(",",@fields);print $line' file

But it didn't work. If the last column of the data is blank. then it is changing for that as well and getting the below output.


Code:
"221100",138.00,"D","0019/1477","44012075","49938","49938/15043000","Television  - 22 Refurbished - Airwave","Supply of Delivery & Collection 1st  Unit (1), Delivery & Collection Additions (1), Whitbread Refurb LCD  (2)","Airwave Europe Ltd","15/04/2015","2520","

Could anyone help me out to fix this issue.

Regards,
Lavanya.

Last edited by rbatte1; 05-11-2015 at 12:45 PM..
Sponsored Links
    #2  
Old Unix and Linux 05-11-2015   -   Original Discussion by mlavanya
RudiC RudiC is online now Forum Staff  
Moderator
 
Join Date: Jul 2012
Last Activity: 22 November 2017, 4:19 AM EST
Location: Aachen, Germany
Posts: 11,641
Thanks: 320
Thanked 3,618 Times in 3,323 Posts
You sed snippet works for me - it removes exactly the " after the 22. Why don't you like that solution?
Sponsored Links
    #3  
Old Unix and Linux 05-11-2015   -   Original Discussion by mlavanya
cjcox cjcox is offline
Registered User
 
Join Date: May 2005
Last Activity: 27 June 2016, 2:12 PM EDT
Posts: 614
Thanks: 4
Thanked 110 Times in 107 Posts
Change outer double quotes to an unprintable (here a literal ^B, so remember that not carat B, but ctrl-B). Then eliminate the remaining double quotes and replace all ^Bs with double quotes.


Code:
sed -e 's/^"/^B/' -e 's/"$/^B/' -e 's/",/^B,/g' -e 's/,"/,^B/g' -e 's/"//g' -e 's/^B/"/g'

The Following User Says Thank You to cjcox For This Useful Post:
rbatte1 (05-11-2015)
    #4  
Old Unix and Linux 05-11-2015   -   Original Discussion by mlavanya
mlavanya mlavanya is offline
Registered User
 
Join Date: May 2015
Last Activity: 22 July 2015, 8:27 AM EDT
Posts: 5
Thanks: 1
Thanked 0 Times in 0 Posts
Hi RudiC,

It doesnt work for me as for the data should be some thing like below:

Input:


Code:
   560003_07.28,292.47,"D","1073/1220","44536370","16520","16520/14103000","Vacuum   - Upright (c) - "Vaclensa","Supply of BS36 Upright (3yr NO QUIBBLE   Guarantee) (1)","Vaclensa PLC","03/10/2014","2510","PINON15N001"

After using

sed 's/\([^",]\)"\([^",]\)/\1\2/' < Input file> output_file


Output:

Code:
   560003_07.28,292.47,"D","1073/1220","44536370","16520","16520/14103000","Vacuum   - Upright (c) - Vaclensa","Supply of BS36 Upright (3yr NO QUIBBLE   Guarantee) (1)","Vaclensa PLC","03/10/2014","2510","PINON15N001

Expected Ouptut:


Code:
   560003_07.28,292.47,"D","1073/1220","44536370","16520","16520/14103000","Vacuum   - Upright (c) - Vaclensa","Supply of BS36 Upright (3yr NO QUIBBLE   Guarantee) (1)","Vaclensa PLC","03/10/2014","2510","PINON15N001"


Can u please help me out with this.

Regards,
Lavanya.

---------- Post updated at 01:04 AM ---------- Previous update was at 01:01 AM ----------

Hi Cjcox,

Can u please confirm if i can write the whole code you provided as a single SED command.
As im new to this technology and trying to learn .

Regards,
Lavanya.
Sponsored Links
    #5  
Old Unix and Linux 05-11-2015   -   Original Discussion by mlavanya
Don Cragun's Unix or Linux Image
Don Cragun Don Cragun is offline Forum Staff  
Administrator
 
Join Date: Jul 2012
Last Activity: 22 November 2017, 3:49 AM EST
Location: San Jose, CA, USA
Posts: 10,673
Thanks: 572
Thanked 3,737 Times in 3,189 Posts
Quote:
Originally Posted by mlavanya View Post
... ... ...

Hi Cjcox,

Can u please confirm if i can write the whole code you provided as a single SED command.
As im new to this technology and trying to learn .

Regards,
Lavanya.
Yes Lavanya, you can use a single invocation of the sed utility to perform all six sed substitute commands as shown in the suggestion cjcox provided.

Obviously, you will have to provide input for that command to process, and, unless you just want the output to go to standard output, you'll need to redirect the output.
Sponsored Links
    #6  
Old Unix and Linux 05-11-2015   -   Original Discussion by mlavanya
mlavanya mlavanya is offline
Registered User
 
Join Date: May 2015
Last Activity: 22 July 2015, 8:27 AM EDT
Posts: 5
Thanks: 1
Thanked 0 Times in 0 Posts
Hi Don,

I tried using :

sed -e 's/^"/^B/' -e 's/"$/^B/' -e 's/",/^B,/g' -e 's/,"/,^B/g' -e 's/"//g' -e 's/^B/"/g'
but i could see that all the double quotes are replaced by
^B
and also eliminating the last double quote in the data file.

Input:


Code:
   221100,37.20,"C","0073/1454","44019120","16395","16395/14103000","Safety   Workwear - "Screwfix","","Screwfix Direct   Ltd","10/10/2014","2520",""

Output:


Code:
   ^B221100^B,37.20,^BC^B,^B0073/1454^B,^B44019120^B,^B16395^B,^B16395/14103000^B,^BSafety   Workwear - Screwfix^B,^B^B,^BScrewfix Direct Ltd^B,^B10/10/2014^B,^B2520^B,^B

Expected Output:


Code:
   221100,37.20,"C","0073/1454","44019120","16395","16395/14103000","Safety   Workwear - Screwfix","","Screwfix Direct   Ltd","10/10/2014","2520",""

If you can see in the output , the last column with null value has been replaced only with a single double quote.

Please help me to resolve this issue.

Regards,
Lavanya.

---------- Post updated at 08:29 AM ---------- Previous update was at 08:15 AM ----------

Actually i have problem only with 8th column of the data file. Can we do the change only for column 8 , to check if there is any double quotes between double quotes and replace it with space.
Sponsored Links
    #7  
Old Unix and Linux 05-12-2015   -   Original Discussion by mlavanya
Don Cragun's Unix or Linux Image
Don Cragun Don Cragun is offline Forum Staff  
Administrator
 
Join Date: Jul 2012
Last Activity: 22 November 2017, 3:49 AM EST
Location: San Jose, CA, USA
Posts: 10,673
Thanks: 572
Thanked 3,737 Times in 3,189 Posts
Go back and look at message #3 in this thread again. You seem to have used the two characters circumflex (^) and capital letter b (B) instead of the single character that you get by pressing and holding the control key (control, ctl, or cntl on your keyboard depending on your keyboard manufacturer) while you press and release the B key. This key combination would show up on your editing screen as ^B if you were using common UNIX/Linux/POSIX editing tools like vi.

If, for some reason, you are unable to use the ctl-B key combination to create that character, you can replace all occurrences of that character in the sed command line with any other character that CANNOT appear as a legitimate character in your input file except that you cannot use a character that has a special meaning in a basic regular expression nor that has a special meaning in a sed s command replacement string.
Sponsored Links
Closed

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Linux More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Replace newline character between a double quotes to a space ajahuja Shell Programming and Scripting 8 08-02-2012 11:43 AM
HELP with AWK or SED. Need to replace the commas between double quotes in CSV file shell_boy23 Shell Programming and Scripting 5 07-12-2012 05:54 AM
replace value with double quotes of specific coulmn value in csv file techmoris Shell Programming and Scripting 3 08-03-2009 10:37 PM



All times are GMT -4. The time now is 05:21 AM.