Shell script that should remove unnecessary commas between double quotes in CSV file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Shell script that should remove unnecessary commas between double quotes in CSV file
# 1  
Old 11-13-2014
Shell script that should remove unnecessary commas between double quotes in CSV file

i have data as below

Code:
123,"paul phiri",paul@yahoo.com,"po.box 23, BT","Eco Bank,Blantyre,Malawi"

i need an output to be

Code:
123,"paul phiri",paul@yahoo.com,"po.box 23 BT","Eco Bank Blantyre Malawi"


Last edited by Don Cragun; 11-13-2014 at 04:03 AM.. Reason: Add CODE tags.
# 2  
Old 11-13-2014
And how are we supposed to guess which commas inside double quotes are unnecessary? And for those that you deem unnecessary, how are we supposed to guess which ones are to be removed and which ones are to be replaced by a space?
# 3  
Old 11-13-2014
Code:
#!/usr/bin/perl

use strict;
use warnings;

while (my $record = <>) {
    my @fields = $record =~ /(".*?")/g;
        for my $field (@fields) {
            my $cache = $field;
            if ($cache =~ tr/,/ /) {
                $record =~ s/$field/$cache/;
            }
    }
    print $record;
}

Example:
Save as fix_csv.pl

Run as perl fix_csv.pl file.csv
# 4  
Old 11-13-2014
Try:
Code:
awk '{for(i=2; i<=NF; i+=2) gsub(/, ?/," ",$i)}1' FS=\" OFS=\" file

or
Code:
awk '!(NR%2){gsub(/, ?/," ")}1' RS=\" ORS=\" file


Last edited by Scrutinizer; 11-13-2014 at 01:40 PM..
This User Gave Thanks to Scrutinizer For This Post:
# 5  
Old 11-14-2014
thanks can u explain the 2nd code
awk '!(NR%2){gsub(/, ?/," ")}1' RS=\" ORS=\" file

---------- Post updated at 03:51 AM ---------- Previous update was at 03:48 AM ----------
# 6  
Old 11-14-2014
Quote:
Originally Posted by mathias23
thanks can u explain the 2nd code
awk '!(NR%2){gsub(/, ?/," ")}1' RS=\" ORS=\" file

---------- Post updated at 03:51 AM ---------- Previous update was at 03:48 AM ----------
Set the input record separator (RS) and the output record separator (ORS) to the double quote character: RS=\" ORS=\". After doing this, odd numbered records (1, 3, 5, ...) will be data outside of double quotes and even records (2, 4, 6, ...) will be data between double quotes.

For every field that is not (!) an odd numbered record (record number modulo 2 (NR%2)), globally substitute (gsub()) for every occurrence of a comma followed by 0 or 1 space character (/, ?/), replace it with a single space character (" ").

For every record read (whether or not any changes were made), print the record (1). Each group of awk statements is of the form:
Code:
pattern { actions }

If pattern evaluates to a non-zero, non-empty string, the actions will be performed on that record. If no pattern is given, the specified actions will be performed on every record. If no actions (and surrounding braes) are present, the default action is to print that record.

And, finally, the script is to process the input file named "file" (file).

Last edited by Don Cragun; 11-14-2014 at 01:59 PM.. Reason: Fix typo.
This User Gave Thanks to Don Cragun For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Is there a way to handle commas inside the data when generating a csv file from shell script?

I am extracting data via sql query and some of the data has commas. Output File must be csv and I cannot update the data in the db (as it is used by other application). Example table FavoriteThings Person VARCHAR2(25), Favorite VARCHAR2(100) Sample Data Greta rain drop on... (12 Replies)
Discussion started by: patk625
12 Replies

2. Shell Programming and Scripting

How to remove unwanted commas from a .csv file?

how to remove unwanted commas from a .csv file Input file format "Server1","server-PRI-Windows","PRI-VC01","Microsoft Windows Server 2012, (64-bit)","Powered On","1,696.12","server-GEN-SFCHT2-VMS-R013,server-GEN-SFCHT2-VMS-R031,server-GEN-SFCHT2-VMS-R023"... (5 Replies)
Discussion started by: ranjancom2000
5 Replies

3. Shell Programming and Scripting

Need shell script to append double quotes for each column in a file

Hi Experts, I am beginner to the shell scripting, My requirement is to append double quotes for each column in a file if double quotes does not exist. Example: "abc"|123|"gh-ch"|23.067 Use code tags, thanks. (10 Replies)
Discussion started by: spidy
10 Replies

4. Shell Programming and Scripting

Replace Double quotes within double quotes in a column with space while loading a CSV file

Hi All, I'm unable to load the data using sql loader where there are double quotes within the double quotes As these are optionally enclosed by double quotes. Sample Data : "221100",138.00,"D","0019/1477","44012075","49938","49938/15043000","Television - 22" Refurbished - Airwave","Supply... (6 Replies)
Discussion started by: mlavanya
6 Replies

5. Shell Programming and Scripting

Remove quotes and commas from field

In the attached file I am trying to remove all the "" and , (quotes and commas) from $2 and $3 and the "" (quotes) from $4. I tried the below as a start: awk -F"|" '{gsub(/\,/,X,$2)} 1' OFS="\t" enhancer.txt > comma.txt Thank you :). (6 Replies)
Discussion started by: cmccabe
6 Replies

6. Shell Programming and Scripting

How to delete the commas in a .CSV file that are enclosed in a string with double quotes?

Okay, I would like to delete all the commas in a .CSV file (TEST.CSV) or at least substitute them with empty space, that are enclosed in double quote. Please see the sample file as below: column 1,column 2,column 3,column 4,column 5,column 6,column 7,column 8,column 9,column 10... (8 Replies)
Discussion started by: dhruuv369
8 Replies

7. Shell Programming and Scripting

Preserve commas inside double quotes (perl)

Hi, I have an input file like this $ cat infile hi,i,"am , sam", y hello ,good, morning abcd, " ef, gh " ,ij no, "good,morning", yes, "good , afternoon" from this file I have to split the fields on basis of comma"," however, I the data present inside double qoutes should be treated as... (3 Replies)
Discussion started by: sam05121988
3 Replies

8. Shell Programming and Scripting

HELP with AWK or SED. Need to replace the commas between double quotes in CSV file

Hello experts, I need to validate a csv file which contains data like this: Sample.csv "ABCD","I",23,0,9,,"23/12/2012","OK","Street,State, 91135",0 "ABCD","I",23,0,9,,"23/12/2012","OK","Street,State, 91135",0 I just need to check if all the records contain exactly the number of... (5 Replies)
Discussion started by: shell_boy23
5 Replies

9. Shell Programming and Scripting

shell script to remove extra commas from CSV outp file

Name,,,,,,,,,,,,,,,,,,,,Domain,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,Contact,Phone,Email,,,,,,,,,,,,,,,,,,,,,,,,,,,,,Location -----------------------,------------------------------------------------,-------,-----,---------------------------------,------------------------------------ ----... (1 Reply)
Discussion started by: sreenath1037
1 Replies

10. Shell Programming and Scripting

Remove duplicate commas after exporting excel file to csv

Hello everyone I'm new here and this is my first post so first of all I want to say that this is a great forum and I have managed to found most of my answers in these forums : ) So with that I ask you my first question: I have an excel file which I saved as a csv. However the excel file... (3 Replies)
Discussion started by: Spunkerspawn
3 Replies
Login or Register to Ask a Question