Preserve commas inside double quotes (perl)


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Preserve commas inside double quotes (perl)
# 1  
Old 04-10-2013
Question Preserve commas inside double quotes (perl)

Hi,

I have an input file like this

Code:
$ cat infile
hi,i,"am , sam", y
hello ,good, morning
abcd, " ef, gh " ,ij
no, "good,morning", yes, "good , afternoon"

from this file I have to split the fields on basis of comma"," however, I the data present inside double qoutes should be treated as 1 field

for this I have written the below code

Code:
$ cat ./prog
#!/usr/bin/perl -w
use strict;

while (<>)
{
my @arr=split(',',$_);
my $startpos;
my $endpos;

#determining sarting and ending position of the quotes
for(my $i=0;$i<@arr;$i++)
{
        $startpos=$i    if ( $arr[$i] =~ /^( *)"/ );
        $endpos=$i      if ( $arr[$i] =~ /"( *)$/ );
}

#clubiing the elemnts between Starting and ending quotes in one element and removing redundant entities
for (my $i=0; $i<@arr; $i++)
{
        if ( defined($startpos) && $i == $startpos )
        {
                for ( my $j=$startpos+1; $j <=$endpos;$j++)
                {
                        $arr[$i] .= ",".$arr[$j];
                        $arr[$j] = undef;
                }
        }
}

#removing undef entities from the array
@arr=grep defined, @arr;

for(my $i=0;$i<@arr;$i++)
{
        print "\$arr[$i] = "."$arr[$i]"."\n";
}

}

this is working fine except for the last line of input file
Code:
no, "good,morning", yes, "good , afternoon"

Current output

Code:
$ ./prog infile
$arr[0] = hi
$arr[1] = i
$arr[2] = "am , sam"
$arr[3] =  y

$arr[0] = hello
$arr[1] = good
$arr[2] =  morning

$arr[0] = abcd
$arr[1] =  " ef, gh "
$arr[2] = ij

$arr[0] = no
$arr[1] =  "good
$arr[2] = morning"
$arr[3] =  yes
$arr[4] =  "good , afternoon"

Expected output
Code:
$ ./prog infile
$arr[0] = hi
$arr[1] = i
$arr[2] = "am , sam"
$arr[3] =  y

$arr[0] = hello
$arr[1] = good
$arr[2] =  morning

$arr[0] = abcd
$arr[1] =  " ef, gh "
$arr[2] = ij

$arr[0] = no
$arr[1] =  "good,morning"
$arr[2] =  yes
$arr[3] =  "good , afternoon"

Please suggest.

Thanks
Sam
# 2  
Old 04-10-2013
Code:
 #!/usr/bin/perl
    use strict;
    use warnings;
    use Text::CSV;

    my $file = 'MyCSV.csv';

 my $csv = Text::CSV->new({allow_loose_quotes=>0,
                                quote_space=>0,
                        allow_whitespace=>1});
    open (CSV, "<", $file) or die $!;

    while (<CSV>) {
        if ($csv->parse($_)) {
            my @columns = $csv->fields();
                for(my $i=0;$i<@columns;$i++)
{
        print "\$arr[$i] = "."$columns[$i]"."\n";
}
        } else {
            my $err = $csv->error_input;
            print "Failed to parse line: $err";
        }
    }
    close CSV;

# 3  
Old 04-10-2013
Thanks Pravin, but any way to do it without using modules
The system where I am running this program, I do not have authority to install modules Smilie
# 4  
Old 04-10-2013
Code:
#!/usr/bin/perl -w
use strict;

my $flag=0;

while (<>)
{
my @arr=split(',',$_);
my $startpos;
my $endpos;
for(my $i=0;$i<@arr;$i++)
{
        #print " hi 3- $arr[$i] \n";
        if ( $arr[$i] =~ /\"/ && $flag eq 0) {
                $startpos=$i;
                $flag=1;
        } elsif ($flag eq 1 ) {
                        if( $arr[$i] =~ /\"/) {  $arr[$startpos] .= ",".$arr[$i];
                                                        print " \$arr[$startpos]  - $arr[$startpos] \n";
                                                        $flag=0
                                                     } else { $arr[$startpos] .= $arr[$i] }
        } else  {
                print "\$arr[$i]- $arr[$i] \n";
        }
}
}

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Replace double quotes inside the string data for all the columns

Please use code tags Hi, I have input data is below format and n of column in the multiple flat files. the string data has any double quotes(") values replaced to double double quotes for all the columns{""). Also, my input flat file each column string data has carriage of new line too.... (14 Replies)
Discussion started by: SSrini
14 Replies

2. Shell Programming and Scripting

Replace Double quotes within double quotes in a column with space while loading a CSV file

Hi All, I'm unable to load the data using sql loader where there are double quotes within the double quotes As these are optionally enclosed by double quotes. Sample Data : "221100",138.00,"D","0019/1477","44012075","49938","49938/15043000","Television - 22" Refurbished - Airwave","Supply... (6 Replies)
Discussion started by: mlavanya
6 Replies

3. Shell Programming and Scripting

Shell script that should remove unnecessary commas between double quotes in CSV file

i have data as below 123,"paul phiri",paul@yahoo.com,"po.box 23, BT","Eco Bank,Blantyre,Malawi" i need an output to be 123,"paul phiri",paul@yahoo.com,"po.box 23 BT","Eco Bank Blantyre Malawi" (5 Replies)
Discussion started by: mathias23
5 Replies

4. Shell Programming and Scripting

How to match fields surrounded by double quotes with commas?

Hello to all, I'm trying to match only fields surrounded by double quotes that have one or more commas inside. The text is like this "one, t2o",334,"tst,982-0",881,"kmk 9-l","kkd, 115-001, jj-3",5 The matches should be "one, t2o" "tst,982-0" "kkd, 115-001, jj-3" I'm trying with... (11 Replies)
Discussion started by: Ophiuchus
11 Replies

5. Shell Programming and Scripting

How to delete the commas in a .CSV file that are enclosed in a string with double quotes?

Okay, I would like to delete all the commas in a .CSV file (TEST.CSV) or at least substitute them with empty space, that are enclosed in double quote. Please see the sample file as below: column 1,column 2,column 3,column 4,column 5,column 6,column 7,column 8,column 9,column 10... (8 Replies)
Discussion started by: dhruuv369
8 Replies

6. Shell Programming and Scripting

HELP with AWK or SED. Need to replace the commas between double quotes in CSV file

Hello experts, I need to validate a csv file which contains data like this: Sample.csv "ABCD","I",23,0,9,,"23/12/2012","OK","Street,State, 91135",0 "ABCD","I",23,0,9,,"23/12/2012","OK","Street,State, 91135",0 I just need to check if all the records contain exactly the number of... (5 Replies)
Discussion started by: shell_boy23
5 Replies

7. Shell Programming and Scripting

how to find the count of commas in a string excluding the ones in double quotes

Hi, my requirement is to find the count of commas in a string excluding the ones in double quotes. For example: If the input string is abc,xyz.com,lmhgdf,"abc, 401 street","tty,stt",45,23,45 The output should be 7 (7 Replies)
Discussion started by: amitshete
7 Replies

8. Shell Programming and Scripting

To Replace comma with Pipe inside double quotes

Hi, I have a requirement to replace the comma's inside the double quotes. The comma's inside the double quotes will get changed dynamically. Input Record: "Washington, DC,Prabhu,aju",New York Output Record: "Washington| DC|Prabhu|aju",New York I tried with the below command but it... (3 Replies)
Discussion started by: prabhutkl
3 Replies

9. Shell Programming and Scripting

sed removing comma inside double quotes

I have a csv file with lines like the followings 123456,"ABC CO., LTD","XXX" 789012,"DEF LIMITED", "XXX" before I bcp this file to database, the comma in "CO.," need to be removed first. My script is cat <filename> | sed 's/"CO.,"/"CO."/g' but it doesn't work. Can anyone here able to... (2 Replies)
Discussion started by: joanneho
2 Replies

10. Shell Programming and Scripting

escaping double-quotes inside the script?

I'm having a strange problem with escaping double-quotes. I have a script that looks like this: #!/bin/bash for HOST in `cat $INFILE | grep -v ^#` do for VFILER in `some_command` do echo " " echo -e '\E The problem with ssh command... (3 Replies)
Discussion started by: GKnight
3 Replies
Login or Register to Ask a Question