Sponsored Content
Top Forums Shell Programming and Scripting Preserve commas inside double quotes (perl) Post 302792225 by sam05121988 on Wednesday 10th of April 2013 03:08:01 AM
Old 04-10-2013
Question Preserve commas inside double quotes (perl)

Hi,

I have an input file like this

Code:
$ cat infile
hi,i,"am , sam", y
hello ,good, morning
abcd, " ef, gh " ,ij
no, "good,morning", yes, "good , afternoon"

from this file I have to split the fields on basis of comma"," however, I the data present inside double qoutes should be treated as 1 field

for this I have written the below code

Code:
$ cat ./prog
#!/usr/bin/perl -w
use strict;

while (<>)
{
my @arr=split(',',$_);
my $startpos;
my $endpos;

#determining sarting and ending position of the quotes
for(my $i=0;$i<@arr;$i++)
{
        $startpos=$i    if ( $arr[$i] =~ /^( *)"/ );
        $endpos=$i      if ( $arr[$i] =~ /"( *)$/ );
}

#clubiing the elemnts between Starting and ending quotes in one element and removing redundant entities
for (my $i=0; $i<@arr; $i++)
{
        if ( defined($startpos) && $i == $startpos )
        {
                for ( my $j=$startpos+1; $j <=$endpos;$j++)
                {
                        $arr[$i] .= ",".$arr[$j];
                        $arr[$j] = undef;
                }
        }
}

#removing undef entities from the array
@arr=grep defined, @arr;

for(my $i=0;$i<@arr;$i++)
{
        print "\$arr[$i] = "."$arr[$i]"."\n";
}

}

this is working fine except for the last line of input file
Code:
no, "good,morning", yes, "good , afternoon"

Current output

Code:
$ ./prog infile
$arr[0] = hi
$arr[1] = i
$arr[2] = "am , sam"
$arr[3] =  y

$arr[0] = hello
$arr[1] = good
$arr[2] =  morning

$arr[0] = abcd
$arr[1] =  " ef, gh "
$arr[2] = ij

$arr[0] = no
$arr[1] =  "good
$arr[2] = morning"
$arr[3] =  yes
$arr[4] =  "good , afternoon"

Expected output
Code:
$ ./prog infile
$arr[0] = hi
$arr[1] = i
$arr[2] = "am , sam"
$arr[3] =  y

$arr[0] = hello
$arr[1] = good
$arr[2] =  morning

$arr[0] = abcd
$arr[1] =  " ef, gh "
$arr[2] = ij

$arr[0] = no
$arr[1] =  "good,morning"
$arr[2] =  yes
$arr[3] =  "good , afternoon"

Please suggest.

Thanks
Sam
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

escaping double-quotes inside the script?

I'm having a strange problem with escaping double-quotes. I have a script that looks like this: #!/bin/bash for HOST in `cat $INFILE | grep -v ^#` do for VFILER in `some_command` do echo " " echo -e '\E The problem with ssh command... (3 Replies)
Discussion started by: GKnight
3 Replies

2. Shell Programming and Scripting

sed removing comma inside double quotes

I have a csv file with lines like the followings 123456,"ABC CO., LTD","XXX" 789012,"DEF LIMITED", "XXX" before I bcp this file to database, the comma in "CO.," need to be removed first. My script is cat <filename> | sed 's/"CO.,"/"CO."/g' but it doesn't work. Can anyone here able to... (2 Replies)
Discussion started by: joanneho
2 Replies

3. Shell Programming and Scripting

To Replace comma with Pipe inside double quotes

Hi, I have a requirement to replace the comma's inside the double quotes. The comma's inside the double quotes will get changed dynamically. Input Record: "Washington, DC,Prabhu,aju",New York Output Record: "Washington| DC|Prabhu|aju",New York I tried with the below command but it... (3 Replies)
Discussion started by: prabhutkl
3 Replies

4. Shell Programming and Scripting

how to find the count of commas in a string excluding the ones in double quotes

Hi, my requirement is to find the count of commas in a string excluding the ones in double quotes. For example: If the input string is abc,xyz.com,lmhgdf,"abc, 401 street","tty,stt",45,23,45 The output should be 7 (7 Replies)
Discussion started by: amitshete
7 Replies

5. Shell Programming and Scripting

HELP with AWK or SED. Need to replace the commas between double quotes in CSV file

Hello experts, I need to validate a csv file which contains data like this: Sample.csv "ABCD","I",23,0,9,,"23/12/2012","OK","Street,State, 91135",0 "ABCD","I",23,0,9,,"23/12/2012","OK","Street,State, 91135",0 I just need to check if all the records contain exactly the number of... (5 Replies)
Discussion started by: shell_boy23
5 Replies

6. Shell Programming and Scripting

How to delete the commas in a .CSV file that are enclosed in a string with double quotes?

Okay, I would like to delete all the commas in a .CSV file (TEST.CSV) or at least substitute them with empty space, that are enclosed in double quote. Please see the sample file as below: column 1,column 2,column 3,column 4,column 5,column 6,column 7,column 8,column 9,column 10... (8 Replies)
Discussion started by: dhruuv369
8 Replies

7. Shell Programming and Scripting

How to match fields surrounded by double quotes with commas?

Hello to all, I'm trying to match only fields surrounded by double quotes that have one or more commas inside. The text is like this "one, t2o",334,"tst,982-0",881,"kmk 9-l","kkd, 115-001, jj-3",5 The matches should be "one, t2o" "tst,982-0" "kkd, 115-001, jj-3" I'm trying with... (11 Replies)
Discussion started by: Ophiuchus
11 Replies

8. Shell Programming and Scripting

Shell script that should remove unnecessary commas between double quotes in CSV file

i have data as below 123,"paul phiri",paul@yahoo.com,"po.box 23, BT","Eco Bank,Blantyre,Malawi" i need an output to be 123,"paul phiri",paul@yahoo.com,"po.box 23 BT","Eco Bank Blantyre Malawi" (5 Replies)
Discussion started by: mathias23
5 Replies

9. Shell Programming and Scripting

Replace Double quotes within double quotes in a column with space while loading a CSV file

Hi All, I'm unable to load the data using sql loader where there are double quotes within the double quotes As these are optionally enclosed by double quotes. Sample Data : "221100",138.00,"D","0019/1477","44012075","49938","49938/15043000","Television - 22" Refurbished - Airwave","Supply... (6 Replies)
Discussion started by: mlavanya
6 Replies

10. Shell Programming and Scripting

Replace double quotes inside the string data for all the columns

Please use code tags Hi, I have input data is below format and n of column in the multiple flat files. the string data has any double quotes(") values replaced to double double quotes for all the columns{""). Also, my input flat file each column string data has carriage of new line too.... (14 Replies)
Discussion started by: SSrini
14 Replies
Locale::Codes::LangFam(3pm)				 Perl Programmers Reference Guide			       Locale::Codes::LangFam(3pm)

NAME
Locale::Codes::LangFam - standard codes for language extension identification SYNOPSIS
use Locale::Codes::LangFam; $lext = code2langfam('apa'); # $lext gets 'Apache languages' $code = langfam2code('Apache languages'); # $code gets 'apa' @codes = all_langfam_codes(); @names = all_langfam_names(); DESCRIPTION
The "Locale::Codes::LangFam" module provides access to standard codes used for identifying language families, such as those as defined in ISO 639-5. Most of the routines take an optional additional argument which specifies the code set to use. If not specified, the default ISO 639-5 language family codes will be used. SUPPORTED CODE SETS
There are several different code sets you can use for identifying language families. A code set may be specified using either a name, or a constant that is automatically exported by this module. For example, the two are equivalent: $lext = code2langfam('apa','alpha'); $lext = code2langfam('apa',LOCALE_LANGFAM_ALPHA); The codesets currently supported are: alpha This is the set of three-letter (lowercase) codes from ISO 639-5 such as 'apa' for Apache languages. This is the default code set. ROUTINES
code2langfam ( CODE [,CODESET] ) langfam2code ( NAME [,CODESET] ) langfam_code2code ( CODE ,CODESET ,CODESET2 ) all_langfam_codes ( [CODESET] ) all_langfam_names ( [CODESET] ) Locale::Codes::LangFam::rename_langfam ( CODE ,NEW_NAME [,CODESET] ) Locale::Codes::LangFam::add_langfam ( CODE ,NAME [,CODESET] ) Locale::Codes::LangFam::delete_langfam ( CODE [,CODESET] ) Locale::Codes::LangFam::add_langfam_alias ( NAME ,NEW_NAME ) Locale::Codes::LangFam::delete_langfam_alias ( NAME ) Locale::Codes::LangFam::rename_langfam_code ( CODE ,NEW_CODE [,CODESET] ) Locale::Codes::LangFam::add_langfam_code_alias ( CODE ,NEW_CODE [,CODESET] ) Locale::Codes::LangFam::delete_langfam_code_alias ( CODE [,CODESET] ) These routines are all documented in the Locale::Codes::API man page. SEE ALSO
Locale::Codes The Locale-Codes distribution. Locale::Codes::API The list of functions supported by this module. http://www.loc.gov/standards/iso639-5/id.php ISO 639-5 . AUTHOR
See Locale::Codes for full author history. Currently maintained by Sullivan Beck (sbeck@cpan.org). COPYRIGHT
Copyright (c) 2011-2012 Sullivan Beck This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself. perl v5.16.2 2012-10-11 Locale::Codes::LangFam(3pm)
All times are GMT -4. The time now is 10:48 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy