regular expression for split function in perl


 
Thread Tools Search this Thread
Top Forums UNIX for Advanced & Expert Users regular expression for split function in perl
# 1  
Old 02-03-2011
regular expression for split function in perl

Hi,

Below is an example of a record I have, which I wish to split using the perl's split function and load it into an array. I am having tough time figuring out the exact reg-ex to perform the split.

Given record:
Code:
"a","xyz",0,2,48,"abcd","lmno,pqrR, stv",300,"abc",20,

The delimiter to uniquely identify each field is "," (comma). The quotation marks represents a string and the non-quotation marks represents integers.

The problem with this record is such that, a string which is represented within the quotation marks has "," (commas) (example this "lmno,pqrR, stv") in it, which should not be identified as a field, because the comma resides within the quotation marks.

Therefore, I would like to build a reg-ex within the split function which will basically follow the either of the below algorithms:

Algo A:
split (/\,|dont split if there is [:alpha:]\,[:alpha:]|dont split if there is [:alpha:]\,|dont split if there is ,[:alpha:]/, $givenrec)

--------------OR---------------

Algo B:
split a field if the starting and ending character of that field is " (for strings) OR none (for integers)

I really appreciate the help in advance.

Please let me know if you require any further explanations.

Thus the result should be: That is retaining the quotation marks as well, since its string.
Code:
"a"
"xyz"
0
2
48
"abcd"
"lmno,pqrR, stv"
300
"abc"
20


Moderator's Comments:
Mod Comment Please use code tags when posting data and code samples.

Last edited by Franklin52; 02-03-2011 at 03:12 AM..
# 2  
Old 02-03-2011
Hi,

Test next script:
Code:
$ perl -MText::ParseWords -e '@fields = quotewords( ",", 1, q/"a","xyz",0,2,48,"abcd","lmno,pqrR, stv",300,"abc",20,/); print "$_\n" foreach( @fields );'

Regards,
Birei
# 3  
Old 02-04-2011
Hi

This is great. Thank you very much.

One problem with this is, the computation time is longer. So say for example, if I am parsing through a 1GB of file with 1.4M records, it takes 10 minutes, versus the "broken" split functon, which takes 2 minutes.

Any room for improvement on the computation would be really appreciated.

Thanks once again.

---------- Post updated 02-04-11 at 12:36 AM ---------- Previous update was 02-03-11 at 12:33 PM ----------
Code:
split (/,(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))/, $str)

where $str is the given record. Computation time on the above profile seems to be 4 minutes.

Thanks.

Last edited by Scott; 02-04-2011 at 01:35 PM.. Reason: Code tags
# 4  
Old 02-04-2011
Code:
$
$
$ cat f5
"a","xyz",0,2,48,"abcd","lmno,pqrR, stv",300,"abc",20,
$
$
$ perl -lne 'while(/,*(".*?")|(\d+)/g) {print $1||$2}' f5
"a"
"xyz"
0
2
48
"abcd"
"lmno,pqrR, stv"
300
"abc"
20
$
$

tyler_durden
# 5  
Old 02-08-2011
So shell can handle it easily, why not call system function, and use tr command directly?
Code:
tr "," "\n" < infile

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Perl split match regular expression with or

I cannot seem to get this to work correct: my ($k, $v) = split(/F/, $fc{$DIR}{symbolic}, 2); Below is the input (the $fc{$DIR}{symbolic} variable): QMH2562 FW:v5.06.03 DVR:v8.03.07.15.05.09-kbut i also need it to break on FV: Emulex NC553i FV4.2.401.6 DV8.3.5.86.2pthe code above... (2 Replies)
Discussion started by: rusted_planet
2 Replies

2. Shell Programming and Scripting

Perl regular expression

Hi , I have the below array my @actionText = ("delivered to governor on 21/23/3345" , "deliver jllj" , "ram 2345/43"); When i am trying to grep the contents of array and if mathced substituting with the digitis or some date format from the element like below my @action = grep { $_ =~... (7 Replies)
Discussion started by: ragilla
7 Replies

3. Programming

Perl: How to read from a file, do regular expression and then replace the found regular expression

Hi all, How am I read a file, find the match regular expression and overwrite to the same files. open DESTINATION_FILE, "<tmptravl.dat" or die "tmptravl.dat"; open NEW_DESTINATION_FILE, ">new_tmptravl.dat" or die "new_tmptravl.dat"; while (<DESTINATION_FILE>) { # print... (1 Reply)
Discussion started by: jessy83
1 Replies

4. Shell Programming and Scripting

Perl regular expression help!

Hi I am doing something basic like... if ($stringvariable =~ /have not typed/) I have a little problem because the 'not' in the expression gets highlighted as a kind of a '!'..what am I supposed to do in this situation? Thank you ---------- Post updated at 03:24 PM ----------... (1 Reply)
Discussion started by: vas28r13
1 Replies

5. Shell Programming and Scripting

Hidden Characters in Regular Expression Matching Perl - Perl Newbie

I am completely new to perl programming. My father is helping me learn said programming language. However, I am stuck on one of the assignments he has given me, and I can't find very much help with it via google, either because I have a tiny attention span, or because I can be very very dense. ... (4 Replies)
Discussion started by: kittyluva2
4 Replies

6. Shell Programming and Scripting

Perl - Grep function regular expression

For some reason, @logs is a list of log files @filter is a list of expressions to grep out foreach (@logs){ open READ, "<$_" or die $!; @temp=<READ>; close READ; foreach (@filter){ print grep /$_/,@temp ; } } returns a regex error in one of the files... (4 Replies)
Discussion started by: adelsin
4 Replies

7. Shell Programming and Scripting

Need perl regular expression

Hi, I am looking for a Perl regular expression to match the below pattern of a java script file. var so = object.device.load('camera','value'); I want to grep out such lines present in the *.js files. The conditions are: a) the line may start with blank space(s) b) always the... (3 Replies)
Discussion started by: royalibrahim
3 Replies

8. Shell Programming and Scripting

perl regular expression

Dear all, I have a simple issue on a perl regular expression. I want to get the characters in red from the next lines : POWER_key LEFT_key RIGHT_key OK_key DOWN_key and so on... Thanks in advance for reply. Ludo (1 Reply)
Discussion started by: lsaas
1 Replies

9. Shell Programming and Scripting

regular expression in perl

hi, i want to extract the sessionID from this line. QnA Session Id : here the output should be-- QnA_SessionID=128589 Thanks NT (3 Replies)
Discussion started by: namishtiwari
3 Replies

10. Shell Programming and Scripting

Regular expression help in perl

Hi all, I am trying to match a multi line string and return the matching string in one line. Here is the perl code that I wrote: #!/usr/bin/perl my $str='<title>My title</title>'; if ($str =~ /(<title>)(+)(<\/title>)/ ){ print "$2\n"; } It returns : My title I want the... (3 Replies)
Discussion started by: sdubey
3 Replies
Login or Register to Ask a Question