Regular Expression matching in PERL


 
Thread Tools Search this Thread
Top Forums Programming Regular Expression matching in PERL
# 1  
Old 02-26-2008
Regular Expression matching in PERL

I am trying to read a file and capture particular lines into different strings:

Code:
LENGTH: Some Content here

TEXT: Some Content Here

COMMENT: Some Content Here

I want to be able to get (LENGTH: .... ) into one array and so on... I'm trying to use PERL in slurp mode but for some reason I'm having trouble. Can someone suggest me a better way?
# 2  
Old 02-26-2008
instead of slurping the whole file in one array and then filtering out the lines, run a loop over the file and read one line at a time. here yoy can put the lines in respective arrays as per the category, with the help of regex
something like this would help you:
Code:
while (<>) {
    if (m/^\s*LENGTH:\s*/) {
        push (@length_array, $_);
    }
    elsif (m/^\s*TEXT:\s*/) {
        push (@text_array, $_);
    }
    elsif (m/^\s*COMMENT:\s*/) {
        push (@comment_array, $_);
    }
}


Last edited by Yogesh Sawant; 02-26-2008 at 03:05 AM.. Reason: added the sample code
# 3  
Old 02-26-2008
Thanks. Well, I have done that in my php version of the same code but heard that perl is really strong when it comes to regex so wanted to try out something new. Actually the problem is something like this:

Code:
LENGTH: ......................................................
..................................................................
..................................................................

...................................................................
..................................................................

SUBJECT: .......................................................

COMMENT: .....................................................
....................................................................

As you can observe, the data that I want is not limited to one line but rather spans multiple lines. Do you have any suggestion on how to solve this problem?
# 4  
Old 02-26-2008
the code that i posted above won't work, since what you want is something like a multi-line regex
# 5  
Old 02-26-2008
Yes. Incidentally, my php version was almost similar to what you posted but I was just hoping there was a multi line solution to the problem. Do you have any suggestions please?
# 6  
Old 02-26-2008
check if this works for you:
Code:
{
    local $/;  # reset the input record separator
    $all_lines = <INPUT_FILE>;  # Slurp the whole file in a string
}
while ($all_lines =~ m/LENGTH:(.*?)(SUBJECT|COMMENT)/g) {
    push (@length_array, $1);
}
while ($all_lines =~ m/SUBJECT:(.*?)(LENGTH|COMMENT)/g) {
    push (@subject_array, $1);
}
while ($all_lines =~ m/COMMENT:(.*?)(LENGTH|SUBJECT)/g) {
    push (@comment_array, $1);
}

idea is to slurp the file in a string instead of in an array, and then take out the required strings from it using regex
# 7  
Old 02-26-2008
Actually I was doing something on a similar lines:

Code:
$capture[0] = "LENGTH:";
$capture[1] = "COMMENT:";
$capture[2] = "BODY:";
$capture[3] = "AVATAR:";
$capture[4] = "POST:";
$capture[5] = "SUBJECT:";
$capture[6] = "DATE:";
$capture[7] = ""; 

open(DATA, "filename.txt");
$line = <DATA>;


if($line =~ /$capture[0](.*?)$capture[1]/sgm) {
        $solution[0] = $1;
}
if($line =~ /$capture[1](.*?)$capture[2]/sgm) {
        $solution[1] = $1;
}
if($line =~ /$capture[2](.*?)$capture[3]/sgm) {
        $solution[2] = $1;
        }
        if($line =~ /$capture[3](.*?)$capture[4]/sgm) {
                $solution[3] = $1;
        }
        if($line =~ /$capture[4](.*?)$capture[5]/sgm) {
                $solution[4] = $1;
        }
        if($line =~ /$capture[5](.*?)$capture[6]/sgm) {
                $solution[5] = $1;
        }
if($line =~ /$capture[6](.*?)$capture[7]/sgm) {
                $solution[6] = $1;
        }


print trim($solution[0])."\n";
print trim($solution[1])."\n";
print trim($solution[2])."\n";
print trim($solution[3])."\n";
print trim($solution[4])."\n";
print trim($solution[5])."\n";
print trim($solution[6])."\n";

For some reason, it prints only the odd number of lines or even number of lines depending on how the ordering is. Well, I see why that is happening but not sure how to solve it... Anyways I will try to incorporate your logic now...

EDIT: Well, works like a charm if I embed your logic into mine Smilie I just changed the if into a while... Great! Thanks a lot for your help...
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

regular expression matching whole words

Hi Consider the file this is a good line when running grep '\b(good|great|excellent)\b' file5 I expect it to match the line but it doesn't... what am i doing wrong?? (ultimately this regex will be in a awk script- just using grep to test it) Thanks, Storms (5 Replies)
Discussion started by: Storms
5 Replies

2. UNIX for Dummies Questions & Answers

delete lines matching a regular expression

I have a very large file (over 700 million lines) that has some lines that I need to delete. An example of 5 lines of the file: HS4_80:8:2303:19153:193032 153 k80:138891 HS4_80:8:2105:5544:43174 89 k88:81949 165 k88:81949 323 0 * = 323 0 ... (6 Replies)
Discussion started by: pathunkathunk
6 Replies

3. Programming

Perl: How to read from a file, do regular expression and then replace the found regular expression

Hi all, How am I read a file, find the match regular expression and overwrite to the same files. open DESTINATION_FILE, "<tmptravl.dat" or die "tmptravl.dat"; open NEW_DESTINATION_FILE, ">new_tmptravl.dat" or die "new_tmptravl.dat"; while (<DESTINATION_FILE>) { # print... (1 Reply)
Discussion started by: jessy83
1 Replies

4. Shell Programming and Scripting

Hidden Characters in Regular Expression Matching Perl - Perl Newbie

I am completely new to perl programming. My father is helping me learn said programming language. However, I am stuck on one of the assignments he has given me, and I can't find very much help with it via google, either because I have a tiny attention span, or because I can be very very dense. ... (4 Replies)
Discussion started by: kittyluva2
4 Replies

5. Shell Programming and Scripting

Matching single quote in a regular expression

I trying to match the begining of the following line in a perl script with a regular expression. $ENV{'ORACLE_HOME'} I tried this regluar expession: /\$ENV\{\'ORACLE_HOME\'\}/ Instead of match, I got a blank prompt > It seems to be a problem with the single quote. If I take it... (11 Replies)
Discussion started by: JC9672
11 Replies

6. Shell Programming and Scripting

Regular expression matching

Hi, I have a variable in my script that gets its value from a procstack output. It could be a number of any length, or it could just be a '1' with 0 or more white spaces around it. I would like to detect when this variable is just a 1 and not a 1234, for example. This is as far as I got: ... (3 Replies)
Discussion started by: tmf33uk
3 Replies

7. Shell Programming and Scripting

Regular expression matching in BASH (equivalent of =~ in Perl)

In Perl I can write a condition that evaluates a match expression like this: if ($foo =~ /^bar/) { do blah blah blah } How do I write this in shell? What I need to know is what operator do I use? The '=~' doesn't seem to fit. I've tried different operators, I browsed the man page for... (3 Replies)
Discussion started by: indiana_tas
3 Replies

8. Shell Programming and Scripting

Help: Regular Expression for Negate Matching String

Hi guys, as per subject I am having problem with regular expressions. Example, if i got a string "javax.servlet.http.HttpServlet.service" that may occurred anywhere within a text file. How can I used the negate pattern matching of regular expression? I tried the below pattern but it... (4 Replies)
Discussion started by: DrivesMeCrazy
4 Replies

9. Shell Programming and Scripting

regular expression in perl

hi, i want to extract the sessionID from this line. QnA Session Id : here the output should be-- QnA_SessionID=128589 Thanks NT (3 Replies)
Discussion started by: namishtiwari
3 Replies

10. Shell Programming and Scripting

Regular expression matching a new line

I have written a script to test some isdn links in my network and I am trying to format the output to be more readable. Each line of the output has a different number of digits as follows... Sitename , spid1 12345678901234 1234567890 1234567 , spid2 1234567890 1234567890 1234567 Sitename , ... (1 Reply)
Discussion started by: drheams
1 Replies
Login or Register to Ask a Question