The UNIX and Linux Forums  
Hello and Welcome from United States to the UNIX and Linux Forums! Thank You for Visiting and Joining Our Global Community.

Go Back   The UNIX and Linux Forums > Top Forums > High Level Programming
.
google unix.com



High Level Programming Post questions about C, C++, Java, SQL, and other programming languages here.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Regular expression help in perl sdubey Shell Programming and Scripting 3 05-22-2008 06:41 PM
S-037: Perl-Compatible Regular Expression (PCRE) Vulnerabilities iBot Security Advisories (RSS) 0 12-24-2007 09:40 AM
Regular expression matching a new line drheams Shell Programming and Scripting 1 12-13-2005 12:40 AM
Perl Regular Expression - Whitelist mh53j_fe Shell Programming and Scripting 3 11-17-2005 08:31 PM
Perl Regular Expression - Whitlist mh53j_fe Shell Programming and Scripting 3 11-01-2005 09:47 PM

Reply
English Japanese Spanish French German Portuguese Italian Dutch Swedish Russian Norwegian Hungarian Hebrew Danish Powered by Powered by Google
 
LinkBack Thread Tools Search this Thread Rate Thread Display Modes
  #1 (permalink)  
Old 02-26-2008
Legend986 Legend986 is offline
Registered User
  
 

Join Date: Sep 2007
Posts: 171
Regular Expression matching in PERL

I am trying to read a file and capture particular lines into different strings:

Code:
LENGTH: Some Content here

TEXT: Some Content Here

COMMENT: Some Content Here
I want to be able to get (LENGTH: .... ) into one array and so on... I'm trying to use PERL in slurp mode but for some reason I'm having trouble. Can someone suggest me a better way?
  #2 (permalink)  
Old 02-26-2008
Yogesh Sawant's Avatar
Yogesh Sawant Yogesh Sawant is offline Forum Staff  
Part Time Moderator and Full Time Dad
  
 

Join Date: Sep 2006
Location: Rossem, Tazenda
Posts: 1,086
instead of slurping the whole file in one array and then filtering out the lines, run a loop over the file and read one line at a time. here yoy can put the lines in respective arrays as per the category, with the help of regex
something like this would help you:
Code:
while (<>) {
    if (m/^\s*LENGTH:\s*/) {
        push (@length_array, $_);
    }
    elsif (m/^\s*TEXT:\s*/) {
        push (@text_array, $_);
    }
    elsif (m/^\s*COMMENT:\s*/) {
        push (@comment_array, $_);
    }
}

Last edited by Yogesh Sawant; 02-26-2008 at 03:05 AM.. Reason: added the sample code
  #3 (permalink)  
Old 02-26-2008
Legend986 Legend986 is offline
Registered User
  
 

Join Date: Sep 2007
Posts: 171
Thanks. Well, I have done that in my php version of the same code but heard that perl is really strong when it comes to regex so wanted to try out something new. Actually the problem is something like this:

Code:
LENGTH: ......................................................
..................................................................
..................................................................

...................................................................
..................................................................

SUBJECT: .......................................................

COMMENT: .....................................................
....................................................................
As you can observe, the data that I want is not limited to one line but rather spans multiple lines. Do you have any suggestion on how to solve this problem?
  #4 (permalink)  
Old 02-26-2008
Yogesh Sawant's Avatar
Yogesh Sawant Yogesh Sawant is offline Forum Staff  
Part Time Moderator and Full Time Dad
  
 

Join Date: Sep 2006
Location: Rossem, Tazenda
Posts: 1,086
the code that i posted above won't work, since what you want is something like a multi-line regex
  #5 (permalink)  
Old 02-26-2008
Legend986 Legend986 is offline
Registered User
  
 

Join Date: Sep 2007
Posts: 171
Yes. Incidentally, my php version was almost similar to what you posted but I was just hoping there was a multi line solution to the problem. Do you have any suggestions please?
  #6 (permalink)  
Old 02-26-2008
Yogesh Sawant's Avatar
Yogesh Sawant Yogesh Sawant is offline Forum Staff  
Part Time Moderator and Full Time Dad
  
 

Join Date: Sep 2006
Location: Rossem, Tazenda
Posts: 1,086
check if this works for you:
Code:
{
    local $/;  # reset the input record separator
    $all_lines = <INPUT_FILE>;  # Slurp the whole file in a string
}
while ($all_lines =~ m/LENGTH:(.*?)(SUBJECT|COMMENT)/g) {
    push (@length_array, $1);
}
while ($all_lines =~ m/SUBJECT:(.*?)(LENGTH|COMMENT)/g) {
    push (@subject_array, $1);
}
while ($all_lines =~ m/COMMENT:(.*?)(LENGTH|SUBJECT)/g) {
    push (@comment_array, $1);
}
idea is to slurp the file in a string instead of in an array, and then take out the required strings from it using regex
  #7 (permalink)  
Old 02-26-2008
Legend986 Legend986 is offline
Registered User
  
 

Join Date: Sep 2007
Posts: 171
Actually I was doing something on a similar lines:

Code:
$capture[0] = "LENGTH:";
$capture[1] = "COMMENT:";
$capture[2] = "BODY:";
$capture[3] = "AVATAR:";
$capture[4] = "POST:";
$capture[5] = "SUBJECT:";
$capture[6] = "DATE:";
$capture[7] = ""; 

open(DATA, "filename.txt");
$line = <DATA>;


if($line =~ /$capture[0](.*?)$capture[1]/sgm) {
        $solution[0] = $1;
}
if($line =~ /$capture[1](.*?)$capture[2]/sgm) {
        $solution[1] = $1;
}
if($line =~ /$capture[2](.*?)$capture[3]/sgm) {
        $solution[2] = $1;
        }
        if($line =~ /$capture[3](.*?)$capture[4]/sgm) {
                $solution[3] = $1;
        }
        if($line =~ /$capture[4](.*?)$capture[5]/sgm) {
                $solution[4] = $1;
        }
        if($line =~ /$capture[5](.*?)$capture[6]/sgm) {
                $solution[5] = $1;
        }
if($line =~ /$capture[6](.*?)$capture[7]/sgm) {
                $solution[6] = $1;
        }


print trim($solution[0])."\n";
print trim($solution[1])."\n";
print trim($solution[2])."\n";
print trim($solution[3])."\n";
print trim($solution[4])."\n";
print trim($solution[5])."\n";
print trim($solution[6])."\n";
For some reason, it prints only the odd number of lines or even number of lines depending on how the ordering is. Well, I see why that is happening but not sure how to solve it... Anyways I will try to incorporate your logic now...

EDIT: Well, works like a charm if I embed your logic into mine I just changed the if into a while... Great! Thanks a lot for your help...
Sponsored Links
Reply

Bookmarks

Tags
awk, awk trim, perl, perl regex, perl slurp, regex, trim, trim awk

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On




All times are GMT -4. The time now is 11:24 PM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited. Language Translations Powered by .
vBCredits v1.4 Copyright ©2007 - 2008, PixelFX Studios
The UNIX and Linux Forums Content Copyright ©1993-2009. All Rights Reserved.Ad Management by RedTyger

Content Relevant URLs by vBSEO 3.2.0