Regular Expression matching in PERL Post: 302320300

Sponsored Content

Top Forums Programming Regular Expression matching in PERL Post 302320300 by Feliix1956 on Wednesday 27th of May 2009 03:16:25 PM

05-27-2009

Registered User

I know its quite late to reply but this is how I would do what is described here:

Code:

#open file, read only
open(DATA, "<filename.txt");

open(SUBJECT, ">subject.txt");
open(COMMENT, ">comment.txt");
open(LENGTH, ">length.txt");

my $filetoprint = "";

#start a run through the file
while(<DATA>)
{
 #grab next line
 my $line = $_;
 # trim line breaks from $line and return it to the variable
 chomp($line);

 # Check start of line
 if ($line =~ m/^SUBJECT(.+)/)
 {
  # set variable indicator to Subject
  $filetoprint = "Subject";
  # remove first word from $line by passing the matched portion back into it
  $line "$1";
 }

 # Check start of line
 if ($line =~ m/^LENGTH(.+)/)
 {
  # set variable indicator to Length
  $filetoprint = "Length";
  # remove first word from $line by passing the matched portion back into it
  $line "$1";
 }


 # Check start of line
 if ($line =~ m/^COMMENT(.+)/)
 {
  # set variable indicator to Comment
  $filetoprint = "Comment";
  # remove first word from $line by passing the matched portion back into it
  $line "$1";
 }

# if there has been a previous match (this line or any following print out to the appropriate file
 if ($filetoprint eq "Subject") {print SUBJECT "$line\n";}
 if ($filetoprint eq "Comment") {print COMMENT "$line\n";}
 if ($filetoprint eq "Length")  {print LENGTH  "$line\n";}

}

close SUBJECT;
close COMMENT;
close LENGTH ;

Hope this helps anyone with a similar problem. you can also add a "terminating" string by writing a regular expression match for the desired character/string then set $filetoprint back to "" and printing anything from the line leading up to the match into the output file so it isnt lost.

to discern between one block and another you could add a variable that you increase by 1 each time you match a new chunk indicator (like for example a subject line) then you could add the number to the beginning of the line in the output file.

An advanced version might be to store the data in an array of hashes, reference the array by the number that iterates while reading the file and store the data from each line in the named part of the hash corresponding to the data type. eg in pseudo code:

Code:

if ($filetoprint eq detail)
{
 #print the detail content to the detail element of the current hash in the array
 $arrayofhashes[$i]->[detail] = "${$arrayofhashes[$i]->[detail]}$line\n";
}
etc

then you can count the array and print out in the format you want for webmail or forum software

Feliix1956

View Public Profile for Feliix1956

Find all posts by Feliix1956

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Regular expression matching a new line

I have written a script to test some isdn links in my network and I am trying to format the output to be more readable. Each line of the output has a different number of digits as follows... Sitename , spid1 12345678901234 1234567890 1234567 , spid2 1234567890 1234567890 1234567 Sitename , ...

2. Shell Programming and Scripting

regular expression in perl

hi, i want to extract the sessionID from this line. QnA Session Id : here the output should be-- QnA_SessionID=128589 Thanks NT

3. Shell Programming and Scripting

Help: Regular Expression for Negate Matching String

Hi guys, as per subject I am having problem with regular expressions. Example, if i got a string "javax.servlet.http.HttpServlet.service" that may occurred anywhere within a text file. How can I used the negate pattern matching of regular expression? I tried the below pattern but it...

4. Shell Programming and Scripting

Regular expression matching in BASH (equivalent of =~ in Perl)

In Perl I can write a condition that evaluates a match expression like this: if ($foo =~ /^bar/) { do blah blah blah } How do I write this in shell? What I need to know is what operator do I use? The '=~' doesn't seem to fit. I've tried different operators, I browsed the man page for...

5. Shell Programming and Scripting

Regular expression matching

Hi, I have a variable in my script that gets its value from a procstack output. It could be a number of any length, or it could just be a '1' with 0 or more white spaces around it. I would like to detect when this variable is just a 1 and not a 1234, for example. This is as far as I got: ...

6. Shell Programming and Scripting

Matching single quote in a regular expression

I trying to match the begining of the following line in a perl script with a regular expression. $ENV{'ORACLE_HOME'} I tried this regluar expession: /\$ENV\{\'ORACLE_HOME\'\}/ Instead of match, I got a blank prompt > It seems to be a problem with the single quote. If I take it...

7. Shell Programming and Scripting

Hidden Characters in Regular Expression Matching Perl - Perl Newbie

I am completely new to perl programming. My father is helping me learn said programming language. However, I am stuck on one of the assignments he has given me, and I can't find very much help with it via google, either because I have a tiny attention span, or because I can be very very dense. ...

8. Programming

Perl: How to read from a file, do regular expression and then replace the found regular expression

Hi all, How am I read a file, find the match regular expression and overwrite to the same files. open DESTINATION_FILE, "<tmptravl.dat" or die "tmptravl.dat"; open NEW_DESTINATION_FILE, ">new_tmptravl.dat" or die "new_tmptravl.dat"; while (<DESTINATION_FILE>) { # print...

9. UNIX for Dummies Questions & Answers

delete lines matching a regular expression

I have a very large file (over 700 million lines) that has some lines that I need to delete. An example of 5 lines of the file: HS4_80:8:2303:19153:193032 153 k80:138891 HS4_80:8:2105:5544:43174 89 k88:81949 165 k88:81949 323 0 * = 323 0 ...

10. Shell Programming and Scripting

regular expression matching whole words

Hi Consider the file this is a good line when running grep '\b(good|great|excellent)\b' file5 I expect it to match the line but it doesn't... what am i doing wrong?? (ultimately this regex will be in a awk script- just using grep to test it) Thanks, Storms

LEARN ABOUT MOJAVE

ppi::token::comment5.18

PPI::Token::Comment(3)					User Contributed Perl Documentation				    PPI::Token::Comment(3)

NAME

       PPI::Token::Comment - A comment in Perl source code

INHERITANCE

	 PPI::Token::Comment
	 isa PPI::Token
	     isa PPI::Element

SYNOPSIS

	 # This is a PPI::Token::Comment

	 print "Hello World!"; # So it this

	 $string =~ s/ foo  # This, unfortunately, is not :(
	       bar
	       /w;

DESCRIPTION

       In PPI, comments are represented by "PPI::Token::Comment" objects.

       These come in two flavours, line comment and inline comments.

       A "line comment" is a comment that stands on its own line. These comments hold their own newline and whitespace (both leading and trailing)
       as part of the one "PPI::Token::Comment" object.

       An inline comment is a comment that appears after some code, and continues to the end of the line. This does not include whitespace, and
       the terminating newlines is considered a separate PPI::Token::Whitespace token.

       This is largely a convenience, simplifying a lot of normal code relating to the common things people do with comments.

       Most commonly, it means when you "prune" or "delete" a comment, a line comment disappears taking the entire line with it, and an inline
       comment is removed from the inside of the line, allowing the newline to drop back onto the end of the code, as you would expect.

       It also means you can move comments around in blocks much more easily.

       For now, this is a suitably handy way to do things. However, I do reserve the right to change my mind on this one if it gets dangerously
       anachronistic somewhere down the line.

METHODS

       Only very limited methods are available, beyond those provided by our parent PPI::Token and PPI::Element classes.

   line
       The "line" accessor returns true if the "PPI::Token::Comment" is a line comment, or false if it is an inline comment.

SUPPORT

       See the support section in the main module.

AUTHOR

       Adam Kennedy <adamk@cpan.org>

COPYRIGHT

       Copyright 2001 - 2011 Adam Kennedy.

       This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

       The full text of the license can be found in the LICENSE file included with this module.

perl v5.18.2							    2011-02-25						    PPI::Token::Comment(3)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Regular expression matching a new line

Discussion started by: drheams

2. Shell Programming and Scripting

regular expression in perl

Discussion started by: namishtiwari

3. Shell Programming and Scripting

Help: Regular Expression for Negate Matching String

Discussion started by: DrivesMeCrazy

4. Shell Programming and Scripting

Regular expression matching in BASH (equivalent of =~ in Perl)

Discussion started by: indiana_tas