Sponsored Content
Top Forums Programming Regular Expression matching in PERL Post 302320300 by Feliix1956 on Wednesday 27th of May 2009 03:16:25 PM
Old 05-27-2009
I know its quite late to reply but this is how I would do what is described here:
Code:
#open file, read only
open(DATA, "<filename.txt");

open(SUBJECT, ">subject.txt");
open(COMMENT, ">comment.txt");
open(LENGTH, ">length.txt");

my $filetoprint = "";

#start a run through the file
while(<DATA>)
{
 #grab next line
 my $line = $_;
 # trim line breaks from $line and return it to the variable
 chomp($line);

 # Check start of line
 if ($line =~ m/^SUBJECT(.+)/)
 {
  # set variable indicator to Subject
  $filetoprint = "Subject";
  # remove first word from $line by passing the matched portion back into it
  $line "$1";
 }

 # Check start of line
 if ($line =~ m/^LENGTH(.+)/)
 {
  # set variable indicator to Length
  $filetoprint = "Length";
  # remove first word from $line by passing the matched portion back into it
  $line "$1";
 }


 # Check start of line
 if ($line =~ m/^COMMENT(.+)/)
 {
  # set variable indicator to Comment
  $filetoprint = "Comment";
  # remove first word from $line by passing the matched portion back into it
  $line "$1";
 }

# if there has been a previous match (this line or any following print out to the appropriate file
 if ($filetoprint eq "Subject") {print SUBJECT "$line\n";}
 if ($filetoprint eq "Comment") {print COMMENT "$line\n";}
 if ($filetoprint eq "Length")  {print LENGTH  "$line\n";}

}

close SUBJECT;
close COMMENT;
close LENGTH ;

Hope this helps anyone with a similar problem. you can also add a "terminating" string by writing a regular expression match for the desired character/string then set $filetoprint back to "" and printing anything from the line leading up to the match into the output file so it isnt lost.

to discern between one block and another you could add a variable that you increase by 1 each time you match a new chunk indicator (like for example a subject line) then you could add the number to the beginning of the line in the output file.

An advanced version might be to store the data in an array of hashes, reference the array by the number that iterates while reading the file and store the data from each line in the named part of the hash corresponding to the data type. eg in pseudo code:
Code:
if ($filetoprint eq detail)
{
 #print the detail content to the detail element of the current hash in the array
 $arrayofhashes[$i]->[detail] = "${$arrayofhashes[$i]->[detail]}$line\n";
}
etc

then you can count the array and print out in the format you want for webmail or forum software
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Regular expression matching a new line

I have written a script to test some isdn links in my network and I am trying to format the output to be more readable. Each line of the output has a different number of digits as follows... Sitename , spid1 12345678901234 1234567890 1234567 , spid2 1234567890 1234567890 1234567 Sitename , ... (1 Reply)
Discussion started by: drheams
1 Replies

2. Shell Programming and Scripting

regular expression in perl

hi, i want to extract the sessionID from this line. QnA Session Id : here the output should be-- QnA_SessionID=128589 Thanks NT (3 Replies)
Discussion started by: namishtiwari
3 Replies

3. Shell Programming and Scripting

Help: Regular Expression for Negate Matching String

Hi guys, as per subject I am having problem with regular expressions. Example, if i got a string "javax.servlet.http.HttpServlet.service" that may occurred anywhere within a text file. How can I used the negate pattern matching of regular expression? I tried the below pattern but it... (4 Replies)
Discussion started by: DrivesMeCrazy
4 Replies

4. Shell Programming and Scripting

Regular expression matching in BASH (equivalent of =~ in Perl)

In Perl I can write a condition that evaluates a match expression like this: if ($foo =~ /^bar/) { do blah blah blah } How do I write this in shell? What I need to know is what operator do I use? The '=~' doesn't seem to fit. I've tried different operators, I browsed the man page for... (3 Replies)
Discussion started by: indiana_tas
3 Replies

5. Shell Programming and Scripting

Regular expression matching

Hi, I have a variable in my script that gets its value from a procstack output. It could be a number of any length, or it could just be a '1' with 0 or more white spaces around it. I would like to detect when this variable is just a 1 and not a 1234, for example. This is as far as I got: ... (3 Replies)
Discussion started by: tmf33uk
3 Replies

6. Shell Programming and Scripting

Matching single quote in a regular expression

I trying to match the begining of the following line in a perl script with a regular expression. $ENV{'ORACLE_HOME'} I tried this regluar expession: /\$ENV\{\'ORACLE_HOME\'\}/ Instead of match, I got a blank prompt > It seems to be a problem with the single quote. If I take it... (11 Replies)
Discussion started by: JC9672
11 Replies

7. Shell Programming and Scripting

Hidden Characters in Regular Expression Matching Perl - Perl Newbie

I am completely new to perl programming. My father is helping me learn said programming language. However, I am stuck on one of the assignments he has given me, and I can't find very much help with it via google, either because I have a tiny attention span, or because I can be very very dense. ... (4 Replies)
Discussion started by: kittyluva2
4 Replies

8. Programming

Perl: How to read from a file, do regular expression and then replace the found regular expression

Hi all, How am I read a file, find the match regular expression and overwrite to the same files. open DESTINATION_FILE, "<tmptravl.dat" or die "tmptravl.dat"; open NEW_DESTINATION_FILE, ">new_tmptravl.dat" or die "new_tmptravl.dat"; while (<DESTINATION_FILE>) { # print... (1 Reply)
Discussion started by: jessy83
1 Replies

9. UNIX for Dummies Questions & Answers

delete lines matching a regular expression

I have a very large file (over 700 million lines) that has some lines that I need to delete. An example of 5 lines of the file: HS4_80:8:2303:19153:193032 153 k80:138891 HS4_80:8:2105:5544:43174 89 k88:81949 165 k88:81949 323 0 * = 323 0 ... (6 Replies)
Discussion started by: pathunkathunk
6 Replies

10. Shell Programming and Scripting

regular expression matching whole words

Hi Consider the file this is a good line when running grep '\b(good|great|excellent)\b' file5 I expect it to match the line but it doesn't... what am i doing wrong?? (ultimately this regex will be in a awk script- just using grep to test it) Thanks, Storms (5 Replies)
Discussion started by: Storms
5 Replies
English(3pm)						 Perl Programmers Reference Guide					      English(3pm)

NAME
English - use nice English (or awk) names for ugly punctuation variables SYNOPSIS
use English; use English qw( -no_match_vars ) ; # Avoids regex performance penalty # in perl 5.16 and earlier ... if ($ERRNO =~ /denied/) { ... } DESCRIPTION
This module provides aliases for the built-in variables whose names no one seems to like to read. Variables with side-effects which get triggered just by accessing them (like $0) will still be affected. For those variables that have an awk version, both long and short English alternatives are provided. For example, the $/ variable can be referred to either $RS or $INPUT_RECORD_SEPARATOR if you are using the English module. See perlvar for a complete list of these. PERFORMANCE
NOTE: This was fixed in perl 5.20. Mentioning these three variables no longer makes a speed difference. This section still applies if your code is to run on perl 5.18 or earlier. This module can provoke sizeable inefficiencies for regular expressions, due to unfortunate implementation details. If performance matters in your application and you don't need $PREMATCH, $MATCH, or $POSTMATCH, try doing use English qw( -no_match_vars ) ; . It is especially important to do this in modules to avoid penalizing all applications which use them. perl v5.18.2 2014-01-06 English(3pm)
All times are GMT -4. The time now is 07:02 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy