Sponsored Content
Top Forums Programming Regular Expression matching in PERL Post 302320300 by Feliix1956 on Wednesday 27th of May 2009 03:16:25 PM
Old 05-27-2009
I know its quite late to reply but this is how I would do what is described here:
Code:
#open file, read only
open(DATA, "<filename.txt");

open(SUBJECT, ">subject.txt");
open(COMMENT, ">comment.txt");
open(LENGTH, ">length.txt");

my $filetoprint = "";

#start a run through the file
while(<DATA>)
{
 #grab next line
 my $line = $_;
 # trim line breaks from $line and return it to the variable
 chomp($line);

 # Check start of line
 if ($line =~ m/^SUBJECT(.+)/)
 {
  # set variable indicator to Subject
  $filetoprint = "Subject";
  # remove first word from $line by passing the matched portion back into it
  $line "$1";
 }

 # Check start of line
 if ($line =~ m/^LENGTH(.+)/)
 {
  # set variable indicator to Length
  $filetoprint = "Length";
  # remove first word from $line by passing the matched portion back into it
  $line "$1";
 }


 # Check start of line
 if ($line =~ m/^COMMENT(.+)/)
 {
  # set variable indicator to Comment
  $filetoprint = "Comment";
  # remove first word from $line by passing the matched portion back into it
  $line "$1";
 }

# if there has been a previous match (this line or any following print out to the appropriate file
 if ($filetoprint eq "Subject") {print SUBJECT "$line\n";}
 if ($filetoprint eq "Comment") {print COMMENT "$line\n";}
 if ($filetoprint eq "Length")  {print LENGTH  "$line\n";}

}

close SUBJECT;
close COMMENT;
close LENGTH ;

Hope this helps anyone with a similar problem. you can also add a "terminating" string by writing a regular expression match for the desired character/string then set $filetoprint back to "" and printing anything from the line leading up to the match into the output file so it isnt lost.

to discern between one block and another you could add a variable that you increase by 1 each time you match a new chunk indicator (like for example a subject line) then you could add the number to the beginning of the line in the output file.

An advanced version might be to store the data in an array of hashes, reference the array by the number that iterates while reading the file and store the data from each line in the named part of the hash corresponding to the data type. eg in pseudo code:
Code:
if ($filetoprint eq detail)
{
 #print the detail content to the detail element of the current hash in the array
 $arrayofhashes[$i]->[detail] = "${$arrayofhashes[$i]->[detail]}$line\n";
}
etc

then you can count the array and print out in the format you want for webmail or forum software
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Regular expression matching a new line

I have written a script to test some isdn links in my network and I am trying to format the output to be more readable. Each line of the output has a different number of digits as follows... Sitename , spid1 12345678901234 1234567890 1234567 , spid2 1234567890 1234567890 1234567 Sitename , ... (1 Reply)
Discussion started by: drheams
1 Replies

2. Shell Programming and Scripting

regular expression in perl

hi, i want to extract the sessionID from this line. QnA Session Id : here the output should be-- QnA_SessionID=128589 Thanks NT (3 Replies)
Discussion started by: namishtiwari
3 Replies

3. Shell Programming and Scripting

Help: Regular Expression for Negate Matching String

Hi guys, as per subject I am having problem with regular expressions. Example, if i got a string "javax.servlet.http.HttpServlet.service" that may occurred anywhere within a text file. How can I used the negate pattern matching of regular expression? I tried the below pattern but it... (4 Replies)
Discussion started by: DrivesMeCrazy
4 Replies

4. Shell Programming and Scripting

Regular expression matching in BASH (equivalent of =~ in Perl)

In Perl I can write a condition that evaluates a match expression like this: if ($foo =~ /^bar/) { do blah blah blah } How do I write this in shell? What I need to know is what operator do I use? The '=~' doesn't seem to fit. I've tried different operators, I browsed the man page for... (3 Replies)
Discussion started by: indiana_tas
3 Replies

5. Shell Programming and Scripting

Regular expression matching

Hi, I have a variable in my script that gets its value from a procstack output. It could be a number of any length, or it could just be a '1' with 0 or more white spaces around it. I would like to detect when this variable is just a 1 and not a 1234, for example. This is as far as I got: ... (3 Replies)
Discussion started by: tmf33uk
3 Replies

6. Shell Programming and Scripting

Matching single quote in a regular expression

I trying to match the begining of the following line in a perl script with a regular expression. $ENV{'ORACLE_HOME'} I tried this regluar expession: /\$ENV\{\'ORACLE_HOME\'\}/ Instead of match, I got a blank prompt > It seems to be a problem with the single quote. If I take it... (11 Replies)
Discussion started by: JC9672
11 Replies

7. Shell Programming and Scripting

Hidden Characters in Regular Expression Matching Perl - Perl Newbie

I am completely new to perl programming. My father is helping me learn said programming language. However, I am stuck on one of the assignments he has given me, and I can't find very much help with it via google, either because I have a tiny attention span, or because I can be very very dense. ... (4 Replies)
Discussion started by: kittyluva2
4 Replies

8. Programming

Perl: How to read from a file, do regular expression and then replace the found regular expression

Hi all, How am I read a file, find the match regular expression and overwrite to the same files. open DESTINATION_FILE, "<tmptravl.dat" or die "tmptravl.dat"; open NEW_DESTINATION_FILE, ">new_tmptravl.dat" or die "new_tmptravl.dat"; while (<DESTINATION_FILE>) { # print... (1 Reply)
Discussion started by: jessy83
1 Replies

9. UNIX for Dummies Questions & Answers

delete lines matching a regular expression

I have a very large file (over 700 million lines) that has some lines that I need to delete. An example of 5 lines of the file: HS4_80:8:2303:19153:193032 153 k80:138891 HS4_80:8:2105:5544:43174 89 k88:81949 165 k88:81949 323 0 * = 323 0 ... (6 Replies)
Discussion started by: pathunkathunk
6 Replies

10. Shell Programming and Scripting

regular expression matching whole words

Hi Consider the file this is a good line when running grep '\b(good|great|excellent)\b' file5 I expect it to match the line but it doesn't... what am i doing wrong?? (ultimately this regex will be in a awk script- just using grep to test it) Thanks, Storms (5 Replies)
Discussion started by: Storms
5 Replies
textutil::trim(n)				    Text and string utilities, macro processing 				 textutil::trim(n)

__________________________________________________________________________________________________________________________________________________

NAME
textutil::trim - Procedures to trim strings SYNOPSIS
package require Tcl 8.2 package require textutil::trim ?0.7? ::textutil::trim::trim string ?regexp? ::textutil::trim::trimleft string ?regexp? ::textutil::trim::trimright string ?regexp? ::textutil::trim::trimPrefix string prefix ::textutil::trim::trimEmptyHeading string _________________________________________________________________ DESCRIPTION
The package textutil::trim provides commands that trim strings using arbitrary regular expressions. The complete set of procedures is described below. ::textutil::trim::trim string ?regexp? Remove in string any leading and trailing substring according to the regular expression regexp and return the result as a new string. This is done for all lines in the string, that is any substring between 2 newline chars, or between the beginning of the string and a newline, or between a newline and the end of the string, or, if the string contain no newline, between the beginning and the end of the string. The regular expression regexp defaults to "[ \t]+". ::textutil::trim::trimleft string ?regexp? Remove in string any leading substring according to the regular expression regexp and return the result as a new string. This apply on any line in the string, that is any substring between 2 newline chars, or between the beginning of the string and a newline, or between a newline and the end of the string, or, if the string contain no newline, between the beginning and the end of the string. The regular expression regexp defaults to "[ \t]+". ::textutil::trim::trimright string ?regexp? Remove in string any trailing substring according to the regular expression regexp and return the result as a new string. This apply on any line in the string, that is any substring between 2 newline chars, or between the beginning of the string and a newline, or between a newline and the end of the string, or, if the string contain no newline, between the beginning and the end of the string. The regular expression regexp defaults to "[ \t]+". ::textutil::trim::trimPrefix string prefix Removes the prefix from the beginning of string and returns the result. The string is left unchanged if it doesn't have prefix at its beginning. ::textutil::trim::trimEmptyHeading string Looks for empty lines (including lines consisting of only whitespace) at the beginning of the string and removes it. The modified string is returned as the result of the command. BUGS, IDEAS, FEEDBACK This document, and the package it describes, will undoubtedly contain bugs and other problems. Please report such in the category textutil of the Tcllib SF Trackers [http://sourceforge.net/tracker/?group_id=12883]. Please also report any ideas for enhancements you may have for either package and/or documentation. SEE ALSO
regexp(n), split(n), string(n) KEYWORDS
prefix, regular expression, string, trimming CATEGORY
Text processing textutil 0.7 textutil::trim(n)
All times are GMT -4. The time now is 02:51 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy