I know its quite late to reply but this is how I would do what is described here:
Code:
#open file, read only
open(DATA, "<filename.txt");
open(SUBJECT, ">subject.txt");
open(COMMENT, ">comment.txt");
open(LENGTH, ">length.txt");
my $filetoprint = "";
#start a run through the file
while(<DATA>)
{
#grab next line
my $line = $_;
# trim line breaks from $line and return it to the variable
chomp($line);
# Check start of line
if ($line =~ m/^SUBJECT(.+)/)
{
# set variable indicator to Subject
$filetoprint = "Subject";
# remove first word from $line by passing the matched portion back into it
$line "$1";
}
# Check start of line
if ($line =~ m/^LENGTH(.+)/)
{
# set variable indicator to Length
$filetoprint = "Length";
# remove first word from $line by passing the matched portion back into it
$line "$1";
}
# Check start of line
if ($line =~ m/^COMMENT(.+)/)
{
# set variable indicator to Comment
$filetoprint = "Comment";
# remove first word from $line by passing the matched portion back into it
$line "$1";
}
# if there has been a previous match (this line or any following print out to the appropriate file
if ($filetoprint eq "Subject") {print SUBJECT "$line\n";}
if ($filetoprint eq "Comment") {print COMMENT "$line\n";}
if ($filetoprint eq "Length") {print LENGTH "$line\n";}
}
close SUBJECT;
close COMMENT;
close LENGTH ;
Hope this helps anyone with a similar problem. you can also add a "terminating" string by writing a regular expression match for the desired character/string then set $filetoprint back to "" and printing anything from the line leading up to the match into the output file so it isnt lost.
to discern between one block and another you could add a variable that you increase by 1 each time you match a new chunk indicator (like for example a subject line) then you could add the number to the beginning of the line in the output file.
An advanced version might be to store the data in an array of hashes, reference the array by the number that iterates while reading the file and store the data from each line in the named part of the hash corresponding to the data type. eg in pseudo code:
Code:
if ($filetoprint eq detail)
{
#print the detail content to the detail element of the current hash in the array
$arrayofhashes[$i]->[detail] = "${$arrayofhashes[$i]->[detail]}$line\n";
}
etc
then you can count the array and print out in the format you want for webmail or forum software