Reading the file line by line in Perl


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Reading the file line by line in Perl
# 8  
Old 01-10-2012
Not sure about what are header and footer but I hope you can avoid the 'out of memory' message:
Code:
$ cat script.pl
use warnings;
use strict;

die qq[Usage: perl $0 <input-file> <output-good-file> <output-bad-file>\n] unless @ARGV == 3;

open my $bad_fh, ">", pop @ARGV or die qq[ERROR: $!\n];
open my $good_fh, ">", pop @ARGV or die qq[ERROR: $!\n];
open my $input_fh, "<", pop @ARGV or die qq[ERROR: $!\n];

my ($fields_processed, $flipflop);

while ( my $line = <$input_fh> ) {
        chomp $line;

        ## Header.
        if ( $flipflop = ( $line =~ m/\A(?i)start-of-file/ .. $line =~ m/\A(?i)start-of-fields/ ) ) {
                next if $flipflop == 1 || $flipflop =~ /E0\Z/;
                printf $good_fh qq[%s\n], $line;
                next;
        }

        ## Footer.
        if ( $fields_processed ) {
                if ( $flipflop = ( $line =~ m/\A(?i)end-of-fields/ .. eof ) ) {
                        next if $flipflop == 1;
                        printf $good_fh qq[%s\n], $line;
                }
        }

        my @f = split /\|/, $line, 25;

        if ( @f < 25 ) {
                next;
        }
        else {
                $fields_processed = 1;
        }

        if ( ( $f[21] eq "N.A.")  && ( $f[22] == 0 || $f[22] eq " ") && ( $f[23] == 0  ||  $f[23] eq " ") ) {
                printf $bad_fh qq[%s\n], $line;
        } 
        else {
                printf $good_fh qq[%s\n], $line;
        }
}

Regards,
Birei
This User Gave Thanks to birei For This Post:
# 9  
Old 01-10-2012
Thank you very much for your reply birei.

The data file contains the header information (example: names of the columns) and the footer contains the number of the data records, filestamp.

I have ran your script ...its avoiding the "out of memory" issue. But while extracting the header information, the good file doesn't include the string "START-OF-FILE" and "START-OF-DATA" ...

Code:
       ## Header.
        if ( $flipflop = ( $line =~ m/\A(?i)START-OF-FILE/ .. $line =~ m/\A(?i)START-OF-DATA/ ) ) {
                next if $flipflop == 1 || $flipflop =~ /E0\Z/;
                printf $good_fh qq[%s\n], $line;
                next;
        }

In the footer the file stamp could be the as it is, but since the number of records have been changes in the good file...I need to count the number of reords (excluding the header information) and the replace it to with the original i.e.

Code:
footer Information:
DATARECORDS=3530288   --> Need to count the number of records in goodfile and put it over here
TIMEFINISHED=Mon Jan  9 19:24:03 EST 2012
END-OF-FILE

If I was using arrays , then I was using the below logic for the above:
Code:
my $footer_len = 4;
my $datarec_line = 1;
do {
 $line = shift @whole;
 push @header, $line;
} while $line !~ /^START-OF-DATA/;

my $n = @goodlines;
$n -= grep {/^# PRODUCT/} @goodlines;
$footer[$datarec_line] =~ s/\d+/$n/;

Could you please advice any similar logic for the header and footer information to be included to the good file .

I would really appreciate your time on this.
# 10  
Old 01-11-2012
Quote:
the good file doesn't include the string "START-OF-FILE" and "START-OF-DATA" ...
To add those two lines comment this instruction:
Code:
#next if $flipflop == 1 || $flipflop =~ /E0\Z/;

Regards,
Birei
This User Gave Thanks to birei For This Post:
# 11  
Old 01-11-2012
Thanks a lot for your reply and for all your help.

I need to count the number of line in the good file excluding the header and footer and then would need to substitute the count with the number existing.

Example:
Number of records in good file without header and footer : 1418125
Code:
Before: 
END-OF-DATA
DATARECORDS=3530288
TIMEFINISHED=Mon Jan  9 19:24:03 EST 2012
END-OF-FILE

After:
END-OF-DATA
DATARECORDS=1418125
TIMEFINISHED=Mon Jan  9 19:24:03 EST 2012
END-OF-FILE

Really appreciate your time and help on this.
# 12  
Old 01-11-2012
In the future you may want to consider using the Tie::File module which can access the lines of a disk file via a Perl array if you cannot read a file into memory because of its size.
# 13  
Old 01-12-2012
Script modified:
Code:
$ cat script.pl
use warnings;
use strict;

die qq[Usage: perl $0 <input-file> <output-good-file> <output-bad-file>\n] unless @ARGV == 3;

open my $bad_fh, ">", pop @ARGV or die qq[ERROR: $!\n];
open my $good_fh, ">", pop @ARGV or die qq[ERROR: $!\n];
open my $input_fh, "<", pop @ARGV or die qq[ERROR: $!\n];

my ($fields_processed, $flipflop, $good_lines);

while ( my $line = <$input_fh> ) {
        chomp $line;

        ## Header.
        if ( $flipflop = ( $line =~ m/\A(?i)start-of-file/ .. $line =~ m/\A(?i)start-of-fields/ ) ) {
#                next if $flipflop == 1 || $flipflop =~ /E0\Z/;
                printf $good_fh qq[%s\n], $line;
                next;
        }

        ## Footer.
        if ( $fields_processed ) {
                if ( $flipflop = ( $line =~ m/\A(?i)end-of-fields/ .. eof ) ) {
                        next if $flipflop == 1;
                         $line =~ s/\A(?i)(?<=datarecords=)\d*/$good_lines/;
                        printf $good_fh qq[%s\n], $line;
                }
        }

        my @f = split /\|/, $line, 25;

        if ( @f < 25 ) {
                next;
        }
        else {
                $fields_processed = 1;
        }

        if ( ( $f[21] eq "N.A.")  && ( $f[22] == 0 || $f[22] eq " ") && ( $f[23] == 0  ||  $f[23] eq " ") ) {
                printf $bad_fh qq[%s\n], $line;
        } 
        else {
                 ++$good_lines;
                printf $good_fh qq[%s\n], $line;
        }
}

Regards,
Birei
# 14  
Old 01-13-2012
Thanks much for your help birei.Its really nice thought that you have given.

But the substitution is not happening i.e.

Code:
 if ( $fields_processed ) {
                if ( $flipflop = ( $line =~ m/\A(?i)END-OF-DATA/ .. eof ) ) {
                        $line =~ s/\A(?i)(?<=DATARECORDS=)\d*/$good_lines/;
                        printf $good_fh qq[%s\n], $line;
                }
        }

Footer Information:
Code:
END-OF-DATA
DATARECORDS=3530288
TIMEFINISHED=Mon Jan  9 19:24:03 EST 2012
END-OF-FILE

I think some regression expression is missing while substituting the value.

Really appreciate you thoughts and time on this.
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Reading a file line by line and print required lines based on pattern

Hi All, i want to write a shell script read below file line by line and want to exclude the lines which contains empty value for MOUNTPOINT field. i am using centos 7 Operating system. want to read below file. # cat /tmp/d5 NAME="/dev/sda" TYPE="disk" SIZE="60G" OWNER="root"... (4 Replies)
Discussion started by: balu1234
4 Replies

2. Shell Programming and Scripting

Reading line by line from live log file using while loop and considering only those lines start from

Hi, I want to read a live log file line by line and considering those line which start from time stamp; Below code I am using, which read line but throws an exception when comparing line that does not contain error code tail -F /logs/COMMON-ERROR.log | while read myline; do... (2 Replies)
Discussion started by: ketanraut
2 Replies

3. Shell Programming and Scripting

Comparison of fields then increment a counter reading line by line in a file

Hi, i have a scenario were i should compare a few fields from each line then increment a variable based on that. Example file 989878|8999|Y|0|Y|N|V 989878|8999|Y|0|N|N|V 989878|8999|Y|2344|Y|N|V i have 3 conditions to check and increment a variable on every line condition 1 if ( $3... (4 Replies)
Discussion started by: selvankj
4 Replies

4. Shell Programming and Scripting

Reading text file, comparing a value in a line, and placing only part of the line in a variable?

I need some help. I would like to read in a text file. Take a variable such as ROW-D-01, compare it to what's in one line in the text file such as PROD/VM/ROW-D-01 and only input PROD/VM into a variable without the /ROW-D-01. Is this possible? any help is appreciated. (2 Replies)
Discussion started by: xChristopher
2 Replies

5. UNIX for Dummies Questions & Answers

Parsing file, reading each line to variable, evaluating date/time stamp of each line

So, the beginning of my script will cat & grep a file with the output directed to a new file. The data I have in this file needs to be parsed, read and evaluated. Basically, I need to identify the latest date/time stamp and then calculate whether or not it is within 15 minutes of the current... (1 Reply)
Discussion started by: hynesward
1 Replies

6. Shell Programming and Scripting

[Solved] Problem in reading a file line by line till it reaches a white line

So, I want to read line-by-line a text file with unknown number of files.... So: a=1 b=1 while ; do b=`sed -n '$ap' test` a=`expr $a + 1` $here do something with b etc done the problem is that sed does not seem to recognise the $a, even when trying sed -n ' $a p' So, I cannot read... (3 Replies)
Discussion started by: hakermania
3 Replies

7. Shell Programming and Scripting

Reading a file line by line and processing for each line

Hi, I am a beginner in shell scripting. I have written the following script, which is supposed to process the while loop for each line in the sid_home.txt file. But I'm getting the 'end of file' unexpected for the last line. The file sid_home.txt gets generated as expected, but the script... (6 Replies)
Discussion started by: sagarparadkar
6 Replies

8. Shell Programming and Scripting

Reading each line of a file in perl script

HI I need to read each line (test.txt) and store it in a array (@test) How to do it in perl. Suppose i have a file test.txt. I have to read each line of the test.txt file and store it in a array @test. How to do it in perl. Regards Harikrishna (3 Replies)
Discussion started by: Harikrishna
3 Replies

9. UNIX for Dummies Questions & Answers

perl - file reading - last line not displayed

Hi, Here is something that am trying with perl #! /opt/third-party/bin/perl open(fh, "s") || die "unable to open the file <small>"; @ch = (); $i = 0; while( $content = <fh> ) { if( $i <= 5 ) { push(@ch, $content); $i++; } else { $i = 1; foreach(@ch) { (8 Replies)
Discussion started by: matrixmadhan
8 Replies
Login or Register to Ask a Question