Reading the file line by line in Perl


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Reading the file line by line in Perl
# 1  
Old 01-10-2012
Reading the file line by line in Perl

Hello Everyone,

I have written a perl script that will load the entire data file into an array and then I would check the value of the specific column and then if interested I will write to a good file else I will write it to a bad file.

But here, the problem is that if the data file is a huge file then storing in an array would cause a memory utilization issue. So i thought I have to read the data file line by line and then check for the column values.

Code:
open(FILE,$file)|| die ("could not open file $file: $!");

my (@whole, @header, @footer, @goodlines, @badlines, @fields);
my $line;
$line = $_;

@whole = <FILE>;

foreach (@whole) {
$line = $_;
@fields = split (/\|/, $line);

if($fields[57] eq " "  ||  $fields[57] eq " ")
{
 push @badlines, $line;
}

elsif( ($fields[32] eq "N.A."  ||  $fields[32] eq " ")  && ($fields[33] eq "N.A."  ||  $fields[33] eq " ") && ($fields[34] eq "N.A."  ||  $fields[34] eq " ") && ($fields[38] eq "N.A."  ||  $fields[38] eq " ") && ($fields[62] eq "N.A." ||  $fields[62] eq " "))
{
push @badlines, $line;
}

else
{
push @goodlines, $line;
}

}

open my $fh, ">", $goodfile;
print $fh @header, @goodlines, @footer;
close $fh;

open my $fh1, ">", $badfile;
print $fh1 @badlines;
close $fh1;


printf(" The New Feed file is located at --------------> '%s'\n" ,   $goodfile);
printf(" The Ignored records are located --------------> '%s'\n\n" , $badfile);

Instead of storing the entire data file into an array (memory) , could someone please advice how can I read the data file line by line so that it doesn't uses much memory.

Really appreciate your thoughts and time. Thanks a lot for looking into this.
# 2  
Old 01-10-2012
Code:
while($LINE=<FILE>)
{
...
}

This User Gave Thanks to Corona688 For This Post:
# 3  
Old 01-10-2012
Hi Corona688,

Thank you very much for your quick reply...

I have tried the following as you have suggested :

Code:
open(FILE,$file)|| die ("could not open file $file: $!");

my (@whole, @header, @footer, @goodlines, @badlines, @fields);
my $line;
$line = $_;

while($line=<FILE>)
{
$line = $_;
@fields = split (/\|/, $line);

if( ( $fields[20] eq "")  && ( $fields[21] == 0 || $fields[21] eq "") && ( $fields[22] == 0  ||  $fields[22] eq "") )

{
push @badlines, $line;
}
else
{
push @goodlines, $line;

}

}
open my $fh, ">", $goodfile;
print $fh @header, @goodlines, @footer;
close $fh;

open my $fh1, ">", $badfile;
print $fh1 @badlines;
close $fh1;

when I am trying to run :
Code:
[cfgdth987] $ perl create_feedfile_bonds_NAMR_OPTNPX.pl equity_option_namr.px.20120109 diff ignore
Out of memory!

I am still encountering the memory issue. Is there any way that I can read line by line and then increment the counter.

Really appreciate you time and advices.
# 4  
Old 01-10-2012
Hi filter,

I think Corona's suggestion is ok, but inside the loop you are saving each input line to arrays, witch are filling the memory. This piece of code:
Code:
if( ( $fields[20] eq "")  && ( $fields[21] == 0 || $fields[21] eq "") && ( $fields[22] == 0  ||  $fields[22] eq "") )  { 
push @badlines, $line; 
} 
else {
 push @goodlines, $line;  
}

Regards,
Birei
# 5  
Old 01-10-2012
Hi birei,

yes you are correct...Thanks to you as well.

I am trying to do run the script with the below script:

Code:
open(FILE,$file)|| die ("could not open file $file: $!");
open(OUT, ">$goodfile") or die "Can't open $goodfile";
open(OUT1, ">$badfile") or die "Can't open $badfile";

while($line=<FILE>)
{
$line = $_;
@fields = split (/\|/, $line);

if( ( $fields[21] eq "N.A.")  && ( $fields[22] == 0 || $fields[22] eq " ") && ( $fields[23] == 0  ||  $fields[23] eq " ") )

{
print OUT1 $line;
}
else
{
print OUT $line;

}
}
close(FILE);
close(OUT1);
close(OUT);

But there is some issue with the above script where I am not able to check for the column values.

Could you please help me out in solving the issue. Appreciate your thoughts!
# 6  
Old 01-10-2012
Can you detail what you want to achieve, provide a sample input and expected output? Useful to give a more valuable help

Regards,
Birei.
# 7  
Old 01-10-2012
Sure birei.

I have a data file which contains ~3.5Million records and would have a header and a footer.It has many number of columns with a pipe ("|") delimited.

Example:
Code:
START-OF-FILE
PROGRAMNAME=getdata
DATEFORMAT=yyyymmdd

START-OF-FIELDS
....
... (column list)

END-OF-FIELDS

TIMESTARTED=Mon Dec  9 17:35:23 EST 2011
START-OF-DATA
AAV CN 01/21/12 C10 Equity|0|43|AAV 1 C10|10.000000|Call|January 12 Calls on AAV CN|American|100.0000|20120121|110957|1000|AAV CN|00765F101|CA00765F1018|N.A.|0.040000|0.040000|N.A.|N.A.|N.A.|N.A.|0|0|CN| | |CM|EO110957201201018140000A|CAD|CA|1.073|110957|20120101|AAV CN 01/21/12 C10|N.A.|266.005|266.005|N.A.|4.320000|0.040000|390863923515|AAV| | |BBG001Q89LY8|
....
END-OF-Fields
(footer)

I need to check the values for the columns 21,22,23 and then write to a file(Badfile) if they are matched else write to a different file.(good file)

Code:
if( ( $fields[21] eq "N.A.")  && ( $fields[22] == 0 || $fields[22] eq " ") && ( $fields[23] == 0  ||  $fields[23] eq " ") )

Once the files are written , I need to include the header and footer to the Good file.

Since the data file is little huge , Memory is filling up.

Could you please help me out solving this.
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Reading a file line by line and print required lines based on pattern

Hi All, i want to write a shell script read below file line by line and want to exclude the lines which contains empty value for MOUNTPOINT field. i am using centos 7 Operating system. want to read below file. # cat /tmp/d5 NAME="/dev/sda" TYPE="disk" SIZE="60G" OWNER="root"... (4 Replies)
Discussion started by: balu1234
4 Replies

2. Shell Programming and Scripting

Reading line by line from live log file using while loop and considering only those lines start from

Hi, I want to read a live log file line by line and considering those line which start from time stamp; Below code I am using, which read line but throws an exception when comparing line that does not contain error code tail -F /logs/COMMON-ERROR.log | while read myline; do... (2 Replies)
Discussion started by: ketanraut
2 Replies

3. Shell Programming and Scripting

Comparison of fields then increment a counter reading line by line in a file

Hi, i have a scenario were i should compare a few fields from each line then increment a variable based on that. Example file 989878|8999|Y|0|Y|N|V 989878|8999|Y|0|N|N|V 989878|8999|Y|2344|Y|N|V i have 3 conditions to check and increment a variable on every line condition 1 if ( $3... (4 Replies)
Discussion started by: selvankj
4 Replies

4. Shell Programming and Scripting

Reading text file, comparing a value in a line, and placing only part of the line in a variable?

I need some help. I would like to read in a text file. Take a variable such as ROW-D-01, compare it to what's in one line in the text file such as PROD/VM/ROW-D-01 and only input PROD/VM into a variable without the /ROW-D-01. Is this possible? any help is appreciated. (2 Replies)
Discussion started by: xChristopher
2 Replies

5. UNIX for Dummies Questions & Answers

Parsing file, reading each line to variable, evaluating date/time stamp of each line

So, the beginning of my script will cat & grep a file with the output directed to a new file. The data I have in this file needs to be parsed, read and evaluated. Basically, I need to identify the latest date/time stamp and then calculate whether or not it is within 15 minutes of the current... (1 Reply)
Discussion started by: hynesward
1 Replies

6. Shell Programming and Scripting

[Solved] Problem in reading a file line by line till it reaches a white line

So, I want to read line-by-line a text file with unknown number of files.... So: a=1 b=1 while ; do b=`sed -n '$ap' test` a=`expr $a + 1` $here do something with b etc done the problem is that sed does not seem to recognise the $a, even when trying sed -n ' $a p' So, I cannot read... (3 Replies)
Discussion started by: hakermania
3 Replies

7. Shell Programming and Scripting

Reading a file line by line and processing for each line

Hi, I am a beginner in shell scripting. I have written the following script, which is supposed to process the while loop for each line in the sid_home.txt file. But I'm getting the 'end of file' unexpected for the last line. The file sid_home.txt gets generated as expected, but the script... (6 Replies)
Discussion started by: sagarparadkar
6 Replies

8. Shell Programming and Scripting

Reading each line of a file in perl script

HI I need to read each line (test.txt) and store it in a array (@test) How to do it in perl. Suppose i have a file test.txt. I have to read each line of the test.txt file and store it in a array @test. How to do it in perl. Regards Harikrishna (3 Replies)
Discussion started by: Harikrishna
3 Replies

9. UNIX for Dummies Questions & Answers

perl - file reading - last line not displayed

Hi, Here is something that am trying with perl #! /opt/third-party/bin/perl open(fh, "s") || die "unable to open the file <small>"; @ch = (); $i = 0; while( $content = <fh> ) { if( $i <= 5 ) { push(@ch, $content); $i++; } else { $i = 1; foreach(@ch) { (8 Replies)
Discussion started by: matrixmadhan
8 Replies
Login or Register to Ask a Question