Copying the Header & footer Information to the Outfile.


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Copying the Header & footer Information to the Outfile.
# 1  
Old 08-25-2011
Copying the Header & footer Information to the Outfile.

Hi

I am writing a perl script which checks for the specific column values from a file and writes to the OUT file.

So the feed file has a header information and footer information.

I header information isaround107 lines i.e.
Starts with
Code:
START-OF-FILE
....... 
so on ....

TIMESTARTED=Thu Aug 25 01:03:50 BST 2011
START-OF-DATA
# PRODUCT=Corp/Pfd

After the last line "# PRODUCT=Corp/Pfd" the actual data would start.

The footer information is 4 lines i.e.
Code:
END-OF-DATA
DATARECORDS=1275983
TIMEFINISHED=Thu Aug 25 02:27:02 BST 2011
END-OF-FILE

Now, My perl script is as below:
Code:
#!/usr/bin/perl

$file='file';
open(FILE,$file)|| die ("could not open file $file: $!");  # note minor changes in this line, too
open(OUT1,'>','badfile');
open(OUT2,'>','goodfile');
my @fields;
$line = $_;

while (<FILE>) {

$line = $_;
@fields = split (/\|/, $line);
<<<<<< 1)  Here Before going to check the column values, I need to write the HEADER and FOOTER information to the Goodfile. >>>>>>>>>

if( $fields[32] eq "N.A."  && $fields[33] eq "N.A." && $fields[34] eq "N.A." && $fields[38] eq "N.A." && ($fields[62] eq "N.A." ||  $fields[62] eq " "))
{
print OUT1 $line;   -----> Badfile
}

else
{
    print OUT2 $line;                ----> Goodfile
}
}
close FILE;
close OUT1;
close OUT2;

1)Here Before going to check the column values, I need to write the HEADER and FOOTER information to the Goodfile

2) Also, I need to calculate the Number of Records in the Good file and then change the FOOTER Information as:
Code:
END-OF-DATA
DATARECORDS=1275983   --> New Rowcount from the Goodfile
TIMEFINISHED=Thu Aug 25 02:27:02 BST 2011
END-OF-FILE

Could anyone please help me out in solving this. Help would be really appreciated.
# 2  
Old 08-25-2011
Hi!

The simplest way is to read the whole file in array, split it to four parts, process then and write the result in the output file. Because it's really simple and quick then perhaps you should do it in that way. There are a lot of things in the world else you can do or improve or learn.

But... There is always but, you know. :-) It is definitely not "unix way". Why?

Well. From the famous "The UNIX Time-Sharing System": "... there have always been fairly severe size constraints on the system and its software. Given the partially antagonistic desires for reasonable efficiency and expressive power, the size constraint has encouraged not only economy, but also a certain elegance of design."

You don't believe if I say what recourses did have the first Unix hosts. So I wouldn't - but the word "severe" says for itself. At those time the famous "unix philosophy" was born.

Doug McIlroy summarized it in this way: "This is the Unix philosophy: Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface."

You can read more here - in the free and good book The Art of Unix Programming.

And what relation does all this stuff have to your question? Just see:

1. You need the header:
Code:
sed -n '/^START-OF-FILE/,/^START-OF-DATA/p' INPUTFILE >/tmp/header.$$

2. The footer:
Code:
sed -n '/^END-OF-DATA/,$p' INPUTFILE >/tmp/footer.$$

3. You can process your file with your perl script but print the name of your good file in the end of the script:
Code:
goodfile=$(perl process.pl)

Or you can print both names - good and bad one and then split them. Or you can give this name as the argument to the script. You just need to know this name.

4. What is the number of records(lines) in the goodfile?
Code:
goodrecs=$(wc -l "$goodfile")

5. The new footer:
Code:
sed 's/^DATARECORDS=.*$/DATARECORDS='"$goodrecs"'/' /tmp/footer.$$ >/tmp/newfooter.$$

6. Now
Code:
cat /tmp/header.$$ "$goodfile" /tmp/newfooter.$$ >OUTPUTFILE

7. And don't forget to clean after you:
Code:
rm /tmp/header.$$ /tmp/*footer.$$ # maybe the goodfile too

The beauty of the shell programming that you can do it incremental, in small pieces. You can test and debug your steps separately. And then, when you get the result, you just append your steps in a small, elegant, and really unix program - a shell script.

Regards,
Andrey (yazu)

===

Well. Sorry for my English. This post was really my English exercise. :-)

Last edited by yazu; 08-25-2011 at 11:29 PM..
These 2 Users Gave Thanks to yazu For This Post:
# 3  
Old 08-25-2011
@Andrey (yazu), thanks for the link to the book. I appreciate it.

- GP
# 4  
Old 08-26-2011
Hi yazu,

Really appreciate for your post. Thanks a lot for your answer and thoughts.

Code:
The simplest way is to read the whole file in array, split it to four  parts, process then and write the result in the output file. Because  it's really simple and quick then perhaps you should do it in that way.  There are a lot of things in the world else you can do or improve or  learn.

yes you are correct. I did tried the logic to save the entire file into an array and then tried to divide the parts.

But I was struck to do the following points Inside the script:
1) How to write the footer information into the goodfile inside the perl script.
2) Thought of using a counter to calculate the number of lines and then how do I substitute the number in the footer information.

Really appreciate your thoughts using Unix and I did learn a lot from your post.

Is there any way we can do the same in Perl Script itself.

Thanks a lot for your replies.
# 5  
Old 08-26-2011
Ok. Let's take a such example file:
Code:
cat INPUTFILE
START-OF-FILE
....... 
so on ....

TIMESTARTED=Thu Aug 25 01:03:50 BST 2011
START-OF-DATA
# PRODUCT=Corp/Pfd
a
b
1
c
d
3
END-OF-DATA
DATARECORDS=1275983
TIMEFINISHED=Thu Aug 25 02:27:02 BST 2011
END-OF-FILE

Good lines are numbers and all others are bad lines. So here a sketch:
Code:
perl -e '                                                              :( 
use warnings;
use strict;

my $goodfile = "goodfile";
my $footer_len = 4;
my $datarec_line = 1;

my (@whole, @header, @footer, @goodlines, @badlines);
my $line;

@whole = <>;

do {
  $line = shift @whole;
  push @header, $line;
} while $line !~ /^START-OF-DATA/;

@footer = splice @whole, -$footer_len;

for (@whole) {
  if (/\d/) {
    push @goodlines, $_;
  } else {
    push @badlines, $_;
  }
}

$footer[$datarec_line] =~ s/\d+/scalar @goodlines/e;

open my $fh, ">", $goodfile;
print $fh @header, @footer, @goodlines;
close $fh;

print @badlines
' INPUTFILE

Good records go to the goodfile and bad ones to the stdout. The footer is before good records.
You can change this sketch (the definition of good lines, the order of output, the output of bad lines) as you want.
This User Gave Thanks to yazu For This Post:
# 6  
Old 08-26-2011
Hi Yazu,

Really Excellent logic when I have seen your code. Thank you very much for your time and for your thoughts.

I have modified the logic accordingly and below is the code:

Code:
#!/usr/bin/perl

$file='feedfile';
open(FILE,$file)|| die ("could not open file $file: $!");


my $goodfile = "goodfile";
my $badfile = "badfile";
my $footer_len = 4;
my $datarec_line = 1;

my (@whole, @header, @footer, @goodlines, @badlines, @fields);
my $line;
$line = $_;

@whole = <FILE>;

do {
  $line = shift @whole;
  push @header, $line;
} while $line !~ /^# PRODUCT/;

@footer = splice @whole, -$footer_len;


foreach (@whole) {
$line = $_;
@fields = split (/\|/, $line);

if( $fields[57] eq " ")
{
 push @badlines, $line;
}

elsif( $fields[32] eq "N.A."  && $fields[33] eq "N.A." && $fields[34] eq "N.A." && $fields[38] eq "N.A." && ($fields[62] eq "N.A." ||  $fields[62] eq " "))
{
push @badlines, $line;
}

else
{
push @goodlines, $line;
}

}

$footer[$datarec_line] =~ s/\d+/scalar @goodlines/e;

open my $fh, ">", $goodfile;
print $fh @header, @goodlines, @footer;
close $fh;

open my $fh1, ">", $badfile;
print $fh1 @badlines;
close $fh1

After running the code I have found that there are 4 lines in between the data records that are differentiate the data.
i.e.
Code:
grep -n "#  PRODUCT"  feedfile
1206675:# PRODUCT=Convertible 
1261566:# PRODUCT=Nationals
1270395:# PRODUCT= Agencies
1274335:# PRODUCT=Regionals

As above we can see that these 4 lines are invalid records.

Now, while calculating the Rowcount we need to ignore these 4 records. i.e.
Code:
$footer[$datarec_line] =~ s/\d+/scalar @goodlines/e; 



Here while calculating the rowcount and substituting the new count, we have to ignore the above 4 lines(records).

May be reducing the array by 4. not sure though.

How can we reduce the row count by 4 so that we can get the actual count.

Really appreciate your time and thoughts.

---------- Post updated at 01:40 PM ---------- Previous update was at 01:24 PM ----------

Finally,

I did the following :

Code:
$footer[$datarec_line] =~ s/\d+/(scalar @goodlines - 4)/e;

Thanks a lot Yazu. I am really Very much thankful to you.
# 7  
Old 08-26-2011
Code:
my $n = @goodlines;
$n -= grep {/^# PRODUCT/} @goodlines; # or just $n -= 4 but it's not good
$footer[$datarec_line] =~ s/\d+/$n/;

This User Gave Thanks to yazu For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Strip header and footer

Hi I have below requirements on the script below : (1) I receive 2 pipe seperated file called OUT.psv and DIFF.psv with a column header.I concatenate the 2 files and create a final.psv file. I want to add another header as START_FILE to the final.psv file . How to achieve this ? (2) I have... (5 Replies)
Discussion started by: samrat dutta
5 Replies

2. Shell Programming and Scripting

Add header and footer with record count in footer

This is my file(Target.txt) name|age|locaction abc|23|del xyz|24|mum jkl|25|kol The file should be like this 1|03252012 1|name|age|location 2|abc|23|del 2|xyz|24|mum 2|jkl|25|kol 2|kkk|26|hyd 3|4 Column 1 is row indicator for row 1 and 2, column indicator is 1,for data rows... (1 Reply)
Discussion started by: itsranjan
1 Replies

3. Shell Programming and Scripting

Removing header and footer

I have two files which are getting sent to a UNIX server in order to be bcp'd into a database. The bcp is failing because there's a header and footer row on the file which give the date of the file and the number of rows in it. That's because the file is also being used for another process, so we... (1 Reply)
Discussion started by: Tom Sawyer
1 Replies

4. Shell Programming and Scripting

Header and Footer...

Hi All, I need to write a script that In my file I have to check header and footer records are available or not. If it is available I have to run the script, otherwise I should not. But current script it is checking only the data inside the script. It is avoiding to check Header and Footer... (1 Reply)
Discussion started by: suresh_target
1 Replies

5. Shell Programming and Scripting

copying file information using awk & grep

Hi, TASK 1: I have been using this code to print the information of files kept at "/castor/cern.ch/user/s/sudha/forPooja" in some text file name FILE.txt. rfdir /castor/cern.ch/user/s/sudha/forPooja | grep data | awk '{print "rfio:///castor/cern.ch/user/s/sudha/forPooja/"$9}' > FILE.txt ... (6 Replies)
Discussion started by: nrjrasaxena
6 Replies

6. UNIX for Dummies Questions & Answers

Help with the Header and Footer check

Hi, I need to check whether the incoming file has a header and footer using a UNIX script. The pattern of the header and footer is fixed as follows: Header: Name,Date Footer: Count, Total Name,Date ------------------------- ------------------------- ------------------------- Count,... (5 Replies)
Discussion started by: Sunny_teotia
5 Replies

7. Shell Programming and Scripting

How to add header and footer?

Hi, Guys, I want add header and footer in a file. I can add footer using following command: echo "Footer" >>file. I don't know how to add header. Thanks in advance (4 Replies)
Discussion started by: ken002
4 Replies

8. Shell Programming and Scripting

Inserting Header and footer

Hi All, I have several txt files i need to enter specific header and footer (both are separate) to all these files how can i do this? plz help.. Regards, Raghav (4 Replies)
Discussion started by: digitalrg
4 Replies

9. Shell Programming and Scripting

rowcnt except Header & Footer

Hi Gurus, My requirement is, I am passing a file1.dat into this(rowcnt.sh) script,but returning a wrong value of -2.(it should be 4).Becoz my file1.dat contains 6records incl: Header & Footer.(6-2=4) wrong output: ------- #sh rowcnt.sh file1.dat -2 actual_cnt except HDR & FTR should be:... (3 Replies)
Discussion started by: vsubbu1000
3 Replies

10. Shell Programming and Scripting

remove header and footer rows

I would like to remove some lines from begining of file (header) and some lines from end of file (footer). The header/footer lines generated by web-browser when the user upload a file to my webserver. Example: -----------------------------7d62af20c052c Content-Disposition: form-data;... (2 Replies)
Discussion started by: seaky
2 Replies
Login or Register to Ask a Question