Hi,
I am writing a Perl Script for the below :
I have a data file that consists of the header information which is 231 Lines and the footer information as 4 lines. The total number of line including the header and footer 1.2 Million with Pipe Delimited file.
For example:
Header Information:
Quote:
START-OF-FILE
FILENAME=fixedincome_bo_euro.out
DATA=bo
REGION=euro
TYPE=out
PROGRAMNAME=getdata
DATEFORMAT=yyyymmdd
... so on 231 Lines
Footer Information:
Quote:
END-OF-DATA
DATARECORDS=1221264
TIMEFINISHED=Fri Aug 12 18:57:09 BST 2011
END-OF-FILE
Data looks like:
Each line has around ~210 columns and is Pipe delimited.
Quote:
TT3069982 Corp|0|198|FSPIN|4.000000| | |FINE SPINNERS|FINE SPIN-CALLED|INDUSTRIAL|Corp|2|FIXED|PERP/CALL|PERPETL PAY,EX-DIV|3|DOMESTIC|EN|GBP|MORTGAGE BACKED|2000000.00|.00|1.0000|1.0000|1.00| |NOT LISTED|100.00000| | |N.A.|N.A.| |100.000000| | | | | | | | | | | | | | | | | | | | | |234953|500000|TT3069982| | | | | | | | | |N.A.| | | | | | | | | | | | |Y|N|N| | | |GB| |Basic Materials|Chemicals|Chemicals-Fibers|N.A.|GB|FSPIN 4 03/29/49|N| |DOMESTIC| |N.A.| | |N| |N|COTT3069982|Fine Spinners|GBP|GBP|N|N|Y|1|N|N|GBP|N|N|Y|19920228|FINE SPINNERS|Anytime| |N.A.| | |N|N|EN|EN|Does Not Apply|20490329|N|42| |Y|N|100.000000|N|20110820|.000000000| |N| | | | |N.A.|N.A.|N.A.|N.A.|N.A.| | | | | | |N|N|N|N| |Grandfathered| |2| | |N.A.|N| | |N| | | | |N| | |20490329| | |N|N|N| | |N|3| | | |N.A.|2| |41|CALENDAR| |N|N|BBG00035Y4Y1|
The outfile should contain the lines with only specific Columns and should be TAB delimited.
Specific Columns:
Quote:
3 4 5-7 10 11 12 13 15 16-19 20-24 25-26 27 28-32 33 36 37 40 55-58 59 60
61 62 63-66 68 69-72 73 74-75 76 77 78-79 80-86 87 88-94 95 96-99 100 101-103 105-107 109-110 112-123 125-128 130-131 133-135 137 111 124 132 136 187 Only.
So I have started writing the Perl script:
Quote:
#!/usr/bin/perl
$file='fileA';
open(F,$file)|| die ("could not open file $file: $!");
@array = <F>;
close F;
open(OUT,'>','outfile');
print OUT @array[231..$#array-4];
close OUT;
I am using array spice to eliminate the Header and footer information..Please correct me if I am wrong.
Now, Once I load the file into an array, how do I select the above selected columns and then insert the delimiter as TAB in Perl.
Would that be easier if I use hashes or array ?
Could someone Please help me out in this. Really appreciate your thoughts.