Problem running Perl Script with huge data files Post: 302435957

Sponsored Content

Top Forums Shell Programming and Scripting Problem running Perl Script with huge data files Post 302435957 by ad23 on Thursday 8th of July 2010 04:24:36 PM

07-08-2010

Registered User

Problem running Perl Script with huge data files

Hello Everyone,

I have a perl script that reads two types of data files (txt and XML). These data files are huge and large in number. I am using something like this :

Code:

foreach my $t  (@text)
{
    open TEXT, $t or die "Cannot open $t for reading: $!\n";
  
    while(my $line=<TEXT>){
       ....My code....
    }
}
close(TEXT);

foreach my $x  (@xml)
{
    open XML, $x or die "Cannot open $x for reading: $!\n";
  
    while(my $line=<XML>){
       ....My code....
    }
}

When I run it directly like following, it gives me "Out of memory" error:

Code:

Usage: perl runXML.pl

Can anyone suggest me as to how I can run this using "qsub" or something? I have these files in a directory structure like this:

Code:

/Data/2010_aaa/data.txt
/Data/2010_aaa/data.xml

/Data/2010_bbb/data.txt
/Data/2010_bbb/data.xml

/Data/2010_ccc/data.txt
/Data/2010_ccc/data.xml

/Data/2010_ddd/data.txt
/Data/2010_ddd/data.xml

And I need to run this script on all these files, as my data is scattered in these files.

Thanks!

ad23

View Public Profile for ad23

Find all posts by ad23

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Shell script to check the unique numbers in huge data

Friends, I have to write a shell script,the description is---- i Have to check the uniqueness of the numbers in a file. A file is containing 200thousand tickets and a ticket have 15 numbers in asecending order.And there is a strip that is having 6 tickets that means 90 numbers.I...

2. Shell Programming and Scripting

Perl script for extract data from xml files

Hi All, Prepare a perl script for extracting data from xml file. The xml data look like as AC StartTime="1227858839" ID="88" ETime="1227858837" DSTFlag="false" Type="2" Duration="303" /> <AS StartTime="1227858849" SigPairs="119 40 98 15 100 32 128 18 131 23 70 39 123 20 120 27 100 17 136 12...

3. Shell Programming and Scripting

Split a huge data into few different files?!

Input file data contents: >seq_1 MSNQSPPQSQRPGHSHSHSHSHAGLASSTSSHSNPSANASYNLNGPRTGGDQRYRASVDA >seq_2 AGAAGRGWGRDVTAAASPNPRNGGGRPASDLLSVGNAGGQASFASPETIDRWFEDLQHYE >seq_3 ATLEEMAAASLDANFKEELSAIEQWFRVLSEAERTAALYSLLQSSTQVQMRFFVTVLQQM ARADPITALLSPANPGQASMEAQMDAKLAAMGLKSPASPAVRQYARQSLSGDTYLSPHSA...

4. Shell Programming and Scripting

Perl script error to split huge data one by one.

Below is my perl script: #!/usr/bin/perl open(FILE,"$ARGV") or die "$!"; @DATA = <FILE>; close FILE; $join = join("",@DATA); @array = split( ">",$join); for($i=0;$i<=scalar(@array);$i++){ system ("/home/bin/./program_name_count_length MULTI_sequence_DATA_FILE -d...

5. Shell Programming and Scripting

running perl script problem

While executing perl scriptit gives some compling issue, please help out $inputFilename="c:\allways.pl"; open (FILEH,$inputFilename) or die "Could not open log file"; Error : Could not open log file at c:\allways.pl line 4 learner in Perl

6. Shell Programming and Scripting

Three Difference File Huge Data Comparison Problem.

I got three different file: Part of File 1 ARTPHDFGAA . . Part of File 2 ARTGHHYESA . . Part of File 3 ARTPOLYWEA . .

7. Shell Programming and Scripting

Help- counting delimiter in a huge file and split data into 2 files

I’m new to Linux script and not sure how to filter out bad records from huge flat files (over 1.3GB each). The delimiter is a semi colon “;” Here is the sample of 5 lines in the file: Name1;phone1;address1;city1;state1;zipcode1 Name2;phone2;address2;city2;state2;zipcode2;comment...

8. Shell Programming and Scripting

Perl: Need help comparing huge files

What do i need to do have the below perl program load 205 million record files into the hash. It currently works on smaller files, but not working on huge files. Any idea what i need to do to modify to make it work with huge files: #!/usr/bin/perl $ot1=$ARGV; $ot2=$ARGV; open(mfileot1,...

9. Shell Programming and Scripting

In PErl script: need to read the data one file and generate multiple files based on the data

We have the data looks like below in a log file. I want to generat files based on the string between two hash(#) symbol like below Source: #ext1#test1.tale2 drop #ext1#test11.tale21 drop #ext1#test123.tale21 drop #ext2#test1.tale21 drop #ext2#test12.tale21 drop #ext3#test11.tale21 drop...

10. UNIX for Advanced & Expert Users

File comaprsons for the Huge data files ( around 60G) - Need optimized and teh best way to do this

I have 2 large file (.dat) around 70 g, 12 columns but the data not sorted in both the files.. need your inputs in giving the best optimized method/command to achieve this and redirect the not macthing lines to the thrid file ( diff.dat) File 1 - 15 columns File 2 - 15 columns Data is...

LEARN ABOUT DEBIAN

data::dumpxml::parser

DumpXML::Parser(3pm)					User Contributed Perl Documentation				      DumpXML::Parser(3pm)

NAME

       Data::DumpXML::Parser - Restore data dumped by Data::DumpXML

SYNOPSIS

	use Data::DumpXML::Parser;

	my $p = Data::DumpXML::Parser->new;
	my $data = $p->parsefile(shift || "test.xml");

DESCRIPTION

       "Data::DumpXML::Parser" is an "XML::Parser" subclass that can recreate the data structure from an XML document produced by "Data::DumpXML".
       The parserfile() method returns a reference to an array of the values dumped.

       The constructor method new() takes a single additional argument to that of "XML::Parser":

       Blesser => CODEREF
	   A subroutine that is invoked to bless restored objects.  The subroutine is invoked with two arguments: a reference to the object, and a
	   string containing the class name.  If not provided, the built-in "bless" function is used.

	   For situations where the input file cannot necessarily be trusted and blessing arbitrary Classes might give malicious input the ability
	   to exploit the DESTROY methods of modules used by the code, it is a good idea to provide a no-op blesser:

	     my $p = Data::DumpXML::Parser->new(Blesser => sub {});

SEE ALSO

       Data::DumpXML, XML::Parser

AUTHOR

       Copyright 2001 Gisle Aas.

       This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

perl v5.8.8							    2006-04-08						      DumpXML::Parser(3pm)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Shell script to check the unique numbers in huge data

Discussion started by: namishtiwari

2. Shell Programming and Scripting

Perl script for extract data from xml files

Discussion started by: allways4u21

3. Shell Programming and Scripting

Split a huge data into few different files?!

Discussion started by: patrick87

4. Shell Programming and Scripting

Perl script error to split huge data one by one.

Discussion started by: patrick87

5. Shell Programming and Scripting

running perl script problem

Discussion started by: allways4u21

6. Shell Programming and Scripting

Three Difference File Huge Data Comparison Problem.

Discussion started by: patrick87

7. Shell Programming and Scripting

Help- counting delimiter in a huge file and split data into 2 files

Discussion started by: lv99

8. Shell Programming and Scripting

Perl: Need help comparing huge files

Discussion started by: mrn6430

9. Shell Programming and Scripting

In PErl script: need to read the data one file and generate multiple files based on the data

Discussion started by: Sanjeev G

10. UNIX for Advanced & Expert Users

File comaprsons for the Huge data files ( around 60G) - Need optimized and teh best way to do this

Discussion started by: kartikirans

LEARN ABOUT DEBIAN

data::dumpxml::parser