Sponsored Content
Top Forums Shell Programming and Scripting Problem running Perl Script with huge data files Post 302436226 by edstertech on Friday 9th of July 2010 05:41:05 PM
Old 07-09-2010
Depending on the task, you could always break the script into 3 scripts:

Script 1 (control script):
Code:
foreach my $t  (@text)
{
    system(text_script.pl $t);
}

foreach my $x  (@xml)
{
    system(xml_script.pl $x);
}

Script 2 (text file processing script):

Code:
my $t = $ARGV[0];
open TEXT, $t or die "Cannot open $t for reading: $!\n";
  
while(my $line=<TEXT>){
   ....My code....
}
close(TEXT);

Script 2 (text file processing script):

Code:
my $x = $ARGV[0];
open XML, $x or die "Cannot open $x for reading: $!\n";
  
while(my $line=<XML>){
   ....My code....
}
close(XML);

This *may* be easier to debug and maintain as well.

Of course this approach won't work if you're trying to collect everything from all the files before doing any data processing...
...but surely you're not trying to do that ...?

Ed
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Shell script to check the unique numbers in huge data

Friends, I have to write a shell script,the description is---- i Have to check the uniqueness of the numbers in a file. A file is containing 200thousand tickets and a ticket have 15 numbers in asecending order.And there is a strip that is having 6 tickets that means 90 numbers.I... (7 Replies)
Discussion started by: namishtiwari
7 Replies

2. Shell Programming and Scripting

Perl script for extract data from xml files

Hi All, Prepare a perl script for extracting data from xml file. The xml data look like as AC StartTime="1227858839" ID="88" ETime="1227858837" DSTFlag="false" Type="2" Duration="303" /> <AS StartTime="1227858849" SigPairs="119 40 98 15 100 32 128 18 131 23 70 39 123 20 120 27 100 17 136 12... (3 Replies)
Discussion started by: allways4u21
3 Replies

3. Shell Programming and Scripting

Split a huge data into few different files?!

Input file data contents: >seq_1 MSNQSPPQSQRPGHSHSHSHSHAGLASSTSSHSNPSANASYNLNGPRTGGDQRYRASVDA >seq_2 AGAAGRGWGRDVTAAASPNPRNGGGRPASDLLSVGNAGGQASFASPETIDRWFEDLQHYE >seq_3 ATLEEMAAASLDANFKEELSAIEQWFRVLSEAERTAALYSLLQSSTQVQMRFFVTVLQQM ARADPITALLSPANPGQASMEAQMDAKLAAMGLKSPASPAVRQYARQSLSGDTYLSPHSA... (7 Replies)
Discussion started by: patrick87
7 Replies

4. Shell Programming and Scripting

Perl script error to split huge data one by one.

Below is my perl script: #!/usr/bin/perl open(FILE,"$ARGV") or die "$!"; @DATA = <FILE>; close FILE; $join = join("",@DATA); @array = split( ">",$join); for($i=0;$i<=scalar(@array);$i++){ system ("/home/bin/./program_name_count_length MULTI_sequence_DATA_FILE -d... (5 Replies)
Discussion started by: patrick87
5 Replies

5. Shell Programming and Scripting

running perl script problem

While executing perl scriptit gives some compling issue, please help out $inputFilename="c:\allways.pl"; open (FILEH,$inputFilename) or die "Could not open log file"; Error : Could not open log file at c:\allways.pl line 4 learner in Perl (1 Reply)
Discussion started by: allways4u21
1 Replies

6. Shell Programming and Scripting

Three Difference File Huge Data Comparison Problem.

I got three different file: Part of File 1 ARTPHDFGAA . . Part of File 2 ARTGHHYESA . . Part of File 3 ARTPOLYWEA . . (4 Replies)
Discussion started by: patrick87
4 Replies

7. Shell Programming and Scripting

Help- counting delimiter in a huge file and split data into 2 files

I’m new to Linux script and not sure how to filter out bad records from huge flat files (over 1.3GB each). The delimiter is a semi colon “;” Here is the sample of 5 lines in the file: Name1;phone1;address1;city1;state1;zipcode1 Name2;phone2;address2;city2;state2;zipcode2;comment... (7 Replies)
Discussion started by: lv99
7 Replies

8. Shell Programming and Scripting

Perl: Need help comparing huge files

What do i need to do have the below perl program load 205 million record files into the hash. It currently works on smaller files, but not working on huge files. Any idea what i need to do to modify to make it work with huge files: #!/usr/bin/perl $ot1=$ARGV; $ot2=$ARGV; open(mfileot1,... (12 Replies)
Discussion started by: mrn6430
12 Replies

9. Shell Programming and Scripting

In PErl script: need to read the data one file and generate multiple files based on the data

We have the data looks like below in a log file. I want to generat files based on the string between two hash(#) symbol like below Source: #ext1#test1.tale2 drop #ext1#test11.tale21 drop #ext1#test123.tale21 drop #ext2#test1.tale21 drop #ext2#test12.tale21 drop #ext3#test11.tale21 drop... (5 Replies)
Discussion started by: Sanjeev G
5 Replies

10. UNIX for Advanced & Expert Users

File comaprsons for the Huge data files ( around 60G) - Need optimized and teh best way to do this

I have 2 large file (.dat) around 70 g, 12 columns but the data not sorted in both the files.. need your inputs in giving the best optimized method/command to achieve this and redirect the not macthing lines to the thrid file ( diff.dat) File 1 - 15 columns File 2 - 15 columns Data is... (9 Replies)
Discussion started by: kartikirans
9 Replies
Locale::Script(3pm)					 Perl Programmers Reference Guide				       Locale::Script(3pm)

NAME
Locale::Script - ISO codes for script identification (ISO 15924) SYNOPSIS
use Locale::Script; use Locale::Constants; $script = code2script('ph'); # 'Phoenician' $code = script2code('Tibetan'); # 'bo' $code3 = script2code('Tibetan', LOCALE_CODE_ALPHA_3); # 'bod' $codeN = script2code('Tibetan', LOCALE_CODE_ALPHA_NUMERIC); # 330 @codes = all_script_codes(); @scripts = all_script_names(); DESCRIPTION
The "Locale::Script" module provides access to the ISO codes for identifying scripts, as defined in ISO 15924. For example, Egyptian hieroglyphs are denoted by the two-letter code 'eg', the three-letter code 'egy', and the numeric code 050. You can either access the codes via the conversion routines (described below), or with the two functions which return lists of all script codes or all script names. There are three different code sets you can use for identifying scripts: alpha-2 Two letter codes, such as 'bo' for Tibetan. This code set is identified with the symbol "LOCALE_CODE_ALPHA_2". alpha-3 Three letter codes, such as 'ell' for Greek. This code set is identified with the symbol "LOCALE_CODE_ALPHA_3". numeric Numeric codes, such as 410 for Hiragana. This code set is identified with the symbol "LOCALE_CODE_NUMERIC". All of the routines take an optional additional argument which specifies the code set to use. If not specified, it defaults to the two- letter codes. This is partly for backwards compatibility (previous versions of Locale modules only supported the alpha-2 codes), and partly because they are the most widely used codes. The alpha-2 and alpha-3 codes are not case-dependent, so you can use 'BO', 'Bo', 'bO' or 'bo' for Tibetan. When a code is returned by one of the functions in this module, it will always be lower-case. SPECIAL CODES The standard defines various special codes. o The standard reserves codes in the ranges qa - qt, qaa - qat, and 900 - 919, for private use. o zx, zxx, and 997, are the codes for unwritten languages. o zy, zyy, and 998, are the codes for an undetermined script. o zz, zzz, and 999, are the codes for an uncoded script. The private codes are not recognised by Locale::Script, but the others are. CONVERSION ROUTINES
There are three conversion routines: "code2script()", "script2code()", and "script_code2code()". code2script( CODE, [ CODESET ] ) This function takes a script code and returns a string which contains the name of the script identified. If the code is not a valid script code, as defined by ISO 15924, then "undef" will be returned: $script = code2script('cy'); # Cyrillic script2code( STRING, [ CODESET ] ) This function takes a script name and returns the corresponding script code, if such exists. If the argument could not be identified as a script name, then "undef" will be returned: $code = script2code('Gothic', LOCALE_CODE_ALPHA_3); # $code will now be 'gth' The case of the script name is not important. See the section "KNOWN BUGS AND LIMITATIONS" below. script_code2code( CODE, CODESET, CODESET ) This function takes a script code from one code set, and returns the corresponding code from another code set. $alpha2 = script_code2code('jwi', LOCALE_CODE_ALPHA_3 => LOCALE_CODE_ALPHA_2); # $alpha2 will now be 'jw' (Javanese) If the code passed is not a valid script code in the first code set, or if there isn't a code for the corresponding script in the second code set, then "undef" will be returned. QUERY ROUTINES
There are two function which can be used to obtain a list of all codes, or all script names: "all_script_codes ( [ CODESET ] )" Returns a list of all two-letter script codes. The codes are guaranteed to be all lower-case, and not in any particular order. "all_script_names ( [ CODESET ] )" Returns a list of all script names for which there is a corresponding script code in the specified code set. The names are capitalised, and not returned in any particular order. EXAMPLES
The following example illustrates use of the "code2script()" function. The user is prompted for a script code, and then told the corresponding script name: $| = 1; # turn off buffering print "Enter script code: "; chop($code = <STDIN>); $script = code2script($code, LOCALE_CODE_ALPHA_2); if (defined $script) { print "$code = $script "; } else { print "'$code' is not a valid script code! "; } KNOWN BUGS AND LIMITATIONS
o When using "script2code()", the script name must currently appear exactly as it does in the source of the module. For example, script2code('Egyptian hieroglyphs') will return eg, as expected. But the following will all return "undef": script2code('hieroglyphs') script2code('Egyptian Hieroglypics') If there's need for it, a future version could have variants for script names. o In the current implementation, all data is read in when the module is loaded, and then held in memory. A lazy implementation would be more memory friendly. SEE ALSO
Locale::Language ISO two letter codes for identification of language (ISO 639). Locale::Currency ISO three letter codes for identification of currencies and funds (ISO 4217). Locale::Country ISO three letter codes for identification of countries (ISO 3166) ISO 15924 The ISO standard which defines these codes. http://www.evertype.com/standards/iso15924/ Home page for ISO 15924. AUTHOR
Neil Bowers <neil@bowers.com> COPYRIGHT
Copyright (c) 2002-2004 Neil Bowers. This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself. perl v5.12.1 2010-05-13 Locale::Script(3pm)
All times are GMT -4. The time now is 03:53 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy