Visit Our UNIX and Linux User Community


Perl Array / pattern match large CPU usage


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Perl Array / pattern match large CPU usage
# 1  
Old 10-05-2009
Perl Array / pattern match large CPU usage

Hi,

I have one file in this format

Code:
20  value1
33   value2
56   value3

I have another file in this format:

Code:
34,30-SEP-09,57,100237775,33614510126,2,34
34,30-SEP-09,57,100237775,33620766654,2,34
34,30-SEP-09,108,100237775,33628458122,2,34
34,30-SEP-09,130,100237775,33635266741,2,254
34,30-SEP-09,135,100237775,33634650517,2,254
34,30-SEP-09,149,100237775,33660985888,2,34

What I want to do is for each value in the first column of the first file (lets call it x) I want to print the last column in the 2nd file if the 5th column begins with x.

So for the above, the first value in the first column of the first file is 20. None of the column4 values in the 2nd file begin with 20 so I don't want to print anything. For 33 from the first file I would want to print the value in the last column of the 2nd file for each line e.g 34,34,34,254,254,34.

I have this, it works but my CPU usage for this process is about 30%! Is there a way to make this more efficient?

Code:
#!/usr/local/bin/perl

use lib "/usr/local/include/modules";
use Getopt::Long;
use File::Copy;
use JDFunction;
use JDProcess;
use JDInput;
#use strict;
use FileHandle;
use IO;
use POSIX qw(strftime);
use Time::Local;

$file_name="/export/home/file2";
open(DATA, $file_name) || die ("Could not open file2!");
@my_data=<DATA>;
close(DATA);

$first_file="/export/home/file1";
open $FIRST_DATA, "<", $first_file or die "Could not open first data file";
while ( $line = <$FIRST_DATA> )
{
chomp ($line);
($code, $value) = split(/\t/, $line);

                foreach $line (@my_data)
                {
                chomp($line);
                ($id,$day,$ref,$valuey,$valuez,$evt_type,$subcode)=split(/,/,$line);

                        if ( $valuez =~ /^$code/ )
                        {
                        print "$code, $subcode\n";
                        }
                }

}
close $FIRST_DATA;

If you know a way to make this more efficient I'd be grateful!

Thanks
# 2  
Old 10-05-2009
1. It is not working straight forward.
After doing a couple of changes like, changing the extracting the first 2 chars from valuez, instead of whole valuez, and placing valuez in the regex.

2. Your program does not seem to take like that much process ?!
If it takes also, what is the issue ?

3. If it takes time, you can ask for optimization -- it is not so ?

4. I think you are missed something while extracting the required code or some misunderstanding is there ?
# 3  
Old 10-05-2009
Hi,

Thanks for your response.

I can't extract the first 2 characters of valuez as sometimes the value in the first field of the first file can be up to 4 digits long.

The 2nd file I'm processing is quite big - around 64,000 files.

I'm not sure what you mean by asking for optimisation - is that a perl option?

Thanks again for your help
# 4  
Old 10-05-2009
What you describe could be easily achieved with the following code, try to see if it's faster.

Code:
#! /usr/bin/env perl

use warnings;
use strict;

my %f1;

die "usage: $0 <file1> <file2>\n" unless @ARGV == 2;

my ( $f1, $f2 ) = @ARGV;

open my $F1, '<', $f1 or die "open $f1: $!\n";

$f1{ (split)[0] } = undef while <$F1>;

close $F1 or warn "close $f1: $!\n";

open my $F2, '<', $f2 or die "open $f2: $!\n";

while (<$F2>) {
    my @fields = split ',';
    grep $fields[4] =~ /^$_/, keys %f1 and print $fields[-1];
}

close $F2 or warn "close $f2: $!\n";

# 5  
Old 10-05-2009
Put this into a file called other.awk

Code:
(NR == FNR) {
        a[$1] = $1
        print
}

(NR != FNR) {
        split($0, b,",")
        for( i in a)
        {
        pat = "^" a[i]
        if (  b[5] ~ pat)
         print b[7]
        }
}

Then use this command line prototype:

Code:
awk -f other.awk <value-file> <data-file>

# 6  
Old 10-07-2009
Hi,

Radoulov, I'm not quite sure how that code works. I'm trying to modify it so I can also print the first values from File1 (you seem to have entered these into a hash and then use this hash to search the second file).

Can you please tell me how I can modify it to print the fields[-1] value and the keys value?

Many thanks
# 7  
Old 10-07-2009
Like this?

Code:
#! /usr/bin/env perl

use warnings;
use strict;

my %f1;

die "usage: $0 <file1> <file2>\n" unless @ARGV == 2;

my ( $f1, $f2 ) = @ARGV;

open my $F1, '<', $f1 or die "open $f1: $!\n";

$f1{ (split)[0] } = undef while <$F1>;

close $F1 or warn "close $f1: $!\n";

open my $F2, '<', $f2 or die "open $f2: $!\n";

while (<$F2>) {
    my @fields = split ',';
    map { $fields[4] =~ /^$_/ and print $_, " ", $fields[-1] } keys %f1;
}

close $F2 or warn "close $f2: $!\n";


Previous Thread | Next Thread
Test Your Knowledge in Computers #961
Difficulty: Medium
SQL MammyJammer was a computer worm that caused a denial of service on some Internet hosts and dramatically slowed down general Internet traffic in 2003.
True or False?

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Perl : Large amount of data put into an array

This basic code works. I have a very long list, almost 10000 lines that I am building into the array. Each line has either 2 or 3 fields as shown in the code snippit. The array elements are static (for a few reasons that out of scope of this question) the list has to be "built in". It... (5 Replies)
Discussion started by: sumguy
5 Replies

2. Shell Programming and Scripting

Help needed - Split large file into smaller files based on pattern match

Help needed urgently please. I have a large file - a few hundred thousand lines. Sample CP START ACCOUNT 1234556 name 1 CP END ACCOUNT CP START ACCOUNT 2224444 name 1 CP END ACCOUNT CP START ACCOUNT 333344444 name 1 CP END ACCOUNT I need to split this file each time "CP START... (7 Replies)
Discussion started by: frustrated1
7 Replies

3. Shell Programming and Scripting

Perl match pattern

Hi all, i have a peice of Perl script like this: foreach (@line) { @tmp = split /;/,$_; #print "Line is: $_\n"; switch($tmp){ case m/p60/i { push @p60, , $tmp ]; ... (7 Replies)
Discussion started by: arrals_vl
7 Replies

4. HP-UX

Perl script limit cpu usage

Hi Experts, I am executing multiple instances(in parallel) of perl script on HP-UX box. OS is allocating substantial amount of CPU to these perl processes,resulting higher cpu utilization. Glance always shows perl processes are occupying majority of the CPU resource. It is causing slower... (2 Replies)
Discussion started by: sai_2507
2 Replies

5. Shell Programming and Scripting

Match Pattern and store next value into array

Hi, I am trying to write a script which parses a log file and will eventually put the values in an array so that I can perform some math on it. In this file I am only interested in the last 200 lines so here is the command I use to display the contents in a manageable manner. tail -200... (3 Replies)
Discussion started by: Filter500
3 Replies

6. Shell Programming and Scripting

PERL : Read an array and write to another array with intial string pattern checks

I have an array and two variables as below, I need to check if $datevar is present in $filename. If so, i need to replace $filename with the values in the array. I need the output inside an ARRAY How can this be done. Any help will be appreciated. Thanks in advance. (2 Replies)
Discussion started by: irudayaraj
2 Replies

7. Shell Programming and Scripting

perl pattern match on xml

using perl Hi All, i was wondering if anyone can solve how to extract the full tag from the xml line ie not sure what to put in the m// to get the string "/data/TOP471//context_data/instruments.txt" I basically want the above filename in a variable for further processing... $_ =" ... (0 Replies)
Discussion started by: satnamx
0 Replies

8. Shell Programming and Scripting

Perl Pattern Match

Hi Friends, I have a tuff time with regular expressionss. Please let me know how to make this happen as it consumed lots of my time but in vain. Here is the sample text file i need to match for. I need to search for pattern1 removed, if it matches then search for pattern types either SE\ or... (2 Replies)
Discussion started by: nmattam
2 Replies

9. AIX

How to monitor the IBM AIX server for I/O usage,memory usage,CPU usage,network..?

How to monitor the IBM AIX server for I/O usage, memory usage, CPU usage, network usage, storage usage? (3 Replies)
Discussion started by: laknar
3 Replies

10. HP-UX

how can I find cpu usage memory usage swap usage and logical volume usage

how can I find cpu usage memory usage swap usage and I want to know CPU usage above X% and contiue Y times and memory usage above X % and contiue Y times my final destination is monitor process logical volume usage above X % and number of Logical voluage above can I not to... (3 Replies)
Discussion started by: alert0919
3 Replies

Featured Tech Videos