Better way to run this perl command


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Better way to run this perl command
# 1  
Old 05-30-2012
Better way to run this perl command

i'm working with files that are huge in size. over 3GB. and i need to do a lot of pattern matching. I need a way to grep for what i want, using a tool that is available across most unix systems.

i initially was gungho about grep, but not all capablities of grep are available on all OSes.

so im focusing on a command like this. how can i make it run faster?

Code:
date ; perl -ne 'print if /core.manager.referral/' $DATAFILE | perl -ne 'print if /ERROR getting Referral/' | wc -l ; date

this code searches for all lines that contain both "core.manager.referral " AND "ERROR getting Referral"

shell = bash
os = linux red hat, sunos
datafile = 4GB
# 2  
Old 05-30-2012
Not sure I understand

Seems all you need is plain-vanilla grep, not any fancy options.
Grep on what you think will have the fewest matches, to make a much smaller file. Then grep on the other option.

Or, are you trying to determine if perl & if is faster than grep?
# 3  
Old 05-30-2012
It would be good if you can post some small part of your input file
# 4  
Old 05-30-2012
Quote:
Originally Posted by joeyg
Seems all you need is plain-vanilla grep, not any fancy options.
Grep on what you think will have the fewest matches, to make a much smaller file. Then grep on the other option.

Or, are you trying to determine if perl & if is faster than grep?
yes i'm trying to see if perl is faster than grep.

and posting the input file is not an option at all. and the strings i picked out to search for necessary. i deliberately picked them.
# 5  
Old 05-30-2012
perl is massively huger than grep, that's a good sign it's not going to be faster. Running it twice, that's certainly not going to be faster. Piping it into wc -l is going to be slower yet.

How about awk? Getting rid of pipes and external utilities is a good way to speed up a program, and awk can easily do everything you want here in one program with no pipes or external utilities (besides awk itself of course).
Code:
awk '/core.manager.referral/ && /ERROR getting Referral/ { N++} END { print N }' filename

# 6  
Old 05-30-2012
Quote:
Originally Posted by Corona688
perl is massively huger than grep, that's a good sign it's not going to be faster. Running it twice, that's certainly not going to be faster. Piping it into wc -l is going to be slower yet.

How about awk? Getting rid of pipes and external utilities is a good way to speed up a program, and awk can easily do everything you want here in one program with no pipes or external utilities (besides awk itself of course).
Code:
awk '/core.manager.referral/ && /ERROR getting Referral/ { N++} END { print N }' filename


thank you. i tried this, and the command is still slow to finish.

im open to any one-liner type of command like this that will be the fastest.
# 7  
Old 05-30-2012
Quote:
Originally Posted by SkySmart
thank you. i tried this, and the command is still slow to finish.
You are working with FOUR GIGABYTES of data. Of course it's slow. What's the file size, divided by your maximum disk speed? How much RAM do you have for cache? What's the bus speed of your RAM?

I don't think you're going to improve much on the awk version. You might try tricks with index() instead of regexes.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Shell script run in a case statement call to run a php file, also Perl

Linux System having all Perl, Python, PHP (and Ruby) installed From a Shell script, can call a Perl, Python, PHP (or Ruby ?) file eg eg a Shell script run in a case statement call to run a php file, also Perl or/and Python file??? Like #!/usr/bin/bash .... .... case $INPUT_STRING... (1 Reply)
Discussion started by: hoyanet
1 Replies

2. Shell Programming and Scripting

Perl: How to run an POST/GET command via proxy?

Please what is the most simple way to run some perl command via proxy? i have perl script an in it: POST **something** GET **something** what should i do before it so its proxified via proxy without authentiffication, like 1.2.3.4:1080? any way to run whole .pl script via proxy? (3 Replies)
Discussion started by: postcd
3 Replies

3. Shell Programming and Scripting

Script for telnet and run one command kill it and run another command using while loop

( sleep 3 echo ${LOGIN} sleep 2 echo ${PSWD} sleep 2 while read line do echo "$line" PID=$? sleep 2 kill -9 $PID done < temp sleep 5 echo "exit" ) | telnet ${HOST} while is executing only command and exits. (5 Replies)
Discussion started by: sooda
5 Replies

4. Shell Programming and Scripting

Run perl command in script[solved]

Hi all, When I put the Perl command in a script, I got error. system("perl -pi -e 's@words@words@g' myFile"); The error is: Unrecognized character \x8A; marked by <-- HERE after دت مد�<-- HERE near column 15 at -e line 1. Thanks in advance. ---------- Post updated at 06:30 AM... (0 Replies)
Discussion started by: Lham
0 Replies

5. Shell Programming and Scripting

Run perl script, with command-line options

Hello everyone, I have a perl script which takes various command line options from user like : test.pl -i <input_file> -o <output_file> -d <value> -c <value> Now I have multiple input files in a directory: <input_file_1> <input_file_2> <input_file_3> <input_file_4> ..... .... ...... (6 Replies)
Discussion started by: ad23
6 Replies

6. Windows & DOS: Issues & Discussions

Cannot run command line scripts in perl or gawk

I originally posted this to a different forum (I am a new Perl user) and realized the error so I will ask here. I am on a WindowsXP machine trying to run perl and gawk scripts from the command line. I have perl and gawk installed and environment set to C:\perl\bin and cannot get a script to... (2 Replies)
Discussion started by: 10000springs
2 Replies

7. Shell Programming and Scripting

How to run the particular command continously for 30 mins in perl?

Hi, I have command that should run continuously for 30 mins but not every day not once in a week , not one in a month. whenever i call that particular program that command should run for 30 mins and stop. #Contents of test.pl `ls -l *.txt`; #some other lines of code to print ... (1 Reply)
Discussion started by: vanitham
1 Replies

8. Shell Programming and Scripting

Run system command in perl cgi

Hi guys, got a problem with a perl cgi script over here. I need it to run a system command to get the status of a process. Unfortunately the process is owned by a specific user and only this user can get its status. So i tried running the command from the perl cgi with "su", but then i get the... (12 Replies)
Discussion started by: polki
12 Replies

9. Shell Programming and Scripting

Run the command inside perl script

I have a command which will run fine in a unix command prompt. Can you tell how to interprete this command inside perl script...... The command is : perl -pe 's/(\|333\}.*)\}$/$1|1.6}/' FIA.txt This will search for the number 333 and appends 1.6 at the end of that line....... (1 Reply)
Discussion started by: vinay123
1 Replies

10. Shell Programming and Scripting

Perl run system command

Can perl execute a system command similar to the C function System()? Thanks. Gregg (1 Reply)
Discussion started by: gdboling
1 Replies
Login or Register to Ask a Question