I have a script that processes a fair amount of data -- say, 25-50 megs per run. I'd like ideas on speeding it up. The code is actually just a preprocessor -- I'm using another language to do the heavy lifting. But as it happens, the preprocessing takes much more time than the final processing so I'm optimizing this rather than that.
Here's the code. The basic idea is that, for each line of input (redirected to stdin), the program checks to see if the sequence number is in $mult and, if so, prints a line asking the other program to validate that sequence:
The file mult.txt is a short file of about a thousand lines, each of which is guaranteed to contain at most (exactly?) one line of the form A\d\d\d\d\d\d; the rest of the line is irrelevant here.
My thought for optimizing this: make an array of the \d\d\d\d\d\d values, sort, and do a binary search rather than a regular expression at the spot marked "Replace this line?". But I'm not sure how to go about that, or even if that's the 'right' optimization. Thoughts?
Also, any suggestions on making better idiomatic use of Perl would be appreciated. I'm not at all accustomed to the language.
Not much of a speed thing, however.
According 'man perlvar' this is a no no...
The proper method would be to keep it local($/) to the smallest block... ie:
Hash it!
Also for better assistance a snippit of 'mult.txt' and a snippit of data would be very helpful in providing good useful information.
-Enjoy
fh : )_~
---------- Post updated at 06:23 PM ---------- Previous update was at 12:14 AM ----------
Thought I would tweek this a bit for ya!
I am new to Perl, My first line of Perl was just over a week ago.. (08/26/2009)
Any comments are very welcome!
3 examples depending on what you really want/need!
[edit]
NOTE:
After some thought I felt it better to modify Example 2 for cases of dirty data...
[/edit]
I am ASSUMING your data looks something like:
I am ASSUMING your mult.txt is something like this:
Example 1, As close to your original as possible without waste.
Example 2, A bit cleaner
Example 3, Lean and mean with the need for speed!
NOTE: The regex changes!
Hope this gets things going a bit faster for ya!
-Enjoy
fh : )_~
Last edited by Festus Hagen; 09-05-2009 at 12:12 AM..
Reason: regex change Example 3 / Modify #2 for dirty data
Hello,
I am basic level shell script developer. I have developed the following script. The shell script basically tracking various files containing certain strings. I am finding options to make the script run more faster. Any help/suggestion would be appreciated :)
#! /bin/bash
# Greps for... (6 Replies)
Hi
I have written a shell script which will test 300 to 500 IPs to find which are pinging and which are not pinging.
the script which give output as
10.x.x.x is pining
10.x.x.x. is not pining
-
-
-
10.x.x.x is pining
like above.
But, this script is taking... (6 Replies)
hey guys i have a perl script wich use to compare hashes but it tookes a long time to do that so i wich i will have the soulition to do it soo fast
he is the code
<redacted> (1 Reply)
Hi,
Im quite new to scripting and would like a bit of assistance with trying to speed up the following script. At the moment it is quite slow....
Any way to improve it?
total=111120
while
do
total=`expr $total + 1`
INCREMENT=$total
firstline = "blablabla"
secondline = "blablabla"... (5 Replies)
I had written a perl script to compare two files: new and master and get the output of the first file i.e. the first file: words that are not in the master file
STRUCTURE OF THE TWO FILES
The first file is a series of names
ramesh
sushil
jonga
sudesh
lugdi
whereas the second file (could be... (4 Replies)
Hi Guys,
I have a script that I am using to convert some text files to xls files. I create multiple temp. files in the process of conversion. Other than reducing the temp. files, are there any general tricks to help speed up the script?
I am running it in the bash shell.
Thanks. (6 Replies)
I analysed disk performance with blktrace and get some data:
read:
8,3 4 2141 2.882115217 3342 Q R 195732187 + 32
8,3 4 2142 2.882116411 3342 G R 195732187 + 32
8,3 4 2144 2.882117647 3342 I R 195732187 + 32
8,3 4 2145 ... (1 Reply)
Hey together,
You should know, that I'am relatively new to shell scripting, so my solution is probably a little awkward.
Here is the script:
#!/bin/bash
live_dir=/var/lib/pokerhands/live
for limit in `find $live_dir/ -type d | sed -e s#$live_dir/##`; do
cat $live_dir/$limit/*... (19 Replies)
hi i have a script that is taking the difference of multiple columns in a file from a value from a single row..so far i have a loop to do that.. all the data is floating point..fin has the difference between array1 and array2..array1 has 700 x 300= 210000 values and array2 has 700 values..
... (11 Replies)
Hello,
I have a Supermicro server with a P4SCI mother board running Debian Sarge 3.1. This is the "dmidecode" output related to RAM info:
RAM speed information is incomplete.. "Current Speed: Unknown", is there anyway/soft to get the speed of installed RAM modules? thanks!!
Regards :)... (0 Replies)