egrep is very slow : How to improve performance


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting egrep is very slow : How to improve performance
# 1  
Old 02-12-2008
egrep is very slow : How to improve performance

We have an egrep search in a while loop.

egrep -w "$key" ${PICKUP_DIR}/new_update >> ${PICKUP_DIR}/update_record_new

${PICKUP_DIR}/new_update is 210 MB file

In each iteration, the egrep on an average takes around 50-60 seconds to search. Ther'es nothing significant in the loop other than egrep. And when we checked the timestamps, egrep is what slowing it down.

Is it possible to improve egrep's performance ? Or do we need to use perl or any other pattern search ?

Could you please help ?
# 2  
Old 02-12-2008
Quote:
Originally Posted by hidnana
We have an egrep search in a while loop.

egrep -w "$key" ${PICKUP_DIR}/new_update >> ${PICKUP_DIR}/update_record_new

${PICKUP_DIR}/new_update is 210 MB file

In each iteration, the egrep on an average takes around 50-60 seconds to search. Ther'es nothing significant in the loop other than egrep. And when we checked the timestamps, egrep is what slowing it down.

Is it possible to improve egrep's performance ? Or do we need to use perl or any other pattern search ?

Could you please help ?
Does the value of "key" and "PICKUP_DIR" change with each iteration ?

Look into the -f flag of grep.
# 3  
Old 02-12-2008
The value of $key changes on each iteration but ${PICKUP_DIR}/new_update doesn't change
# 4  
Old 02-12-2008
Quote:
Originally Posted by hidnana
The value of $key changes on each iteration but ${PICKUP_DIR}/new_update doesn't change
So look into the -f flag.

Code:
egrep -f <file containing the different values of $key> ${PICKUP_DIR}/new_update

# 5  
Old 02-12-2008
Quote:
Originally Posted by hidnana
We have an egrep search in a while loop.

egrep -w "$key" ${PICKUP_DIR}/new_update >> ${PICKUP_DIR}/update_record_new

Could you please help ?
In addition to the above, can you post an example of this $key? Perhaps using a regex optimizer will help. If readability to external assembly of a $key is done, you could do something like this as well.

Code:
grep -E -w "`regexopt $key`" ...

# 6  
Old 02-12-2008
I have uploaded the $key as a screenshot as I don't have the text version right now..., it's a big string concatenated by "|".

Can you pls. tell me which is better than egrep....
grep.. perl... sed...?
And why should egrep take around 50..60 seconds in an iteration ...?
And will splitting the ${PICKUP_DIR}/new_update file into multiple files and searching each file until a match is found, help in anyway...?
# 7  
Old 02-12-2008
Quote:
Originally Posted by hidnana
I have uploaded the $key as a screenshot as I don't have the text version right now..., it's a big string concatenated by "|".

Can you pls. tell me which is better than egrep....
grep.. perl... sed...?
And why should egrep take around 50..60 seconds in an iteration ...?
And will splitting the ${PICKUP_DIR}/new_update file into multiple files and searching each file until a match is found, help in anyway...?
Are the keys separated by a '|' ? Or is the whole thing a key in itself ?

If the keys are separated by '|', then change the file such that each key is on a new line. Then
Code:
egrep -f key.txt ${PICKUP_DIR}/new_update

I dont know if you will have any advantage in splitting up the file.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Solaris

Rsync quite slow (using very little cpu): how to improve its speed?

I have "inherited" a OmniOS (illumos based) server. I noticed rsync is significantly slower in respect to my reference, FreeBSD 12-CURRENT, running on exactly same hardware. Using same hardware, same command with same source and target disks, OmniOS r151026 gives: test@omniosce:~# time... (11 Replies)
Discussion started by: priyadarshan
11 Replies

2. Shell Programming and Scripting

Improve script - slow process with big files

Gents, Please can u help me to improve this script to be more faster, it works perfectly but for big files take a lot time to end the job.. I see the problem is in the step (while) and in this part the script takes a lot time.. Please if you can find a best way to do will be great. ... (13 Replies)
Discussion started by: jiam912
13 Replies

3. Programming

Improve the performance of my C++ code

Hello, Attached is my very simple C++ code to remove any substrings (DNA sequence) of each other, i.e. any redundant sequence is removed to get unique sequences. Similar to sort | uniq command except there is reverse-complementary for DNA sequence. The program runs well with small dataset, but... (11 Replies)
Discussion started by: yifangt
11 Replies

4. UNIX for Dummies Questions & Answers

How to improve the performance of this script?

Hi , i wrote a script to convert dates to the formate i want .it works fine but the conversion is tkaing lot of time . Can some one help me tweek this script #!/bin/bash file=$1 ofile=$2 cp $file $ofile mydates=$(grep -Po '+/+/+' $ofile) # gets 8/1/13 mydates=$(echo "$mydates" | sort |... (5 Replies)
Discussion started by: vikatakavi
5 Replies

5. Programming

Help with improve the performance of grep

Input file: #content_1 12314345345 242467 #content_14 436677645 576577657 #content_100 3425546 56 #content_12 243254546 1232454 . . Reference file: content_100 (1 Reply)
Discussion started by: cpp_beginner
1 Replies

6. Shell Programming and Scripting

Want to improve the performance of script

Hi All, I have written a script as follows which is taking lot of time in executing/searching only 3500 records taken as input from one file in log file of 12 GB Approximately. Working of script is read the csv file as an input having 2 arguments which are transaction_id,mobile_number and search... (6 Replies)
Discussion started by: poweroflinux
6 Replies

7. Shell Programming and Scripting

Any way to improve performance of this script

I have a data file of 2 gig I need to do all these, but its taking hours, any where i can improve performance, thanks a lot #!/usr/bin/ksh echo TIMESTAMP="$(date +'_%y-%m-%d.%H-%M-%S')" function showHelp { cat << EOF >&2 syntax extreme.sh FILENAME Specify filename to parse EOF... (3 Replies)
Discussion started by: sirababu
3 Replies

8. UNIX for Dummies Questions & Answers

Improve Performance

hi someone tell me which ways i can improve disk I/O and system process performance.kindly refer some commands so i can do it on my test machine.thanks, Mazhar (2 Replies)
Discussion started by: mazhar99
2 Replies

9. Shell Programming and Scripting

How to improve grep performance...

Hi All, I am using grep command to find string "abc" in one file . content of file is *********** abc = xyz def= lmn ************ i have given the below mentioned command to redirect the output to tmp file grep abc file | sort -u | awk '{print #3}' > out_file Then i am searching... (2 Replies)
Discussion started by: pooga17
2 Replies

10. UNIX for Advanced & Expert Users

improve performance by using ls better than find

Hi , i'm searching for files over many Aix servers with rsh command using this request : find /dir1 -name '*.' -exec ls {} \; and then count them with "wc" but i would improve this search because it's too long and replace directly find with ls command but "ls *. " doesn't work. and... (3 Replies)
Discussion started by: Nicol
3 Replies
Login or Register to Ask a Question