The UNIX and Linux Forums  

Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
Google UNIX.COM


Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts here.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Improve PHP Performance by Caching Database Results iBot Oracle Updates (RSS) 0 04-06-2008 02:10 AM
How to improve grep performance... pooga17 Shell Programming and Scripting 2 02-13-2008 04:34 AM
improve performance by using ls better than find Nicol UNIX for Advanced & Expert Users 3 03-05-2004 05:53 AM
Help! Slow Performance Neo Post Here to Contact Site Administrators and Moderators 6 08-25-2003 12:08 PM

Reply
 
Submit Tools LinkBack Thread Tools Display Modes
  #1  
Old 02-12-2008
Registered User
 

Join Date: Feb 2008
Posts: 10
egrep is very slow : How to improve performance

We have an egrep search in a while loop.

egrep -w "$key" ${PICKUP_DIR}/new_update >> ${PICKUP_DIR}/update_record_new

${PICKUP_DIR}/new_update is 210 MB file

In each iteration, the egrep on an average takes around 50-60 seconds to search. Ther'es nothing significant in the loop other than egrep. And when we checked the timestamps, egrep is what slowing it down.

Is it possible to improve egrep's performance ? Or do we need to use perl or any other pattern search ?

Could you please help ?
Reply With Quote
Forum Sponsor
  #2  
Old 02-12-2008
vino's Avatar
Supporter (in vino veritas)
 

Join Date: Feb 2005
Location: Bangalore, India
Posts: 2,683
Quote:
Originally Posted by hidnana View Post
We have an egrep search in a while loop.

egrep -w "$key" ${PICKUP_DIR}/new_update >> ${PICKUP_DIR}/update_record_new

${PICKUP_DIR}/new_update is 210 MB file

In each iteration, the egrep on an average takes around 50-60 seconds to search. Ther'es nothing significant in the loop other than egrep. And when we checked the timestamps, egrep is what slowing it down.

Is it possible to improve egrep's performance ? Or do we need to use perl or any other pattern search ?

Could you please help ?
Does the value of "key" and "PICKUP_DIR" change with each iteration ?

Look into the -f flag of grep.
Reply With Quote
  #3  
Old 02-12-2008
Registered User
 

Join Date: Feb 2008
Posts: 10
The value of $key changes on each iteration but ${PICKUP_DIR}/new_update doesn't change
Reply With Quote
  #4  
Old 02-12-2008
vino's Avatar
Supporter (in vino veritas)
 

Join Date: Feb 2005
Location: Bangalore, India
Posts: 2,683
Quote:
Originally Posted by hidnana View Post
The value of $key changes on each iteration but ${PICKUP_DIR}/new_update doesn't change
So look into the -f flag.

Code:
egrep -f <file containing the different values of $key> ${PICKUP_DIR}/new_update
Reply With Quote
  #5  
Old 02-12-2008
HPAVC's Avatar
Registered User
 

Join Date: Feb 2008
Posts: 105
Quote:
Originally Posted by hidnana View Post
We have an egrep search in a while loop.

egrep -w "$key" ${PICKUP_DIR}/new_update >> ${PICKUP_DIR}/update_record_new

Could you please help ?
In addition to the above, can you post an example of this $key? Perhaps using a regex optimizer will help. If readability to external assembly of a $key is done, you could do something like this as well.

Code:
grep -E -w "`regexopt $key`" ...
Reply With Quote
  #6  
Old 02-12-2008
Registered User
 

Join Date: Feb 2008
Posts: 10
I have uploaded the $key as a screenshot as I don't have the text version right now..., it's a big string concatenated by "|".

Can you pls. tell me which is better than egrep....
grep.. perl... sed...?
And why should egrep take around 50..60 seconds in an iteration ...?
And will splitting the ${PICKUP_DIR}/new_update file into multiple files and searching each file until a match is found, help in anyway...?
Attached Images
File Type: bmp egrep-issue.bmp (614.3 KB, 4 views)
Reply With Quote
  #7  
Old 02-12-2008
vino's Avatar
Supporter (in vino veritas)
 

Join Date: Feb 2005
Location: Bangalore, India
Posts: 2,683
Quote:
Originally Posted by hidnana View Post
I have uploaded the $key as a screenshot as I don't have the text version right now..., it's a big string concatenated by "|".

Can you pls. tell me which is better than egrep....
grep.. perl... sed...?
And why should egrep take around 50..60 seconds in an iteration ...?
And will splitting the ${PICKUP_DIR}/new_update file into multiple files and searching each file until a match is found, help in anyway...?
Are the keys separated by a '|' ? Or is the whole thing a key in itself ?

If the keys are separated by '|', then change the file such that each key is on a new line. Then
Code:
egrep -f key.txt ${PICKUP_DIR}/new_update
I dont know if you will have any advantage in splitting up the file.
Reply With Quote
Google The UNIX and Linux Forums
Reply

Thread Tools
Display Modes




All times are GMT -7. The time now is 05:33 AM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited.
The UNIX and Linux Forums Content Copyright ©1993-2008. All Rights Reserved.Ad Management by RedTyger Visit The Complex Event Processing Blog

Content Relevant URLs by vBSEO 3.2.0