Performance improvement in grep


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Performance improvement in grep
# 1  
Old 09-17-2014
IBM Performance improvement in grep

Below script is used to search numeric data from around 400 files in a folder. I have 300 such folders. Need help in performance improvement in the script.

Below Script searches 20 such folders ( 300 files in each folder) simultaneously. This increases cpu utilization upto 90% What changes will improve its performance?

Code:
  find . -type f -name "*.txt" -print | xargs cat | egrep 99023 >> myresult.text &

# 2  
Old 09-17-2014
Not sure if it will make much of a difference, but something like this:
Code:
find . -type f -name '*.txt' -exec grep -F 99023 {} + > myresult.text

You can use grep -Fh is you want to omit the file names...

---
When you say increases CPU utilization to 90%, why would that be bad performance? I can imagine if it takes too long that you should like to speed it up, but just the fact that it temporarily uses a lot of CPU is not necessarily bad..

Last edited by Scrutinizer; 09-17-2014 at 04:47 PM..
This User Gave Thanks to Scrutinizer For This Post:
# 3  
Old 09-17-2014
I directly used egrep in the folder, instead of using find & cat... but still performance issue is there. Any other command will give better performance... ? Please advise.

---------- Post updated at 09:12 PM ---------- Previous update was at 09:11 PM ----------

@Scrutinizer
Thanks for your input, will try it out !!!
# 4  
Old 09-17-2014
Take a leaf out of googles book - index your data.

They do billions of searches on 30 trillion web pages every month.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

I need the improvement for my script

Hi All, Here is my script #! /bin/sh var1=some email id var2=some email id grep -i "FAILED FILE FORMAT VALIDATION" /opt >tmp2 diff tmp1 tmp2 | grep ">" >tmp3 if then cat tmp3 | mailx -s " Error Monitoring" $var2 else echo "Pattern NOt Found" | mailx -s " Error Monitoring" $var1... (1 Reply)
Discussion started by: Gopalak
1 Replies

2. HP-UX

Performance issue with 'grep' command for huge file size

I have 2 files; one file (say, details.txt) contains the details of employees and another file (say, emp.txt) has some selected employee names. I am extracting employee details from details.txt by using emp.txt and the corresponding code is: while read line do emp_name=`echo $line` grep -e... (7 Replies)
Discussion started by: arb_1984
7 Replies

3. Programming

Help with improve the performance of grep

Input file: #content_1 12314345345 242467 #content_14 436677645 576577657 #content_100 3425546 56 #content_12 243254546 1232454 . . Reference file: content_100 (1 Reply)
Discussion started by: cpp_beginner
1 Replies

4. Shell Programming and Scripting

Severe performance issue while 'grep'ing on large volume of data

Background ------------- The Unix flavor can be any amongst Solaris, AIX, HP-UX and Linux. I have below 2 flat files. File-1 ------ Contains 50,000 rows with 2 fields in each row, separated by pipe. Row structure is like Object_Id|Object_Name, as following: 111|XXX 222|YYY 333|ZZZ ... (6 Replies)
Discussion started by: Souvik
6 Replies

5. Infrastructure Monitoring

Possible performance improvement (Bash and flat file)

Hello, I am pretty new to shell scripts and I recently wrote one that seems to do what it should but I am exploring the possibility of improving its performance and would appreciate some help. Here is what it does - Its meant to monitor a bunch of systems (reads in IPs one at a time from a flat... (9 Replies)
Discussion started by: prafulnama
9 Replies

6. Shell Programming and Scripting

performance of shell script ( grep command)

Hi, I have to find out the run time for 40-45 different componets. These components writes in to a genreric log file in a single directory. eg. directory is LOG and the log file name format is generic_log_<process_id>_<date YY_MM_DD_HH_MM_SS>.log i am taking the run time using the time... (3 Replies)
Discussion started by: vikash_k
3 Replies

7. Shell Programming and Scripting

sed / grep / for statement performance - please help

I'm searching the most effective way of doing the following task, so if someone can either provide a working solution with sed or one totally different but more effective then what I've got so far then please go ahead! The debugme directory has 3 subdirectorys and each of them has one .txt file... (7 Replies)
Discussion started by: TehOne
7 Replies

8. Shell Programming and Scripting

Any improvement possible in this script

Hi! Thank you for the help yesterday This is the finished product There is one more thing I would like to do to it but I’m not to certain On how to proceed I would like to log all output to a log in order to Be able to roll back This script is meant to be used in repairing a... (4 Replies)
Discussion started by: Ex-Capsa
4 Replies

9. Programming

File - reading - Performance improvement

Hi All I am reading a huge file of size 2GB atleast. I am reading each line and cutting certain columns and writing it to another file. Here is the logic. int main() { string u_line; string Char_List; string u_file; int line_pos; string temp_form_u_file; ... (10 Replies)
Discussion started by: dhanamurthy
10 Replies

10. Shell Programming and Scripting

How to improve grep performance...

Hi All, I am using grep command to find string "abc" in one file . content of file is *********** abc = xyz def= lmn ************ i have given the below mentioned command to redirect the output to tmp file grep abc file | sort -u | awk '{print #3}' > out_file Then i am searching... (2 Replies)
Discussion started by: pooga17
2 Replies
Login or Register to Ask a Question