How to improve grep performance...


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting How to improve grep performance...
# 1  
Old 02-13-2008
Question How to improve grep performance...

Hi All,

I am using grep command to find string "abc" in one file .

content of file is
***********
abc = xyz
def= lmn
************

i have given the below mentioned command to redirect the output to tmp file

grep abc file | sort -u | awk '{print #3}' > out_file

Then i am searching content of out_file in muliple files... by using below mentioned command..

grep -f out_file l*view_data_file

but the same is very slow..is there any way i can improve grep performance
Thanks in advance
# 2  
Old 02-13-2008
Quote:
Originally Posted by pooga17

Code:
grep abc file | sort -u | awk '{print #3}' > out_file

I think you mean $3, not #3.

Quote:
Then i am searching content of out_file in muliple files... by using below mentioned command..

grep -f out_file l*view_data_file
What's with the |* ? Is that a typo?

Do you need to know which file contains the string? If not, it would be faster to merge all the files together, and then do the grep.

Code:
cat *data_files.dat  | grep -f out_file

Otherwise, you can do a parallelized search, assuming you can take advantage of a multi-CPU system:
Code:
for f in *data_files.dat ; do
   grep -f out_file $f  >>grep-out.$$  & 
done
wait
cat grep-out.$$

Of course, if there are thousands of dat files, this might bring the system "to its knees". In that case, you can have each grep do 5 at a time.

Code:
ls -1 *data_files.dat | 
while read f1; do
  read f2
  read f3
  read f4
  read f5
  grep -f out_file $f1 $f2 $f3 $f4 $f5 >>grep-out.$$ &
done

If any files contain spaces or strange characters, you'll need to enclose each variable in double-quotes.
# 3  
Old 02-13-2008
depending on the number and size of files you are searching, and assuming you are using fixed-character strings, you may see better performance with fgrep rather than grep.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Programming

Improve the performance of my C++ code

Hello, Attached is my very simple C++ code to remove any substrings (DNA sequence) of each other, i.e. any redundant sequence is removed to get unique sequences. Similar to sort | uniq command except there is reverse-complementary for DNA sequence. The program runs well with small dataset, but... (11 Replies)
Discussion started by: yifangt
11 Replies

2. UNIX for Dummies Questions & Answers

How to improve the performance of this script?

Hi , i wrote a script to convert dates to the formate i want .it works fine but the conversion is tkaing lot of time . Can some one help me tweek this script #!/bin/bash file=$1 ofile=$2 cp $file $ofile mydates=$(grep -Po '+/+/+' $ofile) # gets 8/1/13 mydates=$(echo "$mydates" | sort |... (5 Replies)
Discussion started by: vikatakavi
5 Replies

3. Shell Programming and Scripting

Improve performance of echo |awk

Hi, I have a script which looks like this. Input file data1^20 data2^30 #!/bin/sh file"/home/Test.txt" while read line do echo $line |awk 'BEGIN { FS = "^" } ; { print $2 }' echo $line |awk 'BEGIN { FS = "^" } ; { print $1 }' | gzip | wc -c done <"$file" How can i... (4 Replies)
Discussion started by: chetan.c
4 Replies

4. Programming

Help with improve the performance of grep

Input file: #content_1 12314345345 242467 #content_14 436677645 576577657 #content_100 3425546 56 #content_12 243254546 1232454 . . Reference file: content_100 (1 Reply)
Discussion started by: cpp_beginner
1 Replies

5. Shell Programming and Scripting

How to improve the performance of parsers in Perl?

Hi, I have around one lakh records. I have used XML for the creation of the data. I have used these 2 Perl modules. use XML::DOM; use XML::LibXML; The data will loo like this and most it is textual entries. <eid>19000</eid> <einfo>This is the ..........</einfo> ......... (3 Replies)
Discussion started by: vanitham
3 Replies

6. Shell Programming and Scripting

Want to improve the performance of script

Hi All, I have written a script as follows which is taking lot of time in executing/searching only 3500 records taken as input from one file in log file of 12 GB Approximately. Working of script is read the csv file as an input having 2 arguments which are transaction_id,mobile_number and search... (6 Replies)
Discussion started by: poweroflinux
6 Replies

7. Shell Programming and Scripting

Improve the performance of a shell script

Hi Friends, I wrote the below shell script to generate a report on alert messages recieved on a day. But i for processing around 4500 lines (alerts) the script is taking aorund 30 minutes to process. Please help me to make it faster and improve the performace of the script. i would be very... (10 Replies)
Discussion started by: apsprabhu
10 Replies

8. Shell Programming and Scripting

Any way to improve performance of this script

I have a data file of 2 gig I need to do all these, but its taking hours, any where i can improve performance, thanks a lot #!/usr/bin/ksh echo TIMESTAMP="$(date +'_%y-%m-%d.%H-%M-%S')" function showHelp { cat << EOF >&2 syntax extreme.sh FILENAME Specify filename to parse EOF... (3 Replies)
Discussion started by: sirababu
3 Replies

9. UNIX for Dummies Questions & Answers

Improve Performance

hi someone tell me which ways i can improve disk I/O and system process performance.kindly refer some commands so i can do it on my test machine.thanks, Mazhar (2 Replies)
Discussion started by: mazhar99
2 Replies

10. UNIX for Advanced & Expert Users

improve performance by using ls better than find

Hi , i'm searching for files over many Aix servers with rsh command using this request : find /dir1 -name '*.' -exec ls {} \; and then count them with "wc" but i would improve this search because it's too long and replace directly find with ls command but "ls *. " doesn't work. and... (3 Replies)
Discussion started by: Nicol
3 Replies
Login or Register to Ask a Question