Visit The New, Modern Unix Linux Community


Performance Issue for a file search command


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Performance Issue for a file search command
# 1  
Performance Issue for a file search command

Hi All,

This query is regarding performance improvement of a command.

I have a list of IDs in a file (say file1 with single ID column) and file2 has the data rows.

I need to get the IDs from file1 and search in file2, matching rows from file2 should be written to a file3.

For this scenario I have been using below command-

Code:
 for ID in `cat file1`; do grep $ID file2; done > file3

The command I am using is super slow. Can someone please let me know how can I improve the performance here.


Thanks in advance,
Tanu
# 2  
From man grep:
Code:
       -f FILE, --file=FILE
              Obtain patterns  from  FILE,  one  per  line.   The  empty  file
              contains zero patterns, and therefore matches nothing.

So
Code:
grep -f file1 file2 > file3

This User Gave Thanks to Corona688 For This Post:
# 3  
Thank you for your reply @Corona688. But I see the grep command mentioned by you is also running very slow. I am working with files having millions of records.

Let me know if any other way to improve the search.
# 4  
How big is file1? If it's larger than available memory, it will of course be slow.
# 5  
Hi Corona688,

Code:
grep -F -f file1 file2 > file3

The above code (added -F) worked really fast for me. I could complete search for a file with 1.5 million records in 5-6 seconds.

Really appreciate for your response.

Thanks,
Tanu
This User Gave Thanks to Tanu For This Post:

Previous Thread | Next Thread
Thread Tools Search this Thread
Search this Thread:
Advanced Search

Test Your Knowledge in Computers #872
Difficulty: Medium
In computer science, self-modifying code is code that alters its own instructions while it is executing.
True or False?

10 More Discussions You Might Find Interesting

1. AIX

Performance issue

Hi We have an AIX5.3 server with application which is written in C. We are facing server (lpar) hangs intermediately. If we open new telnet window prompts for user and takes hell of a time to authenticate, not only that if we run ps -aef then also it takes lot of time. surprisingly there is no... (2 Replies)
Discussion started by: powerAIX
2 Replies

2. Shell Programming and Scripting

Performance issue while using find command

Hi, I have created a shell script for Server Log Automation Process. I have used find xargs grep command to search the string. for Example, find -name | xargs grep "816995225" > test.txt . Here my problem is, We have lot of records and we want to grep the string... (4 Replies)
Discussion started by: nanthagopal
4 Replies

3. AIX

Performance issue

Hi, We have 2 lpars on p6 blade. One of the lpar is having 3 core cpu with 5gb memory running sybase as database. An EOD process takes 25 min. to complete. Now we have an lpar on P7 server with entitled cpu capacity of 2 with 16 Gb memory and sybase as database. The EOD process which takes... (17 Replies)
Discussion started by: vjm
17 Replies

4. HP-UX

Performance issue with 'grep' command for huge file size

I have 2 files; one file (say, details.txt) contains the details of employees and another file (say, emp.txt) has some selected employee names. I am extracting employee details from details.txt by using emp.txt and the corresponding code is: while read line do emp_name=`echo $line` grep -e... (7 Replies)
Discussion started by: arb_1984
7 Replies

5. UNIX for Dummies Questions & Answers

Performance issue

hi I am having a performance issue with the following requirement i have to create a permutation and combination on a set of three files such that each record in each file is picked and the output is redirected in a specific format but it is taking around 70 odd hours to prepare a combination... (7 Replies)
Discussion started by: mad_man12
7 Replies

6. Shell Programming and Scripting

Performance issue or something else?

Hi All, I have the following script which I use in Nagios to check the health of the applications, the problem with it is that the curl part ($TOTAL) does not return anything after running for 2-3 hrs, even though from command line the script runs fine but not from Nagios. There are 17... (1 Reply)
Discussion started by: jacki
1 Replies

7. Solaris

Performance issue

Hi Gurus, I am beginner in solaris and want to know what are the things we need to check for performance monitoring on our solairs OS. for DISK,CPU and MEMORY. Also how we do ipforwarding in slaris Many thanks for your help Pradeep P (4 Replies)
Discussion started by: ppandey21
4 Replies

8. Shell Programming and Scripting

Performance issue in UNIX while generating .dat file from large text file

Hello Gurus, We are facing some performance issue in UNIX. If someone had faced such kind of issue in past please provide your suggestions on this . Problem Definition: /Few of load processes of our Finance Application are facing issue in UNIX when they uses a shell script having below... (19 Replies)
Discussion started by: KRAMA
19 Replies

9. UNIX for Advanced & Expert Users

Performance issue!

In my C program i am using very large file(approx 400MB) to read parts of it frequently. But due to large file the performance of the program goes down very badly. It shows very high I/O usage and I/O wait time. My question is, What are the ways to optimize or tune I/O on linux or how i can get... (10 Replies)
Discussion started by: mavens
10 Replies

10. Shell Programming and Scripting

performance issue

I want to read a file. is it good to use File I/O or shell script?? which one is the best option? (1 Reply)
Discussion started by: vishwaraj
1 Replies

Featured Tech Videos