gref -f taking long time for big file


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers gref -f taking long time for big file
# 1  
Old 04-05-2010
gref -f taking long time for big file

grep -f taking long time to compare for big files, any alternate for fast check

I am using grep -f file1 file2 to check - to ckeck dups/common rows prsents. But my files contains file1 contains 5gb and file 2 contains 50 mb and its taking such a long time to compare the files.
Do we have any alternate for fast/quick check comparetion ?
# 2  
Old 04-05-2010
try 'fgrep'
Code:
User Commands                                            fgrep(1)



NAME
     fgrep - search a file for a fixed-character string

# 3  
Old 04-05-2010
Even fgrep won't scale for larger files as the operation is o(n2). Best approach is to hash one of the files and do a o(1) lookup best case with the other file
# 4  
Old 04-05-2010
I want to check/compare file1 rows into file2 rows, if file1 rows present in file2 then needs to display like that compare for all rows in file1 to file 2 and it varies the content also. how to do ? the current one grep -f file1 file2 is working my scenario but its looking such a long time, any alternate ?
# 5  
Old 04-06-2010
Can you please post some sample input and output data?
# 6  
Old 04-06-2010
Or read the man page on comm - that is designed to find rows found in two files.
# 7  
Old 04-06-2010
file1.txt conains:
1 2 3 4 5
1 1 1 1 1
2 2 2 2 2
1 3 5 7 9
4 4 4 4 4

file2.txt contains:
1 2 3 4 5
1 3 5 7 9
7 7 7 7 7
9 9 9 99

the output of file3.txt should be
1 2 3 4 5
1 3 5 7 9

Here grep -f file1.txt file2.txt > file3.txt : its working fine for small size of files but large files size (file size conatns GB) then its not working/not at all comming back...?
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Red Hat

Du -sh command taking time to calculate the big size files

Hi , My linux server is taking more time to calculate big size from long time. * i am accessing server through ssh * commands # - du -sh * #du -sh * | sort -n | grep G Please guide me for fast way to find big size directories under to / partition Thanks (8 Replies)
Discussion started by: Nats
8 Replies

2. Shell Programming and Scripting

Rm -rf is taking very long, will it timeout?

I have so many (hundreds of thousands) files and directories within this one specific directory that my "rm -rf" command to delete them has been taking forever. I did this via the SSH, my question is: if my SSH connection times out before rm -rf finishes, will it continue to delete all of those... (5 Replies)
Discussion started by: phpchick
5 Replies

3. Shell Programming and Scripting

While loop problem taking too long

while read myhosts do while read discovered do echo "$discovered" done < $LOGFILE | grep -Pi "|" | egrep... (7 Replies)
Discussion started by: SkySmart
7 Replies

4. UNIX for Dummies Questions & Answers

ls is taking long time to list

Hi, All the data are kept on Netapp using NFS. some directories are so fast when doing ls but few of them are slow. After doing few times, it becomes fast. Then again after few minutes, it becomes slow again. Can you advise what's going on? This one directory I am very interested is giving... (3 Replies)
Discussion started by: samnyc
3 Replies

5. Solaris

How to find out bottleneck if system is taking long time in gzip

Dear All, OS = Solaris 5.10 Hardware Sun Fire T2000 with 1 Ghz quode core We have oracle application 11i with 10g database. When ever i am trying to take cold backup of database with 55GB size its taking long time to finish. As the application is down nobody is using the server at all... (8 Replies)
Discussion started by: yoojamu
8 Replies

6. UNIX for Dummies Questions & Answers

Job is taking long time

Hi , We have 20 jobs are scheduled. In that one of our job is taking long time ,it's not completing. If we are not terminating it's running infinity time actually the job completion time is 5 minutes. The job is deleting some records from the table and two insert statements and one select... (7 Replies)
Discussion started by: ajaykumarkona
7 Replies

7. Shell Programming and Scripting

<AIX>Problem in purge script, taking very very long time to complete 18.30hrs

Hi, I have here a script which is used to purge older files/directories based on defined purge period. The script consists of 45 find commands, where each command will need to traverse through more than a million directories. Therefore a single find command executes around 22-25 mins... (7 Replies)
Discussion started by: sravicha
7 Replies

8. Shell Programming and Scripting

For Loop Taking Too Long

I'm new from UNIX scripting. Please help. I have about 10,000 files from the $ROOTDIR/scp/inbox/string1 directory to compare with the 50 files from /$ROOTDIR/output/tma/pnt/bad/string1/ directory and it takes about 2 hours plus to complete the for loop. Is there a better way to re-write the... (5 Replies)
Discussion started by: hanie123
5 Replies

9. Red Hat

login process taking a long time

I'm having a bit of a login performance issue.. wondering if anyone has any ideas where I might look. Here's the scenario... Linux Red Hat ES 4 update 5 regardless of where I login from (ssh or on the text console) after providing the password the system seems to pause for between 30... (4 Replies)
Discussion started by: retlaw
4 Replies

10. UNIX for Dummies Questions & Answers

fetchmail taking long time to fetchmail...

Hi peeps, We are having around 60 users. The time set to retrieve the mail is 300 sec. But it's taking around 1 hour to deliver mails. I am using debian sarge 3.1. any clues? And how it will affect if I decrease the time? My machine has got 1 p4 3.0 GHZ processor and 1 GB ram. The home... (2 Replies)
Discussion started by: squid04
2 Replies
Login or Register to Ask a Question