grep -f CPU performances


 
Thread Tools Search this Thread
Operating Systems Linux grep -f CPU performances
# 1  
Old 11-26-2008
grep -f CPU performances

Hi

I would like to thank you all for this excellent forum.
Today i tried to compare two files and i get some problem with it.
I have two files and i want to get all the data that match the first file like this

File1 (pattern file)
___________________________
9007
9126
9918
9127
9977
___________________________

File 2
_______________________________
9124 2008-12-11 16:00:00
4963 2007-12-16 17:00:00
9126 2006-11-11 16:00:00
9127 2007-12-10 17:00:00
3912 2008-10-11 18:00:00
______________________________

This is how the output file should be
________________________________
9127 2007-12-10 17:00:00
9126 2006-11-11 16:00:00
________________________________

The first file has more than 50000 line and the second file has more than 600000 lines.
I used " grep -f file1 file2 > output.file "
but this take to long I let it running at my Intel@2x1.8GHz(processor load 100% by grep) for 3 hour but i don't get any results.

I also tried to split the first file (pattern file) into smaller parts, but again no results after 3 hours waiting.
this is the script that i used to split the file and to "grep -f"
_________________________________________________
split -l 100 file1 file1.split.
for CHUNK in file1.split.* ; do
grep -f "$CHUNK" file2
done
rm file1.split.*
_________________________________________________

Does someone know how i can do that faster or does anyone has an idea how it can be done faster?

Thanks in advance.
# 2  
Old 11-27-2008
i woul suggest to use a database.
create 2 tables and
Code:
select * from table2 where table1.id=table2.id;

# 3  
Old 11-27-2008
thanks your for your answer but i don't have any experience with DB's i never used them?
Do you have any guide how to create DB and Oracle or SQL ? how to import the files into the DB etc..
Some stuff that i can use to do that.

Thanks
# 4  
Old 11-27-2008
i think mysql should be sufficient. i'cannot provide a detailed howto, but here are the main steps you have to do

* install an start mysql server
* set root password - mysqladmin -u root -p
* create 1 database with 2 tables
* use load_from_file function to import data
* use select statement to process data

maybe you find someone in a mysql forum who can explain this in detail by heart. i also would have to consult documentation. But U really should use a database.
# 5  
Old 11-27-2008
awk '{ if (NR==FNR) { my_array[$1]=$1; next;} if ( $1 in my_array ) {print $0}}' file1 file2

try above one liner. Not sure about the performance.
This User Gave Thanks to manikantants For This Post:
# 6  
Old 11-27-2008
Thanks a lot i will try that today
# 7  
Old 11-27-2008
Thanks manikantants

awk '{ if (NR==FNR) { my_array[$1]=$1; next;} if ( $1 in my_array ) {print $0}}' file1 file2

unbelievable it takes only 10 seconds
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Why Entitlement CPU can't be set to same as Virtual CPU?

I read that Entitlement CPU should be set to max 75% compare to Virtual CPU. May I know the reason. I have set the Entitlement CPU = Virtual CPU on AIX . It works fine . Can you help to understand. (1 Reply)
Discussion started by: gabhanes
1 Replies

2. UNIX for Dummies Questions & Answers

Is it possible to combine multiple CPU to act as a single CPU on the same server?

We have a single threaded application which is restricted by CPU usage even though there are multiple CPUs on the server, hence leading to significant performance issues. Is it possible to merge / combine multiple CPUs at OS level so it appear as a single CPU for the application? (6 Replies)
Discussion started by: Dissa
6 Replies

3. UNIX and Linux Applications

grep hogs entire cpu

How can grep hog your entire cpu? I am absolutely shocked by this. (11 Replies)
Discussion started by: cokedude
11 Replies

4. UNIX for Dummies Questions & Answers

Need suggestion about grep and CPU usage

guys i need suggestion about how to grep cpu usage and then compare it example : if cpu usage <= 40% then print normal and how much cpu usage is or cpu usage between 40%-65% print normal and much cpu usage is i've tried like this one but got error DOMAIN=`uname -n`... (9 Replies)
Discussion started by: ashary
9 Replies

5. UNIX for Advanced & Expert Users

File alignment and performances... (difficult)

Hello ! I will use my best english possible to explain my objective. I'm french so pardon for the lack of precision... So, what i would like to do in shell script (but you will possibly answer ''not possible in script'' have to use low level langage or something like that) is described below.... (3 Replies)
Discussion started by: Gnaag
3 Replies

6. Solaris

Performances with RAID 5

Hello every body, Maybe someone could help me. I have a SUN Server with 6 disks, each of 150 Gb. I have mounted the first two disk in mirror (RAID1) for the system files. I have mounted 3 disks in RAID5 for users file systems. I kept the last one as spare and I have mounted it standalone... (6 Replies)
Discussion started by: aribault
6 Replies

7. Solaris

Multi CPU Solaris system shows 100% CPU usage.

Hello Friends, On one of my Solaris 10 box, CPU usage shows 100% using "sar", "vmstat". However, it has 4 CPUs and prstat and glance are not showing enough processes to justify high CPU utilization. ========================================================================= $ prstat -a ... (4 Replies)
Discussion started by: mahive
4 Replies

8. UNIX for Dummies Questions & Answers

how to get persistant cpu utilization values per process per cpu in linux (! top,ps)

hi, i want to know cpu utilizatiion per process per cpu..for single processor also if multicore in linux ..to use these values in shell script to kill processes exceeding cpu utilization.ps (pcpu) command does not give exact values..top does not give persistant values..psstat,vmstat..does njot... (3 Replies)
Discussion started by: pankajd
3 Replies

9. UNIX for Dummies Questions & Answers

grep Vs CPU usage

Hi, I have one basic doubt, that using grep command frequently , will it have direct impact on the CPU load, pls clarify for eg, if i run a non stop script containing while loop to grep some parameters, what will be the load in CPU.. thanks (3 Replies)
Discussion started by: vasikaran
3 Replies

10. AIX

ssa performances

Helo: We updated form AIX 4.3.3 to AIX 5.1-7 and after this we spent more than double time in read from external disks. Aparently the ssa cards microcode is at last level and all the ptf and apars are instaled. Out backups expent more than double time, but curiously in read only, if we write in... (0 Replies)
Discussion started by: Javier Gutierre
0 Replies
Login or Register to Ask a Question