No more suggestions.
E.g. grep '^FLOW' isn't noticeable faster - a search takes about the same time as to find the next line.
And an RE search of a simple "string" has zero overhead compared to plain search.
An 8GB file should take about 2 minutes - this is fast.
Everything else - sed, awk, perl is slower.
Such amount of log data should be written to a DB file; at least a text log file should be more often rotated!
I disagree -
Solaris 10 M4000, ksh, ^ is faster, files are 50MB ~510000 lines +/- 40 lines between them:
contents of t.shl:
Results ( I ran it twice to show the effect of filesystem and disk controller caching):
Note user mode times. I know the OP is on a different box, so this may not be a fair comparison. However, expand the (red) user times by a factor of 8GB/50MB
~(20*8) gives 160
so:
two points:
If you ran a times comparison of '^FLOW' vs 'FLOW' (in that order) on the same file your results were confounded by caching.
The user time is independent of caching and reflective of the work a regex does.
Henry Spencer wrote a white paper onthis kind of thing, I cannot find it so I cannot cite it.
No more suggestions.
E.g. grep '^FLOW' isn't noticeable faster - a search takes about the same time as to find the next line.
And an RE search of a simple "string" has zero overhead compared to plain search.
An 8GB file should take about 2 minutes - this is fast.
Everything else - sed, awk, perl is slower.
No offense intended, but all of those unqualified statements are worthless. I have done some work with NFA (cached to DFA) regular expression engines, and the nature and quality of implementations varies massively.
While jim's implementation performs better with an anchor, a GNU grep 2.5.1 does much worse. It takes more than twice as long. (The tests were repeated multiple times in differing order on obsolete hardware and there was never a discrepancy.)
As an aside, some implementations will silently optimize depending on the contents of the pattern. A BSD example from OpenBSD :: grep.c:
My point, regular expression performance is highly implementation dependent and unqualified statements are seldom valid.
I thought GNU grep was the crème de la crème of speed.
In my system (GNU grep 2.6.3) it behaves better using an anchor than without it.
The original author (who does not maintain it any longer) has thoroughly defended it. Not sure if after all these years it has been beaten by something else.
This User Gave Thanks to verdepollo For This Post:
i'm trying to decide if to move operations from one of these hosts to the other. but i cant decide which one of them is the most powerful.
each host has 8 cpus.
HOSTA
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 44
model name : Intel(R) Xeon(R) CPU ... (6 Replies)
I have read anecdotes about people installing RAID0 (RAID - Wikipedia, the free encyclopedia) on some of their machines because it gives a performance boost. Because bandwidth on the motherboard is limited, can someone explain exactly why it should be faster? (7 Replies)
If I just wanted to get andred08 from the following ldap dn
would I be best to use AWK or CUT?
uid=andred08,ou=People,o=example,dc=com
It doesn't make a difference if it's just one ldap search I am getting it from but when there's a couple of hundred people in the group that retruns all... (10 Replies)
Hi I have to grep for 2000 strings in a file one after the other.Say the file name is Snxx.out which has these strings.
I have to search for all the strings in the file Snxx.out one after the other.
What is the fastest way to do it ??
Note:The current grep process is taking lot of time per... (7 Replies)
Hi ,
I need to copy every day about 35GB of files from one file system to another.
Im using the cp command and its toke me about 25 min.
I also tried to use dd command but its toke much more.
Is there better option ?
Regards. (6 Replies)
Sample Log file
IP.address Date&TimeStamp GET/POST URL ETC
123.45.67.89 MMDDYYYYHHMM GET myURL http://ABC.com
123.45.67.90 MMDDYYYYHHMM GET myURL http://XYZ.com
I have a very huge web server log file (about 1.3GB) that contains entries like the one above. I need to get the last entries of... (9 Replies)
For some reason 8.1 Mandrake Linux seems much slower than Windows 2000 with my cable modem. DSL reports test says they conferable speed with Windows2 though.
This is consistant slow with both of my boxes, at the same time. Linux used to be faster, but not with Mandrake. Any way to fix this? (17 Replies)