Sponsored Content
Top Forums UNIX for Advanced & Expert Users Help in faster and quicker grepping Post 302816231 by alister on Monday 3rd of June 2013 02:01:21 PM
Old 06-03-2013
Quote:
Originally Posted by MadeInGermany
No more suggestions.
E.g. grep '^FLOW' isn't noticeable faster - a search takes about the same time as to find the next line.
And an RE search of a simple "string" has zero overhead compared to plain search.
An 8GB file should take about 2 minutes - this is fast.
Everything else - sed, awk, perl is slower.
No offense intended, but all of those unqualified statements are worthless. I have done some work with NFA (cached to DFA) regular expression engines, and the nature and quality of implementations varies massively.

While jim's implementation performs better with an anchor, a GNU grep 2.5.1 does much worse. It takes more than twice as long. (The tests were repeated multiple times in differing order on obsolete hardware and there was never a discrepancy.)
Code:
$ yes 'FLOWWWWWWWW' | head -n1000000 | time -p grep -c 'FLOW'
1000000
real    1.84
user    0.71
sys     0.07
$ yes 'FLOWWWWWWWW' | head -n1000000 | time -p grep -c '^FLOW'
1000000
real    4.83
user    3.70
sys     0.10

As an aside, some implementations will silently optimize depending on the contents of the pattern. A BSD example from OpenBSD :: grep.c:
Code:
	for (i = 0; i < patterns; ++i) {
		/* Check if cheating is allowed (always is for fgrep). */
#ifndef SMALL
		if (Fflag) {
			fgrepcomp(&fg_pattern[i], pattern[i]);
		} else
#endif
		{
			if (fastcomp(&fg_pattern[i], pattern[i])) {
				/* Fall back to full regex library */
				c = regcomp(&r_pattern[i], pattern[i], cflags);

My point, regular expression performance is highly implementation dependent and unqualified statements are seldom valid.

Regards,
Alister
 

9 More Discussions You Might Find Interesting

1. IP Networking

Mandrake should be faster.

For some reason 8.1 Mandrake Linux seems much slower than Windows 2000 with my cable modem. DSL reports test says they conferable speed with Windows2 though. This is consistant slow with both of my boxes, at the same time. Linux used to be faster, but not with Mandrake. Any way to fix this? (17 Replies)
Discussion started by: lancest
17 Replies

2. UNIX for Advanced & Expert Users

faster way to loop?

Sample Log file IP.address Date&TimeStamp GET/POST URL ETC 123.45.67.89 MMDDYYYYHHMM GET myURL http://ABC.com 123.45.67.90 MMDDYYYYHHMM GET myURL http://XYZ.com I have a very huge web server log file (about 1.3GB) that contains entries like the one above. I need to get the last entries of... (9 Replies)
Discussion started by: tads98
9 Replies

3. Shell Programming and Scripting

Faster then cp ?

Hi , I need to copy every day about 35GB of files from one file system to another. Im using the cp command and its toke me about 25 min. I also tried to use dd command but its toke much more. Is there better option ? Regards. (6 Replies)
Discussion started by: yoavbe
6 Replies

4. UNIX for Dummies Questions & Answers

How to grep faster ?

Hi I have to grep for 2000 strings in a file one after the other.Say the file name is Snxx.out which has these strings. I have to search for all the strings in the file Snxx.out one after the other. What is the fastest way to do it ?? Note:The current grep process is taking lot of time per... (7 Replies)
Discussion started by: preethgideon
7 Replies

5. Shell Programming and Scripting

Which is faster AWK or CUT

If I just wanted to get andred08 from the following ldap dn would I be best to use AWK or CUT? uid=andred08,ou=People,o=example,dc=com It doesn't make a difference if it's just one ldap search I am getting it from but when there's a couple of hundred people in the group that retruns all... (10 Replies)
Discussion started by: dopple
10 Replies

6. UNIX for Dummies Questions & Answers

Which command will be faster? y?

i)wc -c/etc/passwd|awk'{print $1}' ii)ls -al/etc/passwd|awk'{print $5}' (4 Replies)
Discussion started by: karthi_g
4 Replies

7. UNIX for Dummies Questions & Answers

Why is RAID0 faster?

I have read anecdotes about people installing RAID0 (RAID - Wikipedia, the free encyclopedia) on some of their machines because it gives a performance boost. Because bandwidth on the motherboard is limited, can someone explain exactly why it should be faster? (7 Replies)
Discussion started by: figaro
7 Replies

8. UNIX for Advanced & Expert Users

any quicker way to list disc usage by users?

Hi: it takes a long time for "du -sh list_of_users" to give you the output. Is there a quicker way to get this info? Thanks! N.B. Phil (4 Replies)
Discussion started by: phil518
4 Replies

9. UNIX for Dummies Questions & Answers

Which system is faster?

i'm trying to decide if to move operations from one of these hosts to the other. but i cant decide which one of them is the most powerful. each host has 8 cpus. HOSTA processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 44 model name : Intel(R) Xeon(R) CPU ... (6 Replies)
Discussion started by: SkySmart
6 Replies
BZGREP(1)						      General Commands Manual							 BZGREP(1)

NAME
bzgrep, bzfgrep, bzegrep - search possibly bzip2 compressed files for a regular expression SYNOPSIS
bzgrep [ grep_options ] [ -e ] pattern filename... bzegrep [ egrep_options ] [ -e ] pattern filename... bzfgrep [ fgrep_options ] [ -e ] pattern filename... DESCRIPTION
Bzgrep is used to invoke the grep on bzip2-compressed files. All options specified are passed directly to grep. If no file is specified, then the standard input is decompressed if necessary and fed to grep. Otherwise the given files are uncompressed if necessary and fed to grep. If bzgrep is invoked as bzegrep or bzfgrep then egrep or fgrep is used instead of grep. If the GREP environment variable is set, bzgrep uses it as the grep program to be invoked. For example: for sh: GREP=fgrep bzgrep string files for csh: (setenv GREP fgrep; bzgrep string files) AUTHOR
Charles Levert (charles@comm.polymtl.ca). Adapted to bzip2 by Philippe Troin <phil@fifi.org> for Debian GNU/Linux. SEE ALSO
grep(1), egrep(1), fgrep(1), bzdiff(1), bzmore(1), bzless(1), bzip2(1) BZGREP(1)
All times are GMT -4. The time now is 10:14 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy