The efficiency between GREP and SED???


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting The efficiency between GREP and SED???
# 1  
Old 09-13-2010
Question The efficiency between GREP and SED???

Hello Everyone!

I am a newbie. I'd like to get key lines from a big txt file by Reg Exp, The file is nearly 22MB.

GREP or SED?which may be the best choice,more efficient way?

or any other best practise?

Thank you in advance.

EverSmilie
# 2  
Old 09-13-2010
Either will work. There is not much difference in speed if you write an efficient regular expression.
This User Gave Thanks to fpmurphy For This Post:
# 3  
Old 09-13-2010
Unless you need to process lots of them or process within a short time limit, 22 megs really isn't that big these days anyway.
# 4  
Old 09-13-2010
Hammer & Screwdriver Based on force of habit...

I would typically use GREP for your request.
In my mind:
GREP is to select out lines based on criteria
SED is a string editor to change lines
# 5  
Old 09-13-2010
So, just for kicks, I created a file of size 22 MB approx., consisting of random characters laid out in 129 cols X 180224 rows.

Code:
$
$ wc testfile_22mb.txt
  180224   180224 23248896 testfile_22mb.txt
$
$

Sample data -

Code:
$
$ head testfile_22mb.txt
v^zEh6p7qQn($_%8UMl@\)z3rZy#b"JJtf;n"S]#}^y^i3X(5U$7xv7OWU"e/):\f>kZGjL5OnMQ5)a9d?T@MSU9.8{Dye\CSJZbB#El_GSM*AY=pbiXX1Wf%Jj>:Zru
eyL(/Y^=5TyG(8d4hHl"T;.c4`(v7[L#4Jjf!-`i80/5cC[T'}D\Q\4Pv<.Xq}&%9@\g%P"i9|8lOS$Ge`sOV9vytDvj'2JoWif2;tBI1O"0isnG0UCb_NUG:n{|`{Wi
9fN.^iYE?}q9ol#b0MUDctWnV>U_JFWhlzug!tglJPq,t!nzw'l_MEYZd:YF|jc:-z`3&8"E^thk!IWE'd.g_-Cf'x]dqhE9S#2`L0bEPiyC3Bjf[^8rsTjeB>v9Q357
i8FHk4>TyTKzB<*=!nJ7\DyXe4;[?:u9yN"aP9Q3AE}b>%@q%}d0?\(uh29NTICsYE>izM*-J;vjieLh}`fx<XTgHY'RAH\5aD="H9$eHy7GCcZrM4kufnvh&XQNl7wt
S_Yf)AJtpw\m\\'XU`pJn3r+L(,+v3Y(u6OC%cX#dQH<}K:pb{PyWm:Kmf#2Uq1L{ox=:[w(FnX+lDm+ge&qy++$c)&RS+>;:&P!_^"Ubci[>uKC!HG^iiWY.C^}U&q<
l4Nx%\[Mkm:3`m).+70lmi|IQY|J6-jjyEh\,`\C?X%4z2kO/4UqgElA\^im%%9WLQw[O1,>;8`+-@j"\*ND/-a`c*P$LiU%V82Ocl;x)iK@&tJmKMNxMoH>D@7)jX.$
q&Io2SODUy;ERj=Stbk0,Yx^^Qf{9$[|?13=9Z<=x?fk61%E`t9dClB]}ZGc#xinEZ`U1U@vq8')fSQ&u{G;H+wg3@=F,FQ'p1d=(0MyBo/rX\ej+YGVDM(|S4/,Plw`
zJo9#Qc`t}Rf1NoC4;,@8*Cp_P=!%p86>i34{}2u?FGCkE/)R9!BxQSJ]?m4OMqj|@Xxk64ytR^F`{.kmT5I;PLAwOBxRu@xD#Q9:06_<YhUy'Kr6}VG}-2`E`:Ge^Bn
BT*%'!tLV`wX]qo2<1Q0cfv+=;UrWSj,2G-1j7z97Sy9aj93|M2X{|}mCNs[MZ^"Qs%Wt]CO/j?CyTlo]&>\6gxR^=S|c9.6G}y3m;32[j3e.0f\(pC9n9FN`LO;N)T$
AXY"pLblBaA8ztKWIXGK#|6N?REvu%F53;G$3n:JJI7kx>Q9<h29}^CnvmO<!?=*9LI?|:L^Fd{=U8^f^`)ej@|D0Ifp`G(R5=Hx6z!T/'>d3pf^vD1zG@BN29d'i&`t
$
$

And I tested the existence of pattern "XYZ" with grep, sed, awk and Perl.
Time taken, in descending order is -

Code:
$
$
$ time perl -ne 'print if /XYZ/' testfile_22mb.txt 1>/dev/null
real    0m0.982s
user    0m0.015s
sys     0m0.015s
$
$ time sed -n '/XYZ/p' testfile_22mb.txt 1>/dev/null
real    0m0.771s
user    0m0.624s
sys     0m0.062s
$
$ time awk '/XYZ/' testfile_22mb.txt 1>/dev/null
real    0m0.300s
user    0m0.186s
sys     0m0.031s
$
$ time grep "XYZ" testfile_22mb.txt 1>/dev/null
real    0m0.152s
user    0m0.077s
sys     0m0.015s
$
$

A few microseconds less than if I had actually printed the results; each case matches and prints 31 lines.

So, not much difference if done once. But it will add up if you are doing this a gazillion times inside a loop.

tyler_durden

Last edited by durden_tyler; 09-13-2010 at 04:45 PM..
These 3 Users Gave Thanks to durden_tyler For This Post:
# 6  
Old 09-13-2010
Quote:
Originally Posted by joeyg
SED is a string editor to change lines
just a minor correction. sed is a stream editor.
Login or Register to Ask a Question

Previous Thread | Next Thread

5 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

About efficiency of parallel memory allocation

Hello, there. I'm a new beginner to Linux kernel and curious about its memory management. When multiple applications apply for memory space at the same time, how Linux kernel solve the resource contending problem for high performance? I have known that there is a buddy system for allocating and... (4 Replies)
Discussion started by: blackwall
4 Replies

2. Shell Programming and Scripting

File or Folder Efficiency?

I've got this program set up so that it creates files whose unique names specify the jobs their contents describe. In order to retrieve the information inside those files, I have to do a "grep" and awk or sed to extract it. I've just assumed that making a directory with that unique name that... (1 Reply)
Discussion started by: gmark99
1 Replies

3. Shell Programming and Scripting

Improve program efficiency (awk)

Hi !! I've finished an awk exercise. Here it is: #!/bin/bash function calcula { # Imprimimos el mayor tamaño de fichero ls -l $1 | awk ' BEGIN { max = $5; # Inicializamos la variable que nos guardará el máximo con el tamaño del primer archivo } { if ($5 > max){ #... (8 Replies)
Discussion started by: Phass
8 Replies

4. Shell Programming and Scripting

Perl: code efficiency for gmtime

I have the following Perl snippet: # get datetime @dt = gmtime(); $strdate = 1900 + $dt . addleadingzero(++$dt) . addleadingzero($dt) . addleadingzero($dt) . addleadingzero($dt) . addleadingzero($dt); # write to file $outfile = $strdate . ".txt"; getstore($url, $outfile) or die "Error:... (3 Replies)
Discussion started by: figaro
3 Replies

5. Shell Programming and Scripting

efficiency..

how efficient is it, and how practical is it to call outside programs in a shell script (bash) for small tasks? for example, say i have a script that might preform many tasks, one of those tasks may require root access; rather than implementing inside the script a method to use su or sudo to... (11 Replies)
Discussion started by: norsk hedensk
11 Replies
Login or Register to Ask a Question