Performance analysis sed vs awk


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Performance analysis sed vs awk
# 1  
Old 07-05-2012
Performance analysis sed vs awk

Hi Guys,

I've wondered for some time the performance analysis between using sed and awk. say i want to print lines from a very large file. For ex say a file with 100,000 records. i want to print the lines 25,000 to 26,000 i can do so by the following commands:

Code:
sed -n '25000,26000 p' filename

Code:
awk 'NR==25000,NR==26000' filename

both will yield the same results but which one is better or is there such a thing ?

Thanks
# 2  
Old 07-05-2012
It depends on your implementations of sed and awk, so isn't a certain thing. It can vary quite a lot, and our results may not be relevant to your system.

I'd be curious whether this is faster than your other awk expression: awk '(NR>=25000)&&(NR<=26000)

The surefire way to find out is to try...

And, of course, it may be possible to alter your program's logic such that you don't need to eat 25,000 useless lines before your program can start working... What exactly are you trying to do here?
This User Gave Thanks to Corona688 For This Post:
# 3  
Old 07-05-2012
i think your awk expression would be much faster:

Code:
awk '(NR>=25000)&&(NR<=26000)

Well i'm not really trying to achieve anything but i've been asked this question many times with regards to performance so i thought i'll post it on the forums to find a good explanation.

Thanks
# 4  
Old 07-05-2012
I saw a thread recently where a similar question was asked, and performance shown for something like 4 or 5 different awk and sed implementations. The numbers were quite odd. There didn't seem to be any clear answer.

So, it really just comes down to what works better for you.
# 5  
Old 07-05-2012
agreed mate. thanks for indulging me though !! Smilie
# 6  
Old 07-06-2012
you could always try a comparison script to see how these are performing on your machine - e.g.

Code:
#!/bin/sh
time `sed -n '25000,26000 p' filename`
time `awk 'NR==25000,NR==26000' filename`
exit 0

it would be trivial to add more information, or to cron this and take samples several times a day for a week - while sed and awk are among the more mature chunks of code in a modern Unix system and one would figure that both are about as quick and elegant as they will ever get, there are still plenty of variables that could make a difference. You may find that during certain times of day or when certain other processes are running, the speed differences may vary wildly.
This User Gave Thanks to zer0sig For This Post:
# 7  
Old 07-06-2012
When using shell "time" the backticks are not required as "time" is part of the shell syntax. Also, it is good to direct the output to /dev/null, while testing and the system cache needs to be taken into account, so that either all reads are from an already cached situation (so for example, perform all tests twice and take the latter resuls), or that you create provisions, so that there is no caching or caching is reset for every test.


--
The thread with different awks and greps can be found here. The speed difference between the various awk implementations can vary wildly..
This User Gave Thanks to Scrutinizer For This Post:
 
Login or Register to Ask a Question

Previous Thread | Next Thread

8 More Discussions You Might Find Interesting

1. Red Hat

What is the best tools for performance data gathering and analysis?

Dear Guru, IHAC who complaint that his CentOS is getting performance issue. I have to help him out of there. Could you please tell me which tools is better to gathering the whole system performance data? -- CPU/Memory/IO(disk & Network)/swap I would like the tools could be... (6 Replies)
Discussion started by: devyfong
6 Replies

2. Shell Programming and Scripting

Log Analysis with AWK with Time difference

I would like to write a shell script that calculated the time difference bettween the log entries. If the time difference is higher as 200 sec. print the complette lines out. My Problem is, i am unable to jump in the next line and calculate the time difference. Thank you for your Help. ... (5 Replies)
Discussion started by: fabian3010
5 Replies

3. Shell Programming and Scripting

Date and time range extraction via Awk or analysis script?

Hello does anyone know of an awk that will extract log file entries between a specific date and time range, eg: awk '/15\/Dec\/2010:16:10:00/, /15\/Dec\/2010:16:15:00/' access_log but one that works? Or a free command line log file analysis tool/script? I'd like to be able to view... (2 Replies)
Discussion started by: competitions
2 Replies

4. Linux

Routing table vulnerability comparison between two versions and analysis of performance in a scenari

Hi Routing tables in a typical linux kernel are implemented using hash data structures. So if the hash table is forced to behave more like a linked list(i.e create chaining) the purpose of using hash is defeated and time complexity increases. I want to try to assess the performance deterioration... (0 Replies)
Discussion started by: coolvaibhav
0 Replies

5. Shell Programming and Scripting

Increase sed performance

I'm using sed to do find and replace. But since the file is huge and i have more than 1000 files to be searched, the script is taking a lot of time. Can somebody help me with a better sed command. Below is the details. Input: 1 1 2 3 3 4 5 5 Here I know the file is sorted. ... (4 Replies)
Discussion started by: gpaulose
4 Replies

6. UNIX for Advanced & Expert Users

WEB Server Log File Analysis using awk/sed/grep

I'm trying to find a way to show large page sizes (page size in K) from multiple web server log files. Essentially I want to show only rows from a file where a specific column is larger than some value. Has anyone ever done this type of log analysis? If so, a snippet of code would be very... (2 Replies)
Discussion started by: mike_cataldo@ad
2 Replies

7. UNIX for Advanced & Expert Users

sed performance

hello experts, i am trying to replace a line in a 100+mb text file. the structure is similar to the passwd file, id:value1:value2 and so on. using the sed command sed -i 's/\(123\):\(\{1,\}\):/\1:bar:/' data.txt works nicely, the line "123:foo:" is replaced by "123:bar:". however, it takes... (7 Replies)
Discussion started by: f3k
7 Replies

8. Shell Programming and Scripting

AWK script: decrypt text uses frequency analysis

Ez all! I have a question how to decrypt text uses letter frequency analysis. I have code which count the letters, but what i need to do after that. Can anybody help me to write a code. VERY NEEDED! My code now: #!/usr/bin/awk -f BEGIN { FS="" } { for (i=1; i <= NF; i++) { if ($i... (4 Replies)
Discussion started by: SerJel
4 Replies
Login or Register to Ask a Question