|
|||||||
| Forums | Search Forums | Register | Forum Rules | Man Pages | Albums | FAQ | Members | Calendar | Search | Today's Posts | Mark Forums Read |
| UNIX for Dummies Questions & Answers If you're not sure where to post a UNIX or Linux question, post it here. All UNIX and Linux newbies welcome !! |
|
|
|
Thread Tools | Search this Thread | Display Modes |
|
#1
|
|||
|
|||
|
Performance analysis sed vs awk
Hi Guys, I've wondered for some time the performance analysis between using sed and awk. say i want to print lines from a very large file. For ex say a file with 100,000 records. i want to print the lines 25,000 to 26,000 i can do so by the following commands: Code:
sed -n '25000,26000 p' filename Code:
awk 'NR==25000,NR==26000' filename both will yield the same results but which one is better or is there such a thing ? Thanks |
| Sponsored Links | ||
|
|
#2
|
|||
|
|||
|
It depends on your implementations of sed and awk, so isn't a certain thing. It can vary quite a lot, and our results may not be relevant to your system.
I'd be curious whether this is faster than your other awk expression: awk '(NR>=25000)&&(NR<=26000) The surefire way to find out is to try... And, of course, it may be possible to alter your program's logic such that you don't need to eat 25,000 useless lines before your program can start working... What exactly are you trying to do here? |
| The Following User Says Thank You to Corona688 For This Useful Post: | ||
Irishboy24 (07-05-2012) | ||
| Sponsored Links | ||
|
|
#3
|
|||
|
|||
|
i think your awk expression would be much faster: Code:
awk '(NR>=25000)&&(NR<=26000) Well i'm not really trying to achieve anything but i've been asked this question many times with regards to performance so i thought i'll post it on the forums to find a good explanation. Thanks |
|
#4
|
|||
|
|||
|
I saw a thread recently where a similar question was asked, and performance shown for something like 4 or 5 different awk and sed implementations. The numbers were quite odd. There didn't seem to be any clear answer.
So, it really just comes down to what works better for you. |
| Sponsored Links | |
|
|
#5
|
|||
|
|||
|
agreed mate. thanks for indulging me though !!
![]() |
| Sponsored Links | |
|
|
#6
|
|||
|
|||
|
you could always try a comparison script to see how these are performing on your machine - e.g. Code:
#!/bin/sh time `sed -n '25000,26000 p' filename` time `awk 'NR==25000,NR==26000' filename` exit 0 it would be trivial to add more information, or to cron this and take samples several times a day for a week - while sed and awk are among the more mature chunks of code in a modern Unix system and one would figure that both are about as quick and elegant as they will ever get, there are still plenty of variables that could make a difference. You may find that during certain times of day or when certain other processes are running, the speed differences may vary wildly. |
| The Following User Says Thank You to zer0sig For This Useful Post: | ||
Irishboy24 (07-06-2012) | ||
| Sponsored Links | |
|
|
#7
|
||||
|
||||
|
When using shell "time" the backticks are not required as "time" is part of the shell syntax. Also, it is good to direct the output to /dev/null, while testing and the system cache needs to be taken into account, so that either all reads are from an already cached situation (so for example, perform all tests twice and take the latter resuls), or that you create provisions, so that there is no caching or caching is reset for every test.
-- The thread with different awks and greps can be found here. The speed difference between the various awk implementations can vary wildly.. |
| The Following User Says Thank You to Scrutinizer For This Useful Post: | ||
Irishboy24 (07-06-2012) | ||
| Sponsored Links | ||
|
![]() |
| Thread Tools | Search this Thread |
| Display Modes | |
More UNIX and Linux Forum Topics You Might Find Helpful
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| What is the best tools for performance data gathering and analysis? | devyfong | Red Hat | 6 | 12-21-2011 10:08 AM |
| Routing table vulnerability comparison between two versions and analysis of performance in a scenari | coolvaibhav | Linux | 0 | 07-27-2010 02:37 AM |
| Announcing collectl - new performance linux performance monitor | MarkSeger | News, Links, Events and Announcements | 0 | 10-26-2007 06:14 PM |
|
|