Why is cut slower than awk?


 
Thread Tools Search this Thread
Top Forums UNIX for Advanced & Expert Users Why is cut slower than awk?
# 1  
Old 02-05-2009
Why is cut slower than awk?

Hi all,

for test reasons I tried the following two one-liners:

Code:
time awk '{print $4}' T_64xSC_128RW_K500.dat > /dev/null

and

Code:
time cut -d" " -f6 T_64xSC_128RW_K500.dat > /dev/null

The file contains approx. 250k lines. awk does it in 0.15 secs (real), cut in 0.44. The user time has about the same relation, whereas the sys time is almost identical in both cases.

The fact that awk is almost 8 times larger than cut (in kB) seems to make no difference.

Why is cut almost 4 times slower?

Cheers,
BG
# 2  
Old 02-05-2009
Quote:
Originally Posted by BandGap
The fact that awk is almost 8 times larger than cut (in kB) seems to make no difference.
Why would that make a difference?

Quote:
Originally Posted by BandGap
Why is cut almost 4 times slower?
Good question; I guess cut's code is just inefficient. Without seeing the source code though we can only guess; what OS is this on? I just tried no HP-UX and awk was more than 3 times slower than cut. It probably depends on the nature of the input data too.
# 3  
Old 02-06-2009
Out of curiosity, I also tried some testing myself. On a file with 250 lines there's no difference.

But on a file with 1000 lines, cut was faster by 1 sec.

On a file with 10000 lines, the result is the same:

Code:
FILE
450000 Feb  6 14:32 test.file

CONTENTS
the quick brown fox jumped over the lazy dog
the quick brown fox jumped over the lazy dog
the quick brown fox jumped over the lazy dog
the quick brown fox jumped over the lazy dog
the quick brown fox jumped over the lazy dog
the quick brown fox jumped over the lazy dog
the quick brown fox jumped over the lazy dog
the quick brown fox jumped over the lazy dog
the quick brown fox jumped over the lazy dog
the quick brown fox jumped over the lazy dog
...
...

USING AWK
real    0m0.07s
user    0m0.05s
sys     0m0.10s

USING CUT
real    0m0.02s
user    0m0.01s
sys     0m0.02s

I'm using HP-UX. Maybe awk is better for files far larger than 10000 lines while cut is better for smaller files. Or maybe because of their differences in primary use. I'm not sure about this though. Need to do some more tests.

Last edited by angheloko; 02-06-2009 at 03:03 AM..
# 4  
Old 02-06-2009
Well I did some testing as well. The system which I used in the first post was a QuadCore AMD with Lustre file system (used mainly on clusters).

I just did the same thing on a Pentium 4 on ext3, with basically the same filesize and the results were exactly opposite. The 'awk' time was almost the same on both systems, but the 'cut' worked approx. 6 times faster.

Of course, the times still range in the sub-second regime but the test file was one of the smaller ones I need to process...

Thanks for the feedback!

BG
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Solaris

Job Run Slower using Autosys than running through terminal

Hi All, We run Many jobs evryday using Autosys. Sometimes due to various reason we got to run the job from terminal as well (using nohup). We observed that the job running through terminal(nohup) takes much less time then the autosys (for same job). What can be the possible reason for such... (1 Reply)
Discussion started by: kg_gaurav
1 Replies

2. Linux

GCC compiles a lot slower than it should

Hello everyone, i'm having a problem compiling an application i'm developing. For a month, while developing, i did lots of compilations to test it. While compiling, i noticed gcc did it pretty slow, but gave it no importance. I'm using ubuntu 10.04 32bits, and my pc has: - Dual core intel... (1 Reply)
Discussion started by: adadon
1 Replies

3. Shell Programming and Scripting

HELP! using cut/awk

how would i write a shell script to show the number of lines in which int variable appears in a c++ program. how would i do this using cut or awk methods is it possbile and having a output similar to this x, number of apperances = y, number of apperances = (2 Replies)
Discussion started by: deadleg
2 Replies

4. Shell Programming and Scripting

Is awk vs cut which one is better

i was trying to work on program to look for users never log on sever.. using awk with awk is working last| awk '{print $1}' |sort -u > /tmp/users1$$ cat /etc/passwd | awk -F: '{print $1}' |sort -u > /tmp/users2$$ comm -13 /tmp/users$$ rm -f /tmp/users$$ with cut it is not working ... (3 Replies)
Discussion started by: macrules
3 Replies

5. Shell Programming and Scripting

[grep awk cut] > awk

Hi, I'm very new to scripting. grep $s $filename | awk '{print $2}' | cut -c 1-8 How can I optimize this using a single awk? I tried: awk '/$s/ {print $2}' $filename | cut -c 1-8 However didn't work, I think the awk is not recognizing $s and the verbal is something else. (6 Replies)
Discussion started by: firdousamir
6 Replies

6. Solaris

Why is restore slower than backup?

After my big disaster, I'm restoring from tape on my Sun box. This is the second time I've used 'ufsrestore' with this DEC TZ88 SCSI DLT drive. The last time was for a migration from one box to another. Both the last time and this time, the restore has taken a hell of a lot longer than the... (4 Replies)
Discussion started by: deckard
4 Replies

7. UNIX for Advanced & Expert Users

TCP/IP Connection getting slower...

Hi, We have developed a server program using TCP/IP Communication to communicate with another client program. After running for some days we find the TCP/IP connection from the server program is getting slower. What i mean to say is since the send() function in the server program (it is... (2 Replies)
Discussion started by: rajesh_puru
2 Replies

8. Programming

TCP/IP send getting slower

Hi, We have developed a server program using TCP/IP Communication to communicate with another client program. After running for some days we find the TCP/IP connection from the server program is getting slower ie, the messages are not send quickly to the client. What i mean to say is since the... (0 Replies)
Discussion started by: rajesh_puru
0 Replies

9. UNIX for Dummies Questions & Answers

Slower slower dead

I am running RH 7.1 and i am always remotely logged in working on a database. I got php and mysql running and then as the each week goes my rsh, or secure telnet, and any other remote session slows to become extremly choppy. If i am at teh box my i am fine the computer is not bogged down at all,... (5 Replies)
Discussion started by: macdonto
5 Replies
Login or Register to Ask a Question