Awk Versus Cut


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Awk Versus Cut
# 1  
Old 12-28-2009
Awk Versus Cut

Hello ALL,

I am looking for a comparison in 2 commands using awk and cut that would replicate the following command below.

This is completely for speed reasons checking apache logs for unique IPs.

Contender #1

Code:
awk '{!a[$1]++}END{for(i in a) if ( a[i] >10 ) print a[i],i }' access_log

I need a string / command that is similar to the above awk command that performs the same checking using CUT. It should disregard UNIQUE apache access log IP'S that exceed lower than 10 entries on the the access logs unique IPs.

I need to modify below command to achieve this. My co worker and I have a bet.

Contender #2
FILE=/usr/local/apache/access_log
Code:
cut -d ' ' -f 1 "$FILE" | sort | uniq -c


Output of above awk command.
Code:
159070 67.72.16.xxx
14 41.223.30.22
159074 67.72.16.xxx
6586 10.4.20.xxx
6614 67.72.16.xxx


Please let me know,

Jaysunn

Last edited by jaysunn; 12-28-2009 at 08:55 PM..
# 2  
Old 12-28-2009
You mean you need to filter out the entries with fewer than 10 occurrences? , like e.g.:
Code:
 | egrep -v ' {6}'

-or-
Code:
 | grep '[0-9][0-9] '

# 3  
Old 12-28-2009
Yes,
I am trying to see what command is the fastest at performing the task of searching trough a large apache access log.

I have had some opinions in regards to cut and awk performing the seaarch and revealing of results. These test are from bash on RHEL.

I feel that the awk command is superior. However I will need to confirm that cause the cut command that I constructed is missing the portion where it checks for duplicate IP's less than 10 on the infile. And to be completely honest I cannot construct a CUT command that will achieve this.

Hope I have explained this well enough.

Regards,

Jaysunn
# 4  
Old 12-28-2009
Hi jaysunn,

You can stick either of these filters at the end of your cut-sort-uniq sequence:
Code:
cut -d ' ' -f 1 "$FILE" | sort | uniq -c | grep '[0-9][0-9] '

And that should give you your output.
# 5  
Old 12-28-2009
Quote:
Originally Posted by Scrutinizer
Hi jaysunn,

You can stick either of these filters at the end of your cut-sort-uniq sequence:
Code:
cut -d ' ' -f 1 "$FILE" | sort | uniq -c | grep '[0-9][0-9] '

And that should give you your output.

just this,
Code:
awk '/[0-9][0-9]/ { print $1 }' $FILE | sort -u

# 6  
Old 12-29-2009
Hi, the OP is looking for an alternative to awk in order to compare it. Smilie

Last edited by Scrutinizer; 12-29-2009 at 05:04 AM..
# 7  
Old 12-29-2009
Hello Scrutinizer,

If you were wondering. AWK destroyed the competition.

Code:
[root@radio10 testing]# ls -lah
total 255M
drwxr-xr-x   2 root root 4.0K Dec 29 09:18 .
drwxr-x---  15 root root 4.0K Dec 29 09:18 ..
-rw-r--r--   1 root root 255M Dec 29 09:15 access_log

Code:
[root@server1 testing]# FILE=/root/testing/access_log

Code:
[root@server1 testing]# time cut -d ' ' -f 1 "$FILE" | sort | uniq -c | grep '[0-9][0-9] '
 598129 10.4.20.236
 179838 67.72.16.134
    215 67.72.16.140
   7470 67.72.16.184
 414332 67.72.16.186
 884701 67.72.16.187
 880528 67.72.16.195
    379 67.86.131.180
    476 68.195.209.195
    166 68.195.209.198
     38 76.19.14.47

real	2m0.744s
user	2m11.299s
sys	0m0.758s

Code:
[root@server1 testing]# time awk '{!a[$1]++}END{for(i in a) if ( a[i] >10 ) print a[i],i }' access_log 
880528 67.72.16.195
414332 67.72.16.186
884701 67.72.16.187
215 67.72.16.140
476 68.195.209.195
379 67.86.131.180
179838 67.72.16.134
166 68.195.209.198
38 76.19.14.47
7470 67.72.16.184
598129 10.4.20.236

real	0m2.756s
user	0m2.489s
sys	0m0.277s
[root@server testing]#

Thanks for making this test happen.

Regards,

Jaysunn
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Help Needed Using awk/CUT

Hi Experts, I am writing a script and struct at a part Need your help to get this I have a file generated called /tmp/testify.log $ cat testify.log Machine Parts 6 DREE Mufler Strengths 33 XYNC Siscos 20 09 ABSC... (7 Replies)
Discussion started by: itsme488
7 Replies

2. Shell Programming and Scripting

how to cut the last field without using awk

i have file as with the below content aaa.bbb.cc.dd aaa.fff.bb yyyyy.rrrrr.ggggg.iii wwww.w.r.ty i want the o/p as below dd bb iii ty but i dont want to use awk. is there any other way to do this ? (5 Replies)
Discussion started by: anandgodse
5 Replies

3. Shell Programming and Scripting

Want to use awk instead of cut

I want to use awk instead of cut command. Following is my code: line="slNumber=US78AJF11643, slName=USJFKAAUSYDAAUL80441032900095, dummy sl found? sqlca.sqlcode=0" sl_WORD=`echo $line| cut -f 1 -d','` sl=`echo $sl_WORD | cut -f 2 -d'='` echo "$sl" Please suggest me about the code. ... (5 Replies)
Discussion started by: rinku
5 Replies

4. Shell Programming and Scripting

HELP! using cut/awk

how would i write a shell script to show the number of lines in which int variable appears in a c++ program. how would i do this using cut or awk methods is it possbile and having a output similar to this x, number of apperances = y, number of apperances = (2 Replies)
Discussion started by: deadleg
2 Replies

5. UNIX for Dummies Questions & Answers

Help please awk or cut

Hi I'm new to unix programming so struggling with something thats probably simple to many of you I have data files of the format : ID, date, value1, value2, blank on each line either value1 or value2 will be zero. I need my output file to contain ID, date, non-zero value The input... (3 Replies)
Discussion started by: thewench
3 Replies

6. Shell Programming and Scripting

Is awk vs cut which one is better

i was trying to work on program to look for users never log on sever.. using awk with awk is working last| awk '{print $1}' |sort -u > /tmp/users1$$ cat /etc/passwd | awk -F: '{print $1}' |sort -u > /tmp/users2$$ comm -13 /tmp/users$$ rm -f /tmp/users$$ with cut it is not working ... (3 Replies)
Discussion started by: macrules
3 Replies

7. Shell Programming and Scripting

awk or cut

select some fields from data file (source.csv) The data in file(source.csv) is like "x1,2",,"y",,"z" How to get the 1st, 2nd and 3rd field from the file. Using awk or cut? Note: "x1,2" is one field. thanks, (16 Replies)
Discussion started by: anypager
16 Replies

8. Shell Programming and Scripting

[grep awk cut] > awk

Hi, I'm very new to scripting. grep $s $filename | awk '{print $2}' | cut -c 1-8 How can I optimize this using a single awk? I tried: awk '/$s/ {print $2}' $filename | cut -c 1-8 However didn't work, I think the awk is not recognizing $s and the verbal is something else. (6 Replies)
Discussion started by: firdousamir
6 Replies

9. UNIX for Dummies Questions & Answers

CTRL+H versus ^? versus BACKSPACE

Hi Gurus! I recently got my shell account (HP UX v11) created by our sysadmin and am having problem deleting with the backspace key. After doing some reading, I believe I need to enter a custom "STTY..." statement in my profile. Can someone please help me with the correct "STTY" sequence... (3 Replies)
Discussion started by: alan
3 Replies
Login or Register to Ask a Question