Awk counting lines with field match


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Awk counting lines with field match
# 1  
Old 05-08-2009
Awk counting lines with field match

Hi,

Im trying to create a script that reads throught every line in a file and then counts how many lines there with a certain field that matches a input, and also ausing another awk it has to do the same as the above but to then use sort anduniq to get rid of all the unique lines with another field.

Heres an example line that i am going through

345.345.345.345 Mon June 03 02:33:44 GMT 2007

and here is my awk's
as i havnt managed to compress them into one line yet
awk '$2 == "'"$DDD"'" { print $5 }' ~/folder/home.hhh | awk -F: '$1 >= VAR1 && $1 <= VAR2 { print $0 }' VAR1=$BEGIN VAR2=$END | wc -l

awk '$2 == "'"$DDD"'" { print $5 }' ~/folder/home.hhh | awk -F: '$1 >= VAR1 && $1 <= VAR2 { print $0 }' VAR1=$BEGIN VAR2=$END | sort | uniq | wc -l;;

DDD = A user entered day, e.g. Mon
BEGIN each earliest hour in a range e.g. 01
END is the last hour in a hour range e.g. 03

The first awk works bril, but the second doesnt as im trying to use the uniq on the 345.345.345.345 field ($1 on the original lines), but the first awk doesnt pass this foreward to the next awk. is it possible to still check if $2 is equal to DAYS and the time has an hour between a time range and then use the uniq to delete any lines where $1 of the original line is the same.

Sorry if this is confusing and please ask questions if needed.

Thanks

Last edited by fredted40x; 05-08-2009 at 12:57 PM..
# 2  
Old 05-08-2009
Code:
#count
awk -v val=$VALUE ' $1==val {cnt++} END {print cnt}' somefile

I do not get your second requirement at all - do you want a unique listing or do you want to change everything so that there is no unique value?

This gives uniq:
Code:
awk '!arr[$1]++' somefile > uniq_file

# 3  
Old 05-08-2009
thanks for getting back to me,

I have had to use both bits of code at the moment, the first one counts everyline wth the correct day and within the time range, and the second does the same but only counts unique ip address ($! of the original line)

the code is wrong on the second bit.

Basically i want to count how many lines have a day of whatever is stored in the variable DDD, but it must also have a hour between the numbers BEGIN and END. But i want to only count the lines that have all that but the ip address (345.345.345.345) is unique so if there is three lines that have have the right day and hour but they all have the same ip address it would only count 1. And the final count then get printed on the screen in the terminal.

Hope this is a bit better.
# 4  
Old 05-08-2009
why don't you show an example of an input file and show your desired output as well?
# 5  
Old 05-09-2009
Good idea!! Smilie

Input file
28.60.227.169 Fri May 03 14:15:44 GMT 2007
28.60.227.169 Fri May 03 15:15:47 GMT 2007
234.234.234.3 Fri May 25 18:11:32 GMT 2007
44.184.167.119 Mon May 07 09:36:18 GMT 2007
44.184.167.119 Mon May 07 09:36:18 GMT 2007
177.136.78.125 Thu May 24 07:10:12 GMT 2007
102.27.136.47 Fri May 25 01:11:32 GMT 2007

Program input/output
Please enter a day e.g. Mon
Fri
Would you like to add a time range?
Y
Please enter starting hour
01
Please enter last hour
16
3 Lines match this
2 Lines match this when removing the lines containing identical ip addresses.
# 6  
Old 05-09-2009
if you have Python, here's an alternative solution
Code:
import sys
while 1:
    day = raw_input("Please enter a day(eg Mon): ").lower()
    choice= raw_input("Would you like to add a range(Y|N): ")
    if choice in ['Y','y']:
        starthr = int(raw_input("Please enter starting hour eg 01,10: "))
        endhr = int(raw_input("Please enter ending hr eg 02,24: "))
        d={}
        for line in open("file"):
            line=line.strip()
            s=line.split()
            if day == s[1].lower():
                hr = int(s[-3].split(":")[0])
                if hr >= starthr and hr <= endhr:
                    d.setdefault(s[0],0)
                    d[s[0]]+=1        
        o=0            
        for i,j in d.iteritems():
            print i,j
            if j==1: o+=1      
        print "%d lines match this" %(len(d.keys()))
        print "%d lines match this when containing identical ip address" %(o)
    elif choice in ['N','n']:    
        print "you are not adding a range...so what now?"
    else:
        print "Enter correct choice..exiting"
        sys.exit()

output:
Code:
# ./test.py
Please enter a day(eg Mon): fri
Would you like to add a range(Y|N): y
Please enter starting hour eg 01,10: 01
Please enter ending hr eg 02,24: 23
102.27.136.47 1
28.60.227.169 2
234.234.234.3 1
3 lines match this
2 lines match this when containing identical ip address
Please enter a day(eg Mon): fri
Would you like to add a range(Y|N): y
Please enter starting hour eg 01,10: 02
Please enter ending hr eg 02,24: 23
28.60.227.169 2
234.234.234.3 1
2 lines match this
1 lines match this when containing identical ip address
Please enter a day(eg Mon):

# 7  
Old 05-09-2009
Try this:

awkfile.awk

Code:
BEGIN{
printf "Please enter a day e.g Mon:";getline < "-";s=$0
printf "Would you like to Enter a time range:";getline < "-"
if ($0 ~ /(y|Y)/){printf "Enter thestarting hour:";getline < "-";st=$0
                  printf "Enter the last hour:";getline < "-";lt=$0}
}
length(st) == 0 && $2 == s {a[$1]=$0;next}
length(st) > 0 && $2 == s && substr($5,1,2) >= st && substr($5,1,2) <= lt{a[$1]=$0;next}
END{
for(i in a) print a[i]
   }

run as:
Code:
awk -f awkfile.awk filename


cheers,
Devaraj Takhellambam
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk to match field between two files and use conditions on match

I am trying to look for $2 of file1 (skipping the header) in $2 of file2 (skipping the header) and if they match and the value in $10 is > 30 and $11 is > 49, then print the line from file1 to a output file. If no match is foung the line is not printed. Both the input and output are tab-delimited.... (3 Replies)
Discussion started by: cmccabe
3 Replies

2. Shell Programming and Scripting

awk to match value to a field within +/- value

In the awk below I use $2 of filet to search filea for a match. If the values in $2 are exact match this works great. However, that is not always the case, so I need to perform the search using a range of + or - 2. That is if the value in filea $2 is within + or - 2 of filet $2 then it is matched.... (6 Replies)
Discussion started by: cmccabe
6 Replies

3. Shell Programming and Scripting

awk repeat one field at all lines and modify field repetitions

Hello experts I have a file with paragraphs begining with a keeping date and ending with "END": 20120301 num num John num num A keepnum1 num num kathrin num num A keepnum1 num num kathrin num num B keepnum2 num num Pete num num A keepnum1 num num Jacob num... (2 Replies)
Discussion started by: phaethon
2 Replies

4. Shell Programming and Scripting

Counting lines in a file using awk

I want to count lines of a file using AWK (only) and not in the END part like this awk 'END{print FNR}' because I want to use it. Does anyone know of a way? Thanks a lot. (7 Replies)
Discussion started by: guitarist684
7 Replies

5. Shell Programming and Scripting

counting lines that match pattern

I have a file of 1.3 millions lines. some are with the same word twice on the line, some line have two diffrent words. each line has two words, one in brackets. example: foo (foo) bar (bar) thae (awvd) beladf (vswvw) I am sure this can be done with one line of... (6 Replies)
Discussion started by: robsonde
6 Replies

6. UNIX for Dummies Questions & Answers

awk display the match and 2 lines after the match is found.

Hello, can someone help me how to find a word and 2 lines after it and then send the output to another file. For example, here is myfile1.txt. I want to search for "Error" and 2 lines below it and send it to myfile2.txt I tried with grep -A but it's not supported on my system. I tried with awk,... (4 Replies)
Discussion started by: eurouno
4 Replies

7. Shell Programming and Scripting

AWK: Pattern match between 2 files, then compare a field in file1 as > or < field in file2

First, thanks for the help in previous posts... couldn't have gotten where I am now without it! So here is what I have, I use AWK to match $1 and $2 as 1 string in file1 to $1 and $2 as 1 string in file2. Now I'm wondering if I can extend this AWK command to incorporate the following: If $1... (4 Replies)
Discussion started by: right_coaster
4 Replies

8. Shell Programming and Scripting

counting lines containing two column field values with awk

Hello everybody, I'm trying to count the number of consecutive lines in a text file which have two distinctive column field values. These lines may appear in several line blocks within the file, but I only want a single block to be counted. This was my first approach to tackle the problem (I'm... (6 Replies)
Discussion started by: origamisven
6 Replies

9. Shell Programming and Scripting

Counting lines of code in a directory with awk

I've never toyed with awk, but it seems every time I present an elegant 2- to 8-line script, someone comes back with an awk 1-liner. I just came up with this to count all the lines of source code in a directory. How would I do it in awk? LINES=0 for n in $(wc -l *.cpp *.h | cut -b-7); do ... (2 Replies)
Discussion started by: KenJackson
2 Replies

10. Shell Programming and Scripting

awk - Counting number of similar lines

Hi All I have the input file OMAK_11. OMAK 000002EXCLUDE 1341 OMAK 000002EXCLUDE 1341 OMAK 000002EXCLUDE 1341 OMAK 000003EXCLUDE 1341 OMAK 000003EXCLUDE 1341 OMAK 000003EXCLUDE ... (8 Replies)
Discussion started by: dhanamurthy
8 Replies
Login or Register to Ask a Question