Counting entries in a file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Counting entries in a file
# 8  
Old 08-08-2011
Small tweeks to the original awk to show number of new IPs in the current bin compared to the previous.

Code:
#!/usr/bin/env ksh

awk -v bin_size=${1:-5} '
    function dump( )
    {
        if( NR == 1 )
            return;

        new_count = 0;
        for( u in unique )              # compute total in this bin that were not in last bin
            if( last_bin[u] == 0 )
                new_count++;

        printf( "%3d %3d %3d\n", bin+1, total, new_count );
        bin++;
    }

    {
        if( $1+0 >= next_bin )
        {
            dump( );
            next_bin = $1 + bin_size;

            delete last_bin;
            for( u in unique )              # copy hits from this bin
                last_bin[u] = 1;
            delete unique;
            total = 0;
        }

        unique[$2]++
        total++;
    }
    END {
        if( total )
            dump( );
    }
'

exit

Have fun!
This User Gave Thanks to agama For This Post:
# 9  
Old 08-08-2011
It worksSmilie

Can you give a brief explanation ? I am new to awk.
# 10  
Old 08-09-2011
Glad it worked for you. I've added some comments. I'll watch the thread if you have specific questions.

Code:
awk -v bin_size=${1:-5} '
    # use a function so we can call as we process input and at the end without duplicating the code
    function dump( )                # dump out the information that we collected about the last bin
    {
        if( NR == 1 )               # we will call this for the first record; 
            return;                 # if this is the first record (NR equals 1) then we skip the print

        new_count = 0;
        for( u in unique )              # look at each unique IP we saved
            if( last_bin[u] == 0 )      # if it wasnt noticed last time, count it
                new_count++;

        bin++;
        printf( "%3d %3d %3d\n", bin, total, new_count );       # print all of the counts
    }

    {                                   # process for each record in the file (impled true condition)
        if( $1+0 >= next_bin )          # if timestamp (col 1) is in the next bin
        {
            dump( );                    # print data from the previous bin
            next_bin = $1 + bin_size;   # mark the start of the next bin

            delete last_bin;            # must delete contents of last bin
            for( u in unique )          # copy hits from this bin 
                last_bin[u] = 1;        # for comparison when we see the start of next bin
            delete unique;              # must delete the list of unique IPs from the current bin before we start
            total = 0;                  # zero number of hits in the bin
        }

        unique[$2]++                    # count the number of times this IP address was seen in the bin
        total++;                        # total number of entries in the bin
    }

    END {               # at the end of the file, one last print if we saw something in the previous bin
        if( total )
            dump( );
    }
'

# 11  
Old 09-30-2011
Can you modify it to compute the 'new IPs in the current bin compared to the ENTIRE HISTORY upto that interval instead of just the previous interval' ?

Cheers,
# 12  
Old 10-05-2011
I have another column to the input file and would like to sum-up and print the entries of that column for the user specified time interval. For e.g. if the user specifies 5 second as the input, the script should add all the entries of the third column for this 5 sec interval and print it alongside the other information being currently printed by the above script i.e. Time-stamp, number of packets, number of uniq IPs in that interval and number of new IP in that interval as compared to the previous interval.

Looking for a solution.

Cheers,
# 13  
Old 10-06-2011
only raw data, no title in source file. Otherwise, you need adjust the red part.

Code:
$ interval=2
$ awk -v s=$interval 'NR==1{min=$1}
                    {NoP[$1]++;UnIP[$1 FS $2]++;IP[$2];min=min>$1?$1:min;max=max>$1?max:$1}
                   END{for (i=min;i<=max;i=i+s)  
                         { b=i
                           {while (b<i+s) 
                                 {t+=NoP[b]
                                  for (j in IP) if (UnIP[b FS j]) u++
                                  b++
                                 }
                            }
                            print ++e,t,u
                           t=0;u=0}
                       }' infile

1 5 4
2 7 5

# 14  
Old 10-06-2011
The third column it prints is incorrect.

I am looking for a solution which integrates the above script given by agama and your previous solution
https://www.unix.com/shell-programmin...alues-awk.html
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Counting and print from file

Dear community, I have an already filtered log on my machine, something like: WARN 2016.03.10 10:59:01.136 logging.LogAlarmListener raise ALARMWARNINGRAISED Alarm NODE-NetworkAccessGroup.Client.41283 SERVICEDOWN-41283.WC severity WARNING raised: Service 41283.WC protocoltype client is down... (13 Replies)
Discussion started by: Lord Spectre
13 Replies

2. Shell Programming and Scripting

Need help of counting no of column of a file

Hi All , I got stuck on the below scenario.If anyone can help me ,that will be really helpful. I have a target hdfs file layout.I need to know the no of column in that file. Target_RECRD_layout { ABC_ID EN NOTNULLABLE, ABC_COUNTRY CHARACTER ENCODING ASCII NOTNULLABLE, ... (5 Replies)
Discussion started by: STCET22
5 Replies

3. Shell Programming and Scripting

Counting lines in a file using awk

I want to count lines of a file using AWK (only) and not in the END part like this awk 'END{print FNR}' because I want to use it. Does anyone know of a way? Thanks a lot. (7 Replies)
Discussion started by: guitarist684
7 Replies

4. UNIX for Dummies Questions & Answers

Counting feilds entries with Perl

Hi All, I have a small problem of counting the number of times a particular entry that exists in a horizontal string of elements and a vertical feild (column of entries). For example AATGGTCCTGExpected outputA=2 C=2 G=3 T=3 I have an idea to do this but I dont know how to do that if these entries... (1 Reply)
Discussion started by: pawannoel
1 Replies

5. Shell Programming and Scripting

Counting characters within a file

Ok say I wanted to count every Y in a data file. Then set Y as my delimiter so that I can separate my file by taking all the contents that occur BEFORE the first Y and store them in a variable so that I may use this content later on in my program. Then I could do the same thing with the next Y's... (5 Replies)
Discussion started by: puttster
5 Replies

6. Shell Programming and Scripting

Counting duplicate entries in a file using awk

Hi, I have a very big (with around 1 million entries) txt file with IPv4 addresses in the standard format, i.e. a.b.c.d The file looks like 10.1.1.1 10.1.1.1 10.1.1.1 10.1.2.4 10.1.2.4 12.1.5.6 . . . . and so on.... There are duplicate/multiple entries for some IP... (3 Replies)
Discussion started by: sajal.bhatia
3 Replies

7. Shell Programming and Scripting

Counting multiple entries in a file using awk

Hi, I have a big file (~960MB) having epoch time values (~50 million entries) which looks like 897393601 897393601 897393601 897393601 897393602 897393602 897393602 897393602 897393602 897393603 897393603 897393603 897393603 and so on....each time stamp has more than one... (6 Replies)
Discussion started by: sajal.bhatia
6 Replies

8. Programming

Counting the words in a file

Please find the below program. It contains the purpose of the program itself. /* Program : Write a program to count the number of words in a given text file */ /* Date : 12-June-2010 */ # include <stdio.h> # include <stdlib.h> # include <string.h> int main( int argc, char *argv ) {... (6 Replies)
Discussion started by: ramkrix
6 Replies

9. Shell Programming and Scripting

Help me in counting records from file

Hi, Please help me in counting the below records(1st field) from samplefile: Expected output: Count Descr ------------------------------------------- 7 Mean manager 14 ... (7 Replies)
Discussion started by: prashant43
7 Replies

10. Shell Programming and Scripting

Counting words in a file

I'm trying to figure out a way to count the number of words in the follwing file: cal 2002 > file1 Is there anyway to do this without using wc but instead using the cut command? (1 Reply)
Discussion started by: r0mulus
1 Replies
Login or Register to Ask a Question