The UNIX and Linux Forums  

Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
Google UNIX.COM



View Single Post in UNIX Forums - Click on the Thread or Permalink to View Entire Thread -->
  #4 (permalink)  
Old 05-16-2008
ripat ripat is offline
Registered User
 

Join Date: Oct 2006
Location: Belgium
Posts: 171
You just can use it instead of your code. No need for wc, uniqu or sort any more.

This is the code commented with explanation:
Code:
 # define the field separator
BEGIN{FS=":"}

# start to loop through the file. For each record, add one to
# the array hits[site.name] thix will count the total number of hits
{
        hits[$1]++

        # create one array with index: "site.name:ip_address"
        # this way there will be only one index site.name:ip_adress (unique hits)
        ip_hits[$1 ":" substr($2, 1, 15)]
}

# once all fil is processed we print array contents by looping through
# associative indexes for (i in array)....
END {
        # this line counts the number of *different* elements in array ip_hits
        for (i in ip_hits){ sub(/:.+/, "", i); unique[i]++}

        # print title
        pr_format="%-20s %5s %s\n"
        printf pr_format, "Page:", "Hits", "Unique Hits"

        # loop through array hits and array unique
        for (i in hits) printf pr_format, i, hits[i], unique[i]
}
To help you understand: here is a printout of the different arrays used above
  • hits (counts total number of hits per site)
    Code:
    [hits/cds.hits] => 10
    [hits/contact.hits] => 4
    [hits/books.hits] => 4
  • ip_hits (no value here. Just indexes)
    Code:
    [hits/cds.hits:182.210.215.110] => 
    [hits/books.hits:143.217.64.204 ] => 
    [hits/cds.hits:150.205.160.134] => 
    [hits/cds.hits:221.253.17.33  ] => 
    [hits/books.hits:81.140.86.170  ] => 
    [hits/cds.hits:67.231.144.166 ] => 
    [hits/contact.hits:15.111.224.138 ] => 
    [hits/contact.hits:234.33.121.120 ] => 
    [hits/cds.hits:193.140.224.143] => 
    [hits/cds.hits:199.226.220.114] => 
    [hits/cds.hits:94.83.157.230  ] => 
    [hits/books.hits:62.145.39.14   ] => 
    [hits/contact.hits:199.179.222.39 ] => 
    [hits/cds.hits:148.214.45.187 ] =>
  • unique (counts number of elements in ip_hits)
    Code:
    [hits/cds.hits] => 8
    [hits/contact.hits] => 3
    [hits/books.hits] => 3

To use the code:
Code:
$ awk -f script-name you-input-file
Reply With Quote