Sponsored Content
Full Discussion: using uniq and awk??
Top Forums Shell Programming and Scripting using uniq and awk?? Post 302196179 by ripat on Saturday 17th of May 2008 02:27:57 AM
Old 05-17-2008
You just can use it instead of your code. No need for wc, uniqu or sort any more.

This is the code commented with explanation:
Code:
 # define the field separator
BEGIN{FS=":"}

# start to loop through the file. For each record, add one to
# the array hits[site.name] thix will count the total number of hits
{
        hits[$1]++

        # create one array with index: "site.name:ip_address"
        # this way there will be only one index site.name:ip_adress (unique hits)
        ip_hits[$1 ":" substr($2, 1, 15)]
}

# once all fil is processed we print array contents by looping through
# associative indexes for (i in array)....
END {
        # this line counts the number of *different* elements in array ip_hits
        for (i in ip_hits){ sub(/:.+/, "", i); unique[i]++}

        # print title
        pr_format="%-20s %5s %s\n"
        printf pr_format, "Page:", "Hits", "Unique Hits"

        # loop through array hits and array unique
        for (i in hits) printf pr_format, i, hits[i], unique[i]
}

To help you understand: here is a printout of the different arrays used above
  • hits (counts total number of hits per site)
    Code:
    [hits/cds.hits] => 10
    [hits/contact.hits] => 4
    [hits/books.hits] => 4

  • ip_hits (no value here. Just indexes)
    Code:
    [hits/cds.hits:182.210.215.110] => 
    [hits/books.hits:143.217.64.204 ] => 
    [hits/cds.hits:150.205.160.134] => 
    [hits/cds.hits:221.253.17.33  ] => 
    [hits/books.hits:81.140.86.170  ] => 
    [hits/cds.hits:67.231.144.166 ] => 
    [hits/contact.hits:15.111.224.138 ] => 
    [hits/contact.hits:234.33.121.120 ] => 
    [hits/cds.hits:193.140.224.143] => 
    [hits/cds.hits:199.226.220.114] => 
    [hits/cds.hits:94.83.157.230  ] => 
    [hits/books.hits:62.145.39.14   ] => 
    [hits/contact.hits:199.179.222.39 ] => 
    [hits/cds.hits:148.214.45.187 ] =>

  • unique (counts number of elements in ip_hits)
    Code:
    [hits/cds.hits] => 8
    [hits/contact.hits] => 3
    [hits/books.hits] => 3


To use the code:
Code:
$ awk -f script-name you-input-file

 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to replicate data using Uniq or awk

Hi, I have this scenario; where there are two classes:- apple and orange. 1,2,3,4,5,6,apple 1,1,0,4,2,3,apple 1,3,3,3,3,4,apple 1,1,1,1,1,1,orange 1,2,3,1,1,1,orange Basically for apple, i have 3 entries in the file, and for orange, I have 2 entries. Im trying to edit the file and find... (5 Replies)
Discussion started by: ahjiefreak
5 Replies

2. Shell Programming and Scripting

Help with uniq or awk??

Hi, my dilemna is this: example i got a file of fruit.txt which contains: Apple 6 Apple_new 7 old_orange 9 orange 10 Is there any way for me to have an output of Apple 13 Orange 19 using shell script: (6 Replies)
Discussion started by: shinoman28
6 Replies

3. Shell Programming and Scripting

Text Proccessing with sort,uniq,awk

Hello, I have a log file with the following input: X , ID , Date, Time, Y 01,01368,2010-12-02,09:07:00,Pass 01,01368,2010-12-02,10:54:00,Pass 01,01368,2010-12-02,13:07:04,Pass 01,01368,2010-12-02,18:54:01,Pass 01,01368,2010-12-03,09:02:00,Pass 01,01368,2010-12-03,13:53:00,Pass... (12 Replies)
Discussion started by: rollyah
12 Replies

4. Shell Programming and Scripting

[uniq + awk?] How to remove duplicate blocks of lines in files?

Hello again, I am wanting to remove all duplicate blocks of XML code in a file. This is an example: input: <string-array name="threeItems"> <item>item1</item> <item>item2</item> <item>item3</item> </string-array> <string-array name="twoItems"> <item>item1</item> <item>item2</item>... (19 Replies)
Discussion started by: raidzero
19 Replies

5. Shell Programming and Scripting

awk - getting uniq count on multiple col

Hi My file have 7 column, FIle is pipe delimed Col1|Col2|col3|Col4|col5|Col6|Col7 I want to find out uniq record count on col3, col4 and col2 ( same order) how can I achieve it. ex 1|3|A|V|C|1|1 1|3|A|V|C|1|1 1|4|A|V|C|1|1 Output should be FREQ|A|V|3|2 FREQ|A|V|4|1 Here... (5 Replies)
Discussion started by: sanranad
5 Replies

6. Shell Programming and Scripting

awk uniq and longest string of a column as index

I met a challenge to filter ~70 millions of sequence rows and I want using awk with conditions: 1) longest string of each pattern in column 2, ignore any sub-string, as the index; 2) all the unique patterns after 1); 3) print the whole row; input: 1 ABCDEFGHI longest_sequence1 2 ABCDEFGH... (12 Replies)
Discussion started by: yifangt
12 Replies

7. Shell Programming and Scripting

Rewriting GNU uniq in awk

Within a shell script I use uniq -w 16 -D in order to process all lines in which the first 16 characters are duplicated. Now I want to also run that script on a BSD based system where the included version of uniq does not support the -w (--check-chars) option. To get around this I have... (7 Replies)
Discussion started by: mij
7 Replies

8. Shell Programming and Scripting

Sort uniq or awk

Hi again, I have files with the following contents datetime,ip1,port1,ip2,port2,number How would I find out how many times ip1 field shows up a particular file? Then how would I find out how many time ip1 and port 2 shows up? Please mind the file may contain 100k lines. (8 Replies)
Discussion started by: LDHB2012
8 Replies

9. Shell Programming and Scripting

awk compare and keep uniq

Hi all I was wondering if you may help me in resolving an issue. In particular I have a file like this: the ... represent different string and what I wrote Cur or Ent are the constant. Well, what I would like to obtain is a file in which are reported only the ID in which the second column... (6 Replies)
Discussion started by: giuliangiuseppe
6 Replies

10. UNIX for Dummies Questions & Answers

awk or uniq

Hi Help, I have a file which looks like 1 20 30 40 50 60 6 2 20 30 40 50 60 8 7 20 30 40 50 60 7 4 30 40 50 60 70 8 5 30 40 50 60 70 9 2 30 40 50 60 70 8 I want the o/p as 1 20 30 40 50 60 6 4 30 40 50 60 70 8 Is there a way I can use uniq command or awk to do this? ... (11 Replies)
Discussion started by: Indra2011
11 Replies
KinoSearch1::Search::Hits(3pm)				User Contributed Perl Documentation			    KinoSearch1::Search::Hits(3pm)

NAME
KinoSearch1::Search::Hits - access search results SYNOPSIS
my $hits = $searcher->search( query => $query ); $hits->seek( 0, 10 ); while ( my $hashref = $hits->fetch_hit_hashref ) { print "<p>$hashref->{title} <em>$hashref->{score}</em></p> "; } DESCRIPTION
Hits objects are used to access the results of a search. By default, a hits object provides access to the top 100 matches; the seek() method provides finer-grained control. A classic application would be paging through hits. The first time, seek to a START of 0, and retrieve 10 documents. If the user wants to see more -- and there are more than 10 total hits -- seek to a START of 10, and retrieve 10 more documents. And so on. METHODS
seek $hits->seek( START, NUM_TO_RETRIEVE ); Position the Hits iterator at START, and capture NUM_TO_RETRIEVE docs. total_hits my $num_that_matched = $hits->total_hits; Return the total number of documents which matched the query used to produce the Hits object. (This number is unlikely to match NUM_TO_RETRIEVE.) fetch_hit while ( my $hit = $hits->fetch_hit ) { # ... } Return the next hit as a KinoSearch1::Search::Hit object. fetch_hit_hashref while ( my $hashref = $hits->fetch_hit_hashref ) { # ... } Return the next hit as a hashref, with the field names as keys and the field values as values. An entry for "score" will also be present, as will an entry for "excerpt" if create_excerpts() was called earlier. However, if the document contains stored fields named "score" or "excerpt", they will not be clobbered. create_excerpts my $highlighter = KinoSearch1::Highlight::Highlighter->new( excerpt_field => 'bodytext', ); $hits->create_excerpts( highlighter => $highlighter ); Use the supplied highlighter to generate excerpts. See KinoSearch1::Highlight::Highlighter. COPYRIGHT
Copyright 2005-2010 Marvin Humphrey LICENSE, DISCLAIMER, BUGS, etc. See KinoSearch1 version 1.00. perl v5.14.2 2011-11-15 KinoSearch1::Search::Hits(3pm)
All times are GMT -4. The time now is 02:28 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy