using uniq and awk?? Post: 302196179

Sponsored Content

Top Forums Shell Programming and Scripting using uniq and awk?? Post 302196179 by ripat on Saturday 17th of May 2008 02:27:57 AM

05-17-2008

Registered User

You just can use it instead of your code. No need for wc, uniqu or sort any more.

This is the code commented with explanation:

Code:

 # define the field separator
BEGIN{FS=":"}

# start to loop through the file. For each record, add one to
# the array hits[site.name] thix will count the total number of hits
{
        hits[$1]++

        # create one array with index: "site.name:ip_address"
        # this way there will be only one index site.name:ip_adress (unique hits)
        ip_hits[$1 ":" substr($2, 1, 15)]
}

# once all fil is processed we print array contents by looping through
# associative indexes for (i in array)....
END {
        # this line counts the number of *different* elements in array ip_hits
        for (i in ip_hits){ sub(/:.+/, "", i); unique[i]++}

        # print title
        pr_format="%-20s %5s %s\n"
        printf pr_format, "Page:", "Hits", "Unique Hits"

        # loop through array hits and array unique
        for (i in hits) printf pr_format, i, hits[i], unique[i]
}

To help you understand: here is a printout of the different arrays used above

hits (counts total number of hits per site)

Code:

[hits/cds.hits] => 10
[hits/contact.hits] => 4
[hits/books.hits] => 4

ip_hits (no value here. Just indexes)

Code:

[hits/cds.hits:182.210.215.110] => 
[hits/books.hits:143.217.64.204 ] => 
[hits/cds.hits:150.205.160.134] => 
[hits/cds.hits:221.253.17.33  ] => 
[hits/books.hits:81.140.86.170  ] => 
[hits/cds.hits:67.231.144.166 ] => 
[hits/contact.hits:15.111.224.138 ] => 
[hits/contact.hits:234.33.121.120 ] => 
[hits/cds.hits:193.140.224.143] => 
[hits/cds.hits:199.226.220.114] => 
[hits/cds.hits:94.83.157.230  ] => 
[hits/books.hits:62.145.39.14   ] => 
[hits/contact.hits:199.179.222.39 ] => 
[hits/cds.hits:148.214.45.187 ] =>

unique (counts number of elements in ip_hits)

Code:

[hits/cds.hits] => 8
[hits/contact.hits] => 3
[hits/books.hits] => 3

To use the code:

Code:

$ awk -f script-name you-input-file

ripat

View Public Profile for ripat

Find all posts by ripat

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to replicate data using Uniq or awk

Hi, I have this scenario; where there are two classes:- apple and orange. 1,2,3,4,5,6,apple 1,1,0,4,2,3,apple 1,3,3,3,3,4,apple 1,1,1,1,1,1,orange 1,2,3,1,1,1,orange Basically for apple, i have 3 entries in the file, and for orange, I have 2 entries. Im trying to edit the file and find...

2. Shell Programming and Scripting

Help with uniq or awk??

Hi, my dilemna is this: example i got a file of fruit.txt which contains: Apple 6 Apple_new 7 old_orange 9 orange 10 Is there any way for me to have an output of Apple 13 Orange 19 using shell script:

3. Shell Programming and Scripting

Text Proccessing with sort,uniq,awk

Hello, I have a log file with the following input: X , ID , Date, Time, Y 01,01368,2010-12-02,09:07:00,Pass 01,01368,2010-12-02,10:54:00,Pass 01,01368,2010-12-02,13:07:04,Pass 01,01368,2010-12-02,18:54:01,Pass 01,01368,2010-12-03,09:02:00,Pass 01,01368,2010-12-03,13:53:00,Pass...

4. Shell Programming and Scripting

[uniq + awk?] How to remove duplicate blocks of lines in files?

Hello again, I am wanting to remove all duplicate blocks of XML code in a file. This is an example: input: <string-array name="threeItems"> <item>item1</item> <item>item2</item> <item>item3</item> </string-array> <string-array name="twoItems"> <item>item1</item> <item>item2</item>...

5. Shell Programming and Scripting

awk - getting uniq count on multiple col

Hi My file have 7 column, FIle is pipe delimed Col1|Col2|col3|Col4|col5|Col6|Col7 I want to find out uniq record count on col3, col4 and col2 ( same order) how can I achieve it. ex 1|3|A|V|C|1|1 1|3|A|V|C|1|1 1|4|A|V|C|1|1 Output should be FREQ|A|V|3|2 FREQ|A|V|4|1 Here...

6. Shell Programming and Scripting

awk uniq and longest string of a column as index

I met a challenge to filter ~70 millions of sequence rows and I want using awk with conditions: 1) longest string of each pattern in column 2, ignore any sub-string, as the index; 2) all the unique patterns after 1); 3) print the whole row; input: 1 ABCDEFGHI longest_sequence1 2 ABCDEFGH...

7. Shell Programming and Scripting

Rewriting GNU uniq in awk

Within a shell script I use uniq -w 16 -D in order to process all lines in which the first 16 characters are duplicated. Now I want to also run that script on a BSD based system where the included version of uniq does not support the -w (--check-chars) option. To get around this I have...

8. Shell Programming and Scripting

Sort uniq or awk

Hi again, I have files with the following contents datetime,ip1,port1,ip2,port2,number How would I find out how many times ip1 field shows up a particular file? Then how would I find out how many time ip1 and port 2 shows up? Please mind the file may contain 100k lines.

9. Shell Programming and Scripting

awk compare and keep uniq

Hi all I was wondering if you may help me in resolving an issue. In particular I have a file like this: the ... represent different string and what I wrote Cur or Ent are the constant. Well, what I would like to obtain is a file in which are reported only the ID in which the second column...

10. UNIX for Dummies Questions & Answers

awk or uniq

Hi Help, I have a file which looks like 1 20 30 40 50 60 6 2 20 30 40 50 60 8 7 20 30 40 50 60 7 4 30 40 50 60 70 8 5 30 40 50 60 70 9 2 30 40 50 60 70 8 I want the o/p as 1 20 30 40 50 60 6 4 30 40 50 60 70 8 Is there a way I can use uniq command or awk to do this? ...

LEARN ABOUT DEBIAN

kinosearch1::search::hits

KinoSearch1::Search::Hits(3pm)				User Contributed Perl Documentation			    KinoSearch1::Search::Hits(3pm)

NAME

       KinoSearch1::Search::Hits - access search results

SYNOPSIS

	   my $hits = $searcher->search( query => $query );
	   $hits->seek( 0, 10 );
	   while ( my $hashref = $hits->fetch_hit_hashref ) {
	       print "<p>$hashref->{title} <em>$hashref->{score}</em></p>
";
	   }

DESCRIPTION

       Hits objects are used to access the results of a search.  By default, a hits object provides access to the top 100 matches; the seek()
       method provides finer-grained control.

       A classic application would be paging through hits.  The first time, seek to a START of 0, and retrieve 10 documents.  If the user wants to
       see more -- and there are more than 10 total hits -- seek to a START of 10, and retrieve 10 more documents.  And so on.

METHODS

   seek
	   $hits->seek( START, NUM_TO_RETRIEVE );

       Position the Hits iterator at START, and capture NUM_TO_RETRIEVE docs.

   total_hits
	   my $num_that_matched = $hits->total_hits;

       Return the total number of documents which matched the query used to produce the Hits object.  (This number is unlikely to match
       NUM_TO_RETRIEVE.)

   fetch_hit
	   while ( my $hit = $hits->fetch_hit ) {
	       # ...
	   }

       Return the next hit as a KinoSearch1::Search::Hit object.

   fetch_hit_hashref
	   while ( my $hashref = $hits->fetch_hit_hashref ) {
	       # ...
	   }

       Return the next hit as a hashref, with the field names as keys and the field values as values.  An entry for "score" will also be present,
       as will an entry for "excerpt" if create_excerpts() was called earlier.	However, if the document contains stored fields named "score" or
       "excerpt", they will not be clobbered.

   create_excerpts
	   my $highlighter = KinoSearch1::Highlight::Highlighter->new(
	       excerpt_field => 'bodytext',
	   );
	   $hits->create_excerpts( highlighter => $highlighter );

       Use the supplied highlighter to generate excerpts.  See KinoSearch1::Highlight::Highlighter.

COPYRIGHT

       Copyright 2005-2010 Marvin Humphrey

LICENSE, DISCLAIMER, BUGS, etc.
       See KinoSearch1 version 1.00.

perl v5.14.2							    2011-11-15					    KinoSearch1::Search::Hits(3pm)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to replicate data using Uniq or awk

Discussion started by: ahjiefreak

2. Shell Programming and Scripting

Help with uniq or awk??

Discussion started by: shinoman28

3. Shell Programming and Scripting

Text Proccessing with sort,uniq,awk

Discussion started by: rollyah

4. Shell Programming and Scripting

[uniq + awk?] How to remove duplicate blocks of lines in files?

Discussion started by: raidzero

5. Shell Programming and Scripting

awk - getting uniq count on multiple col

Discussion started by: sanranad

6. Shell Programming and Scripting

awk uniq and longest string of a column as index

Discussion started by: yifangt

7. Shell Programming and Scripting

Rewriting GNU uniq in awk

Discussion started by: mij

8. Shell Programming and Scripting

Sort uniq or awk

Discussion started by: LDHB2012

9. Shell Programming and Scripting

awk compare and keep uniq

Discussion started by: giuliangiuseppe

10. UNIX for Dummies Questions & Answers

awk or uniq

Discussion started by: Indra2011

LEARN ABOUT DEBIAN

kinosearch1::search::hits