awk to count duplicated lines


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk to count duplicated lines
# 1  
Old 09-24-2010
awk to count duplicated lines

We have an input file as follows:

Code:
2010-09-15-12.41.15
2010-09-15-12.41.15
2010-09-15-12.41.24
2010-09-15-12.41.24
2010-09-15-12.41.24
2010-09-15-12.41.24
2010-09-15-12.41.25
2010-09-15-12.41.26
2010-09-15-12.41.26
2010-09-15-12.41.26
2010-09-15-12.41.26
2010-09-15-12.41.26
2010-09-15-12.41.28
2010-09-15-12.41.28
2010-09-15-12.41.28
2010-09-15-12.41.28
2010-09-15-12.41.41

And we have this loop which works fine to count and print the line recurrences, i.e.:
Code:
for i in `cat infile | uniq`
        do
        num=`cat infile | grep $i | wc -l`
        echo $i $num
        done

However, would like to use the awk program to perform the similar logic. Please assist if possible and thanking you in advance.
# 2  
Old 09-24-2010
Code:
awk 'arr[$0]++  END {for (i in arr) { if(arr[i]>1]) {print arr[i], "    ", $0 }}' inputfile | sort -n

This produces a list of lines that occur more than once, with a count of the number of times they occur.
This User Gave Thanks to jim mcnamara For This Post:
# 3  
Old 09-24-2010
Or
Code:
sort file | uniq -c

This User Gave Thanks to anbu23 For This Post:
# 4  
Old 09-24-2010
Should be something like:
Code:
awk '{a[$0]++}END{for(i in a){print i, a[i]}}' file



---------- Post updated at 05:16 PM ---------- Previous update was at 05:13 PM ----------

or if you need only duplicate count
Code:
awk '{a[$0]++}END{for(i in a){if(a[i]-1)print i,a[i]}}' file

This User Gave Thanks to danmero For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Count lines with awk if statements

Hi Everybody, I wanna count lines in many files, but only if they meet a condition, I have something like this, cat /path1/usr/STAT/GPRS/ESTCOL_GPRS_2016* | awk 'BEGIN{FS=",";}{ if (substr($5,1,8)=='$DATE'){a++} END{for(i in a)print a}}' DATE=$(date +%Y%m%d -d "1 day ago") But it has... (6 Replies)
Discussion started by: Elly
6 Replies

2. Shell Programming and Scripting

awk to print before and after lines then count of patterns

What i'm trying to do here is show X amount of lines before and after the string "serialNumber" is found. BEFORE=3 AFTER=2 gawk '{a=$0} {count=0} /serialNumber/ && /./ {for(i=NR-'"${BEFORE}"';i<=NR;i++){count++ ;print a}for(i=1;i<'"${AFTER}"';i++){getline; print ; count ++; print... (5 Replies)
Discussion started by: SkySmart
5 Replies

3. Shell Programming and Scripting

Count words/lines between two tags using awk

Is there an efficient awk that can count the number of lines that occur in between two tags. For instance, consider the following text: <s> Hi PP - my VBD - name DT - is NN - . SENT . </s> <s> Her PP - name VBD - is DT - the NN - same WRT - . SENT - </s> I am interested to know... (4 Replies)
Discussion started by: owwow14
4 Replies

4. Shell Programming and Scripting

How to remove duplicated lines?

Hi, if i have a file like this: Query=1 a a b c c c d Query=2 b b b c c e . . . (7 Replies)
Discussion started by: the_simpsons
7 Replies

5. Shell Programming and Scripting

awk to insert duplicated lines

Dear All, Suppose I have a file: 1 1 1 1 2 2 2 2 3 3 3 3I want to insert new line under each old line so that the file would become: 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3How can this be accomplished using awk (or sed)? (5 Replies)
Discussion started by: littlewenwen
5 Replies

6. Shell Programming and Scripting

Multiple pattern matching using awk and getting count of lines

Hi , I have a file which has multiple rows of data, i want to match the pattern for two columns and if both conditions satisfied i have to add the counter by 1 and finally print the count value. How to proceed... I tried in this way... awk -F, 'BEGIN {cnt = 0} {if $6 == "VLY278" &&... (6 Replies)
Discussion started by: aemunathan
6 Replies

7. UNIX for Dummies Questions & Answers

Removing duplicated lines??

Hi Guys.. I have a problem for some reason my database has copied everything 4 times. My Database looks like this: >BAC233456 rhjieaheiohjteo tjtjrj6jkk6k6 j54ju54jh54jh >ANI124365 afrhtjykulilil htrjykuk rtkjryky ukrykyrk >BAC233456 rhjieaheiohjteo tjtjrj6jkk6k6 j54ju54jh54jh... (6 Replies)
Discussion started by: Iifa
6 Replies

8. Shell Programming and Scripting

Count lines AWK

Hi, how can I count the lines where a word appears in a file, using AWK? Example: file.txt: gold 1588 France gold 1478 Spain silver 1596 France emerald 1584 UK diamond 1478 Germany gold 1639 USA Number of lines where gold in text is = 3 I've try this, but all I get is the number... (3 Replies)
Discussion started by: Godie
3 Replies

9. Shell Programming and Scripting

awk help needed in trying to count lines,words and characters

Hello, i am trying to write a script file in awk which yields me the number of lines,characters and words, i checked it many many times but i am not able to find any mistake in it. Please tell me where i went wrong. BEGIN{ print "Filename Lines Words Chars\n" } { filename=filename + 1... (2 Replies)
Discussion started by: salman4u
2 Replies

10. Shell Programming and Scripting

Help removing lines with duplicated columns

Hi Guys... Please Could you help me with the following ? aaaa bbbb cccc sdsd aaaa bbbb cccc qwer as you can see, the 2 lines are matched in three fields... how can I delete this pupicate ? I mean to delete the second one if 3 fields were duplicated ? Thanks (14 Replies)
Discussion started by: yahyaaa
14 Replies
Login or Register to Ask a Question