Sponsored Content
Top Forums Shell Programming and Scripting Counting duplicate entries in a file using awk Post 302462281 by sajal.bhatia on Wednesday 13th of October 2010 07:59:59 PM
Old 10-13-2010
Counting duplicate entries in a file using awk

Hi,

I have a very big (with around 1 million entries) txt file with IPv4 addresses in the standard format, i.e. a.b.c.d

The file looks like

10.1.1.1
10.1.1.1
10.1.1.1
10.1.2.4
10.1.2.4
12.1.5.6
.
.
.
.

and so on....

There are duplicate/multiple entries for some IP addresses. I want an awk/sed script (since the file is too big) to count the number of time each IP is repeated and print (write to the output file) in the following format:

10.1.1.1 3
10.1.2.4 2
12.1.5.6 1
.
.
.

and so on...

Any help would be highly appreciated.

Thanks !
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Counting The Number Of Duplicate Lines In a File

Hello. First time poster here. I have a huge file of IP numbers. I am trying to output only the class b of the IPs and rank them by most common and output the total # of duplicate class b's before the class b. An example is below: 12.107.1.1 12.107.9.54 12.108.3.89 12.109.109.4 12.109.6.3 ... (2 Replies)
Discussion started by: crunchtime
2 Replies

2. Programming

Counting duplicate chars in C

Hi, im trying to create a C program that will count the number of characters, duplicate characters and non duplicate characters in a file and output this to the screen. Here is my code so far: #include <stdio.h> int main( void ) { char c; int duplicate = 0; int nonduplicate = 0; int... (3 Replies)
Discussion started by: DavoMan
3 Replies

3. Shell Programming and Scripting

Counting multiple entries in a file using awk

Hi, I have a big file (~960MB) having epoch time values (~50 million entries) which looks like 897393601 897393601 897393601 897393601 897393602 897393602 897393602 897393602 897393602 897393603 897393603 897393603 897393603 and so on....each time stamp has more than one... (6 Replies)
Discussion started by: sajal.bhatia
6 Replies

4. Shell Programming and Scripting

AWK Command to duplicate lines in a file?

Hi, I have a file with date in it like: UserString1 UserString2 UserString3 UserString4 UserString5 I need two entries for each line so it reads like UserString1 UserString1 UserString2 UserString2 etc. Can someone help me with the awk command please? Thanks (4 Replies)
Discussion started by: Grueben
4 Replies

5. Shell Programming and Scripting

counting particular record format in a file using AWK

I am trying to count records of particular format from a file and assign it to a variable. I tried below command br_count=wc -l "inputfile.dat"| awk -F"|" '{if (NF != "14") print }' but I amnot able to get it done. Please share me some idea how to get it done. Thanks in advance (7 Replies)
Discussion started by: siteregsam
7 Replies

6. Shell Programming and Scripting

Counting entries in a file

Hi, I have a very large two column log file in the following format: # Epoch Time IP Address 899726401 112.254.1.0 899726401 112.254.1.0 899726402 154.162.38.0 899726402 160.114.12.0 899726402 165.161.7.0 899726403 ... (39 Replies)
Discussion started by: sajal.bhatia
39 Replies

7. Shell Programming and Scripting

Help with removing duplicate entries with awk or Perl

Hi, I have a file which looks like:ke this : chr1 11127067 11132181 89 chr1 11128023 11128311 chr1 11130990 11131025 chr1 11127067 11132181 89 chr1 11128023 11128311 chr1 11131583... (22 Replies)
Discussion started by: Amit Pande
22 Replies

8. Shell Programming and Scripting

How to delete duplicate entries without using awk command?

Hello.. I am trying to remove the duplicate entries in a log files and used the the below shell script to do the same. awk '!x++' <filename> Can I do without using the awk command and the regex? I do not want to start the search from the beginning of the line in the log file as it contains... (9 Replies)
Discussion started by: sandeepcm
9 Replies

9. Shell Programming and Scripting

Shell script with awk command for counting in a file

Hi, I hope you can help me with the awk command in shell scripting. I want to do the following, but it doesn't work. for i in $REF1 $REF2 $REF3; do awk '{if($n>=0 && $n<=50000){count+=1}} END{print count}' ${DIR}${i} >${DIR}${i}_count.txt done REF1 to REF3 are only variables for .txt... (1 Reply)
Discussion started by: y.g.
1 Replies

10. Shell Programming and Scripting

Counting lines in a file using awk

I want to count lines of a file using AWK (only) and not in the END part like this awk 'END{print FNR}' because I want to use it. Does anyone know of a way? Thanks a lot. (7 Replies)
Discussion started by: guitarist684
7 Replies
NEWS2MAIL(8)						      System Manager's Manual						      NEWS2MAIL(8)

NAME
news2mail - a channel script to gateway news into email. SYNOPSIS
news2mail DESCRIPTION
news2mail runs as a channel process underneath innd. It is set up as channel feed in newsfeeds, with different mailing lists as funnel entries pointing to it (see below). news2mail uses a config file <pathetc in inn.conf>/news2mail.cf to map mailing list names to email addresses. news2mail causes sendmail to queue the messages for later delivery (to avoid DOS attacks by mass postings). You must run 'sendmail -q' periodically to get the queue processed. CONFIG FILE
The config file format is simple: comments (start with ``#'') and blank lines are ignored. All other lines have two fields on them. The first is the list name and is what innd uses (i.e. the site field of the entry in the newsfeeds file). The second field is the actual email address to send the article to. In the email message, the ``To'' header will have the mailing list name (i.e. the first field) # list-name address big-red-ants@ucsd.edu big-red-ants-digest@ucsd.edu news-software@ucsd.edu news-software-digest@ucsd.edu a set of newsfeeds entries for these lists would be: n2m!:!*:Tc,Ac,Wn*:<PREFIX specified with --prefix at configure>/bin/news2mail big-red-ants@ucsd.edu:rec.pets.redants.*:Tm:n2m! news-software@ucsd.edu:news.software.nntp:Tm:n2m! news2mail strips most article headers from the article before mailing. It leaves: From, Subject Date, Organization and Message-ID in there. It add a To header with the mailing list name in it. HISTORY
news2mail was written by Brian Kantor. This man pagewas written by James Brister This is revision 1.2, dated 1998/12/09. SEE ALSO
ctlinnd(8), inn.conf(5), innd(8), newsfeeds(5), shlock(1). NEWS2MAIL(8)
All times are GMT -4. The time now is 10:54 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy