10-13-2010
Counting duplicate entries in a file using awk
Hi,
I have a very big (with around 1 million entries) txt file with IPv4 addresses in the standard format, i.e. a.b.c.d
The file looks like
10.1.1.1
10.1.1.1
10.1.1.1
10.1.2.4
10.1.2.4
12.1.5.6
.
.
.
.
and so on....
There are duplicate/multiple entries for some IP addresses. I want an awk/sed script (since the file is too big) to count the number of time each IP is repeated and print (write to the output file) in the following format:
10.1.1.1 3
10.1.2.4 2
12.1.5.6 1
.
.
.
and so on...
Any help would be highly appreciated.
Thanks !
10 More Discussions You Might Find Interesting
1. UNIX for Dummies Questions & Answers
Hello. First time poster here. I have a huge file of IP numbers. I am trying to output only the class b of the IPs and rank them by most common and output the total # of duplicate class b's before the class b. An example is below:
12.107.1.1
12.107.9.54
12.108.3.89
12.109.109.4
12.109.6.3
... (2 Replies)
Discussion started by: crunchtime
2 Replies
2. Programming
Hi, im trying to create a C program that will count the number of characters, duplicate characters and non duplicate characters in a file and output this to the screen. Here is my code so far:
#include <stdio.h>
int main( void )
{
char c;
int duplicate = 0;
int nonduplicate = 0;
int... (3 Replies)
Discussion started by: DavoMan
3 Replies
3. Shell Programming and Scripting
Hi,
I have a big file (~960MB) having epoch time values (~50 million entries) which looks like
897393601
897393601
897393601
897393601
897393602
897393602
897393602
897393602
897393602
897393603
897393603
897393603
897393603
and so on....each time stamp has more than one... (6 Replies)
Discussion started by: sajal.bhatia
6 Replies
4. Shell Programming and Scripting
Hi,
I have a file with date in it like:
UserString1
UserString2
UserString3
UserString4
UserString5
I need two entries for each line so it reads like
UserString1
UserString1
UserString2
UserString2
etc. Can someone help me with the awk command please?
Thanks (4 Replies)
Discussion started by: Grueben
4 Replies
5. Shell Programming and Scripting
I am trying to count records of particular format from a file and assign it to a variable. I tried below command
br_count=wc -l "inputfile.dat"| awk -F"|" '{if (NF != "14") print }'
but I amnot able to get it done. Please share me some idea how to get it done.
Thanks in advance (7 Replies)
Discussion started by: siteregsam
7 Replies
6. Shell Programming and Scripting
Hi,
I have a very large two column log file in the following format:
# Epoch Time IP Address
899726401 112.254.1.0
899726401 112.254.1.0
899726402 154.162.38.0
899726402 160.114.12.0
899726402 165.161.7.0
899726403 ... (39 Replies)
Discussion started by: sajal.bhatia
39 Replies
7. Shell Programming and Scripting
Hi,
I have a file which looks like:ke this : chr1 11127067 11132181 89 chr1 11128023 11128311 chr1 11130990 11131025 chr1 11127067 11132181 89 chr1 11128023 11128311 chr1 11131583... (22 Replies)
Discussion started by: Amit Pande
22 Replies
8. Shell Programming and Scripting
Hello..
I am trying to remove the duplicate entries in a log files and used the the below shell script to do the same.
awk '!x++' <filename>
Can I do without using the awk command and the regex? I do not want to start the search from the beginning of the line in the log file as it contains... (9 Replies)
Discussion started by: sandeepcm
9 Replies
9. Shell Programming and Scripting
Hi,
I hope you can help me with the awk command in shell scripting.
I want to do the following, but it doesn't work.
for i in $REF1 $REF2 $REF3; do
awk '{if($n>=0 && $n<=50000){count+=1}} END{print count}' ${DIR}${i} >${DIR}${i}_count.txt
done
REF1 to REF3 are only variables for .txt... (1 Reply)
Discussion started by: y.g.
1 Replies
10. Shell Programming and Scripting
I want to count lines of a file using AWK (only) and not in the END part like this awk 'END{print FNR}' because I want to use it.
Does anyone know of a way?
Thanks a lot. (7 Replies)
Discussion started by: guitarist684
7 Replies
LEARN ABOUT REDHAT
news2mail
NEWS2MAIL(8) System Manager's Manual NEWS2MAIL(8)
NAME
news2mail - a channel script to gateway news into email.
SYNOPSIS
news2mail
DESCRIPTION
news2mail runs as a channel process underneath innd. It is set up as channel feed in newsfeeds, with different mailing lists as funnel
entries pointing to it (see below).
news2mail uses a config file
<pathetc in inn.conf>/news2mail.cf
to map mailing list names to email addresses.
news2mail causes sendmail to queue the messages for later delivery (to avoid DOS attacks by mass postings). You must run 'sendmail -q'
periodically to get the queue processed.
CONFIG FILE
The config file format is simple: comments (start with ``#'') and blank lines are ignored. All other lines have two fields on them. The
first is the list name and is what innd uses (i.e. the site field of the entry in the newsfeeds file). The second field is the actual email
address to send the article to. In the email message, the ``To'' header will have the mailing list name (i.e. the first field)
# list-name address
big-red-ants@ucsd.edu big-red-ants-digest@ucsd.edu
news-software@ucsd.edu news-software-digest@ucsd.edu
a set of newsfeeds entries for these lists would be:
n2m!:!*:Tc,Ac,Wn*:<PREFIX specified with --prefix at configure>/bin/news2mail
big-red-ants@ucsd.edu:rec.pets.redants.*:Tm:n2m!
news-software@ucsd.edu:news.software.nntp:Tm:n2m!
news2mail strips most article headers from the article before mailing. It leaves: From, Subject Date, Organization and Message-ID in there.
It add a To header with the mailing list name in it.
HISTORY
news2mail was written by Brian Kantor. This man pagewas written by James Brister This is revision 1.2, dated 1998/12/09.
SEE ALSO
ctlinnd(8), inn.conf(5), innd(8), newsfeeds(5), shlock(1).
NEWS2MAIL(8)