Sponsored Content
Top Forums Shell Programming and Scripting Counting duplicate entries in a file using awk Post 302462281 by sajal.bhatia on Wednesday 13th of October 2010 07:59:59 PM
Old 10-13-2010
Counting duplicate entries in a file using awk

Hi,

I have a very big (with around 1 million entries) txt file with IPv4 addresses in the standard format, i.e. a.b.c.d

The file looks like

10.1.1.1
10.1.1.1
10.1.1.1
10.1.2.4
10.1.2.4
12.1.5.6
.
.
.
.

and so on....

There are duplicate/multiple entries for some IP addresses. I want an awk/sed script (since the file is too big) to count the number of time each IP is repeated and print (write to the output file) in the following format:

10.1.1.1 3
10.1.2.4 2
12.1.5.6 1
.
.
.

and so on...

Any help would be highly appreciated.

Thanks !
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Counting The Number Of Duplicate Lines In a File

Hello. First time poster here. I have a huge file of IP numbers. I am trying to output only the class b of the IPs and rank them by most common and output the total # of duplicate class b's before the class b. An example is below: 12.107.1.1 12.107.9.54 12.108.3.89 12.109.109.4 12.109.6.3 ... (2 Replies)
Discussion started by: crunchtime
2 Replies

2. Programming

Counting duplicate chars in C

Hi, im trying to create a C program that will count the number of characters, duplicate characters and non duplicate characters in a file and output this to the screen. Here is my code so far: #include <stdio.h> int main( void ) { char c; int duplicate = 0; int nonduplicate = 0; int... (3 Replies)
Discussion started by: DavoMan
3 Replies

3. Shell Programming and Scripting

Counting multiple entries in a file using awk

Hi, I have a big file (~960MB) having epoch time values (~50 million entries) which looks like 897393601 897393601 897393601 897393601 897393602 897393602 897393602 897393602 897393602 897393603 897393603 897393603 897393603 and so on....each time stamp has more than one... (6 Replies)
Discussion started by: sajal.bhatia
6 Replies

4. Shell Programming and Scripting

AWK Command to duplicate lines in a file?

Hi, I have a file with date in it like: UserString1 UserString2 UserString3 UserString4 UserString5 I need two entries for each line so it reads like UserString1 UserString1 UserString2 UserString2 etc. Can someone help me with the awk command please? Thanks (4 Replies)
Discussion started by: Grueben
4 Replies

5. Shell Programming and Scripting

counting particular record format in a file using AWK

I am trying to count records of particular format from a file and assign it to a variable. I tried below command br_count=wc -l "inputfile.dat"| awk -F"|" '{if (NF != "14") print }' but I amnot able to get it done. Please share me some idea how to get it done. Thanks in advance (7 Replies)
Discussion started by: siteregsam
7 Replies

6. Shell Programming and Scripting

Counting entries in a file

Hi, I have a very large two column log file in the following format: # Epoch Time IP Address 899726401 112.254.1.0 899726401 112.254.1.0 899726402 154.162.38.0 899726402 160.114.12.0 899726402 165.161.7.0 899726403 ... (39 Replies)
Discussion started by: sajal.bhatia
39 Replies

7. Shell Programming and Scripting

Help with removing duplicate entries with awk or Perl

Hi, I have a file which looks like:ke this : chr1 11127067 11132181 89 chr1 11128023 11128311 chr1 11130990 11131025 chr1 11127067 11132181 89 chr1 11128023 11128311 chr1 11131583... (22 Replies)
Discussion started by: Amit Pande
22 Replies

8. Shell Programming and Scripting

How to delete duplicate entries without using awk command?

Hello.. I am trying to remove the duplicate entries in a log files and used the the below shell script to do the same. awk '!x++' <filename> Can I do without using the awk command and the regex? I do not want to start the search from the beginning of the line in the log file as it contains... (9 Replies)
Discussion started by: sandeepcm
9 Replies

9. Shell Programming and Scripting

Shell script with awk command for counting in a file

Hi, I hope you can help me with the awk command in shell scripting. I want to do the following, but it doesn't work. for i in $REF1 $REF2 $REF3; do awk '{if($n>=0 && $n<=50000){count+=1}} END{print count}' ${DIR}${i} >${DIR}${i}_count.txt done REF1 to REF3 are only variables for .txt... (1 Reply)
Discussion started by: y.g.
1 Replies

10. Shell Programming and Scripting

Counting lines in a file using awk

I want to count lines of a file using AWK (only) and not in the END part like this awk 'END{print FNR}' because I want to use it. Does anyone know of a way? Thanks a lot. (7 Replies)
Discussion started by: guitarist684
7 Replies
aclsort(3C)															       aclsort(3C)

NAME
aclsort() - sort an Access Control List (JFS File Systems only) SYNOPSIS
DESCRIPTION
The routine sorts JFS Access Control List (ACL) entries into the correct order to be accepted by the acl(2) system call. points to a buffer containing ACL entries; if non-zero, indicates that the permissions should be recalculated; and specifies the number of ACL entries in the buffer. sorts the contents of the ACL buffer as follows: Entries will be in order and Entries of type and will be sorted in increasing order by numeric ID. The call will succeed if all of the following are true: There is exactly one entry each of type and There is at most one entry each of type and Entries of type or may not contain duplicate entries. A duplicate entry is one of the same type containing the same numeric id, irrespective of permission bits. If the argument is zero and there are no entries of type and no entries of type the permissions of the and entries must be the same. If there are no entries of type and no entries of type and the entry is specified, then the entry must also be specified, and the permissions of the and entries must be the same. RETURN VALUE
Upon successful completion, the return value is 0. If there are duplicate entries, the return value is the position of the first duplicate entry. If there is more than one entry of type or they are treated as duplicate entries, and the return value is the position of the first duplicate entry. For all other errors, the return value is -1. NOTICES
The buffer is sorted by type and ID before checking for any failures. Therefore the buffer is always sorted, even if there is a failure. The position of a duplicate entry returned on failure is not the byte offset of the duplicate entry from its base; rather it refers to the entry number of the duplicate entry within the sorted buffer. Checks will be performed in order of entry type. If there are multiple failures, the failure returned will be the first encountered, for example, if the ACL buffer contains a duplicate entry and does not contain an entry, the return value will be the first duplicate entry. ACLs do not have to be sorted with prior to passing them to acl(2). DEPENDENCIES
is supported only on JFS file systems on the standard HP-UX operating system. AUTHOR
was developed by AT&T. SEE ALSO
acl(2), aclv(5). aclsort(3C)
All times are GMT -4. The time now is 06:23 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy