10-13-2010
Remove copies/duplicates of same entry
Hi,
I have a very big (with around 1 million entries) txt file with IPv4 addresses in the standard format, i.e. a.b.c.d
The file looks like
10.1.1.1
10.1.1.1
10.1.1.1
10.1.2.4
10.1.2.4
12.1.5.6
.
.
.
.
and so on....
There are duplicate/multiple entries for some IP addresses. I want an awk/sed script (since the file is too big) to remove the copies of the IP addresses that are being repeated OR in other words I just want to keep distinct entries (IP addresses in this case) and write them in an output file. So the output file would look something like this:
10.1.1.1
10.1.2.4
12.1.5.6
.
.
.
and so on...
Any help would be highly appreciated.
Thanks !
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
I have searched the FAQ - by using sort, duplicates, etc.... but I didn't get any articles or results on it.
Currently, I am using:
sort -u file1 > file2 to remove duplicates. For a file size of 1giga byte approx. time taken to remove duplicates is 1hr 21 mins.
Is there any other faster way... (15 Replies)
Discussion started by: radhika
15 Replies
2. UNIX for Dummies Questions & Answers
Hello,
I can remove duplicate entries in a file by:
sort File1 | uniq > File2
but how can I remove duplicates without sorting the file?
I tried cat File1 | uniq > File2 but it doesn't work
thanks (4 Replies)
Discussion started by: orahi001
4 Replies
3. Shell Programming and Scripting
Hello Experts,
I have two files named old and new. Below are my example files. I need to compare and print the records that only exist in my new file. I tried the below awk script, this script works perfectly well if the records have exact match, the issue I have is my old file has got extra... (4 Replies)
Discussion started by: forumthreads
4 Replies
4. Shell Programming and Scripting
Hi,
I need to delete duplicate records in a file that is around 30MB. Below is what I need. Below are the entries of input file and the output file that I need. Each section of input file is separated by an empty line. Some of these sections have duplicate uid values. I want to retain only one... (4 Replies)
Discussion started by: Samingla
4 Replies
5. Shell Programming and Scripting
Hi,
I'm using the below command to sort and remove duplicates in a file. But, i need to make this applied to the same file instead of directing it to another.
Thanks (6 Replies)
Discussion started by: dvah
6 Replies
6. Shell Programming and Scripting
Hi all,
I want to remove the remove bracket sign ( ) and put in the separate column I also want to remove the repeated entry like in first row in below input (PA156) is repeated
ESR1 (PA156) leflunomide (PA450192) (PA156) leflunomide (PA450192)
CHST3 (PA26503) docetaxel... (2 Replies)
Discussion started by: manigrover
2 Replies
7. UNIX for Dummies Questions & Answers
Can u tell me how to remove duplicate records from a file? (11 Replies)
Discussion started by: saga20
11 Replies
8. Shell Programming and Scripting
I have a file with the following format:
fields seperated by "|"
title1|something class|long...content1|keys
title2|somhing class|log...content1|kes
title1|sothing class|lon...content1|kes
title3|shing cls|log...content1|ks
I want to remove all duplicates with the same "title field"(the... (3 Replies)
Discussion started by: dtdt
3 Replies
9. Shell Programming and Scripting
Hi I have a below file structure.
200,1245,E1,1,E1,,7611068,KWH,30, ,,,,,,,,
200,1245,E1,1,E1,,7611070,KWH,30, ,,,,,,,,
300,20140223,0.001,0.001,0.001,0.001,0.001
300,20140224,0.001,0.001,0.001,0.001,0.001
300,20140225,0.001,0.001,0.001,0.001,0.001
300,20140226,0.001,0.001,0.001,0.001,0.001... (1 Reply)
Discussion started by: tejashavele
1 Replies
10. Shell Programming and Scripting
values=(1 2 3 5 4 2 3 1 6 8 3 5 )
#i need the output like this by removing the duplicates
1
2
3
5
4
6
8
#i dont need sorting in my program
#plz explain me as simple using for loop
#os-ubuntu ,shell=bash (5 Replies)
Discussion started by: Meeran Rizvi
5 Replies
LEARN ABOUT DEBIAN
addr2name
ADDR2NAME(1) User's Manual ADDR2NAME(1)
NAME
addr2name, name2addr - perform DNS lookups from scripts
SYNOPSIS
addr2name [-46cmnr] [HOSTNAMES]
name2addr [-46cn] [HOSTNAMES]
DESCRIPTON
Write addresses or canonical hostname of specified names to the standard output. If no names are specified via the command line, they are
read from the standard input.
OPTIONS
-4 or --ipv4
Only try to perform IPv4 lookups.
-6 or --ipv6
Only try to perform IPv6 lookups.
-c or --config
Only lookup and print results for address families that match locally configured addresses (If the system has no IPv4 resp IPv6
address, then IPv4 resp IPv6 addresses are ignored).
-h or --help
Display some help and exit.
-m or --multiple
Normally, if a name yields multiple results, only the first one is printed. With this optional parameter, all results will be
printed on a single line, separated by spaces.
-n or --numeric
Prevent forward hostname lookup. This ensures that the input names are numeric addresses, which do not need to be looked up. Typi-
cally used along with the --reverse option to convert numerical addresses to canonical hostnames.
-r or --reverse
Perform a reverse DNS lookup (enabled implicitly with addr2name). addr2name prints numerical resolved addresses by default instead.
--version
Display program version and license and exit.
SEE ALSO
host(1), getaddrinfo(3), getnameinfo(3), resolv.conf(5)
AUTHOR
Remi Denis-Courmont <remi at remlab dot net>
$Id: addr2name.1 658 2010-10-31 20:56:30Z remi $
http://www.remlab.net/ndisc6/
addr2name $Date: 2010-10-31 22:56:30 +0200 (dim. 31 oct. 2010) $ ADDR2NAME(1)