Sponsored Content
Top Forums Shell Programming and Scripting Remove copies/duplicates of same entry Post 302462299 by sajal.bhatia on Wednesday 13th of October 2010 10:17:31 PM
Old 10-13-2010
Remove copies/duplicates of same entry

Hi,

I have a very big (with around 1 million entries) txt file with IPv4 addresses in the standard format, i.e. a.b.c.d

The file looks like

10.1.1.1
10.1.1.1
10.1.1.1
10.1.2.4
10.1.2.4
12.1.5.6
.
.
.
.

and so on....

There are duplicate/multiple entries for some IP addresses. I want an awk/sed script (since the file is too big) to remove the copies of the IP addresses that are being repeated OR in other words I just want to keep distinct entries (IP addresses in this case) and write them in an output file. So the output file would look something like this:

10.1.1.1
10.1.2.4
12.1.5.6
.
.
.

and so on...

Any help would be highly appreciated.

Thanks !
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

fastest way to remove duplicates.

I have searched the FAQ - by using sort, duplicates, etc.... but I didn't get any articles or results on it. Currently, I am using: sort -u file1 > file2 to remove duplicates. For a file size of 1giga byte approx. time taken to remove duplicates is 1hr 21 mins. Is there any other faster way... (15 Replies)
Discussion started by: radhika
15 Replies

2. UNIX for Dummies Questions & Answers

How to remove duplicates without sorting

Hello, I can remove duplicate entries in a file by: sort File1 | uniq > File2 but how can I remove duplicates without sorting the file? I tried cat File1 | uniq > File2 but it doesn't work thanks (4 Replies)
Discussion started by: orahi001
4 Replies

3. Shell Programming and Scripting

Remove duplicates

Hello Experts, I have two files named old and new. Below are my example files. I need to compare and print the records that only exist in my new file. I tried the below awk script, this script works perfectly well if the records have exact match, the issue I have is my old file has got extra... (4 Replies)
Discussion started by: forumthreads
4 Replies

4. Shell Programming and Scripting

Deleting Duplicates leaving the first entry

Hi, I need to delete duplicate records in a file that is around 30MB. Below is what I need. Below are the entries of input file and the output file that I need. Each section of input file is separated by an empty line. Some of these sections have duplicate uid values. I want to retain only one... (4 Replies)
Discussion started by: Samingla
4 Replies

5. Shell Programming and Scripting

remove duplicates and sort

Hi, I'm using the below command to sort and remove duplicates in a file. But, i need to make this applied to the same file instead of directing it to another. Thanks (6 Replies)
Discussion started by: dvah
6 Replies

6. Shell Programming and Scripting

remove brackets and put it in a column and remove repeated entry

Hi all, I want to remove the remove bracket sign ( ) and put in the separate column I also want to remove the repeated entry like in first row in below input (PA156) is repeated ESR1 (PA156) leflunomide (PA450192) (PA156) leflunomide (PA450192) CHST3 (PA26503) docetaxel... (2 Replies)
Discussion started by: manigrover
2 Replies

7. UNIX for Dummies Questions & Answers

Remove duplicates from a file

Can u tell me how to remove duplicate records from a file? (11 Replies)
Discussion started by: saga20
11 Replies

8. Shell Programming and Scripting

Remove duplicates

I have a file with the following format: fields seperated by "|" title1|something class|long...content1|keys title2|somhing class|log...content1|kes title1|sothing class|lon...content1|kes title3|shing cls|log...content1|ks I want to remove all duplicates with the same "title field"(the... (3 Replies)
Discussion started by: dtdt
3 Replies

9. Shell Programming and Scripting

Remove duplicates

Hi I have a below file structure. 200,1245,E1,1,E1,,7611068,KWH,30, ,,,,,,,, 200,1245,E1,1,E1,,7611070,KWH,30, ,,,,,,,, 300,20140223,0.001,0.001,0.001,0.001,0.001 300,20140224,0.001,0.001,0.001,0.001,0.001 300,20140225,0.001,0.001,0.001,0.001,0.001 300,20140226,0.001,0.001,0.001,0.001,0.001... (1 Reply)
Discussion started by: tejashavele
1 Replies

10. Shell Programming and Scripting

How to remove duplicates using for loop?

values=(1 2 3 5 4 2 3 1 6 8 3 5 ) #i need the output like this by removing the duplicates 1 2 3 5 4 6 8 #i dont need sorting in my program #plz explain me as simple using for loop #os-ubuntu ,shell=bash (5 Replies)
Discussion started by: Meeran Rizvi
5 Replies
ADDR2NAME(1)							   User's Manual						      ADDR2NAME(1)

NAME
addr2name, name2addr - perform DNS lookups from scripts SYNOPSIS
addr2name [-46cmnr] [HOSTNAMES] name2addr [-46cn] [HOSTNAMES] DESCRIPTON
Write addresses or canonical hostname of specified names to the standard output. If no names are specified via the command line, they are read from the standard input. OPTIONS
-4 or --ipv4 Only try to perform IPv4 lookups. -6 or --ipv6 Only try to perform IPv6 lookups. -c or --config Only lookup and print results for address families that match locally configured addresses (If the system has no IPv4 resp IPv6 address, then IPv4 resp IPv6 addresses are ignored). -h or --help Display some help and exit. -m or --multiple Normally, if a name yields multiple results, only the first one is printed. With this optional parameter, all results will be printed on a single line, separated by spaces. -n or --numeric Prevent forward hostname lookup. This ensures that the input names are numeric addresses, which do not need to be looked up. Typi- cally used along with the --reverse option to convert numerical addresses to canonical hostnames. -r or --reverse Perform a reverse DNS lookup (enabled implicitly with addr2name). addr2name prints numerical resolved addresses by default instead. --version Display program version and license and exit. SEE ALSO
host(1), getaddrinfo(3), getnameinfo(3), resolv.conf(5) AUTHOR
Remi Denis-Courmont <remi at remlab dot net> $Id: addr2name.1 658 2010-10-31 20:56:30Z remi $ http://www.remlab.net/ndisc6/ addr2name $Date: 2010-10-31 22:56:30 +0200 (dim. 31 oct. 2010) $ ADDR2NAME(1)
All times are GMT -4. The time now is 01:28 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy