How to find duplicate entries


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting How to find duplicate entries
# 1  
Old 04-25-2013
Oracle How to find duplicate entries

I have a file contails as below

I/P:
Code:
123456
123456
234567
987654
678905
678905

Like above i have 1000's of entries
I need output as below
O/P:
Code:
123456
678905

I'm using uniq -d filename it is showing results but it is missing few duplicate entries and i dont know why.Please help me.

Last edited by Franklin52; 04-26-2013 at 03:27 AM.. Reason: Please use code tags
# 2  
Old 04-25-2013
Code:
awk 'A[$0]++==1' file

# 3  
Old 04-25-2013
Quote:
Originally Posted by Yoda
Code:
awk 'A[$0]++==1' file

Better to use $1 instead of $0 to avoid skipping some duplicate numbers due to leading/trailing whitespace.
These 2 Users Gave Thanks to elixir_sinari For This Post:
# 4  
Old 04-25-2013
Hi Buzzme,
Code:
awk '!d[$0]++' file

Correction: This ll have only unique entries
# 5  
Old 04-25-2013
Quote:
Originally Posted by rveri
Hi Buzzme,
Code:
awk '!d[$0]++' file

Enjoy, Have fun.
That will not work.
Revisit the OP's requirement.
# 6  
Old 04-25-2013
elixir_sinari,
you are correct, that ll not work!. I did not understand the problem at first shot. Thanks.. for correcting me.

Buzzme,
> I'm using uniq -d filename it is showing results but it is missing few duplicate entries and i dont know why
- You may need to use
Code:
sort

before
Code:
uniq -d

, to have it work correctly. Wondering if you have tried it.

Please check it out with sorting numerical order:
Code:
sort -n file|uniq -d



Here is onother version with uniq that ll give output inclduing a numerical sorted output:

Code:
sort -n file|uniq -c|awk '{if ($1>1) print $2}'

Enjoy Have fun!.
# 7  
Old 04-25-2013
Quote:
Originally Posted by rveri
Please check it out with sorting numerical order:
Code:
sort -n file|uniq -d

...

Here is onother version with uniq that ll give output inclduing a numerical sorted output:

Code:
sort -n file|uniq -c|awk '{if ($1>1) print $2}'

You cannot meaningfully use a numeric sort, since uniq expects its data to be sorted lexicographically.

uniq will not consider "01" to be equal to "1", nor 1.0 to 1.00, and nor " 1" to "1". If leading/trailing zeroes/whitespace are a concern, then either the file needs to be preprocessed to normalize the entries, or a more capable tool should be used, e.g. perl or AWK.

Demonstration:
Code:
$ printf '%s\n' 1 01 001 '  1'
1
01
001
  1
$ printf '%s\n' 1 01 001 '  1' | sort -un
1
$ printf '%s\n' 1 01 001 '  1' | sort -n | uniq
  1
001
01
1

Notice how sort -un knows that it's doing a numeric comparison and considers all 4 terms to be equal. However, uniq considers each value to be distinct.

Regards,
Alister

Last edited by alister; 04-25-2013 at 07:58 PM..
This User Gave Thanks to alister For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Find duplicate values in specific column and delete all the duplicate values

Dear folks I have a map file of around 54K lines and some of the values in the second column have the same value and I want to find them and delete all of the same values. I looked over duplicate commands but my case is not to keep one of the duplicate values. I want to remove all of the same... (4 Replies)
Discussion started by: sajmar
4 Replies

2. Shell Programming and Scripting

How to find duplicate line in Linux?

Hi, Gurus, I need find the duplicate record in unix file. what command I should use for this. Thanks in advance (4 Replies)
Discussion started by: ken6503
4 Replies

3. Shell Programming and Scripting

Usage of find and cp with duplicate

Hi All ! I am trying to copy all files with extension .sh to one folder, following command I am using find . -name \*.sh -print0 | xargs -I{} -0 cp -v {} Scripts/ above command working fine but I have some .sh file with same base name different directory, so I would copy all .sh file including... (5 Replies)
Discussion started by: Akshay Hegde
5 Replies

4. Shell Programming and Scripting

Find duplicate based on 'n' fields and mark the duplicate as 'D'

Hi, In a file, I have to mark duplicate records as 'D' and the latest record alone as 'C'. In the below file, I have to identify if duplicate records are there or not based on Man_ID, Man_DT, Ship_ID and I have to mark the record with latest Ship_DT as "C" and other as "D" (I have to create... (7 Replies)
Discussion started by: machomaddy
7 Replies

5. Shell Programming and Scripting

Find duplicate files

What utility do you recommend for simply finding all duplicate files among all files? (4 Replies)
Discussion started by: kiasas
4 Replies

6. Shell Programming and Scripting

Find Duplicate files, not by name

I have a directory with images: -rw-r--r-- 1 root root 26216 Mar 19 21:00 020109.210001.jpg -rw-r--r-- 1 root root 21760 Mar 19 21:15 020109.211502.jpg -rw-r--r-- 1 root root 23144 Mar 19 21:30 020109.213002.jpg -rw-r--r-- 1 root root 31350 Mar 20 00:45 020109.004501.jpg -rw-r--r-- 1 root... (2 Replies)
Discussion started by: Ikon
2 Replies

7. Shell Programming and Scripting

find out duplicate records in file?

Dear All, I have one file which looks like : account1:passwd1 account2:passwd2 account3:passwd3 account1:passwd4 account5:passwd5 account6:passwd6 you can see there're two records for account1. and is there any shell command which can find out : account1 is the duplicate record in... (3 Replies)
Discussion started by: tiger2000
3 Replies

8. Shell Programming and Scripting

find duplicate records... again

Hi all: Let's suppose I have a file like this (but with many more records). XX ME 342 8688 2006 7 6 3c 60.029 -38.568 2901 0001 74 4 7603 8 969.8 958.4 3.6320 34.8630 985.5 973.9 3.6130 34.8600 998.7 986.9 3.6070 34.8610 1003.6 991.7 ... (4 Replies)
Discussion started by: rleal
4 Replies

9. Shell Programming and Scripting

Find duplicate value and create an

I need a perl script, which will run every midnight via cronjob and e-mail few users once it finds any duplicated value in a file which is located /etc/hosts, the file name is called hosts and the format of the file has 3 colums and some time 2 columns. The script will look for duplicate IP or... (3 Replies)
Discussion started by: ricky007
3 Replies

10. Shell Programming and Scripting

how to find duplicate files with find ?

hello all I like to make search on files , and the result need to be the files that are duplicated? (8 Replies)
Discussion started by: umen
8 Replies
Login or Register to Ask a Question