How to find duplicate line in Linux?


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting How to find duplicate line in Linux?
# 1  
Old 07-30-2013
How to find duplicate line in Linux?

Hi, Gurus,

I need find the duplicate record in unix file.

what command I should use for this.

Thanks in advance
# 2  
Old 07-30-2013
try:
Code:
awk 'a[$0]++' infile

This User Gave Thanks to rdrtx1 For This Post:
# 3  
Old 07-30-2013
Quote:
Originally Posted by rdrtx1
try:
Code:
awk 'a[$0]++' infile

Thanks for your quick reply.

it works perfect.

Thanks again
# 4  
Old 07-31-2013
Quote:
Originally Posted by rdrtx1
try:
Code:
awk 'a[$0]++' infile

It works like a charm for me as well. Would you mind very much explaining what 'a[$0]++' means?
# 5  
Old 07-31-2013
It is an associative array indexed by whole record and the value is post-incremented.

So for first occurrence of each record the value will be zero due to post-increment. But for next occurrence it will be a non-zero value.

A non-zero value is evaluated as true, hence the default awk action is to print that record.

Your can run below program to understand what is going on:
Code:
awk '{ print $0, a[$0]++ }' file

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

To Find Duplicate files using latest in Linux

I have tried the following code and with that i couldnt achieve what i want. #!/usr/bin/bash find ./ -type f \( -iname "*.xml" \) | sort -n > fileList sed -i '/\.\/fileList/d' fileList NAMEOFTHISFILE=$(echo $0|sed -e 's/\/()$*.^|/\\&/g') sed -i "/$NAMEOFTHISFILE/d"... (2 Replies)
Discussion started by: gold2k8
2 Replies

2. Shell Programming and Scripting

Find duplicate values in specific column and delete all the duplicate values

Dear folks I have a map file of around 54K lines and some of the values in the second column have the same value and I want to find them and delete all of the same values. I looked over duplicate commands but my case is not to keep one of the duplicate values. I want to remove all of the same... (4 Replies)
Discussion started by: sajmar
4 Replies

3. Shell Programming and Scripting

How find Linux version from command line?

Hi, I am looking to extract linux version from /etc/*-release file. I am specifically tring to avoid use of awk command here. would be great if can do done via sed or grep command. Red Hat Enterprise Linux Server release 6.5 (Tikanga) output must be 6 regards, Litu (7 Replies)
Discussion started by: Litu1988
7 Replies

4. Shell Programming and Scripting

Honey, I broke awk! (duplicate line removal in 30M line 3.7GB csv file)

I have a script that builds a database ~30 million lines, ~3.7 GB .cvs file. After multiple optimzations It takes about 62 min to bring in and parse all the files and used to take 10 min to remove duplicates until I was requested to add another column. I am using the highly optimized awk code: awk... (34 Replies)
Discussion started by: Michael Stora
34 Replies

5. Shell Programming and Scripting

Usage of find and cp with duplicate

Hi All ! I am trying to copy all files with extension .sh to one folder, following command I am using find . -name \*.sh -print0 | xargs -I{} -0 cp -v {} Scripts/ above command working fine but I have some .sh file with same base name different directory, so I would copy all .sh file including... (5 Replies)
Discussion started by: Akshay Hegde
5 Replies

6. Shell Programming and Scripting

Find duplicate based on 'n' fields and mark the duplicate as 'D'

Hi, In a file, I have to mark duplicate records as 'D' and the latest record alone as 'C'. In the below file, I have to identify if duplicate records are there or not based on Man_ID, Man_DT, Ship_ID and I have to mark the record with latest Ship_DT as "C" and other as "D" (I have to create... (7 Replies)
Discussion started by: machomaddy
7 Replies

7. Shell Programming and Scripting

How to find duplicate line in log file?

Hi guys, I'm really happy to find this forum I have a log file, and I have to find all lines that have "error" word, and then save this output in file, the output file has to have just only one line to any Duplicated lines and counter that show how many time this lines duplicated? I already... (2 Replies)
Discussion started by: wax_light
2 Replies

8. Shell Programming and Scripting

Find duplicate value and create an

I need a perl script, which will run every midnight via cronjob and e-mail few users once it finds any duplicated value in a file which is located /etc/hosts, the file name is called hosts and the format of the file has 3 colums and some time 2 columns. The script will look for duplicate IP or... (3 Replies)
Discussion started by: ricky007
3 Replies

9. UNIX for Dummies Questions & Answers

removing line and duplicate line

Hi, I have 3 lines in a text file that is similar to this (as a result of a diff between 2 files): 35,36d34 < DATA.EVENT.EVENT_ID.s = "3661208" < DATA.EVENT.EVENT_ID.s = "3661208" I am trying to get it down to just this: DATA.EVENT.EVENT_ID.s = "3661208" How can I do this?... (11 Replies)
Discussion started by: ocelot
11 Replies

10. Shell Programming and Scripting

how to find duplicate files with find ?

hello all I like to make search on files , and the result need to be the files that are duplicated? (8 Replies)
Discussion started by: umen
8 Replies
Login or Register to Ask a Question