Find the replicate record using awk


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Find the replicate record using awk
# 1  
Old 11-06-2012
Find the replicate record using awk

We usually use the following awk code to delete of find out the replicate record.

Code:
awk -F, '{a[$1]++} END {for (i in a) if (a[i]>=2) print i a[i]}' file

My question is how can I print the whole record. The following code doesn't work.

Code:
awk -F, '{a[$1]++} END {for (i in a) if (a[i]>=2) print $0}' file

Thank you!
# 2  
Old 11-06-2012
Try this:

Code:
'{a[$0]++} END {for (i in a) if (a[i]>=2) print i}' file

This User Gave Thanks to ripat For This Post:
# 3  
Old 11-06-2012
And one needn't wait till the file is read completely, to determine/print the duplicate records:
Code:
awk 'a[$0]++==1' file

This User Gave Thanks to elixir_sinari For This Post:
# 4  
Old 11-06-2012
Quote:
Originally Posted by ripat
Try this:

Code:
'{a[$0]++} END {for (i in a) if (a[i]>=2) print i}' file

Sorry I can't express my desire clearly. What I want is printing out the record when they have replicate $1.

---------- Post updated at 02:18 AM ---------- Previous update was at 02:18 AM ----------

Quote:
Originally Posted by elixir_sinari
And one needn't wait till the file is read completely, to determine/print the duplicate records:
Code:
awk 'a[$0]++==1' file

Sorry I can't express my desire clearly. What I want is printing out the record when they have replicate $1.
# 5  
Old 11-06-2012
awk

Hi,

Try this one,

Code:
awk -F, '{a[$1]++;if(v[$1]){v[$1]=v[$1] ORS $0;}else{v[$1]=$0;}} END {for (i in a) if (a[i]>=2) print v[i]}' file


If you want to disply only the duplicated lines,

Code:
awk -F';' '{a[$1]++;}a[$1]>1{if(v[$1]){v[$1]=v[$1] ORS $0;}else{v[$1]=$0;}} END {for (i in a) if (a[i]>=2) print v[i]}' file

Cheers,
Ranga Smilie

Last edited by rangarasan; 11-06-2012 at 02:27 AM.. Reason: Added additional information
This User Gave Thanks to rangarasan For This Post:
# 6  
Old 11-06-2012
With some assumptions:
Code:
sort -t, -k1,1 file|awk -F, 'p1==$1{if(p) print p0;p=0;print;next}{p1=$1;p0=$0;p=1}'


Last edited by elixir_sinari; 11-06-2012 at 02:56 AM..
This User Gave Thanks to elixir_sinari For This Post:
# 7  
Old 11-06-2012
Quote:
Originally Posted by rangarasan
Hi,

Try this one,

Code:
awk -F, '{a[$1]++;if(v[$1]){v[$1]=v[$1] ORS $0;}else{v[$1]=$0;}} END {for (i in a) if (a[i]>=2) print v[i]}' file


If you want to disply only the duplicated lines,

Code:
awk -F';' '{a[$1]++;}a[$1]>1{if(v[$1]){v[$1]=v[$1] ORS $0;}else{v[$1]=$0;}} END {for (i in a) if (a[i]>=2) print v[i]}' file

Cheers,
Ranga Smilie
It works, Thank you!

---------- Post updated at 02:38 AM ---------- Previous update was at 02:35 AM ----------

Quote:
Originally Posted by elixir_sinari
With some assumptions:
Code:
sort -t, -k1,1 file|awk -F, 'p1==$1{print p0;print}{p1=$1;p0=$0}'

Nice! Thank you! Can you explain the code in awk? I never saw that kind of code.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

To find record having null value

Hi All My requirement is to find the null values in particular column of a file and reject it in case if it contains null values. But the challenge is that I want a common command which can be used across different file, as the position of the column we need to check for different file may get... (14 Replies)
Discussion started by: ginrkf
14 Replies

2. Shell Programming and Scripting

How to compare current record,with next and previous record in awk without using array?

Hi! all can any one tell me how to compare current record of column with next and previous record in awk without using array my case is like this input.txt 0 32 1 26 2 27 3 34 4 26 5 25 6 24 9 23 0 32 1 28 2 15 3 26 4 24 (7 Replies)
Discussion started by: Dona Clara
7 Replies

3. Shell Programming and Scripting

Find and remove duplicate record and print list

Gents, I needs to delete duplicate values and only get uniq values based in columns 2-27 Always we should keep the last record found... I need to store one clean file and other with the duplicate values removed. Input : S3033.0 7305.01 0 420123.8... (18 Replies)
Discussion started by: jiam912
18 Replies

4. UNIX for Advanced & Expert Users

how to find lenght of fixed width file record?

actually i am trying to find the lenght of fixed width file record reading from teradata db but its not working can u guys help me out? code which i wrote--- colmn_lngth=`cat $RPT_FILE | awk -F~ '{print $1}'` rm $RPT_FILE while read line do result=`echo $line | wc -m` ... (4 Replies)
Discussion started by: Seshendranath
4 Replies

5. Shell Programming and Scripting

replicate lines - awk

Is it possible to replicate the lines based on 4th column of the input like the below ? input ar1 10 100 -1 ar1 20 200 -2 arX 34 140 +1 arY 7 1 +4 output ar1 10 100 - ar1 20 200 - ar1 20 200 - arX 34 140 + arY ... (1 Reply)
Discussion started by: quincyjones
1 Replies

6. Shell Programming and Scripting

Find x and print its record

Hi all, I have a file containing two fields with 154 rows/records/lines (forgive me, my UNIX terminology is not quite up to par yet). I am trying to read from this list, find a value (lets say 0), then print the record/line/row that value falls on (In this case it would be record/line/row #27)?... (5 Replies)
Discussion started by: StudentServitor
5 Replies

7. Shell Programming and Scripting

Replicate one directory with another

I have a dir as /library/utility/apache-tomcat/tbase-6001/repositories which has many huge directories and files. I am planning to replicate it to another folder /library/utility/apache-tomcat/tbase2008-6001/repositories Normal copy command is taking a hell lot of time and getting hung in... (1 Reply)
Discussion started by: Tuxidow
1 Replies

8. UNIX Desktop Questions & Answers

How to find that has a string and pull the record out of that file

Hello Experts !!! Have some trouble finding a solution for the problem mentioned below. Please help. Thanks, Lee. I have set of input files as below File1 MCMCNDD77 20100903:12:36:50 323-2322 BAC,MRP,NWER CKDJKJK838838 234 ... (4 Replies)
Discussion started by: OMLEELA
4 Replies

9. Shell Programming and Scripting

awk to find the length of each record.

Hi Guys, I wanted to print the length of each record and the record itself. I tried the following awk ... awk 'a=length(); {print $a,$0}' file1 But it is giving me the records instead of length. and also, it giving me each record twice. Means the value of a is not the length of the... (0 Replies)
Discussion started by: mac4rfree
0 Replies

10. Shell Programming and Scripting

How to replicate data using Uniq or awk

Hi, I have this scenario; where there are two classes:- apple and orange. 1,2,3,4,5,6,apple 1,1,0,4,2,3,apple 1,3,3,3,3,4,apple 1,1,1,1,1,1,orange 1,2,3,1,1,1,orange Basically for apple, i have 3 entries in the file, and for orange, I have 2 entries. Im trying to edit the file and find... (5 Replies)
Discussion started by: ahjiefreak
5 Replies
Login or Register to Ask a Question