awk remove first duplicates


Login or Register for Dates, Times and to Reply

 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk remove first duplicates
# 1  
awk remove first duplicates

Hi All,
I have searched many threads for possible close solution. But I was unable to get simlar scenario.

I would like to print all duplicate based on 3rd column except the first occurance. Also would like to print if it is single entry(non-duplicate).
Code:
i/P file
12  NIL ABD LON
11  NIL ABC SIG    <= First duplicate for 3rd column need to be removed
12  NIL ABC AMR
13  NIL ABC AMR
11  NIL ABK AMR

Code:
O/P desired based on 3rd column
12  NIL ABD LON
12  NIL ABC AMR
13  NIL ABC AMR
11  NIL ABK AMR

Many thanks,

Last edited by joeyg; 01-28-2014 at 11:09 AM.. Reason: corrected a spelling error
# 2  
Code:
awk 'NR==FNR{A[$3]++;next}{if(A[$3] > 1 && !B[$3]){B[$3]++;next} }1' file file

12  NIL ABD LON
12  NIL ABC AMR
13  NIL ABC AMR
11  NIL ABK AMR

This User Gave Thanks to pamu For This Post:
# 3  
Quote:
Originally Posted by pamu
Code:
awk 'NR==FNR{A[$3]++;next}{if(A[$3] > 1 && !B[$3]){B[$3]++;next} }1' file file
 
12  NIL ABD LON
12  NIL ABC AMR
13  NIL ABC AMR
11  NIL ABK AMR


Works really well. bit slow.
# 4  
Another approach:
Code:
awk 'NR==FNR{a[$3]++;next}a[$3]>1{a[$3]=0; next}1' file file

This User Gave Thanks to Franklin52 For This Post:
# 5  
Hello,

Following may help.

Code:
awk 'NR==1 {print} f ~ $3 && i == 0 {i++;} f ~ $3 && i > 0 {print $0;i=0;j=1} f !~ $3 && j==1  {print $0} {f=$3;}'  file_name


Output will be as follows.

Code:
12  NIL ABD LON
12  NIL ABC AMR
13  NIL ABC AMR
11  NIL ABK AMR


NOTE: It will work for only this particular Input.


Thanks,
R. Singh

Last edited by RavinderSingh13; 01-28-2014 at 12:06 PM.. Reason: added a note
This User Gave Thanks to RavinderSingh13 For This Post:
# 6  
Quote:
Originally Posted by Franklin52
Another approach:
Code:
awk 'NR==FNR{a[$3]++;next}a[$3]>1{a[$3]=0; next}1' file file

Nice Approach Franklin52 Smilie
# 7  
If the file is uniquely sorted in col3 (like your example)
Code:
awk '{first=($3!=p3)} (first==0 || pfirst==0); {p3=$3; pfirst=first}' file

The principle becomes clear with
Code:
awk '{first=($3!=p3)} {print pfirst,first,":",$0} {p3=$3; pfirst=first}' file

Login or Register for Dates, Times and to Reply

Previous Thread | Next Thread
Thread Tools Search this Thread
Search this Thread:
Advanced Search

Test Your Knowledge in Computers #97
Difficulty: Easy
Unix is a multi-user operating system where the same resources can be shared by different users even if they do not have the proper permissions to do so.
True or False?

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to remove duplicates using for loop?

values=(1 2 3 5 4 2 3 1 6 8 3 5 ) #i need the output like this by removing the duplicates 1 2 3 5 4 6 8 #i dont need sorting in my program #plz explain me as simple using for loop #os-ubuntu ,shell=bash (5 Replies)
Discussion started by: Meeran Rizvi
5 Replies

2. Shell Programming and Scripting

awk - Remove duplicates during array build

Greetings Experts, Issue: Within awk script, remove the duplicate occurrences that are space (1 single space character) separated Description: I am processing 2 files using awk and during processing, I am building an array and there are duplicates on this; how can I delete the duplicates... (3 Replies)
Discussion started by: chill3chee
3 Replies

3. Shell Programming and Scripting

Remove duplicates

Hi I have a below file structure. 200,1245,E1,1,E1,,7611068,KWH,30, ,,,,,,,, 200,1245,E1,1,E1,,7611070,KWH,30, ,,,,,,,, 300,20140223,0.001,0.001,0.001,0.001,0.001 300,20140224,0.001,0.001,0.001,0.001,0.001 300,20140225,0.001,0.001,0.001,0.001,0.001 300,20140226,0.001,0.001,0.001,0.001,0.001... (1 Reply)
Discussion started by: tejashavele
1 Replies

4. Shell Programming and Scripting

Remove duplicates

I have a file with the following format: fields seperated by "|" title1|something class|long...content1|keys title2|somhing class|log...content1|kes title1|sothing class|lon...content1|kes title3|shing cls|log...content1|ks I want to remove all duplicates with the same "title field"(the... (3 Replies)
Discussion started by: dtdt
3 Replies

5. Shell Programming and Scripting

Help with merge and remove duplicates

Hi all, I need some help to remove duplicates from a file before merging. I have got 2 files: file1 has data in format 4300 23456 4301 2357 the 4 byte values on the right hand side is uniq, and are not repeated anywhere in the file file 2 has data in same format but is not in... (10 Replies)
Discussion started by: roy121
10 Replies

6. Shell Programming and Scripting

Awk: Remove Duplicates

I have the following code for removing duplicate records based on fields in inputfile file & moves the duplicate records in duplicates file(1st Awk) & in 2nd awk i fetch the non duplicate entries in inputfile to tmp file and use move to update the original file. Requirement: Can both the awk... (4 Replies)
Discussion started by: siramitsharma
4 Replies

7. Shell Programming and Scripting

bash - remove duplicates

I need to use a bash script to remove duplicate files from a download list, but I cannot use uniq because the urls are different. I need to go from this: http://***/fae78fe/file1.wmv http://***/39du7si/file1.wmv http://***/d8el2hd/file2.wmv http://***/h893js3/file2.wmv to this: ... (2 Replies)
Discussion started by: locoroco
2 Replies

8. Shell Programming and Scripting

remove duplicates and sort

Hi, I'm using the below command to sort and remove duplicates in a file. But, i need to make this applied to the same file instead of directing it to another. Thanks (6 Replies)
Discussion started by: dvah
6 Replies

9. Shell Programming and Scripting

Remove duplicates

Hello Experts, I have two files named old and new. Below are my example files. I need to compare and print the records that only exist in my new file. I tried the below awk script, this script works perfectly well if the records have exact match, the issue I have is my old file has got extra... (4 Replies)
Discussion started by: forumthreads
4 Replies

10. Shell Programming and Scripting

fastest way to remove duplicates.

I have searched the FAQ - by using sort, duplicates, etc.... but I didn't get any articles or results on it. Currently, I am using: sort -u file1 > file2 to remove duplicates. For a file size of 1giga byte approx. time taken to remove duplicates is 1hr 21 mins. Is there any other faster way... (15 Replies)
Discussion started by: radhika
15 Replies

Featured Tech Videos