awk remove first duplicates

01-28-2014

Registered User

5, 0

Join Date: Jan 2013

Last Activity: 28 May 2015, 9:01 AM EDT

Posts: 5

Thanks Given: 4

Thanked 0 Times in 0 Posts

awk remove first duplicates

Hi All,
I have searched many threads for possible close solution. But I was unable to get simlar scenario.

I would like to print all duplicate based on 3rd column except the first occurance. Also would like to print if it is single entry(non-duplicate).

Code:

i/P file
12  NIL ABD LON
11  NIL ABC SIG    <= First duplicate for 3rd column need to be removed
12  NIL ABC AMR
13  NIL ABC AMR
11  NIL ABK AMR

Code:

O/P desired based on 3rd column
12  NIL ABD LON
12  NIL ABC AMR
13  NIL ABC AMR
11  NIL ABK AMR

Many thanks,

Last edited by joeyg; 01-28-2014 at 10:09 AM.. Reason: corrected a spelling error

sybadm

View Public Profile for sybadm

Find all posts by sybadm

01-28-2014

Registered User

1,650, 478

Join Date: Mar 2012

Last Activity: 11 September 2019, 8:06 AM EDT

Posts: 1,650

Thanks Given: 58

Thanked 478 Times in 474 Posts

Code:

awk 'NR==FNR{A[$3]++;next}{if(A[$3] > 1 && !B[$3]){B[$3]++;next} }1' file file

12  NIL ABD LON
12  NIL ABC AMR
13  NIL ABC AMR
11  NIL ABK AMR

This User Gave Thanks to pamu For This Post:

pamu

View Public Profile for pamu

Find all posts by pamu

01-28-2014

Registered User

5, 0

Join Date: Jan 2013

Last Activity: 28 May 2015, 9:01 AM EDT

Posts: 5

Thanks Given: 4

Thanked 0 Times in 0 Posts

Quote:

Originally Posted by pamu

Code:

awk 'NR==FNR{A[$3]++;next}{if(A[$3] > 1 && !B[$3]){B[$3]++;next} }1' file file
 
12  NIL ABD LON
12  NIL ABC AMR
13  NIL ABC AMR
11  NIL ABK AMR

Works really well. bit slow.

sybadm

View Public Profile for sybadm

Find all posts by sybadm

01-28-2014

Registered User

7,747, 559

Join Date: Feb 2007

Last Activity: 20 April 2020, 11:28 AM EDT

Location: The Netherlands

Posts: 7,747

Thanks Given: 139

Thanked 559 Times in 520 Posts

Another approach:

Code:

awk 'NR==FNR{a[$3]++;next}a[$3]>1{a[$3]=0; next}1' file file

This User Gave Thanks to Franklin52 For This Post:

Franklin52

View Public Profile for Franklin52

Find all posts by Franklin52

01-28-2014

Moderator

3,105, 1,603

Join Date: May 2013

Last Activity: 31 August 2020, 1:46 AM EDT

Location: Chennai

Posts: 3,105

Thanks Given: 1,269

Thanked 1,603 Times in 1,369 Posts

Hello,

Following may help.

Code:

awk 'NR==1 {print} f ~ $3 && i == 0 {i++;} f ~ $3 && i > 0 {print $0;i=0;j=1} f !~ $3 && j==1  {print $0} {f=$3;}'  file_name

Output will be as follows.

Code:

12  NIL ABD LON
12  NIL ABC AMR
13  NIL ABC AMR
11  NIL ABK AMR

NOTE: It will work for only this particular Input.

Thanks,
R. Singh

Last edited by RavinderSingh13; 01-28-2014 at 11:06 AM.. Reason: added a note

This User Gave Thanks to RavinderSingh13 For This Post:

RavinderSingh13

View Public Profile for RavinderSingh13

Find all posts by RavinderSingh13

01-28-2014

Moderator

1,837, 668

Join Date: Nov 2012

Last Activity: 30 June 2020, 12:07 PM EDT

Posts: 1,837

Thanks Given: 180

Thanked 668 Times in 590 Posts

Quote:

Originally Posted by Franklin52

Another approach:

Code:

awk 'NR==FNR{a[$3]++;next}a[$3]>1{a[$3]=0; next}1' file file

Nice Approach Franklin52

Akshay Hegde

View Public Profile for Akshay Hegde

Find all posts by Akshay Hegde

01-29-2014

Registered User

5,091, 1,931

Join Date: May 2012

Last Activity: 15 July 2020, 4:46 AM EDT

Location: Simplicity

Posts: 5,091

Thanks Given: 565

Thanked 1,931 Times in 1,668 Posts

If the file is uniquely sorted in col3 (like your example)

Code:

awk '{first=($3!=p3)} (first==0 || pfirst==0); {p3=$3; pfirst=first}' file

The principle becomes clear with

Code:

awk '{first=($3!=p3)} {print pfirst,first,":",$0} {p3=$3; pfirst=first}' file

MadeInGermany

View Public Profile for MadeInGermany

Find all posts by MadeInGermany

Shell Programming and Scripting

awk remove first duplicates

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk - Remove duplicates during array build

Discussion started by: chill3chee

2. Shell Programming and Scripting

Remove duplicates

Discussion started by: tejashavele

3. Shell Programming and Scripting

Sort and Remove duplicates

Discussion started by: ysvsr1

4. Shell Programming and Scripting

Remove top 3 duplicates

Discussion started by: Tomlight

5. Shell Programming and Scripting

Remove duplicates

Discussion started by: dtdt

6. Shell Programming and Scripting

Awk: Remove Duplicates

Discussion started by: siramitsharma

7. Shell Programming and Scripting

bash - remove duplicates

Discussion started by: locoroco

8. Shell Programming and Scripting

remove duplicates and sort

Discussion started by: dvah

9. Shell Programming and Scripting

Remove duplicates

Discussion started by: forumthreads

10. UNIX for Dummies Questions & Answers

How to remove duplicates without sorting

Discussion started by: orahi001