To select non-duplicate records using awk

02-24-2014

Registered User

55, 0

Join Date: Oct 2008

Last Activity: 14 May 2014, 4:24 AM EDT

Location: HYDERABAD INDIA

Posts: 55

Thanks Given: 11

Thanked 0 Times in 0 Posts

To select non-duplicate records using awk

Friends,

I have data sorted on id like this

Code:

id addressl
1  abc
2 abc
2 abc
2 abc
3 aabc
4 abc
4 abc

I want to pick all ids with addressesses leaving out duplicate records. Desired output would be

Code:

id address
1 abc
2 abc
3 abc
4 abc

I tried this way

Code:

awk "{ if ($1==id) {next;} else {print;id=$1;)" inputfile

But the results were still having a few duplicates

Any help'll be appreciated

Paresh

Moderator's Comments:

Please use CODE tags (not ICODE tags) when tagging multi-line sample code, input, and output.

Last edited by Don Cragun; 02-24-2014 at 05:01 AM.. Reason: Change ICODE tags to CODE tags.

paresh n doshi

View Public Profile for paresh n doshi

Find all posts by paresh n doshi

02-24-2014

Registered User

2,205, 181

Join Date: Mar 2006

Last Activity: 8 May 2020, 5:01 AM EDT

Location: Bangalore,India

Posts: 2,205

Thanks Given: 31

Thanked 181 Times in 171 Posts

If your data is correctly sorted, then your code will work. Just check if the data is sorted.

anbu23

View Public Profile for anbu23

Find all posts by anbu23

02-24-2014

Registered User

65, 23

Join Date: Feb 2014

Last Activity: 22 February 2017, 10:30 PM EST

Location: Shanghai, PRC

Posts: 65

Thanks Given: 15

Thanked 23 Times in 22 Posts

Hi guy,
As anbu123 said, if your data is sorted, you code is correct.
Mostly we do this to select non-duplicate records

Code:

awk '!a[$1]++' inputfile

This User Gave Thanks to Lucas_0418 For This Post:

Lucas_0418

View Public Profile for Lucas_0418

Find all posts by Lucas_0418

02-24-2014

Registered User

779, 112

Join Date: Feb 2006

Last Activity: 18 May 2018, 1:51 PM EDT

Location: Almer�a, Spain

Posts: 779

Thanks Given: 24

Thanked 112 Times in 106 Posts

Try:

Code:

sort -un inputfile

This User Gave Thanks to Klashxx For This Post:

Klashxx

View Public Profile for Klashxx

Find all posts by Klashxx

02-24-2014

Registered User

559, 160

Join Date: Jul 2012

Last Activity: 20 September 2019, 7:24 AM EDT

Location: India, Hyderabad

Posts: 559

Thanks Given: 11

Thanked 160 Times in 148 Posts

Code:

awk '! X[$0]++'

SriniShoo

View Public Profile for SriniShoo

Find all posts by SriniShoo

02-24-2014

Moderator

3,689, 1,352

Join Date: Jan 2012

Last Activity: 22 August 2020, 11:29 PM EDT

Location: Galactic Empire

Posts: 3,689

Thanks Given: 268

Thanked 1,352 Times in 1,258 Posts

I would suggest using $1 and $2 instead of $0 to discard any blank spaces in any record:

Code:

awk '!A[$1,$2]++' file

This User Gave Thanks to Yoda For This Post:

Yoda

View Public Profile for Yoda

Visit Yoda's homepage!

Find all posts by Yoda

Shell Programming and Scripting

To select non-duplicate records using awk

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Duplicate records

Discussion started by: jiam912

2. Shell Programming and Scripting

Select records and fields

Discussion started by: giuliangiuseppe

3. Shell Programming and Scripting

Deleting duplicate records from file 1 if records from file 2 match

Discussion started by: vestport

4. UNIX for Dummies Questions & Answers

Need to keep duplicate records

Discussion started by: pandeesh

5. Shell Programming and Scripting

awk print only select records from file2

Discussion started by: sigh2010

6. Linux

Need awk script for removing duplicate records

Discussion started by: Rastamed

7. Shell Programming and Scripting

Block of records to select from a file

Discussion started by: nvkuriseti

8. UNIX for Dummies Questions & Answers

Getting non-duplicate records

Discussion started by: rs123

9. Linux

Need awk script for removing duplicate records

Discussion started by: nmumbarkar

10. Shell Programming and Scripting

Using a variable to select records with awk

Discussion started by: joeyg