Delete a row that has a duplicate column

04-16-2009

Registered User

22, 1

Join Date: Feb 2009

Last Activity: 8 March 2013, 3:36 PM EST

Posts: 22

Thanks Given: 4

Thanked 1 Time in 1 Post

Delete a row that has a duplicate column

I'm trying to remove lines of data that contain duplicate data in a specific column.

For example.

Code:

apple 12345
apple 54321
apple 14234
orange 55656
orange 88989
orange 99898

I only want to see

Code:

apple 12345
orange 55656

How would i go about doing this?

spartan22

View Public Profile for spartan22

Find all posts by spartan22

04-16-2009

Registered User

748, 11

Join Date: Jul 2008

Last Activity: 15 January 2015, 10:57 AM EST

Location: Frederick, MD

Posts: 748

Thanks Given: 4

Thanked 11 Times in 10 Posts

Code:

#cat test.log
apple 12345
apple 54321
apple 14234
orange 55656
orange 88989
orange 99898

#awk 'a !~ $1; {a=$1}' test.log
apple 12345
orange 55656

Ikon

View Public Profile for Ikon

Find all posts by Ikon

04-16-2009

Moderator

8,825, 1,112

Join Date: Feb 2005

Last Activity: 23 August 2021, 11:26 AM EDT

Location: Foxborough, MA

Posts: 8,825

Thanks Given: 579

Thanked 1,112 Times in 1,003 Posts

the usual awk paradigm:

Code:

nawk '!a[$1]++' myFile

Code:

sort -u -k1,1 myFile

vgersh99

View Public Profile for vgersh99

Find all posts by vgersh99

05-30-2009

Registered User

100, 0

Join Date: Feb 2009

Last Activity: 7 November 2016, 6:38 AM EST

Posts: 100

Thanks Given: 19

Thanked 0 Times in 0 Posts

Hi Friends,

Can anybody change the above script "#awk 'a !~ $1; {a=$1}' test.log
" to keep the last repeated entry and delete all the previous duplicates.
For example if the input file is
1 2 3 4
2 2 4 5.
Here column 2 field(s) are repeating.
So I want the output as 2 2 4 5 but not 1 2 3 4.
Thanks in advance..

ks_reddy

View Public Profile for ks_reddy

Find all posts by ks_reddy

05-30-2009

Registered User

738, 7

Join Date: Oct 2007

Last Activity: 21 August 2013, 5:20 AM EDT

Location: Bangalore

Posts: 738

Thanks Given: 0

Thanked 7 Times in 7 Posts

sort it in reverse mode and use the same command.

Code:

sort -r -k2,2 filename | awk 'a !~ $2; {a=$2}'

-Devaraj Takhellambam

devtakh

View Public Profile for devtakh

Find all posts by devtakh

05-30-2009

Registered User

100, 0

Join Date: Feb 2009

Last Activity: 7 November 2016, 6:38 AM EST

Posts: 100

Thanks Given: 19

Thanked 0 Times in 0 Posts

Thanks Devaraj,

Actually my application is slight different. But your idea satisfied my needs with slight modification to my raw data. Thank you very much.

ks_reddy

View Public Profile for ks_reddy

Find all posts by ks_reddy

Shell Programming and Scripting

Delete a row that has a duplicate column

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Find duplicate values in specific column and delete all the duplicate values

Discussion started by: sajmar

2. Shell Programming and Scripting

Delete duplicate row based on criteria

Discussion started by: shash

3. Shell Programming and Scripting

Delete duplicate row

Discussion started by: aav1307

4. UNIX for Dummies Questions & Answers

awk to sum column field from duplicate row/lines

Discussion started by: asjaiswal

5. Shell Programming and Scripting

Delete a row if either of column value is zero

Discussion started by: jacobs.smith

6. Shell Programming and Scripting

delete a row with a specific value at a certain column

Discussion started by: kylle345

7. Shell Programming and Scripting

Delete row if a a particular column has more then three characters in it

Discussion started by: bhargavpbk88

8. Shell Programming and Scripting

duplicate row based on single column

Discussion started by: mitr

9. Shell Programming and Scripting

Find and replace duplicate column values in a row

Discussion started by: nuthalapati

10. Shell Programming and Scripting

Delete first row last column

Discussion started by: susau_79