How can i delete the duplicates based on one column of a line

08-04-2009

Registered User

80, 0

Join Date: Jun 2009

Last Activity: 15 August 2011, 3:32 PM EDT

Posts: 80

Thanks Given: 0

Thanked 0 Times in 0 Posts

How can i delete the duplicates based on one column of a line

I have my data something like this

Code:

(08/03/2009 22:57:42.414)(:) king aaaaaaaaaaaaaaaa bbbbbbbbbbbbbbbbbbbbbb
(08/03/2009 22:57:42.416)(:) John cccccccccccc cccccvssssssssss baaaaa
(08/03/2009 22:57:42.417)(:) Michael ddddddd tststststtststts
(08/03/2009 22:57:42.425)(:) Ravi vvvvvvvvvvvvvvvvvvsssssssss bsbbbbs
(08/03/2009 22:57:42.426)(:) John bgbhhhhhhhhhhhhhhhhh dddddddddddddd
(08/03/2009 22:57:42.427)(:) king hhhhhhhhhhhhhssssss rr

Here i need to take the 3rd column as the key foir finding the duplicate rows. I need the output to have the rows with only one king,one john and so on...

Output expected :

Code:

(08/03/2009 22:57:42.414)(:) king aaaaaaaaaaaaaaaa bbbbbbbbbbbbbbbbbbbbbb
(08/03/2009 22:57:42.416)(:) John cccccccccccc cccccvssssssssss baaaaa
(08/03/2009 22:57:42.417)(:) Michael ddddddd tststststtststts
(08/03/2009 22:57:42.425)(:) Ravi vvvvvvvvvvvvvvvvvvsssssssss bsbbbbs

can some expert help me with this? this will be very helpful for my script.

rdhanek

View Public Profile for rdhanek

Find all posts by rdhanek

08-04-2009

Registered User

61, 0

Join Date: Jun 2009

Last Activity: 21 April 2011, 4:33 AM EDT

Posts: 61

Thanks Given: 3

Thanked 0 Times in 0 Posts

May not be efficient

Code:

awk '!arr[$3]++ {print}'  file

johnbach

View Public Profile for johnbach

Find all posts by johnbach

08-04-2009

Registered User

80, 0

Join Date: Jun 2009

Last Activity: 15 August 2011, 3:32 PM EDT

Posts: 80

Thanks Given: 0

Thanked 0 Times in 0 Posts

I am getting syntax error with that command. Could you verify the syntax please?

rdhanek

View Public Profile for rdhanek

Find all posts by rdhanek

08-04-2009

Registered User

7,747, 559

Join Date: Feb 2007

Last Activity: 20 April 2020, 11:28 AM EDT

Location: The Netherlands

Posts: 7,747

Thanks Given: 139

Thanked 559 Times in 520 Posts

Quote:

Originally Posted by rdhanek

I am getting syntax error with that command. Could you verify the syntax please?

Use nawk or /usr/xpg4/bin/awk on Solaris.

Regards

Franklin52

View Public Profile for Franklin52

Find all posts by Franklin52

08-04-2009

Registered User

80, 0

Join Date: Jun 2009

Last Activity: 15 August 2011, 3:32 PM EDT

Posts: 80

Thanks Given: 0

Thanked 0 Times in 0 Posts

not working

I tried using

nawk '!arr[$3]++ {print}' file

it's not removing the duplicates..just printing all the rows.

rdhanek

View Public Profile for rdhanek

Find all posts by rdhanek

08-04-2009

Registered User

201, 10

Join Date: Jul 2009

Last Activity: 24 December 2011, 7:16 AM EST

Location: /dev/random

Posts: 201

Thanks Given: 12

Thanked 10 Times in 8 Posts

Very inefficient:

Code:

awk '{x = $3
if (x != y) print
y = $3
}' file

Last edited by ilikecows; 08-04-2009 at 08:02 AM.. Reason: added code tags

ilikecows

View Public Profile for ilikecows

Find all posts by ilikecows

08-04-2009

Registered User

80, 0

Join Date: Jun 2009

Last Activity: 15 August 2011, 3:32 PM EDT

Posts: 80

Thanks Given: 0

Thanked 0 Times in 0 Posts

This is printing all the lines without removing the lines with duplicate column3

rdhanek

View Public Profile for rdhanek

Find all posts by rdhanek

Shell Programming and Scripting

How can i delete the duplicates based on one column of a line

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Search for duplicates and delete but remain the first one based on a specific pattern

Discussion started by: redse171

2. Shell Programming and Scripting

delete from line and remove duplicates

Discussion started by: pareshkp

3. Shell Programming and Scripting

remove duplicates based on single column

Discussion started by: Diya123

4. Shell Programming and Scripting

need to remove duplicates based on key in first column and pattern in last column

Discussion started by: script_op2a

5. Shell Programming and Scripting

Delete lines based on line number

Discussion started by: novice_man

6. Shell Programming and Scripting

Delete Duplicates on the basis of two column values.

Discussion started by: neeraj617

7. UNIX for Dummies Questions & Answers

Remove duplicates based on a column in fixed width file

Discussion started by: Qwerty123

8. Shell Programming and Scripting

how to delete duplicate rows based on last column

Discussion started by: reva

9. UNIX for Dummies Questions & Answers

delete a line based on first character of the line

Discussion started by: borncrazy