Delete lines with duplicate strings based on date

01-12-2010

Registered User

2, 0

Join Date: Jan 2010

Last Activity: 12 January 2010, 2:11 PM EST

Posts: 2

Thanks Given: 0

Thanked 0 Times in 0 Posts

Delete lines with duplicate strings based on date

Hey all, a relative bash/script newbie trying solve a problem.

I've got a text file with lots of lines that I've been able to clean up and format with awk/sed/cut, but now I'd like to remove the lines with duplicate usernames based on time stamp. Here's what the data looks like

Code:

2007-11-03 user1 xxxxxxxx
2008-06-30 user1 yyyyyyyy
2008-01-26 user2 bbbbbbbb
2007-11-03 user3 aaaaaaaa
2007-11-02 user4 cccccccc
2008-07-09 user4 dddddddd
2008-06-30 user4 eeeeeeee

What I'm trying to do is pull only the lines with the most recent date with unique accounts.

For example, I'd like the output to be:

Code:

2008-06-30 user1 yyyyyyyy
2008-01-26 user2 bbbbbbbb
2007-11-03 user3 aaaaaaaa
2008-07-09 user4 eeeeeeee

Some accounts are listed only 2 times, some more often. How can I delete the 'oldest' lines with that username?

Thanks for your help.

mattv

View Public Profile for mattv

Find all posts by mattv

01-12-2010

Registered User

11,728, 1,345

Join Date: Feb 2004

Last Activity: 8 May 2020, 9:07 AM EDT

Location: NM

Posts: 11,728

Thanks Given: 903

Thanked 1,345 Times in 1,201 Posts

Code:

awk '!arr[$1 $2]++'  inputfile > newfile

jim mcnamara

View Public Profile for jim mcnamara

Find all posts by jim mcnamara

01-12-2010

Registered User

2,205, 181

Join Date: Mar 2006

Last Activity: 8 May 2020, 5:01 AM EDT

Location: Bangalore,India

Posts: 2,205

Thanks Given: 31

Thanked 181 Times in 171 Posts

Quote:

sort file | awk ' {arr[$2]=$0} END { for(i in arr) { print arr[i] } } '

anbu23

View Public Profile for anbu23

Find all posts by anbu23

01-12-2010

Registered User

2, 0

Join Date: Jan 2010

Last Activity: 12 January 2010, 2:11 PM EST

Posts: 2

Thanks Given: 0

Thanked 0 Times in 0 Posts

anbu23,

Perfect! Thank you.

mattv

View Public Profile for mattv

Find all posts by mattv

UNIX for Dummies Questions & Answers

Delete lines with duplicate strings based on date

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Delete rows based on array of strings

Discussion started by: mrcool4

2. Shell Programming and Scripting

Delete duplicate row based on criteria

Discussion started by: shash

3. Shell Programming and Scripting

Remove lines containing 2 or more duplicate strings

Discussion started by: martinsmith

4. Shell Programming and Scripting

Delete duplicate strings in a line

Discussion started by: redse171

5. Shell Programming and Scripting

Getting lines between two strings with duplicate set of data

Discussion started by: nariwithu

6. Shell Programming and Scripting

Delete lines in file containing duplicate strings, keeping longer strings

Discussion started by: raidzero

7. UNIX for Advanced & Expert Users

In a huge file, Delete duplicate lines leaving unique lines

Discussion started by: krishnix

8. UNIX for Dummies Questions & Answers

Delete strings in file1 based on the list of strings in file2

Discussion started by: roussine

9. Shell Programming and Scripting

how to delete duplicate rows based on last column

Discussion started by: reva

10. Shell Programming and Scripting

How to delete duplicate records based on key

Discussion started by: sumitc