Delete lines with duplicate strings based on date


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Delete lines with duplicate strings based on date
# 1  
Old 01-12-2010
Delete lines with duplicate strings based on date

Hey all, a relative bash/script newbie trying solve a problem.

I've got a text file with lots of lines that I've been able to clean up and format with awk/sed/cut, but now I'd like to remove the lines with duplicate usernames based on time stamp. Here's what the data looks like

Code:
2007-11-03 user1 xxxxxxxx
2008-06-30 user1 yyyyyyyy
2008-01-26 user2 bbbbbbbb
2007-11-03 user3 aaaaaaaa
2007-11-02 user4 cccccccc
2008-07-09 user4 dddddddd
2008-06-30 user4 eeeeeeee

What I'm trying to do is pull only the lines with the most recent date with unique accounts.

For example, I'd like the output to be:

Code:
2008-06-30 user1 yyyyyyyy
2008-01-26 user2 bbbbbbbb
2007-11-03 user3 aaaaaaaa
2008-07-09 user4 eeeeeeee

Some accounts are listed only 2 times, some more often. How can I delete the 'oldest' lines with that username?

Thanks for your help.
# 2  
Old 01-12-2010
Code:
awk '!arr[$1 $2]++'  inputfile > newfile

# 3  
Old 01-12-2010
Quote:
sort file | awk ' {arr[$2]=$0} END { for(i in arr) { print arr[i] } } '
# 4  
Old 01-12-2010
anbu23,

Perfect! Thank you.
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Delete rows based on array of strings

Hi All, I have a file with around 1000 rows and one of the fields is an account number. I have been provided with a few account numbers, when any of the account number appears in a row then the row has to be deleted. Can we place the account numbers in an array and use awk or sed for this task?... (3 Replies)
Discussion started by: mrcool4
3 Replies

2. Shell Programming and Scripting

Delete duplicate row based on criteria

Hi, I have an input file as shown below: 20140102;13:30;FR-AUD-LIBOR-1W;2.495 20140103;13:30;FR-AUD-LIBOR-1W;2.475 20140106;13:30;FR-AUD-LIBOR-1W;2.495 20140107;13:30;FR-AUD-LIBOR-1W;2.475 20140108;13:30;FR-AUD-LIBOR-1W;2.475 20140109;13:30;FR-AUD-LIBOR-1W;2.475... (2 Replies)
Discussion started by: shash
2 Replies

3. Shell Programming and Scripting

Remove lines containing 2 or more duplicate strings

Within my text file i have several thousand lines of text with some lines containing duplicate strings/words. I would like to entirely remove those lines which contain the duplicate strings. Eg; One and a Two Unix.com is the Best This as a Line Line Example duplicate sentence with the word... (22 Replies)
Discussion started by: martinsmith
22 Replies

4. Shell Programming and Scripting

Delete duplicate strings in a line

Hi, i need help to remove duplicates in my file. The problem is i need to delete one duplicate for each line only. the input file as follows and it is not tab delimited:- The output need to remove 2nd word (in red) that duplicate with 1st word (in blue). Other duplicates should remained... (12 Replies)
Discussion started by: redse171
12 Replies

5. Shell Programming and Scripting

Getting lines between two strings with duplicate set of data

if I have the following lines in a file app.log some lines here <AAAA> abc <id>123456789</id> ddd </AAAA>some lines here too <BBBB> abc <id>123456789</id> ddd </BBBB>some lines here too <AAAA> xyz <id>987654321</id> ssss </AAAA>some lines here again... How do I get the... (5 Replies)
Discussion started by: nariwithu
5 Replies

6. Shell Programming and Scripting

Delete lines in file containing duplicate strings, keeping longer strings

The question is not as simple as the title... I have a file, it looks like this <string name="string1">RZ-LED</string> <string name="string2">2.0</string> <string name="string2">Version 2.0</string> <string name="string3">BP</string> I would like to check for duplicate entries of... (11 Replies)
Discussion started by: raidzero
11 Replies

7. UNIX for Advanced & Expert Users

In a huge file, Delete duplicate lines leaving unique lines

Hi All, I have a very huge file (4GB) which has duplicate lines. I want to delete duplicate lines leaving unique lines. Sort, uniq, awk '!x++' are not working as its running out of buffer space. I dont know if this works : I want to read each line of the File in a For Loop, and want to... (16 Replies)
Discussion started by: krishnix
16 Replies

8. UNIX for Dummies Questions & Answers

Delete strings in file1 based on the list of strings in file2

Hello guys, should be a very easy questn for you: I need to delete strings in file1 based on the list of strings in file2. like file2: word1_word2_ word3_word5_ word3_word4_ word6_word7_ file1: word1_word2_otherwords..,word3_word5_others... (7 Replies)
Discussion started by: roussine
7 Replies

9. Shell Programming and Scripting

how to delete duplicate rows based on last column

hii i have a huge amt of data stored in a file.Here in this file i need to remove duplicates rows in such a way that the last column has different data & i must check for greatest among last colmn data & print the largest data along with other entries but just one of other duplicate entries is... (16 Replies)
Discussion started by: reva
16 Replies

10. Shell Programming and Scripting

How to delete duplicate records based on key

For example suppose I have a file which contains data as: $cat data 800,2 100,9 700,3 100,9 200,8 100,3 Now I want the output as 200,8 700,3 800,2 Key is first three characters, I don't want any reords which are having duplicate keys. Like sort +0.0 -0.3 data can we use... (9 Replies)
Discussion started by: sumitc
9 Replies
Login or Register to Ask a Question