how to delete duplicate rows based on last column


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting how to delete duplicate rows based on last column
# 15  
Old 09-01-2009
post sample input and expected output at least a few lines to test.
# 16  
Old 09-01-2009
The sample input is
Code:
 SIG  2007  3 24  4 35 45.80   5.2600  94.3100  58   0 5.20   0 0.00 5.00 0.00   0 0.00 5.20   0
 SSS  2007  3 24  9  3 37.40  36.5600  71.4800 152   0 4.70   0 0.00 0.00 0.00   0 0.00 4.70   0
 SIG  2008  3 25 18 29 33.15   1.7700  99.3400 163   0 4.60   0 0.00 0.00 0.00   0 0.00 4.60   0
 SEG  2008  3 25 18 27 35.06   1.7700  99.3400  89   0 5.00   0 0.00 0.00 0.00   0 0.00 5.00   0
PDE-Q 2009  7  2 22 36 45.17  37.4800  71.7400  20   0 4.60   0 0.00 0.00 0.00   0 0.00 4.60   0 
PDE-Q 2009  7  2 23 50 49.20  37.4800  71.7400 108   0 4.70   0 0.00 0.00 0.00   0 0.00 4.70   0 
PDE-Q 2009  7  3  4 42 32.83  34.4600  24.1200  41   0 4.50   0 0.00 0.00 0.00   0 0.00 4.50   0 
PDE-Q 2009  7  5  9 45 48.77  36.4600  71.0700 248   0 4.90   0 0.00 0.00 0.00   0 0.00 4.90   0
PDE-Q 2009  7  5 12 25 37.44   1.3300  99.7800 185   0 4.50   0 0.00 0.00 0.00   0 0.00 4.60   0
PDE-Q 2009  7  5 12 25 37.44   1.3300  99.7800 185   0 4.50   0 0.00 0.00 0.00   0 0.00 4.50   0
PDE-Q 2009  7  6 16  0 38.96   3.0400  93.3500  34   0 4.90   0 0.00 0.00 0.00   0 0.00 4.90   0
PDE-Q 2009  7  7  0 32 47.11  34.1600  25.5100  13   0 0.00   0 0.00 0.00 0.00   0 0.00 0.00   0
PDE-Q 2009  7  7  1  2  0.48  34.1600  25.5100  25   0 4.80   0 0.00 0.00 0.00   0 3.00 4.80   0

The sample output is
Code:
 SIG  2007  3 24  4 35 45.80   5.2600  94.3100  58   0 5.20   0 0.00 5.00 0.00   0 0.00 5.20   0
 SEG  2008  3 25 18 27 35.06   1.7700  99.3400  89   0 5.00   0 0.00 0.00 0.00   0 0.00 5.00   0
PDE-Q 2009  7  2 23 50 49.20  37.4800  71.7400 108   0 4.70   0 0.00 0.00 0.00   0 0.00 4.70   0 
PDE-Q 2009  7  3  4 42 32.83  34.4600  24.1200  41   0 4.50   0 0.00 0.00 0.00   0 0.00 4.50   0 
PDE-Q 2009  7  5  9 45 48.77  36.4600  71.0700 248   0 4.90   0 0.00 0.00 0.00   0 0.00 4.90   0
PDE-Q 2009  7  5 12 25 37.44   1.3300  99.7800 185   0 4.50   0 0.00 0.00 0.00   0 0.00 4.60   0
PDE-Q 2009  7  6 16  0 38.96   3.0400  93.3500  34   0 4.90   0 0.00 0.00 0.00   0 0.00 4.90   0
PDE-Q 2009  7  7  1  2  0.48  34.1600  25.5100  25   0 4.80   0 0.00 0.00 0.00   0 3.00 4.80   0

# 17  
Old 09-01-2009
something like this :
Code:
awk '{ if($1" "$2" "$3" "$4 in a) { if(va < $(NF-1)) {a[$1" "$2" "$3" "$4]=$0;va=$(NF-1);next}} else { a[$1" "$2" "$3" "$4]=$0;va=$(NF-1)}} END { for ( i in a) print a[i] }' file_name.txt | sort +1n

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extract and exclude rows based on duplicate values

Hello I have a file like this: > cat examplefile ghi|NN603762|eee mno|NN607265|ttt pqr|NN613879|yyy stu|NN615002|uuu jkl|NN607265|rrr vwx|NN615002|iii yzA|NN618555|ooo def|NN190486|www BCD|NN628717|ppp abc|NN190486|qqq EFG|NN628717|aaa HIJ|NN628717|sss > I can sort the file by... (5 Replies)
Discussion started by: CHoggarth
5 Replies

2. Shell Programming and Scripting

Remove duplicate rows based on one column

Dear members, I need to filter a file based on the 8th column (that is id), and does not mather the other columns, because I want just one id (1 line of each id) and remove the duplicates lines based on this id (8th column), and does not matter wich duplicate will be removed. example of my file... (3 Replies)
Discussion started by: clarissab
3 Replies

3. Shell Programming and Scripting

awk to sum a column based on duplicate strings in another column and show split totals

Hi, I have a similar input format- A_1 2 B_0 4 A_1 1 B_2 5 A_4 1 and looking to print in this output format with headers. can you suggest in awk?awk because i am doing some pattern matching from parent file to print column 1 of my input using awk already.Thanks! letter number_of_letters... (5 Replies)
Discussion started by: prashob123
5 Replies

4. UNIX for Dummies Questions & Answers

merging rows into new file based on rows and first column

I have 2 files, file01= 7 columns, row unknown (but few) file02= 7 columns, row unknown (but many) now I want to create an output with the first field that is shared in both of them and then subtract the results from the rest of the fields and print there e.g. file 01 James|0|50|25|10|50|30... (1 Reply)
Discussion started by: A-V
1 Replies

5. Shell Programming and Scripting

Delete duplicate rows

Hi, This is a followup to my earlier post him mno klm 20 76 . + . klm_mango unix_00000001; alp fdc klm 123 456 . + . klm_mango unix_0000103; her tkr klm 415 439 . + . klm_mango unix_00001043; abc tvr klm 20 76 . + . klm_mango unix_00000001; abc def klm 83 84 . + . klm_mango... (5 Replies)
Discussion started by: jacobs.smith
5 Replies

6. UNIX for Dummies Questions & Answers

Remove duplicate rows when >10 based on single column value

Hello, I'm trying to delete duplicates when there are more than 10 duplicates, based on the value of the first column. e.g. a 1 a 2 a 3 b 1 c 1 gives b 1 c 1 but requires 11 duplicates before it deletes. Thanks for the help Video tutorial on how to use code tags in The UNIX... (11 Replies)
Discussion started by: informaticist
11 Replies

7. Ubuntu

delete duplicate rows with awk files

Hi every body I have some text file with a lots of duplicate rows like this: 165.179.568.197 154.893.836.174 242.473.396.153 165.179.568.197 165.179.568.197 165.179.568.197 154.893.836.174 how can I delete the repeated rows? Thanks Saeideh (2 Replies)
Discussion started by: sashtari
2 Replies

8. UNIX for Dummies Questions & Answers

forming duplicate rows based on value of a key

if the key (A or B or ...others) has 4 in its 3rd column the 1st A row has to form 4 dupicates along with the all the values of A in 4th column (2.9, 3.8, 4.2) . Hope I explain the question clearly. Cheers Ruby input "A" 1 4 2.9 "A" 2 5 ... (7 Replies)
Discussion started by: ruby_sgp
7 Replies

9. UNIX for Dummies Questions & Answers

Remove duplicate rows of a file based on a value of a column

Hi, I am processing a file and would like to delete duplicate records as indicated by one of its column. e.g. COL1 COL2 COL3 A 1234 1234 B 3k32 2322 C Xk32 TTT A NEW XX22 B 3k32 ... (7 Replies)
Discussion started by: risk_sly
7 Replies

10. Shell Programming and Scripting

how to delete duplicate rows in a file

I have a file content like below. "0000000","ABLNCYI","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","","" "0000000","ABLNCYI","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","","" "0000000","ABLNCYI","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","",""... (5 Replies)
Discussion started by: vamshikrishnab
5 Replies
Login or Register to Ask a Question