Hii i have a file with data as shown below. Here i need to remove duplicates of the rows in such a way that
it just checks for 2,3,4,5 column for duplicates.When deleting duplicates,retain largest row i.e with many columns with values should be selected.Then it must remove duplicates such that by checking for the largest value in a specific column say 19 from the given below data.
HTML Code:
SSR 1901 12 1 0 0 0.00 40.0000 71.2000 14 12 3.00 0 4.60 4.00 0.00 0 0.00 8.60 0
SSR 1901 12 1 0 10 3.00 40.0000 71.0000 30 0 0.00 0 5.80 0.00 5.90 0 5.70 5.90 0
SSR 1902 8 22 3 7 4.40 40.0000 68.5000 35 0 0.00 0 6.00 0.00 6.20 0 5.90 6.20 0 aaaa
BDA 1902 8 22 3 0 0.00 40.0000 77.0000 60 0 8.70 0 0.00 0.00 8.00 0 8.60 8.60 0 cccc
CFR 1903 8 22 3 0 0.00 40.0000 77.0000 25 0 0.00 0 0.00 0.00 0.00 0 8.60 8.60 0 bbbb
RAO 1906 8 16 17 0 0.00 24.4000 72.7000 10 0 0.00 0 4.30 0.00 0.00 0 0.00 4.30 0
RAO 1906 8 16 17 6 0.00 24.4000 72.7000 10 0 0.00 0 4.30 6.00 0.00 0 0.00 4.30 0
LEE 1912 8 22 3 0 0.00 40.0000 76.5000 0 0 0.00 0 0.00 0.00 0.00 0 8.20 8.20 0 ffff
LEE 1912 8 22 3 0 0.00 40.0000 76.5000 0 0 0.00 0 0.00 0.00 0.00 0 8.20 8.20 0 ffff
The output should be like
HTML Code:
SSR 1901 12 1 0 0 0.00 40.0000 71.2000 14 12 3.00 0 4.60 4.00 0.00 0 0.00 8.60 0
BDA 1902 8 22 3 0 0.00 40.0000 77.0000 60 0 8.70 0 0.00 0.00 8.00 0 8.60 8.60 0 cccc
CFR 1903 8 22 3 0 0.00 40.0000 77.0000 25 0 0.00 0 0.00 0.00 0.00 0 8.60 8.60 0 bbbb
RAO 1906 8 16 17 6 0.00 24.4000 72.7000 10 0 0.00 0 4.30 6.00 0.00 0 0.00 4.30 0
LEE 1912 8 22 3 0 0.00 40.0000 76.5000 0 0 0.00 0 0.00 0.00 0.00 0 8.20 8.20 0 ffff
Here we are removing duplicates rows based on 2 criteria i.e
1)we check for 2,3,4,5 columns if they are same if so then remove one of the duplicate row.
2)Retain the row which has its largest value in column 19 & which has large set of columns with values in that row.
Help me out if any one has an idea also..i am trying this out from past one week...
Thanks in advance..