![]() |
Hello and Welcome from United States to the UNIX and Linux Forums! Thank You for Visiting and Joining Our Global Community.
|
|
google unix.com
|
|||||||
| Forums | Register | Forum Rules | Links | Albums | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here. |
More UNIX and Linux Forum Topics You Might Find Helpful
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| column to rows based on another column... | malcomex999 | Shell Programming and Scripting | 5 | 07-08-2009 06:28 AM |
| Duplicate rows in CSV files based on values | Incrediblian | Shell Programming and Scripting | 8 | 04-26-2009 09:35 PM |
| merge rows based on a common column | smriti_shridhar | Shell Programming and Scripting | 6 | 10-17-2008 06:15 AM |
| Remove duplicate rows of a file based on a value of a column | risk_sly | UNIX for Dummies Questions & Answers | 7 | 09-26-2008 06:26 AM |
| how to delete duplicate rows in a file | vamshikrishnab | Shell Programming and Scripting | 5 | 06-18-2008 10:00 AM |
![]() |
|
|
LinkBack | Thread Tools | Search this Thread | Rate Thread | Display Modes |
|
|
|
||||
|
how to delete duplicate rows based on last column
hii i have a huge amt of data stored in a file.Here in this file i need to remove duplicates rows in such a way that the last column has different data & i must check for greatest among last colmn data & print the largest data along with other entries but just one of other duplicate entries is needed .For example the given file which looks like this
1902 8 22 3 40.0000 77.0000 8.60 1902 8 22 3 40.0000 76.5000 8.20 1902 8 22 3 40.0000 76.5000 8.30 1902 8 22 3 40.0000 77.0000 8.40 1902 8 22 3 39.8000 76.2000 8.10 1902 9 30 6 38.5000 67.0000 7.70 1902 9 30 6 38.5000 67.0000 6.30 1902 10 6 9 36.5000 70.5000 7.20 1902 12 4 22 37.8000 65.5000 4.90 Now i want the output for such a file as below 1902 8 22 3 40.0000 77.0000 8.60 1902 8 22 3 40.0000 76.5000 8.30 1902 8 22 3 39.8000 76.2000 8.10 1902 9 30 6 36.5000 67.0000 7.70 1902 10 6 9 36.5000 70.5000 7.20 1902 12 4 22 37.8000 65.5000 4.90 ------ ![]() Last edited by reva; 08-25-2009 at 05:17 AM.. |
|
||||
|
something like this :
Code:
awk '{ va=$NF;$NF=" "; if ($0 in a) { if (va > a[$0]){a[$0]=va}} else {a[$0]=va}} END { for ( i in a ) print i" "a[i] }' file_name.txt
|
|
||||
|
thanks a lot its working.but first few lines are been deleted in my file...
one more thing for the same data if i need the ouput as 1902 8 22 3 40.0000 77.0000 8.60 1902 9 30 6 38.5000 67.0000 7.70 1902 10 6 9 36.5000 70.5000 7.20 1902 12 4 22 37.8000 65.5000 4.90 that is just check for first 4 columns if its equal & other columns for largest value as shown in above .. |
|
||||
|
Something like this :
Code:
awk '{ va2=$NF;va1=$(NF-1);va=$(NF-2);$NF=" ";$(NF-1)=" ";$(NF-2)=" ";if ($0 in a) { if (va" "va1" "va2 >a[$0] ){a[$0]=va" "v
a1" "va2" "}} else {a[$0]=va" "va1" "va2}} END { for ( i in a ) print i" "a[i] }' file_name.txt
need to check further as the order of the elements in associative array is not the same. |
|
||||
|
Another way...
For the 1st one... Code:
sort -n +6 infile | awk '{t[$1" "$2" "$3" "$4" "$5" "$6]=$7}END{for (i in t){print i,t[i]}}'
Code:
sort -n +4 infile | awk '{t[$1" "$2" "$3" "$4]=$5" "$6" "$7}END{for (i in t){print i,t[i]}}'
|
![]() |
| Bookmarks |
| Thread Tools | Search this Thread |
| Display Modes | Rate This Thread |
|
|