keeping last record among group of records with common fields (awk) | Unix Linux Forums | UNIX for Dummies Questions & Answers

  Go Back    


UNIX for Dummies Questions & Answers If you're not sure where to post a UNIX or Linux question, post it here. All UNIX and Linux newbies welcome !!

keeping last record among group of records with common fields (awk)

UNIX for Dummies Questions & Answers


Closed Thread    
 
Thread Tools Search this Thread Display Modes
    #1  
Old 10-07-2012
beca123456 beca123456 is offline
Registered User
 
Join Date: Apr 2012
Last Activity: 20 June 2014, 9:51 PM EDT
Posts: 70
Thanks: 36
Thanked 0 Times in 0 Posts
keeping last record among group of records with common fields (awk)

input:

Code:
ref.1;rack.1;1     #group1
ref.1;rack.1;2     #group1
ref.1;rack.2;1     #group2
ref.2;rack.3;1     #group3
ref.2;rack.3;2     #group3
ref.2;rack.3;3     #group3

Among records from same group (i.e. with same 1st and 2nd field - separated by ";"), I would need to keep the last record (or the record with the highest number in the last field, which is the same here).

in order to get:

Code:
ref.1;rack.1;2
ref.1;rack.2;1
ref.2;rack.3;3

I think I managed to isolate records in groups by doing

Code:
BEGIN{FS=OFS=";"}

{array[$1$2] = $0

for(a in array)
if(<$0 is last record>) print array[a]}

but I don't know how to say "the last record". I tried with NR but it didn't really help...
Sponsored Links
    #2  
Old 10-07-2012
elixir_sinari's Avatar
elixir_sinari elixir_sinari is offline Forum Advisor  
Registered User
 
Join Date: Mar 2012
Last Activity: 9 October 2014, 4:50 PM EDT
Location: India
Posts: 1,412
Thanks: 101
Thanked 496 Times in 473 Posts
Quote:
Originally Posted by beca123456 View Post
but I don't know how to say "the last record". I tried with NR but it didn't really help...
To know the last record, you have read the full input stream and then in the END pattern, print out the records. Since you are using an associative array in the mentioned manner, the elements of it will always contain the last records of any group.
And, setting OFS is useless in this case.

Code:
BEGIN{FS=";"}
{array[$1,$2]=$0}
END{for(a in array) print array[a]}

The Following User Says Thank You to elixir_sinari For This Useful Post:
beca123456 (10-07-2012)
Sponsored Links
    #3  
Old 10-07-2012
beca123456 beca123456 is offline
Registered User
 
Join Date: Apr 2012
Last Activity: 20 June 2014, 9:51 PM EDT
Posts: 70
Thanks: 36
Thanked 0 Times in 0 Posts
Thanks elixir_sinari. I got it now for this way !

And what about if I wanted to keep the highest value in the last field instead of the last line (which is the same here but just for me to know)?
    #4  
Old 10-07-2012
elixir_sinari's Avatar
elixir_sinari elixir_sinari is offline Forum Advisor  
Registered User
 
Join Date: Mar 2012
Last Activity: 9 October 2014, 4:50 PM EDT
Location: India
Posts: 1,412
Thanks: 101
Thanked 496 Times in 473 Posts
Assuming only positive values in the third field, you may try:

Code:
BEGIN{FS=";"}
$3>=highest[$1,$2]{maxrec[$1,$2]=$0;highest[$1,$2]=$3}
END{
for(i in maxrec)
 print maxrec[i]
}


Last edited by elixir_sinari; 10-07-2012 at 04:11 AM..
The Following User Says Thank You to elixir_sinari For This Useful Post:
beca123456 (10-07-2012)
Sponsored Links
    #5  
Old 10-07-2012
Don Cragun's Avatar
Don Cragun Don Cragun is online now Forum Staff  
Moderator
 
Join Date: Jul 2012
Last Activity: 22 December 2014, 2:20 PM EST
Location: San Jose, CA, USA
Posts: 5,267
Thanks: 207
Thanked 1,759 Times in 1,499 Posts
Here are some options for you. This script provides several ways of processing the input file giving different results depending on whether you want the highest value for $3 or the last value for $3, all entries with matching field 1 and field 2 values adjacent or spread throughout the input file, and whether or not you care if the output order is the same as the input file order:

Code:
#!/bin/ksh
printf "Following assumes all entries with matching field 1 & field 2 are
adjacent and prints the last entry found.\n"
awk 'BEGIN {FS = OFS = ";"}
last != $1 FS $2 {
        if(last != "") print last, hi3
        last = $1 FS $2
        hi3 = $3
        next
}
        {hi3 = $3}
END {   if(last != "") print last, hi3}' input

printf "\nFollowing assumes all entries with matching field 1 & field 2 are
adjacent and prints the entry with highest value in field 3.\n"
awk 'BEGIN {FS = OFS = ";"}
last != $1 FS $2 {
        if(last != "") print last, hi3
        last = $1 FS $2
        hi3 = $3
        next
}
        {if($3 > hi3) hi3 = $3}
END {   if(last != "") print last, hi3}' input

printf "\nFollowing assumes entries with matching field 1 & field 2 might not be
adjacent and prints the last entry found.  Output order is not guaranteed to
match the order of first appearance in the input file.\n"
awk 'BEGIN {FS = OFS = ";"}
{       k3[$1 FS $2] = $3}
END {   for (k in k3) print k, k3[k]}' input

printf "\nFollowing assumes entries with matching field 1 & field 2 might not be
adjacent and prints the highest entry found.  Output order is not guaranteed to
match the order of first appearance in the input file.\n"
awk 'BEGIN {FS = OFS = ";"}
{       if(kc[$1 FS $2]++ == 0) k3[$1 FS $2] = $3
        else if($3 > k3[$1 FS $2]) k3[$1 FS $2] = $3
}
END {   for (k in kc) print k, k3[k]}' input

printf "\nFollowing assumes entries with matching field 1 & field 2 might not be
adjacent and prints the highest entry found.  Output order is guaranteed to
match the order of first appearance in the input file.\n"
awk 'BEGIN {FS = OFS = ";"}
{       if(kc[$1 FS $2]++ == 0) {
                k3[$1 FS $2] = $3
                order[++cnt] = $1 FS $2
        } else if($3 > k3[$1 FS $2]) k3[$1 FS $2] = $3
}
END {   for(i = 1; i <= cnt; i++) print order[i], k3[order[i]]}' input

When the file input contains:

Code:
split;test;2
ref.1;rack.1;2
ref.1;rack.1;1
ref.1;rack.2;1
split;test;3
ref.2;rack.3;1
ref.2;rack.3;2
ref.2;rack.3;3
split;test;1

the output from the above script is:

Code:
Following assumes all entries with matching field 1 & field 2 are
adjacent and prints the last entry found.
split;test;2
ref.1;rack.1;1
ref.1;rack.2;1
split;test;3
ref.2;rack.3;3
split;test;1

Following assumes all entries with matching field 1 & field 2 are
adjacent and prints the entry with highest value in field 3.
split;test;2
ref.1;rack.1;2
ref.1;rack.2;1
split;test;3
ref.2;rack.3;3
split;test;1

Following assumes entries with matching field 1 & field 2 might not be
adjacent and prints the last entry found.  Output order is not guaranteed to
match the order of first appearance in the input file.
split;test;1
ref.2;rack.3;3
ref.1;rack.1;1
ref.1;rack.2;1

Following assumes entries with matching field 1 & field 2 might not be
adjacent and prints the highest entry found.  Output order is not guaranteed to
match the order of first appearance in the input file.
split;test;3
ref.2;rack.3;3
ref.1;rack.1;2
ref.1;rack.2;1

Following assumes entries with matching field 1 & field 2 might not be
adjacent and prints the highest entry found.  Output order is guaranteed to
match the order of first appearance in the input file.
split;test;3
ref.1;rack.1;2
ref.1;rack.2;1
ref.2;rack.3;3

The Following User Says Thank You to Don Cragun For This Useful Post:
beca123456 (10-07-2012)
Sponsored Links
    #6  
Old 10-07-2012
beca123456 beca123456 is offline
Registered User
 
Join Date: Apr 2012
Last Activity: 20 June 2014, 9:51 PM EDT
Posts: 70
Thanks: 36
Thanked 0 Times in 0 Posts
Waoww !!! This is a very complete answer !
Thanks a lot !
Sponsored Links
Closed Thread

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Common records jacobs.smith Shell Programming and Scripting 7 05-15-2012 03:38 PM
Matching and Merging csv data fields based on a common field landossa Shell Programming and Scripting 3 04-24-2012 03:35 AM
Common records after matching on different columns jacobs.smith Shell Programming and Scripting 10 02-17-2012 04:45 PM
Merging CSV fields based on a common field landossa Shell Programming and Scripting 1 02-09-2012 01:02 AM
Common records using AWK jacobs.smith Shell Programming and Scripting 9 02-02-2012 03:39 AM



All times are GMT -4. The time now is 03:25 PM.