Issue using awk to print min values for unique ids


Login or Register to Reply

 
Thread Tools Search this Thread
# 1  
Issue using awk to print min values for unique ids

I am using the following script to search for and print minimum values for each individual Fields (3-14) for each unique id (Field 1). But when the field contains a "-99.99" ( I am ignoring "-99.99") and when the minimum value is the first line of a new id (Field 1), the output does not print Field 2, it leave a blank space. Any ideas on how to correct this?

The below snipit is looking for the min value in field 12 only.

Code:
awk ' /-99.99/ {next} {if (a[$1] == "") a[$1] = $12; if (a[$1] > $12) {a[$1]=$12" " $2} } END { for (i in a) { print i, a[i]} }'

Input file sample
Code:
01001   012017  10.64   3.96    4.45    3.48    9.58    10.76   4.44    5.03    2.78    1.45    1.26    3.24
01001   012018  3.94    6.18    4.47    4.79    11.07   3.01    4.27    6.37    6.48    4.52    6.16    9.95
01001   012019  6.77    3.79    3.96    -99.99  -99.99  -99.99  -99.99  -99.99  -99.99  -99.99  -99.99  -99.99
01003   011895  6.46    3.53    6.69    4.63    3.54    11.71   7.64    8.02    2.45    3.21    1.35    4.05
01003   011896  3.80    5.88    5.63    2.45    2.83    6.98    5.24    5.60    2.90    5.38    4.09    2.45

Current Output
Code:
01001   1.45 
01003   3.21  011895

Expected Output
Code:
01001   1.45  012017 
01003   3.21  011895


Last edited by ncwxpanther; 4 Weeks Ago at 10:34 AM.. Reason: Provide Clarity
# 2  
a bit verbose, but..... sometyhing along these lines.
awk -f ncw.awk myFile where ncw.awk is:
Code:
{
  for(i=3;i<=NF;i++) {
     key[$1]
     if ( $i >0 && (!(($1, i) in f1) || f1[key[$1],i] > $i)) {
       f1[$1,i] = $i
       if (!($1,i) in f2) f2[$1,i] = $2
     }
  }
  nf=i
}
END {
  for( k in key)
    for(i=3; i<= nf; i++) {
      printf("%s%s%s", (i==3)?k OFS f2[k,i]:"", OFS, (i==nf)?f1[k,i] ORS:f1[k,i])
    }
}

results in (given your sample input file):
Code:
01001 012017 10.64 3.96 4.45 3.48 9.58 10.76 4.44 5.03 2.78 1.45 1.26 3.24
01003 011895 6.46 3.53 6.69 4.63 3.54 11.71 7.64 8.02 2.45 3.21 1.35 4.05


Last edited by vgersh99; 4 Weeks Ago at 12:49 PM..
These 2 Users Gave Thanks to vgersh99 For This Post:
# 3  
I get other data
Code:
awk '
!($1 in t)      {t[$1] = $3}
!/-99.99/       {for(i = 3; i <= NF; i++) if (t[$1] > $i) {
                 t[$1] = $i; k[$1] = $2}
                 }
END             { for (i in t) print i, t[i], k[i]
                 }' file

--- Post updated at 19:46 ---

Code:
01001 1.26 012017
01003 1.35 011895

--- Post updated at 19:52 ---

fix
Code:
awk '
!/-99.99/       {if (! t[$1]) t[$1] = $3
                 for(i = 3; i <= NF; i++) if (t[$1] > $i) {
                 t[$1] = $i; k[$1] = $2}
                }
END             { for (i in t) print i, t[i], k[i]
                }' file

--- Post updated at 19:59 ---

Not until the end understood the task Smilie
This User Gave Thanks to nezabudka For This Post:
# 4  
Hello ncwxpanther,

Could you please try following too, it considers that your Input_file is sorted by 1st field(as per shown samples),if not then we could add like sort -k1 Input_file | awk .... too before my following code.

Code:
awk '
!prev{
  prev_first=$1
  prev_sec=$2
}
prev!=$1 && prev{
  print prev_first,min,prev_sec
  prev_first=$1
  prev_sec=$2
  min=""
}
{
  for(i=3;i<=NF;i++){
      if($i!=-99.99){
          min=min<$i?min?min:$i:$i
      }
  }
  prev=$1
}
END{
  if(min){
      print prev_first,min,prev_sec
  }
}'   Input_file

Output will be as follows.
Code:
01001 1.26 012017
01003 1.35 011895

Thanks,
R. Singh
# 5  
fix
Code:
awk '
!/-99.99/       {if (! t[$1]) t[$1] = $3
                 for(i = 3; i <= NF; i++) if (t[$1] > $i) {
                 t[$1] = $i; k[$1] = $2}
                }
END             { for (i in t) print i, t[i], k[i]
                }' file

--- Post updated at 19:59 ---

Not until the end understood the task Smilie[/QUOTE]


Thanks nezabudka


Is the field of interest determined by counting from the last field in the file?
Code:
 for(i = 3; i <= NF; i++)

So would this be the 3rd field from the end? The output seems correct, but I am putting in variables for each field from the 3rd to the 14th. So in the below snipit im considering 10.64 as the 3rd field and 3.24 as the 14th field.

Code:
01001   012017  10.64   3.96    4.45    3.48    9.58    10.76   4.44    5.03    2.78    1.45    1.26    3.24

In other words, what would be the correct syntax for find the min value of the 3rd field?
# 6  
ncwxpanther ,
in my post #2, the suggested implementation calculates ALL the min values for ALL the fields starting at 3.
# 7  
Try:
Code:
awk '
  !($1 in M) {
    M[$1]=$3
  }
  {
    for(i=3; i<=NF; i++) 
      if($i>-99.99 && $i<=M[$1]) {
        M[$1]=$i
        V[$1]=$2
      }
  }
  END {
    for(i in M) print i, M[i], V[i]
  }
' file

Code:
01001 1.26 012017
01003 1.35 011895

Login or Register to Reply

|
Thread Tools Search this Thread
Search this Thread:
Advanced Search

More UNIX and Linux Forum Topics You Might Find Helpful
Print lines based upon unique values in Nth field
jvoot
For some reason I am having difficulty performing what should be a fairly easy task. I would like to print lines of a file that have a unique value in the first field. For example, I have a large data-set with the following excerpt: PS003,001 MZMWR/ L-DWD// * PS003,001...... UNIX for Beginners Questions & Answers
4
UNIX for Beginners Questions & Answers
Print count of unique values
H squared
Hello experts, I am converting a number into its binary output as : read n echo "obase=2;$n" | bc I wish to count the maximum continuous occurrences of the digit 1. Example : 1. The binary equivalent of 5 = 101. Hence the output must be 1. 2. The binary...... Shell Programming and Scripting
3
Shell Programming and Scripting
How to get min and max values using awk?
redse171
Hi, I need your kind help to get min and max values from file based on value in $5 . File1 SP12.3 stc 2240806 2240808 + ID1_N003 ID2_N003T0 SP12.3 sto 2241682 2241684 + ID1_N003 ID2_N003T0 SP12.3 XE 2239943 2240011 + ID1_N003 ID2_N003T0 SP12.3 XE 2240077 2241254 + ID1_N003 ...... Shell Programming and Scripting
12
Shell Programming and Scripting
[Solved] Print a line using a max and a min values of different columns
MetaBolic0
Hi guys, I already search on the forum but i can't solve this on my own. I have a lot of files like this: And i need to print the line with the maximum value in last column but if the value is the same (2 in this exemple for the 3 last lines) i need get the line with the minimum value in...... UNIX for Dummies Questions & Answers
4
UNIX for Dummies Questions & Answers
Awk search for max and min field values
Kirichiko
hi, i have an awk script and I managed to figure out how to search the max value but Im having difficulty in searching for the min field value. BEGIN {FS=","; max=0} NF == 7 {if (max < $6) max = $6;} END { print man, min} where $6 is the column of a field separated by a comma... UNIX for Dummies Questions & Answers
3
UNIX for Dummies Questions & Answers