Issue using awk to print min values for unique ids


Login or Register to Reply

 
Thread Tools Search this Thread
# 8  
I see. To be more specific I need the script to look at each field separately for each unique ID. So the min value for just Field 3 or just Field 4.
# 9  
Quote:
Originally Posted by vgersh99
a bit verbose, but..... sometyhing along these lines.
awk -f ncw.awk myFile where ncw.awk is:
Code:
{
  for(i=3;i<=NF;i++) {
     key[$1]
     if ( $i >0 && (!(($1, i) in f1) || f1[key[$1],i] > $i)) {
       f1[$1,i] = $i
       if (!($1,i) in f2) f2[$1,i] = $2
     }
  }
  nf=i
}
END {
  for( k in key)
    for(i=3; i<= nf; i++) {
      printf("%s%s%s", (i==3)?k OFS f2[k,i]:"", OFS, (i==nf)?f1[k,i] ORS:f1[k,i])
    }
}

results in (given your sample input file):
Code:
01001 012017 10.64 3.96 4.45 3.48 9.58 10.76 4.44 5.03 2.78 1.45 1.26 3.24
01003 011895 6.46 3.53 6.69 4.63 3.54 11.71 7.64 8.02 2.45 3.21 1.35 4.05

Thanks for the help, but this prints out the first line for each unique ID. Not necessarily the min value.

Input
Code:
01001   011895  7.03    2.96    8.36    3.53    3.96    5.40    3.92    3.36    0.73    2.03    1.44    3.66
01001   011896  5.86    5.42    5.54    3.98    3.77    6.24    4.38    2.57    0.82    1.66    2.89    1.94
01001   011897  3.27    6.63    10.94   4.35    0.81    1.57    3.96    5.02    0.87    0.75    1.84    4.38
01001   011898  2.33    2.07    2.60    4.56    0.54    3.13    5.80    6.02    1.51    3.21    6.66    3.91
01001   011899  5.80    6.94    3.35    2.22    2.93    2.31    6.80    2.90    0.63    3.02    1.98    5.25
.....
01003   011895  6.46    3.53    6.69    4.63    3.54    11.71   7.64    8.02    2.45    3.21    1.35    4.05
01003   011896  3.80    5.88    5.63    2.45    2.83    6.98    5.24    5.60    2.90    5.38    4.09    2.45
01003   011897  3.95    6.62    8.19    4.43    1.09    2.44    7.06    14.69   2.08    1.29    2.45    3.91
01003   011898  2.82    4.25    2.04    3.64    0.73    5.87    6.72    11.96   9.90    1.46    7.88    5.09
01003   011899  5.64    4.47    3.37    0.97    1.16    5.90    7.81    6.53    0.31    2.23    3.75    4.79
.....

Output
Code:
01001 011895 7.03 2.96 8.36 3.53 3.96 5.40 3.92 3.36 0.73 2.03 1.44 3.66
01003 011895 6.46 3.53 6.69 4.63 3.54 11.71 7.64 8.02 2.45 3.21 1.35 4.05

# 10  
sorry - my bad. try this version:
Code:
{
  for(i=3;i<=NF;i++) {
     key[$1]
     if ( $i >0 && (!(($1, i) in f1) || f1[$1,i] > $i)) {
       f1[$1,i] = $i
       if (!($1,i) in f2) f2[$1,i] = $2
     }
  }
  nf=i
}
END {
  for( k in key)
    for(i=3; i<= nf; i++) {
      printf("%s%s%s", (i==3)?k OFS f2[k,i]:"", OFS, (i==nf)?f1[k,i] ORS:f1[k,i])
    }
}

# 12  
Adaptation of post #7 for the new requirements:

Code:
awk -v c=4 '
  !($1 in M) {
    M[$1]=$c
  }
  {
    if($c>-99.99 && $c<=M[$1]) {
      M[$1]=$c
      V[$1]=$2
    }
  }
  END {
    for(i in M) print i, M[i], V[i]
  }
' file

c is the column number
This User Gave Thanks to Scrutinizer For This Post:
# 14  
Quote:
Originally Posted by ncwxpanther
These appear to be the same. Did I miss something?
it's not the same:
Code:
{
  for(i=3;i<=NF;i++) {
     key[$1]
     if ( $i >0 && (!(($1, i) in f1) || f1[$1,i] > $i)) {
       f1[$1,i] = $i
       if (!($1,i) in f2) f2[$1,i] = $2
     }
  }
  nf=i
}
END {
  for( k in key)
    for(i=3; i<= nf; i++) {
      printf("%s%s%s", (i==3)?k OFS f2[k,i]:"", OFS, (i==nf)?f1[k,i] ORS:f1[k,i])
    }
}

given your sample file:
Code:
01001   011895  7.03    2.96    8.36    3.53    3.96    5.40    3.92    3.36    0.73    2.03    1.44    3.66
01001   011896  5.86    5.42    5.54    3.98    3.77    6.24    4.38    2.57    0.82    1.66    2.89    1.94
01001   011897  3.27    6.63    10.94   4.35    0.81    1.57    3.96    5.02    0.87    0.75    1.84    4.38
01001   011898  2.33    2.07    2.60    4.56    0.54    3.13    5.80    6.02    1.51    3.21    6.66    3.91
01001   011899  5.80    6.94    3.35    2.22    2.93    2.31    6.80    2.90    0.63    3.02    1.98    5.25
01003   011895  6.46    3.53    6.69    4.63    3.54    11.71   7.64    8.02    2.45    3.21    1.35    4.05
01003   011896  3.80    5.88    5.63    2.45    2.83    6.98    5.24    5.60    2.90    5.38    4.09    2.45
01003   011897  3.95    6.62    8.19    4.43    1.09    2.44    7.06    14.69   2.08    1.29    2.45    3.91
01003   011898  2.82    4.25    2.04    3.64    0.73    5.87    6.72    11.96   9.90    1.46    7.88    5.09
01003   011899  5.64    4.47    3.37    0.97    1.16    5.90    7.81    6.53    0.31    2.23    3.75    4.79

produces:
Code:
01001 011895 2.33 2.07 2.60 2.22 0.54 1.57 3.92 2.57 0.63 0.75 1.44 1.94
01003 011895 2.82 3.53 2.04 0.97 0.73 2.44 5.24 5.60 0.31 1.29 1.35 2.45

Login or Register to Reply

|
Thread Tools Search this Thread
Search this Thread:
Advanced Search

More UNIX and Linux Forum Topics You Might Find Helpful
Print lines based upon unique values in Nth field
jvoot
For some reason I am having difficulty performing what should be a fairly easy task. I would like to print lines of a file that have a unique value in the first field. For example, I have a large data-set with the following excerpt: PS003,001 MZMWR/ L-DWD// * PS003,001...... UNIX for Beginners Questions & Answers
4
UNIX for Beginners Questions & Answers
Print count of unique values
H squared
Hello experts, I am converting a number into its binary output as : read n echo "obase=2;$n" | bc I wish to count the maximum continuous occurrences of the digit 1. Example : 1. The binary equivalent of 5 = 101. Hence the output must be 1. 2. The binary...... Shell Programming and Scripting
3
Shell Programming and Scripting
How to get min and max values using awk?
redse171
Hi, I need your kind help to get min and max values from file based on value in $5 . File1 SP12.3 stc 2240806 2240808 + ID1_N003 ID2_N003T0 SP12.3 sto 2241682 2241684 + ID1_N003 ID2_N003T0 SP12.3 XE 2239943 2240011 + ID1_N003 ID2_N003T0 SP12.3 XE 2240077 2241254 + ID1_N003 ...... Shell Programming and Scripting
12
Shell Programming and Scripting
[Solved] Print a line using a max and a min values of different columns
MetaBolic0
Hi guys, I already search on the forum but i can't solve this on my own. I have a lot of files like this: And i need to print the line with the maximum value in last column but if the value is the same (2 in this exemple for the 3 last lines) i need get the line with the minimum value in...... UNIX for Dummies Questions & Answers
4
UNIX for Dummies Questions & Answers
Awk search for max and min field values
Kirichiko
hi, i have an awk script and I managed to figure out how to search the max value but Im having difficulty in searching for the min field value. BEGIN {FS=","; max=0} NF == 7 {if (max < $6) max = $6;} END { print man, min} where $6 is the column of a field separated by a comma... UNIX for Dummies Questions & Answers
3
UNIX for Dummies Questions & Answers