Issue using awk to print min values for unique ids


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Issue using awk to print min values for unique ids
# 1  
Old 04-23-2019
Issue using awk to print min values for unique ids

I am using the following script to search for and print minimum values for each individual Fields (3-14) for each unique id (Field 1). But when the field contains a "-99.99" ( I am ignoring "-99.99") and when the minimum value is the first line of a new id (Field 1), the output does not print Field 2, it leave a blank space. Any ideas on how to correct this?

The below snipit is looking for the min value in field 12 only.

Code:
awk ' /-99.99/ {next} {if (a[$1] == "") a[$1] = $12; if (a[$1] > $12) {a[$1]=$12" " $2} } END { for (i in a) { print i, a[i]} }'

Input file sample
Code:
01001   012017  10.64   3.96    4.45    3.48    9.58    10.76   4.44    5.03    2.78    1.45    1.26    3.24
01001   012018  3.94    6.18    4.47    4.79    11.07   3.01    4.27    6.37    6.48    4.52    6.16    9.95
01001   012019  6.77    3.79    3.96    -99.99  -99.99  -99.99  -99.99  -99.99  -99.99  -99.99  -99.99  -99.99
01003   011895  6.46    3.53    6.69    4.63    3.54    11.71   7.64    8.02    2.45    3.21    1.35    4.05
01003   011896  3.80    5.88    5.63    2.45    2.83    6.98    5.24    5.60    2.90    5.38    4.09    2.45

Current Output
Code:
01001   1.45 
01003   3.21  011895

Expected Output
Code:
01001   1.45  012017 
01003   3.21  011895


Last edited by ncwxpanther; 04-24-2019 at 11:34 AM.. Reason: Provide Clarity
# 2  
Old 04-23-2019
a bit verbose, but..... sometyhing along these lines.
awk -f ncw.awk myFile where ncw.awk is:
Code:
{
  for(i=3;i<=NF;i++) {
     key[$1]
     if ( $i >0 && (!(($1, i) in f1) || f1[key[$1],i] > $i)) {
       f1[$1,i] = $i
       if (!($1,i) in f2) f2[$1,i] = $2
     }
  }
  nf=i
}
END {
  for( k in key)
    for(i=3; i<= nf; i++) {
      printf("%s%s%s", (i==3)?k OFS f2[k,i]:"", OFS, (i==nf)?f1[k,i] ORS:f1[k,i])
    }
}

results in (given your sample input file):
Code:
01001 012017 10.64 3.96 4.45 3.48 9.58 10.76 4.44 5.03 2.78 1.45 1.26 3.24
01003 011895 6.46 3.53 6.69 4.63 3.54 11.71 7.64 8.02 2.45 3.21 1.35 4.05


Last edited by vgersh99; 04-23-2019 at 01:49 PM..
These 2 Users Gave Thanks to vgersh99 For This Post:
# 3  
Old 04-23-2019
I get other data
Code:
awk '
!($1 in t)      {t[$1] = $3}
!/-99.99/       {for(i = 3; i <= NF; i++) if (t[$1] > $i) {
                 t[$1] = $i; k[$1] = $2}
                 }
END             { for (i in t) print i, t[i], k[i]
                 }' file

--- Post updated at 19:46 ---

Code:
01001 1.26 012017
01003 1.35 011895

--- Post updated at 19:52 ---

fix
Code:
awk '
!/-99.99/       {if (! t[$1]) t[$1] = $3
                 for(i = 3; i <= NF; i++) if (t[$1] > $i) {
                 t[$1] = $i; k[$1] = $2}
                }
END             { for (i in t) print i, t[i], k[i]
                }' file

--- Post updated at 19:59 ---

Not until the end understood the task Smilie
This User Gave Thanks to nezabudka For This Post:
# 4  
Old 04-23-2019
Hello ncwxpanther,

Could you please try following too, it considers that your Input_file is sorted by 1st field(as per shown samples),if not then we could add like sort -k1 Input_file | awk .... too before my following code.

Code:
awk '
!prev{
  prev_first=$1
  prev_sec=$2
}
prev!=$1 && prev{
  print prev_first,min,prev_sec
  prev_first=$1
  prev_sec=$2
  min=""
}
{
  for(i=3;i<=NF;i++){
      if($i!=-99.99){
          min=min<$i?min?min:$i:$i
      }
  }
  prev=$1
}
END{
  if(min){
      print prev_first,min,prev_sec
  }
}'   Input_file

Output will be as follows.
Code:
01001 1.26 012017
01003 1.35 011895

Thanks,
R. Singh
# 5  
Old 04-23-2019
fix
Code:
awk '
!/-99.99/       {if (! t[$1]) t[$1] = $3
                 for(i = 3; i <= NF; i++) if (t[$1] > $i) {
                 t[$1] = $i; k[$1] = $2}
                }
END             { for (i in t) print i, t[i], k[i]
                }' file

--- Post updated at 19:59 ---

Not until the end understood the task Smilie[/QUOTE]


Thanks nezabudka


Is the field of interest determined by counting from the last field in the file?
Code:
 for(i = 3; i <= NF; i++)

So would this be the 3rd field from the end? The output seems correct, but I am putting in variables for each field from the 3rd to the 14th. So in the below snipit im considering 10.64 as the 3rd field and 3.24 as the 14th field.

Code:
01001   012017  10.64   3.96    4.45    3.48    9.58    10.76   4.44    5.03    2.78    1.45    1.26    3.24

In other words, what would be the correct syntax for find the min value of the 3rd field?
# 6  
Old 04-23-2019
ncwxpanther ,
in my post #2, the suggested implementation calculates ALL the min values for ALL the fields starting at 3.
# 7  
Old 04-23-2019
Try:
Code:
awk '
  !($1 in M) {
    M[$1]=$3
  }
  {
    for(i=3; i<=NF; i++) 
      if($i>-99.99 && $i<=M[$1]) {
        M[$1]=$i
        V[$1]=$2
      }
  }
  END {
    for(i in M) print i, M[i], V[i]
  }
' file

Code:
01001 1.26 012017
01003 1.35 011895

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Print lines based upon unique values in Nth field

For some reason I am having difficulty performing what should be a fairly easy task. I would like to print lines of a file that have a unique value in the first field. For example, I have a large data-set with the following excerpt: PS003,001 MZMWR/ L-DWD// * PS003,001... (4 Replies)
Discussion started by: jvoot
4 Replies

2. Shell Programming and Scripting

Print count of unique values

Hello experts, I am converting a number into its binary output as : read n echo "obase=2;$n" | bc I wish to count the maximum continuous occurrences of the digit 1. Example : 1. The binary equivalent of 5 = 101. Hence the output must be 1. 2. The binary... (3 Replies)
Discussion started by: H squared
3 Replies

3. Shell Programming and Scripting

awk to print unique text in field

I am trying to use awk to print the unique entries in $2 So in the example below there are 3 lines but 2 of the lines match in $2 so only one is used in the output. File.txt chr17:29667512-29667673 NF1:exon.1;NF1:exon.2;NF1:exon.38;NF1:exon.4;NF1:exon.46;NF1:exon.47 703.807... (5 Replies)
Discussion started by: cmccabe
5 Replies

4. Shell Programming and Scripting

awk to filter out lines containing unique values in a specified column

Hi, I have multiple files that each contain four columns of strings: File1: Code: 123 abc gfh 273 456 ddff jfh 837 789 ghi u4u 395 File2: Code: 123 abc dd fu 456 def 457 nd 891 384 djh 783 I want to compare the strings in Column 1 of File 1 with each other file and Print in... (3 Replies)
Discussion started by: owwow14
3 Replies

5. Shell Programming and Scripting

How to get min and max values using awk?

Hi, I need your kind help to get min and max values from file based on value in $5 . File1 SP12.3 stc 2240806 2240808 + ID1_N003 ID2_N003T0 SP12.3 sto 2241682 2241684 + ID1_N003 ID2_N003T0 SP12.3 XE 2239943 2240011 + ID1_N003 ID2_N003T0 SP12.3 XE 2240077 2241254 + ID1_N003 ... (12 Replies)
Discussion started by: redse171
12 Replies

6. Shell Programming and Scripting

Print unique records in 2 columns using awk

Is it possible to print the records that has only 1 value in 2nd column. Ex: input awex1 1 awex1 2 awex1 3 assww 1 ader34 1 ader34 2 output assww 1 (5 Replies)
Discussion started by: quincyjones
5 Replies

7. UNIX for Dummies Questions & Answers

[Solved] Print a line using a max and a min values of different columns

Hi guys, I already search on the forum but i can't solve this on my own. I have a lot of files like this: And i need to print the line with the maximum value in last column but if the value is the same (2 in this exemple for the 3 last lines) i need get the line with the minimum value in... (4 Replies)
Discussion started by: MetaBolic0
4 Replies

8. Shell Programming and Scripting

AWK script - extracting min and max values from selected lines

Hi guys! I'm new to scripting and I need to write a script in awk. Here is example of file on which I'm working ATOM 4688 HG1 PRO A 322 18.080 59.680 137.020 1.00 0.00 ATOM 4689 HG2 PRO A 322 18.850 61.220 137.010 1.00 0.00 ATOM 4690 CD ... (18 Replies)
Discussion started by: grincz
18 Replies

9. Shell Programming and Scripting

print unique values of a column and sum up the corresponding values in next column

Hi All, I have a file which is having 3 columns as (string string integer) a b 1 x y 2 p k 5 y y 4 ..... ..... Question: I want get the unique value of column 2 in a sorted way(on column 2) and the sum of the 3rd column of the corresponding rows. e.g the above file should return the... (6 Replies)
Discussion started by: amigarus
6 Replies

10. UNIX for Dummies Questions & Answers

Awk search for max and min field values

hi, i have an awk script and I managed to figure out how to search the max value but Im having difficulty in searching for the min field value. BEGIN {FS=","; max=0} NF == 7 {if (max < $6) max = $6;} END { print man, min} where $6 is the column of a field separated by a comma (3 Replies)
Discussion started by: Kirichiko
3 Replies
Login or Register to Ask a Question