Awk- Pivot Table Averages


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Awk- Pivot Table Averages
# 1  
Old 01-21-2018
Question Awk- Pivot Table Averages

Hi everyone,

Has anyone figured out yet how to do pivot table averages using AWK. I didn't see anything with regards to doing averages.

For example, suppose you have the following table with various individuals and their scores in round1 and round2:

Code:
SAMPLE    SCORE1    SCORE2
British    15    19.5
British    7    9.1
British    8    10.4
British    11    14.3
German    6    7.8
German    7    9.1
Italian    10    13
Italian    3    3.9
Italian    19    24.7
Italian    9    11.7
Italian    6    7.8

The objective is to do a pivot table using Awk to calculate the average scores by country such as shown in the output below.

Code:
POPULATION    AVG SCORE1    AVG SCORE2
British    10.25    13.33
German    6.50    8.45
Italian    9.40    12.22


Any ideas how to do this using Awk?
# 2  
Old 01-21-2018
Code:
awk '
NR > 1 {aa[$1]; a[$1]++; b[$1,1]+=$2; b[$1,2]+=$3;}
END {
   printf("%-10s\t%-10s\t%-10s\n", "POPULATION", "AVG SCORE1", "AVG SCORE2");
   for (i in aa) printf("%-10s\t%10.2f\t%10.2f\n", i, b[i,1]/a[i], b[i,2]/a[i]);
}' datafile

This User Gave Thanks to rdrtx1 For This Post:
# 3  
Old 01-21-2018
When looking at the aa[$1]; a[$1]++;,
the first array is not needed because its information (the index) is also present in the second array.
So can be simply a[$1]++;, and the reference changes to for (i in a)
This User Gave Thanks to MadeInGermany For This Post:
# 4  
Old 01-21-2018
Quote:
Originally Posted by rdrtx1
Code:
awk '
NR > 1 {aa[$1]; a[$1]++; b[$1,1]+=$2; b[$1,2]+=$3;}
END {
   printf("%-10s\t%-10s\t%-10s\n", "POPULATION", "AVG SCORE1", "AVG SCORE2");
   for (i in aa) printf("%-10s\t%10.2f\t%10.2f\n", i, b[i,1]/a[i], b[i,2]/a[i]);
}' datafile


Awesome rdrtx!! good work. Any idea why there is an extra line inserted in the output with 0 results. Here is the output based on your script:
Code:
POPULATION    AVG SCORE1    AVG SCORE2
                    0.00          0.00
Italian             9.40         12.22
German              6.50          8.45
British            10.25         13.32


Also, MadeInGermany your are spot on.
# 5  
Old 01-21-2018
maybe blank lines in input file. try updating NR line to:
Code:
NR > 1 && NF


Last edited by rdrtx1; 01-21-2018 at 08:22 PM..
This User Gave Thanks to rdrtx1 For This Post:
# 6  
Old 01-21-2018
Quote:
Originally Posted by rdrtx1
maybe blank lines in input file. try updating NR line to:
Code:
NR > 1 && NF

That did the trick Smilie
# 7  
Old 01-22-2018
The following variant adapts to the number of columns, using a custom field width %*s (denotes an additional argument for the field width)
Code:
awk '
(NR>1 && NF>0) {
  if (NF>nf) nf=NF
  CNT[$1]++
  for (i=2; i<=NF; i++)
    SUM[$1,i]+=$i
}
END {
  colh="  AVG SCORE"
  out="POPULATION"
  lcolh=length(colh)
  lout=length(out)
  for (i=2; i<=nf; i++)
    out=(out colh i)
  print out
  for (c in CNT) {
    out=sprintf("%-*s", lout, c)
    for (i=2; i<=nf; i++)
      out=sprintf("%s %*.2f", out, lcolh, SUM[c,i]/CNT[c])
    print out
  }
}
' datafile

This User Gave Thanks to MadeInGermany For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Pivot data using awk

Hi My Input is like below DELETE|MPI|AUD_UPD_AGENT|MPISYS INSERT|MPI|AUD_UPD_AGENT|MPISYS SELECT|MPI|AUD_UPD_AGENT|MPISYS UPDATE|MPI|AUD_UPD_AGENT|MPISYS DELETE|MPI|BDYMOD|MPISYS INSERT|MPI|BDYMOD|MPISYS SELECT|MPI|BDYMOD|MPISYS UPDATE|MPI|BDYMOD|MPISYS DELETE|MPI|BDYMOD_DESC|MPISYS... (4 Replies)
Discussion started by: dineshaila
4 Replies

2. How to Post in the The UNIX and Linux Forums

Daily averages...

I have date file like below.. 1995 1 2 10 29 38.6706 -6.53823 41.9201 1995 1 2 10 29 -49.2477 -4.59733 17.2704 1995 1 2 10 29 -49.2369 -4.48045 8.61348 1995 1 3 8 48 -42.2643 ... (3 Replies)
Discussion started by: athithi
3 Replies

3. Shell Programming and Scripting

Create a pivot table from CSV file

Gents, Can you please help me to create a pivot table from a csv file. ( I have zip the csv file) Using the file attached, columns 1,28 and 21 i would like to get something like this output JD Val 1 2 3 4 5 6 7 8 9 10 11 12 Total... (4 Replies)
Discussion started by: jiam912
4 Replies

4. Shell Programming and Scripting

Pivot Column using awk

Hello everyone I have a 20M file which is having the below sample layout 1111,ABC,100 1111,DEF,200 1111,XYZ,300 4444,LMN,100 4444,AKH,500 4444,WCD,400 2222,ABC,100 7777,DEF,300 7777,WCD,300 I need to covert this to below format Output Party_ID|ABC|AKH|DEF|LMN|WCD|XYZ... (5 Replies)
Discussion started by: morbid_angel
5 Replies

5. Shell Programming and Scripting

Pivot using awk

Hi, I am writing a code to basically pivot the data. awk -v var1="" -v var2="" -v var3="" -v var4="" -v var5="" -v Disp=0\ 'BEGIN {FS=":"; OFS="|";}\ /^Pattern1/ {var1=$2;Disp=0;} \ /^Pattern2/ {var2=$2;} \ /^Pattern3/ {var3=$2;} \ /^Pattern4/ {var4=$2;} \ /^Pattern5/... (5 Replies)
Discussion started by: tostay2003
5 Replies

6. Homework & Coursework Questions

Calculating Total and Averages with awk Commands & Scripts

Use and complete the template provided. The entire template must be completed. If you don't, your post may be deleted! 1. The problem statement, all variables and given/known data: Write an awk script(company.awk) for the workers file to find the number of workers of each departman, total... (8 Replies)
Discussion started by: RedJohn
8 Replies

7. Shell Programming and Scripting

awk to convert table-by-row to matrix table

Hello, I need some help to reformat this table-by-row to matrix? infile: site1 A:o,p,q,r,s,t site1 C:y,u site1 T:v,w site1 -:x,z site2 A:p,r,t,v,w,z site2 C:u,y site2 G:q,s site2 -:o,x site3 A:o,q,s,t,u,z site3 C:y site3 T:v,w,x site3 -:p,routfile: SITE o p q r s t v u w x y... (7 Replies)
Discussion started by: yifangt
7 Replies

8. Shell Programming and Scripting

Calculate Averages !

Hi, I have a file with more than 2,000 rows like this: 05/26/2011,1200,1500 I would like to create a awk shell script that calculate the price average of the second and third field each 5,10 and 20 rows or ask me for the values, starting at first row. Finally compare the average value... (1 Reply)
Discussion started by: csierra
1 Replies

9. Shell Programming and Scripting

Create Pivot table

I would like to use awk to parse a file with three columns in, like: Chennai,01,1 Chennai,07,1 Chennai,08,3 Chennai,09,6 Chennai,10,12 Chennai,11,19 Chennai,12,10 Chennai,13,12 Kerala,09,2 AP,10,1 AP,11,1 Delhi,13,1 Kerala,13,3 Chennai,00,3 Chennai,01,1 Chennai,02,1 Chennai,07,5 (3 Replies)
Discussion started by: boston_nilesh
3 Replies

10. Shell Programming and Scripting

Pivot table

Hello everyone, I would like to use awk to parse a file with three columns in, like: monday 0 1 monday 1 1 monday 2 1 monday 3 1 monday 4 1 monday 5 1 tuesday 0 5 tuesday 1 1 tuesday 2 1 tuesday 3 1 tuesday 4 1 wednesday 0 1 monday 5 25 they represent the day the hour and the... (2 Replies)
Discussion started by: gio001
2 Replies
Login or Register to Ask a Question