Visit Our UNIX and Linux User Community


AWK sample variance


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting AWK sample variance
# 1  
Old 09-09-2012
AWK sample variance

I would like to calculate
1/n [sum (x-average)^2]

In awk, I wrote the following line for the sigma summation:
Code:
{ summ+=($1-average)^2 }

Full code:
Code:
BEGIN { Print "This script calculate error estimates"; sum=0 }
{ sum+=$1; n++ }
END { average = sum/n }
BEGIN { summ=0 }
{ summ+=($1-average)^2 }
END { print "error estimate:", "Avg:", average, "Samples:", n, summ, "Co:", summ/n, "Estimator:", summ/(n*n-n), "Error:", sqrt(2/(n-1))*(summ/(n*n-n)), "Variance estimate upper:", summ/(n*n-n)+sqrt(2/(n-1))*(summ/(n*n-n)), "Variance estimate lower:", summ/(n*n-n)-sqrt(2/(n-1))*(summ/(n*n-n)) }

This, does not seem to be working.

Last edited by chrisjorg; 09-09-2012 at 05:18 PM.. Reason: mistake
# 2  
Old 09-09-2012
If you want to read a file 2 times :
Code:
awk '
FNR == NR {
first pass
next
}

{ second pass }
' file file

# 3  
Old 09-09-2012
I guess it is working, but mayhap it does not do what you expect. You can have multiple BEGIN and/or multiple END patterns in awk (although maybe not all implementations), but they ARE executed at the begin or the end of the entire programme. So - if you need the single elements and the average, in parallel to calculating the avg, put the elemants into an array, and then, in the END action, do your summ calculation looping through the array, and then the rest of your calculations.
# 4  
Old 09-09-2012
Yes, it works but doesn't do what I need it to do.

If I set things in an array, then I would want the following:
Code:
arr[($1-average)^2]

Then I would need to sum all the elements in the array.
How might one do this?
# 5  
Old 09-09-2012
for example :

Code:
awk '
 BEGIN {sum=0;summ=0}
{
  arr[NR]=$1
  sum+=$1
}
END {
  average=sum/NR

  for (i=1;i<=NR;i++) {
     summ+=(arr[i]-average)^2
  }
  print summ
}
' file

please note in END part NR is the last line number, so I can use it for the average and the for loop.
This User Gave Thanks to delugeag For This Post:
# 6  
Old 09-11-2012
Question:

If I want to find the sum of the square value of all elements in an array,
would that be

Code:
summm+=arr[i^2]

The problem I am trying to find a variance, using var(X)=<X^2>-<X>^2
and I am getting a negative value (wrong, var>0 ALWAYS). So I suspect
my procedure is wrong.
# 7  
Old 09-11-2012
Quote:
Originally Posted by chrisjorg
Question:

If I want to find the sum of the square value of all elements in an array,
would that be

Code:
summm+=arr[i^2]

arr[i] is the number of the i line. That mean if "i=4" arr[i^2] will return the 16th lines and not the square of the 4th line.

Previous Thread | Next Thread
Test Your Knowledge in Computers #249
Difficulty: Easy
Senator Albert Gore, Jr. authored the High Performance Computing and Communication Act of 1991, creating what Gore referred to as the information superhighway.
True or False?

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Compare sum of two columns if variance is zero do nothing else send an email

11☺Hi, I have to data sets: One is in .txt format and other is in .csv format, please refer below two outputs from two files. File1.txt SOURCE PAYDATE TOTAL_DOLLARS RECORD_COUNT ASSET 05/25/2018 247643.94 ASSET 06/20/2018 ... (27 Replies)
Discussion started by: Tahir_M
27 Replies

2. Shell Programming and Scripting

awk capturing first sample, but not subsequent id's

In the awk below the first sample MEV45 gets extracted from the html, but the subsequent MEV46 and MEV47 do not as they are not part of parse. I can not seem to add them to the code. Thank you very much @RudiC your awk is very nice :). input {"barcodeId": "IonXpress", "barcodedSamples":... (3 Replies)
Discussion started by: cmccabe
3 Replies

3. Shell Programming and Scripting

Sample output

hi gurus , i want the command to get the output in the desired format . basically to convert columns to rows. please refer to the attachment. (3 Replies)
Discussion started by: r_t_1601
3 Replies

4. Shell Programming and Scripting

Sample Script

Below is the code. Its the 1st line of a file. How can I remove the bracket and display like below. 123 web int 1 09:30:45 2013 I dont want to use AWK or SED or PERL. I need to use only the bash shell scripting commands to do it. (3 Replies)
Discussion started by: ghosh_tanmoy
3 Replies

5. Shell Programming and Scripting

sed and awk giving error ./sample.sh: line 13: sed: command not found

Hi, I am running a script sample.sh in bash environment .In the script i am using sed and awk commands which when executed individually from terminal they are getting executed normally but when i give these sed and awk commands in the script it is giving the below errors :- ./sample.sh: line... (12 Replies)
Discussion started by: satishmallidi
12 Replies

6. Shell Programming and Scripting

Calculating Running Variance Using Awk

Hi all, I am attempting to calculate a running variance for a file containing a column of numbers. I am using the formula variance=sum((x-mean(x))^2)/(n-1), where x is the value on the current row, and mean(x) is the average of all of the values up until that row. n represents the total number... (1 Reply)
Discussion started by: Jahn
1 Replies

7. Shell Programming and Scripting

Awk total and variance

File1 0358 Not Visible ***:* NA:NA RDF1+TDEV Grp'd (M) RW 102413 0359 Not Visible ***:* NA:NA RDF1+TDEV N/Grp'd (m) RW - 035A Not Visible ***:* NA:NA RDF1+TDEV N/Grp'd (m) RW - 035B Not Visible ***:* NA:NA ... (2 Replies)
Discussion started by: greycells
2 Replies

8. UNIX for Dummies Questions & Answers

Sample scripts

Hi All' I'm a newbe, and just is started to learn unix. Where can I find a complete sample scripts? I looking for a sample scripts which log in at another unix host and and execute another script server side. Any input welcome (5 Replies)
Discussion started by: ioniCoder
5 Replies

9. Programming

C language to calculate mean,variance

Here I want to calculate mean,variance and sum from a file 1.1*2*4*22*211*22*12*22*22*11 2.2*2*22*12*22*11*11*122*33*22 3.9*7*22*88*87*98*67*66*56*66*11 As this is a large file and i am trying to write in c where formulae of MEAN = 1/N (X1...+..Xn) Variance = square root of 1/N-1... (9 Replies)
Discussion started by: cdfd123
9 Replies

10. Shell Programming and Scripting

calculating variance in perl programming

#!/usr/bin/perl -w use strict; open(FH,"$ARGV") or die; my @temp=<FH>; close FH; my $mean = Mean(\@temp); my $var = variance(\@temp); print "$var\n"; sub estimate_variance { my ($arrayref) = @_; my ($mean,$result) = (mean($arrayref),0); foreach (@$arrayref) {... (4 Replies)
Discussion started by: cdfd123
4 Replies

Featured Tech Videos