Sponsored Content
Top Forums Shell Programming and Scripting Finding standard deviation for all columns in a data file Post 302628153 by ks_reddy on Monday 23rd of April 2012 03:55:35 AM
Old 04-23-2012
Finding standard deviation for all columns in a data file

Hi All,

I want someone to modify the below script from this forum so that it can be used for all columns in the file( instead of only printing column 3 mean and standard deviation values). I don't know how to loop around all the columns.
https://www.unix.com/unix-dummies-que...on-column.html

Code:
awk '{ lines=FNR; arr[lines]=$3; sum+=$3}      END{ avg=sum/lines      sum=0;      for(i=1; i<=lines; i++)       	{ v=arr[i]-avg;       	  sum+= v*v       	}      printf("n=%d avg=%f  stddev=%f\n",             lines, avg, sqrt( sum/( lines - 1) ) ) } ' filename

Thanks a lot.
Sidda

Last edited by Scrutinizer; 04-23-2012 at 06:12 AM.. Reason: code tags
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Script for finding standard deviation

I have a CSV file that looks like 0,0,0,0,1,0,1,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0 10,11,7,0,4,12,2,3,7,0,11,3,12,4,0,5,5,4,5,0,8,6,12,0,9,3,3,0,2,7,8 19,11,7,0,4,14,16,10,8,2,13,7,15,6,0,76,6,4,10,0,18,10,17,1,11,3,3,0,9,9,8... (7 Replies)
Discussion started by: RJ17
7 Replies

2. Shell Programming and Scripting

Mean and Standard deviation

Hi all, I am new to shell scripting and wanna calculate the mean and standard deviation using shell programming. I have a file with letters that are repeating and their corresponding duration a 0.32 a 0.89 aa 0.34 aa 0.23 au 0.012 au 0.26... (4 Replies)
Discussion started by: lakshmikanth.pg
4 Replies

3. UNIX for Dummies Questions & Answers

Calculating the Standard Deviation for a column

Hi all, I want to calculate the standard deviation for a column (happens to be column 3). Does any know of simple awk script to do this? Thanks (1 Reply)
Discussion started by: kylle345
1 Replies

4. Shell Programming and Scripting

using awk to print average and standard deviation into a file

Hi I want to use awk to print avg and st deviation but it does not go into a file for column 1 only. I can do average and # of records but i cannot get st deviation. awk '{sum+=$1} END { print "Average = ",sum/NR}' thanks (1 Reply)
Discussion started by: phil_heath
1 Replies

5. Shell Programming and Scripting

Standard deviation in awk

Hi all, I need to find the standard deviation of each column of a dataset below for each hour. The data is given in 5 second intervals as shown below DATE TIME FRAC_DAYS_SINCE_JAN1 FRAC_HRS_SINCE_JAN1 EPOCH_TIME ... (11 Replies)
Discussion started by: gd9629
11 Replies

6. Shell Programming and Scripting

AWK script for standard deviation / root mean square deviation

I have a file with say 50 columns, each containing a whole lot of data. Each column contains data from a separate simulation, but each simulation is related to the data in the last (REFERENCE) column $50 I need to calculate the RMS deviation for each data line, i.e. column 1 relative to... (12 Replies)
Discussion started by: chrisjorg
12 Replies

7. Shell Programming and Scripting

calculating row-wise standard deviation using awk

Hi, I have a file containing 100,000 rows-by-120 columns and I need to compute for the standard deviation for each row. Any idea on how to calculate row-wise standard deviation using awk? My sample data looks like this: input data: 23 35 12 25 16 17 18 19 29 12 12 26 15 14 15 23 12 12... (2 Replies)
Discussion started by: ida1215
2 Replies

8. Shell Programming and Scripting

Computing average and standard deviation from multiple text files

Hello there, I found an elegant solution to computing average values from multiple text files awk '{for (i=1;i<=NF;i++){if ($i!~"n/a"){a+=$i}else{b++}}}END{for (i=1;i<=FNR;i++){for (j=1;j<=NF;j++){printf (a/(3-b))((b>0)?"~"b" ":" ")};printf "\n"}}' file1 file2 file3 I tried to modify... (2 Replies)
Discussion started by: charmmilein
2 Replies

9. Shell Programming and Scripting

Output mean and standard deviation of a row

I have a file that looks that this: 820 890 530 1650 1600 1800 1850 1900 2270 1640 2300 1670 2080 2200 2350 1150 1630 2210 I would like to output the mean and standard deviation of each row so that my final output would look like this 820 890 530 746.667 155.849 1650 1600 1800... (5 Replies)
Discussion started by: kayak
5 Replies

10. Shell Programming and Scripting

SMA (Single Moving Average) and Standard Deviation

Hello Team, I am using the following awk script to calculate the SMA (Single Moving Average) for an specific period but now I would like to include the standard deviation output. Could you please help me to modify this awk shell script awk -F, -v points=5 ' { a = $2; ... (4 Replies)
Discussion started by: csierra
4 Replies
TOTAL(1)						      General Commands Manual							  TOTAL(1)

NAME
total - sum up columns SYNOPSIS
total [ -m ][ -sE | -p | -u | -l ][ -i{f|d}[N] ][ -o{f|d} ][ -tC ][ -N [ -r ]] [ file .. ] DESCRIPTION
Total sums up columns of real numbers from one or more files and prints out the result on its standard output. By default, total computes the straigt sum of each input column, but multiplication can be specified instead with the -p option. Likewise, the -u option means find the upper limit (maximum), and -l means find the lower limit (minimum). Sums of powers can be computed by giving an exponent with the -s option. (Note that there is no space between the -s and the exponent.) This exponent can be any real number, positive or negative. The absolute value of the input is always taken before the power is computed in order to avoid complex results. Thus, -s1 will produce a sum of absolute values. The default power (zero) is interpreted as a straight sum without taking absolute values. The -m option can be used to compute the mean rather than the total. For sums, the arithmetic mean is computed. For products, the geomet- ric mean is computed. (A logarithmic sum of absolute values is used to avoid overflow, and zero values are silently ignored.) If the input data is binary, the -id or -if option may be given for 64-bit double or 32-bit float values, respectively. Either option may be followed immediately by an optional count, which defaults to 1, indicating the number of double or float binary values to read per record on the input file. (There can be no space between the option and this count.) Similarly, the -od and -of options specify binary double or float output, respectively. These options do not need a count, as this will be determined by the number of input channels. A count can be given as the number of lines to read before computing a result. Normally, total reads each file to its end before producing its result, but this behavior may be overridden by inserting blank lines in the input. For each blank input line, total produces a result as if the end-of-file had been reached. If two blank lines immediately follow each other, total closes the file and proceeds to the next one (after reporting the result). The -N option (where N is a decimal integer) tells total to produce a result and reset the calculation after every N input lines. In addition, the -r option can be specified to override reinitialization and thus give a running total every N lines (or every blank line). If the end of file is reached, the current total is printed and the calculation is reset before the next file (with or without the -r option). The -tC option can be used to specify the input and output tab character. The default tab character is TAB. If no files are given, the standard input is read. EXAMPLE
To compute the RMS value of colon-separated columns in a file: total -t: -m -s2 input To produce a running product of values from a file: total -p -1 -r input BUGS
If the input files have varying numbers of columns, mean values will certainly be off. Total will ignore missing column entries if the tab separator is a non-white character, but cannot tell where a missing column should have been if the tab character is white. AUTHOR
Greg Ward SEE ALSO
cnt(1), neaten(1), rcalc(1), rlam(1), tabfunc(1) RADIANCE
2/3/95 TOTAL(1)
All times are GMT -4. The time now is 04:59 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy