You are correct. I copied your algorithm - it needs checks.
This should prevent domain errors.... the fact that there are a lot of zero values means the sum of squares can be very small number. You could also use a function like this placed at the top of the awk code block
function abs(n) { return (n <0)? n*=-1 : n}
Hi all,
I am new to shell scripting and wanna calculate the mean and standard deviation using shell programming.
I have a file with letters that are repeating and their corresponding duration
a 0.32
a 0.89
aa 0.34
aa 0.23
au 0.012
au 0.26... (4 Replies)
Hi all,
I want to calculate the standard deviation for a column (happens to be column 3).
Does any know of simple awk script to do this?
Thanks (1 Reply)
Hi I want to use awk to print avg and st deviation but it does not go into a file for column 1 only.
I can do average and # of records but i cannot get st deviation.
awk '{sum+=$1} END { print "Average = ",sum/NR}'
thanks (1 Reply)
Hi all,
I need to find the standard deviation of each column of a dataset below for each hour. The data is given in 5 second intervals as shown below
DATE TIME FRAC_DAYS_SINCE_JAN1 FRAC_HRS_SINCE_JAN1 EPOCH_TIME ... (11 Replies)
I have a file with say 50 columns, each containing a whole lot of data.
Each column contains data from a separate simulation, but each simulation is related to the data in the last (REFERENCE) column $50
I need to calculate the RMS deviation for each data line, i.e. column 1 relative to... (12 Replies)
Hi All,
I want someone to modify the below script from this forum so that it can be used for all columns in the file( instead of only printing column 3 mean and standard deviation values). I don't know how to loop around all the columns.
... (3 Replies)
Hi,
I have a file containing 100,000 rows-by-120 columns and I need to compute for the standard deviation for each row. Any idea on how to calculate row-wise standard deviation using awk? My sample data looks like this:
input data:
23 35 12 25 16 17 18 19 29 12
12 26 15 14 15 23 12 12... (2 Replies)
Hello there,
I found an elegant solution to computing average values from multiple text files
awk '{for (i=1;i<=NF;i++){if ($i!~"n/a"){a+=$i}else{b++}}}END{for (i=1;i<=FNR;i++){for (j=1;j<=NF;j++){printf (a/(3-b))((b>0)?"~"b" ":" ")};printf "\n"}}' file1 file2 file3
I tried to modify... (2 Replies)
I have a file that looks that this:
820 890 530
1650 1600 1800
1850 1900 2270
1640 2300 1670
2080 2200 2350
1150 1630 2210
I would like to output the mean and standard deviation of each row so that my final output would look like this
820 890 530 746.667 155.849
1650 1600 1800... (5 Replies)
Hello Team,
I am using the following awk script to calculate the SMA (Single Moving Average) for an specific period but now I would like to include the standard deviation output.
Could you please help me to modify this awk shell script
awk -F, -v points=5 ' { a = $2; ... (4 Replies)
Discussion started by: csierra
4 Replies
LEARN ABOUT DEBIAN
bup-margin
bup-margin(1) General Commands Manual bup-margin(1)NAME
bup-margin - figure out your deduplication safety margin
SYNOPSIS
bup margin [options...]
DESCRIPTION
bup margin iterates through all objects in your bup repository, calculating the largest number of prefix bits shared between any two
entries. This number, n, identifies the longest subset of SHA-1 you could use and still encounter a collision between your object ids.
For example, one system that was tested had a collection of 11 million objects (70 GB), and bup margin returned 45. That means a 46-bit
hash would be sufficient to avoid all collisions among that set of objects; each object in that repository could be uniquely identified by
its first 46 bits.
The number of bits needed seems to increase by about 1 or 2 for every doubling of the number of objects. Since SHA-1 hashes have 160 bits,
that leaves 115 bits of margin. Of course, because SHA-1 hashes are essentially random, it's theoretically possible to use many more bits
with far fewer objects.
If you're paranoid about the possibility of SHA-1 collisions, you can monitor your repository by running bup margin occasionally to see if
you're getting dangerously close to 160 bits.
OPTIONS --predict
Guess the offset into each index file where a particular object will appear, and report the maximum deviation of the correct answer
from the guess. This is potentially useful for tuning an interpolation search algorithm.
--ignore-midx
don't use .midx files, use only .idx files. This is only really useful when used with --predict.
EXAMPLE
$ bup margin
Reading indexes: 100.00% (1612581/1612581), done.
40
40 matching prefix bits
1.94 bits per doubling
120 bits (61.86 doublings) remaining
4.19338e+18 times larger is possible
Everyone on earth could have 625878182 data sets
like yours, all in one repository, and we would
expect 1 object collision.
$ bup margin --predict
PackIdxList: using 1 index.
Reading indexes: 100.00% (1612581/1612581), done.
915 of 1612581 (0.057%)
SEE ALSO bup-midx(1), bup-save(1)BUP
Part of the bup(1) suite.
AUTHORS
Avery Pennarun <apenwarr@gmail.com>.
Bup unknown-bup-margin(1)