Sponsored Content
Top Forums Shell Programming and Scripting Calculate 5th percentile based on another column Post 302971483 by Yoda on Wednesday 20th of April 2016 05:55:01 PM
Old 04-20-2016
Try this, please feel free to correct any errors in my calculations:-
Code:
awk '
        NR > 1 {
                ++T[$1]
                A[$1 FS T[$1]] = $2
        }
        END {
                print "Site", "Val"
                for ( k in T )
                {
                        idx = sprintf( "%.0f", T[k] * 0.05 )
                        idx = ( idx == 0 ? 1 : idx )
                        print k, A[k FS idx]
                }
        }
' OFS='\t' file

 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Can we use 'tr' command to print 5th column of output of 'ls -l'

Hi All, I know awk command can do it, but can we use tr command to print 5th column of out put 'ls -l' command???? Regards, Nidhi... (4 Replies)
Discussion started by: Nidhi2177
4 Replies

2. Shell Programming and Scripting

shell script to sort the 5th column

hi folks, I have this data in a data.txt file and i want to sort the 5th column and in descending order: Jun 15 119.167.247.40 = 23 Jun 15 119.167.247.40 = 3 Jun 15 208.115.46.125 = 12 Jun 15 208.115.46.125 = 6 Jun 15 210.51.10.160 = 20 I want this sample output: Jun... (2 Replies)
Discussion started by: linuxgeek
2 Replies

3. Shell Programming and Scripting

top 10 highest and lowest percentile from a column

Hi, I want to extract the the top 10 and lowest 10 percentile for a column of values. For example in column 2 for this file: JOE 1 JAY 5 JAM 6 JIL 8 JIB 4 JIH 3 JIG 2 JIT 7 JAM 9 MAR 10 The top 10 lowest will be: JOE 1 and the top 10 highest will be: (2 Replies)
Discussion started by: kylle345
2 Replies

4. Shell Programming and Scripting

Transpose timestamp based on column values and calculate time difference

Hello Expert, I need to transpose Date-Timestamp based on same column values and calculate time difference. The input file would be as below and required output is mentioned in the bottom INPUT File ======== 08/23/2012 12:36:09 JOB_5340 08/23/2012 12:36:14 JOB_5340 08/23/2012... (2 Replies)
Discussion started by: asnandhakumar
2 Replies

5. Shell Programming and Scripting

Calculate difference in timestamps based on unique column value

Hi Friends, Require a quick help to write the difference between 2 timestamps based on a unique column value: Input file: 08/23/2012 12:36:09,JOB_5340,08/23/2012 12:36:14,JOB_5340 08/23/2012 12:36:22,JOB_5350,08/23/2012 12:36:26,JOB_5350 08/23/2012 13:08:51,JOB_5360,08/23/2012... (4 Replies)
Discussion started by: asnandhakumar
4 Replies

6. Shell Programming and Scripting

Calculate 2nd Column Based on 1st Column

Dear All, I have input file like this. input.txt CE2_12-15 3950.00 589221.0 9849709.0 768.0 CE2_12_2012 CE2_12-15 3949.00 589199.0 9849721.0 768.0 CE2_12_2012 CE2_12-15 3948.00 589178.0 9849734.0 768.0 CE2_12_2012 CE2_12-52 1157.00 ... (3 Replies)
Discussion started by: attila
3 Replies

7. Shell Programming and Scripting

Calculate the average of a column based on the value of another column

Hi, I would like to calculate the average of column 'y' based on the value of column 'pos'. For example, here is file1 id pos y c 11 1 220 aa 11 4333 207 f 11 5333 112 ee 11 11116 305 e 11 11117 310 r 11 22228 781 gg 11 ... (2 Replies)
Discussion started by: jackken007
2 Replies

8. Shell Programming and Scripting

Total of 5th column using awk or any other utility in UNIX??

Hi I have this file which contains Al,AADESH,id1_0,23,2013-01-28,2,2 Al,AADESH,id1_0,23,2013-01-29,4,4 Al,AADESH,id1_0,23,2013-01-30,2,1 Al,AADESH,id1_0,31,2013-01-29,1,1 Al,AESH,id1_0,31,2013-01-31,2,2 Al,AESH,id2_2,23,2013-01-29,1,1 Al,AESH,id2_2,31,2013-01-31,1,1 ... (5 Replies)
Discussion started by: nikhil jain
5 Replies

9. Linux

Filter a .CSV file based on the 5th column values

I have a .CSV file with the below format: "column 1","column 2","column 3","column 4","column 5","column 6","column 7","column 8","column 9","column 10 "12310","42324564756","a simple string with a , comma","string with or, without commas","string 1","USD","12","70%","08/01/2013",""... (2 Replies)
Discussion started by: dhruuv369
2 Replies

10. UNIX for Beginners Questions & Answers

Search spaces in 5th column in large file

i have a file having 5 columns with more than million records. And i want to search using UNIX command to find if there are any spaces in 5th column. any please help. (1 Reply)
Discussion started by: sivakumar.p
1 Replies
Distributions(3pm)					User Contributed Perl Documentation					Distributions(3pm)

NAME
Statistics::Distributions - Perl module for calculating critical values and upper probabilities of common statistical distributions SYNOPSIS
use Statistics::Distributions; $chis=Statistics::Distributions::chisqrdistr (2,.05); print "Chi-squared-crit (2 degrees of freedom, 95th percentile " ."= 0.05 level) = $chis "; $u=Statistics::Distributions::udistr (.05); print "u-crit (95th percentile = 0.05 level) = $u "; $t=Statistics::Distributions::tdistr (1,.005); print "t-crit (1 degree of freedom, 99.5th percentile = 0.005 level) " ."= $t "; $f=Statistics::Distributions::fdistr (1,3,.01); print "F-crit (1 degree of freedom in numerator, 3 degrees of freedom " ."in denominator, 99th percentile = 0.01 level) = $f "; $uprob=Statistics::Distributions::uprob (-0.85); print "upper probability of the u distribution (u = -0.85): Q(u) " ."= 1-G(u) = $uprob "; $chisprob=Statistics::Distributions::chisqrprob (3,6.25); print "upper probability of the chi-square distribution (3 degrees " ."of freedom, chi-squared = 6.25): Q = 1-G = $chisprob "; $tprob=Statistics::Distributions::tprob (3,6.251); print "upper probability of the t distribution (3 degrees of " ."freedom, t = 6.251): Q = 1-G = $tprob "; $fprob=Statistics::Distributions::fprob (3,5,.625); print "upper probability of the F distribution (3 degrees of freedom " ."in numerator, 5 degrees of freedom in denominator, F = 6.25): " ."Q = 1-G = $fprob "; DESCRIPTION
This Perl module calculates percentage points (5 significant digits) of the u (standard normal) distribution, the student's t distribution, the chi-square distribution and the F distribution. It can also calculate the upper probability (5 significant digits) of the u (standard normal), the chi-square, the t and the F distribution. These critical values are needed to perform statistical tests, like the u test, the t test, the F test and the chi-squared test, and to calculate confidence intervals. If you are interested in more precise algorithms you could look at: StatLib: http://lib.stat.cmu.edu/apstat/ ; Applied Statistics Algorithms by Griffiths, P. and Hill, I.D., Ellis Horwood: Chichester(1985) BUGS
This final version 1.02 has been released after more than one year without a bug report on the previous version 0.07. Nevertheless, if you find any bugs or oddities, please do inform the author. INSTALLATION
See perlmodinstall for information and options on installing Perl modules. AVAILABILITY
The latest version of this module is available from the Distribution Perl Archive Network (CPAN). Please visit http://www.cpan.org/ to find a CPAN site near you or see http://www.cpan.org/authors/id/M/MI/MIKEK/ . AUTHOR
Michael Kospach <mike.perl@gmx.at> Nice formating, simplification and bug repair by Matthias Trautner Kromann <mtk@id.cbs.dk> COPYRIGHT
Copyright 2003 Michael Kospach. All rights reserved. This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself. SEE ALSO
Statistics::ChiSquare, Statistics::Table::t, Statistics::Table::F, perl(1). perl v5.12.4 2003-10-01 Distributions(3pm)
All times are GMT -4. The time now is 04:07 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy