|
|||||||
| Forums | Search Forums | Register | Forum Rules | Man Pages | Albums | FAQ | Members | Calendar | Search | Today's Posts | Mark Forums Read |
| UNIX for Dummies Questions & Answers If you're not sure where to post a UNIX or Linux question, post it here. All UNIX and Linux newbies welcome !! |
|
|
|
Thread Tools | Search this Thread | Display Modes |
|
#1
|
|||
|
|||
|
Finding data value that contains x% of points
Hi, I need help on finding the value of my data that encompasses certain percentage of my total data points (n). Attached is an example of my data, n=30. What I want to do is for instance is find the minimum threshold that still encompasses 60% (n=18), 70% (n=21) and 80% (n=24). Code:
manually to find the data value that encompasses 60% of data points, I tried something like:
awk '$1 >= 0.233 {print $0}' > threshold_0.233.txt
awk '$1 >= 0.234 {print $0}' > threshold_0.234.txt
awk '$1 >= 0.235 {print $0}' > threshold_0.235.txt
then I counted all the data if it correspond to 60% of n.
trial-and-error until I get all the values I needed at different %.Code:
0.222568470365 0.221756265888 0.219760388204 0.242798143771 0.238352821721 0.241443756619 0.223094316003 0.228262624788 0.216889793498 0.210031152159 0.21097303707 0.207019965666 0.217014341085 0.239244868006 0.240522828032 0.237227034969 0.257647932043 0.248749576572 0.246545881317 0.247231196664 0.234222785343 0.235188699739 0.254819829246 0.250148878221 0.275682631829 0.287082318457 0.252075020326 0.412756783786 0.402542710592 0.227780278349 Any suggestion on how to go through it? Thanks much. |
| Sponsored Links | ||
|
|
#2
|
|||
|
|||
|
There's probably a way to do that statistically. Checking...
|
| Sponsored Links | ||
|
|
#3
|
|||
|
|||
|
Thanks for having a check,
. |
|
#4
|
|||
|
|||
|
Your data doesn't seem to have a normal distribution. There's a much more obvious way anyway, don't know why it didn't occur to me before Sort it, then look past the % number of lines you want for the threshold.Code:
sort -n data > sorted
awk 'NR==FNR { N++; next } FNR > (.8*N) { print $1 ; exit }' sorted sorted
rm -f sorted |
| The Following User Says Thank You to Corona688 For This Useful Post: | ||
ida1215 (12-13-2012) | ||
| Sponsored Links | |
|
|
#5
|
|||
|
|||
|
Thank you very much Corona688. The data I posted is just a part of the whole data and those were extracted prediction values at certain points and might explain the non-normal distribution (?). Anyways, bunch of thanks.
|
| Sponsored Links | ||
|
![]() |
| Thread Tools | Search this Thread |
| Display Modes | |
More UNIX and Linux Forum Topics You Might Find Helpful
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| GNUPLOT- how to change the style of data points | natasha | Programming | 0 | 05-17-2010 08:03 PM |
| How to get data only inside polygon created by points which is part of whole data from file? | reva | UNIX for Dummies Questions & Answers | 7 | 04-12-2010 11:27 AM |
| Writing an algorithm to recode data points | doobedoo | Shell Programming and Scripting | 10 | 10-27-2009 11:51 AM |
| recoding data points using SED?? | doobedoo | Shell Programming and Scripting | 7 | 10-12-2009 02:34 PM |
| to extarct data points | cdfd123 | Shell Programming and Scripting | 5 | 01-12-2008 08:39 AM |
|
|