the smallest number from 90% of highest numbers from all numbers in file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting the smallest number from 90% of highest numbers from all numbers in file
# 1  
Old 05-21-2011
the smallest number from 90% of highest numbers from all numbers in file

Hello All,
I am having problem to find what is the smallest number from 90% of highest numbers from all numbers in file. I am having file with thousands of lines and hundreds of columns.
I am familiar mainly with bash but I am open to whatever suggestion witch will lead to the solutions.

If I explain it differently I have fx 1000 numbers between 0 and 10000. The results could be:

90% of numbers are bigger than 1000
80% of numbers are bigger than 2342
70% of numbers are bigger than 5674
etc.

I am looking for numbers like 1000, 2342, 5674 as in this example.

I am sure that there is some statistical method how to do this, but I cannot remember and can find it how it is called. If I know what method can be used to do this I may find the way to calculate it too.

Thank you for help
# 2  
Old 05-21-2011
Hi,

Could you please give us an input file and desired output example ?
# 3  
Old 05-21-2011
Hi

INPUT can looks like this, but much bigger (in columns and rows), the numbers are not sorted in any way (it may looks like that here however)

Code:
0.35156582 0.36767924 0.40942771 1.15580244 1.20877668

1.21842761 1.27427217 1.41896056 1.16207427 1.21533599

1.41774799 1.22634608 1.28255355 1.42818227 1.19181428

2.08513847 1.78348512 1.86522813 2.07701713 1.78747556

OUTPUT
here is 20numbers, fx I would like to have 5bands. Each band will have 20% of numbers, meaning
Code:
100% of numbers is bigger then 0 or seeking number
80% of numbers is bigger then (seeking number)
60% of numbers is bigger then (seeking number)
40% of numbers is bigger then (seeking number)
20% of numbers is bigger then (seeking number)

I hope that it it is more clear now.
I am slowlly find it way around, but it is not that much elegant and I am creating lots of rubbish around. The have to do this for tens of files with 50000numbers in each file. That reason why I am looking for elegant and quick solution.

Thank you

Last edited by radoulov; 05-22-2011 at 04:49 AM.. Reason: Code tags.
# 4  
Old 05-21-2011
I don't see a quick solution. You need to put the numbers in a list, sort them, count them, then see what is at each 10% of the list.
# 5  
Old 05-22-2011
For a real quick solution, I would
(1) Put the data one on a line.
(2) Sort.
(3) Pass it to 'awk' with the required percentile value as a parameter.
(4) Use pattern $1 < parameter.
(5) For each record make it the minimum if needed.
(6) On END, print the value.
# 6  
Old 05-22-2011
Try this script:
Code:
#!/usr/bin/perl
open I, "$ARGV[0]";
while (<I>){
  chomp;
  push @x, split / /, $_;
}
@x=sort {$a<=>$b} @x;
for ($i=0;$i<=$#x;$i+=($#x+1)/5){
  printf "%d%s of numbers is bigger than %s\n", 100-$i/($#x+1)*100,"%",$x[$i];
}

Run it like this: ./script.pl data_file
# 7  
Old 05-22-2011
The OP does not know what the limits are... he or she needs to find them. Consider:
Code:
1 2 3 4 7 8 9
1 2 3 6 7 8 9

Now find the middle point. It the first list 4 is the mid point. But in the second list its 6. You don't know 4 or 6 ahead of time. The mid point is the 50% point. Now image a much longer list and you need to find the data element at 10%, 20%, 30%...90% points in the list.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Decimal numbers and letters in the same collums: round numbers

Hi! I found and then adapt the code for my pipeline... awk -F"," -vOFS="," '{printf "%0.2f %0.f\n",$2,$4}' xxx > yyy I add -F"," -vOFS="," (for input and output as csv file) and I change the columns and the number of decimal... It works but I have also some problems... here my columns ... (7 Replies)
Discussion started by: echo manolis
7 Replies

2. Shell Programming and Scripting

Adding (as in arithmetic) to numbers in columns in file, and writing new file with new numbers

Hi again. Sorry for all the questions — I've tried to do all this myself but I'm just not good enough yet, and the help I've received so far from bartus11 has been absolutely invaluable. Hopefully this will be the last bit of file manipulation I need to do. I have a file which is formatted as... (4 Replies)
Discussion started by: crunchgargoyle
4 Replies

3. Shell Programming and Scripting

Print numbers between two number ranges

Hi, I have a list.txt file with number ranges and want to print/save new all.txt file with all the numbers and between the numbers. == list.txt == 65936 65938 65942 && 65943 65945 ... (7 Replies)
Discussion started by: AK47
7 Replies

4. UNIX for Dummies Questions & Answers

Print numbers and associated text belonging to an interval of numbers

##### (0 Replies)
Discussion started by: lucasvs
0 Replies

5. Programming

Help with find highest and smallest number in a file with c

Input file: #data_1 AGDG #data_2 ADG #data_3 ASDDG DG #data_4 A Desired result: Highest 7 Slowest 1 code that I try but failed to archive my goal :( #include <stdio.h> (2 Replies)
Discussion started by: cpp_beginner
2 Replies

6. Shell Programming and Scripting

trying to make an AWK code for ordering numbers in a column from least to highest

Hi all, I have a large column of numbers like 5.6789 2.4578 9.4678 13.5673 1.6589 ..... I am trying to make an awk code so that awk can easily go through the column and arrange the numbers from least to highest like 1.6589 2.4578 5.6789 ....... can anybody suggest, how can I do... (5 Replies)
Discussion started by: ananyob
5 Replies

7. Shell Programming and Scripting

read numbers from file and output which numbers belongs to which range

Howdy experts, We have some ranges of number which belongs to particual group as below. GroupNo StartRange EndRange Group0125 935300 935399 Group2006 935400 935476 937430 937459 Group0324 935477 935549 ... (6 Replies)
Discussion started by: thepurple
6 Replies

8. UNIX for Dummies Questions & Answers

seperating records with numbers from a set of numbers

I have two files one (numbers file)contains the numbers(approximately 30000) and the other file(record file) contains the records(approximately 40000)which may or may not contain the numbers from that file. I want to seperate the records which has the field 1=(any of the number from numbers... (15 Replies)
Discussion started by: Shiv@jad
15 Replies

9. Shell Programming and Scripting

Perl ? - How to find and print the lowest and highest numbers punched in by the user?

. . . . . . (3 Replies)
Discussion started by: some124one
3 Replies

10. AIX

How to replace many numbers with one number in a file

How to replace many numbers with one number in a file. Many numbers like 444565,454678,443298,etc. i want to replace these with one number (300).Please halp me out. (2 Replies)
Discussion started by: vpandey
2 Replies
Login or Register to Ask a Question