Grouping data numbers in a text file into prescribed intervals and count


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Grouping data numbers in a text file into prescribed intervals and count
# 1  
Old 01-14-2010
Grouping data numbers in a text file into prescribed intervals and count

I have a text file that contains numbers (listed from the smallest to the largest).
For ex.

34
817
1145
1645
1759
1761
3368
3529
4311
4681
5187
5193
5199
5417
5682
.
.
.
..
1000000

I need to count the number of values are there (data points in the text file) for every 10,000 intervals (10k). ie. number of values (indicated as numbers) are there in the interval 0 - 9999; 10,000 - 19999, 20,000-29,000 .........until last interval of 10K.

Please let me know the best way to implement it using either shell scripting or awk.

LA
# 2  
Old 01-14-2010
Something like this maybe?
Code:
awk '$1>=t*10000{t++} {A[t]++} END{for (i=1;i<=t;i++) print i*10000"\t"A[i]}' infile

# 3  
Old 01-14-2010
Thanks Scrutinizer,
It worked.

LA
# 4  
Old 01-14-2010
Quote:
Originally Posted by Scrutinizer
Something like this maybe?
Code:
awk '$1>=t*10000{t++} {A[t]++} END{for (i=1;i<=t;i++) print i*10000"\t"A[i]}' infile

There is a bug here. Whenever there is a gap in the data, wherein there are zero values in an interval, the accounting will be off.

Example using most of the sample data above and an interval size of 1000 (instead of 10000):

Code:
$ cat data
34
817
1145
1645
1759
1761
3368
3529
4311
4681
5187
5193
5199
5417
5682

$ awk '$1>=t*1000{t++} {A[t]++} END{for (i=1;i<=t;i++) print i*1000"\t"A[i]}' data
1000    2
2000    4
3000    1
4000    1
5000    2
6000    5

The output should be:

Code:
1000    2
2000    4
3000    0
4000    2
5000    2
6000    5

Regards,
alister

---------- Post updated at 01:24 PM ---------- Previous update was at 01:15 PM ----------

A bugfix for Scrutinizer's solution:

Code:
$ awk '$1>=t*1000{while($1>=++t*1000);} {A[t]++} END{for (i=1;i<=t;i++) print i*1000"\t"(A[i]+0)}' data

A different solution I had been working on for kicks:

Code:
$ awk 'function ge() { return $1>=1000*(i+1) } function p(){print NR-1; NR=1; i++} ge(){p(); while(ge())p()} END {NR++; p()}' data
2
4
0
2
2
5

Take care,
alister

Last edited by alister; 01-14-2010 at 02:16 PM.. Reason: courteousness :)
# 5  
Old 01-14-2010
Yep, bit of a hasty solution..., well observed... Smilie
# 6  
Old 01-14-2010
Hello. My name is Alister and I have a sickness. Sometimes I can't help but revisit inconsequential, properly functioning commands to make them a tiny bit shorter. Smilie

Code:
awk '{while ($1>=t*w) t++; A[t]++} END {for (i=1;i<=t;i++) print i*w"\t"(A[i]+0)}' w=10000 data

As a bonus, the interval's width can be easily modified on the command line.

Cheers,
alister
This User Gave Thanks to alister For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Script Shell: Count The sum of numbers in a file

Hi all; Here is my file: V1.3=4 V1.4=5 V1.1=3 V1.2=6 V1.3=6 Please, can you help me to write a script shell that counts the sum of values in my file (4+5+3+6+6) ? Thank you so much for help. Kind regards. (3 Replies)
Discussion started by: chercheur111
3 Replies

2. Shell Programming and Scripting

Yearly Grouping of Data

I need some logic that would help to group up some records that fall between two dates: Input Data COL_1 COL_2 COL_3 COL_4 COL_5 COL_6 COL_7 COL_8 COL_9 COL_10 COL_11 COL_12 C ABC ABCD 3 ZZ WLOA 2015-12-01 2015-12-15 975.73 ZZZ P 147018.64 C ABC ... (3 Replies)
Discussion started by: Ads89
3 Replies

3. Shell Programming and Scripting

Count column data in a text file

I have a text file that has the following column data: 0.007 0.005 0.004 0.007 How do i output the total sum of the data above? (6 Replies)
Discussion started by: alegnagrp
6 Replies

4. UNIX for Dummies Questions & Answers

Extracting lines from a text file based on another text file with line numbers

Hi, I am trying to extract lines from a text file given a text file containing line numbers to be extracted from the first file. How do I go about doing this? Thanks! (1 Reply)
Discussion started by: evelibertine
1 Replies

5. Shell Programming and Scripting

Remove a block of Text at regular intervals

Hello all, I have a text files that consists of blocks of text. Each block of text represents a set of Cartesian coordinates for a molecule. Each block of text starts with a line that has a only a number, which is equal to the total number of atoms in the molecule. After this number is a line... (15 Replies)
Discussion started by: marcozd
15 Replies

6. Shell Programming and Scripting

Divide numbers into intervals

divide input values into specified number (-100 or -200) according to the key (a1 or a2 ....) For ex: if we give -100 in the command line it would create 100 number intervals (1-100, 100-200, 200-300) untill it covers the value 300 in a1. Note: It should work the same even with huge numbers... (3 Replies)
Discussion started by: ruby_sgp
3 Replies

7. Shell Programming and Scripting

count numbers of matching rows and replace its value in another file

Hello all, can you help me in this problem, assume We have two txt file (file_1 and file_3) one is file_1 contains the data: a 0 b 1 c 3 a 7 b 4 c 5 b 8 d 6 . . . . and I need to count the lines with the matching data (a,b,..) and print in new file called file_2 such as the... (4 Replies)
Discussion started by: GoldenFalcon10
4 Replies

8. UNIX for Dummies Questions & Answers

Help with data grouping

Hi all, I have a set data as shown below, and i would like to eliminate the name that no children - boy and girl. What is the appropriate command can i use(other than grep)? Please assist... My input: name sex marital status children - boy children - girl ... (3 Replies)
Discussion started by: 793589
3 Replies

9. Shell Programming and Scripting

Grouping and summing data through unix

Hi everyone, I need a help on Unix scripting. I have a file is like this Date Amt 20071205 10 20071204 10 20071203 200 20071204 300 20071203 400 20071205 140 20071203 100 20071205 100... (1 Reply)
Discussion started by: pcharanraj
1 Replies

10. Shell Programming and Scripting

grouping of numbers with script

Suppose u have a file 2 4 6 11 22 13 23 43 12 4 33 31 45 then u want a output like 0-10 4 10-20 3 20-30 2 30-40 2 40-50 2 (2 Replies)
Discussion started by: cdfd123
2 Replies
Login or Register to Ask a Question