Sponsored Content
Top Forums Shell Programming and Scripting Grouping data numbers in a text file into prescribed intervals and count Post 302387114 by alister on Thursday 14th of January 2010 01:24:27 PM
Old 01-14-2010
Quote:
Originally Posted by Scrutinizer
Something like this maybe?
Code:
awk '$1>=t*10000{t++} {A[t]++} END{for (i=1;i<=t;i++) print i*10000"\t"A[i]}' infile

There is a bug here. Whenever there is a gap in the data, wherein there are zero values in an interval, the accounting will be off.

Example using most of the sample data above and an interval size of 1000 (instead of 10000):

Code:
$ cat data
34
817
1145
1645
1759
1761
3368
3529
4311
4681
5187
5193
5199
5417
5682

$ awk '$1>=t*1000{t++} {A[t]++} END{for (i=1;i<=t;i++) print i*1000"\t"A[i]}' data
1000    2
2000    4
3000    1
4000    1
5000    2
6000    5

The output should be:

Code:
1000    2
2000    4
3000    0
4000    2
5000    2
6000    5

Regards,
alister

---------- Post updated at 01:24 PM ---------- Previous update was at 01:15 PM ----------

A bugfix for Scrutinizer's solution:

Code:
$ awk '$1>=t*1000{while($1>=++t*1000);} {A[t]++} END{for (i=1;i<=t;i++) print i*1000"\t"(A[i]+0)}' data

A different solution I had been working on for kicks:

Code:
$ awk 'function ge() { return $1>=1000*(i+1) } function p(){print NR-1; NR=1; i++} ge(){p(); while(ge())p()} END {NR++; p()}' data
2
4
0
2
2
5

Take care,
alister

Last edited by alister; 01-14-2010 at 02:16 PM.. Reason: courteousness :)
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

grouping of numbers with script

Suppose u have a file 2 4 6 11 22 13 23 43 12 4 33 31 45 then u want a output like 0-10 4 10-20 3 20-30 2 30-40 2 40-50 2 (2 Replies)
Discussion started by: cdfd123
2 Replies

2. Shell Programming and Scripting

Grouping and summing data through unix

Hi everyone, I need a help on Unix scripting. I have a file is like this Date Amt 20071205 10 20071204 10 20071203 200 20071204 300 20071203 400 20071205 140 20071203 100 20071205 100... (1 Reply)
Discussion started by: pcharanraj
1 Replies

3. UNIX for Dummies Questions & Answers

Help with data grouping

Hi all, I have a set data as shown below, and i would like to eliminate the name that no children - boy and girl. What is the appropriate command can i use(other than grep)? Please assist... My input: name sex marital status children - boy children - girl ... (3 Replies)
Discussion started by: 793589
3 Replies

4. Shell Programming and Scripting

count numbers of matching rows and replace its value in another file

Hello all, can you help me in this problem, assume We have two txt file (file_1 and file_3) one is file_1 contains the data: a 0 b 1 c 3 a 7 b 4 c 5 b 8 d 6 . . . . and I need to count the lines with the matching data (a,b,..) and print in new file called file_2 such as the... (4 Replies)
Discussion started by: GoldenFalcon10
4 Replies

5. Shell Programming and Scripting

Divide numbers into intervals

divide input values into specified number (-100 or -200) according to the key (a1 or a2 ....) For ex: if we give -100 in the command line it would create 100 number intervals (1-100, 100-200, 200-300) untill it covers the value 300 in a1. Note: It should work the same even with huge numbers... (3 Replies)
Discussion started by: ruby_sgp
3 Replies

6. Shell Programming and Scripting

Remove a block of Text at regular intervals

Hello all, I have a text files that consists of blocks of text. Each block of text represents a set of Cartesian coordinates for a molecule. Each block of text starts with a line that has a only a number, which is equal to the total number of atoms in the molecule. After this number is a line... (15 Replies)
Discussion started by: marcozd
15 Replies

7. UNIX for Dummies Questions & Answers

Extracting lines from a text file based on another text file with line numbers

Hi, I am trying to extract lines from a text file given a text file containing line numbers to be extracted from the first file. How do I go about doing this? Thanks! (1 Reply)
Discussion started by: evelibertine
1 Replies

8. Shell Programming and Scripting

Count column data in a text file

I have a text file that has the following column data: 0.007 0.005 0.004 0.007 How do i output the total sum of the data above? (6 Replies)
Discussion started by: alegnagrp
6 Replies

9. Shell Programming and Scripting

Yearly Grouping of Data

I need some logic that would help to group up some records that fall between two dates: Input Data COL_1 COL_2 COL_3 COL_4 COL_5 COL_6 COL_7 COL_8 COL_9 COL_10 COL_11 COL_12 C ABC ABCD 3 ZZ WLOA 2015-12-01 2015-12-15 975.73 ZZZ P 147018.64 C ABC ... (3 Replies)
Discussion started by: Ads89
3 Replies

10. Shell Programming and Scripting

Script Shell: Count The sum of numbers in a file

Hi all; Here is my file: V1.3=4 V1.4=5 V1.1=3 V1.2=6 V1.3=6 Please, can you help me to write a script shell that counts the sum of values in my file (4+5+3+6+6) ? Thank you so much for help. Kind regards. (3 Replies)
Discussion started by: chercheur111
3 Replies
SLAEBZ(l)								 )								 SLAEBZ(l)

NAME
SLAEBZ - contain the iteration loops which compute and use the function N(w), which is the count of eigenvalues of a symmetric tridiagonal matrix T less than or equal to its argument w SYNOPSIS
SUBROUTINE SLAEBZ( IJOB, NITMAX, N, MMAX, MINP, NBMIN, ABSTOL, RELTOL, PIVMIN, D, E, E2, NVAL, AB, C, MOUT, NAB, WORK, IWORK, INFO ) INTEGER IJOB, INFO, MINP, MMAX, MOUT, N, NBMIN, NITMAX REAL ABSTOL, PIVMIN, RELTOL INTEGER IWORK( * ), NAB( MMAX, * ), NVAL( * ) REAL AB( MMAX, * ), C( * ), D( * ), E( * ), E2( * ), WORK( * ) PURPOSE
SLAEBZ contains the iteration loops which compute and use the function N(w), which is the count of eigenvalues of a symmetric tridiagonal matrix T less than or equal to its argument w. It performs a choice of two types of loops: IJOB=1, followed by IJOB=2: It takes as input a list of intervals and returns a list of sufficiently small intervals whose union contains the same eigenvalues as the union of the original intervals. The input intervals are (AB(j,1),AB(j,2)], j=1,...,MINP. The output interval (AB(j,1),AB(j,2)] will contain eigenvalues NAB(j,1)+1,...,NAB(j,2), where 1 <= j <= MOUT. IJOB=3: It performs a binary search in each input interval (AB(j,1),AB(j,2)] for a point w(j) such that N(w(j))=NVAL(j), and uses C(j) as the starting point of the search. If such a w(j) is found, then on output AB(j,1)=AB(j,2)=w. If no such w(j) is found, then on output (AB(j,1),AB(j,2)] will be a small interval containing the point where N(w) jumps through NVAL(j), unless that point lies outside the initial interval. Note that the intervals are in all cases half-open intervals, i.e., of the form (a,b] , which includes b but not a . To avoid underflow, the matrix should be scaled so that its largest element is no greater than overflow**(1/2) * underflow**(1/4) in abso- lute value. To assure the most accurate computation of small eigenvalues, the matrix should be scaled to be not much smaller than that, either. See W. Kahan "Accurate Eigenvalues of a Symmetric Tridiagonal Matrix", Report CS41, Computer Science Dept., Stanford University, July 21, 1966 Note: the arguments are, in general, *not* checked for unreasonable values. ARGUMENTS
IJOB (input) INTEGER Specifies what is to be done: = 1: Compute NAB for the initial intervals. = 2: Perform bisection iteration to find eigenvalues of T. = 3: Perform bisection iteration to invert N(w), i.e., to find a point which has a specified number of eigenvalues of T to its left. Other values will cause SLAEBZ to return with INFO=-1. NITMAX (input) INTEGER The maximum number of "levels" of bisection to be performed, i.e., an interval of width W will not be made smaller than 2^(-NITMAX) * W. If not all intervals have converged after NITMAX iterations, then INFO is set to the number of non-converged intervals. N (input) INTEGER The dimension n of the tridiagonal matrix T. It must be at least 1. MMAX (input) INTEGER The maximum number of intervals. If more than MMAX intervals are generated, then SLAEBZ will quit with INFO=MMAX+1. MINP (input) INTEGER The initial number of intervals. It may not be greater than MMAX. NBMIN (input) INTEGER The smallest number of intervals that should be processed using a vector loop. If zero, then only the scalar loop will be used. ABSTOL (input) REAL The minimum (absolute) width of an interval. When an interval is narrower than ABSTOL, or than RELTOL times the larger (in magni- tude) endpoint, then it is considered to be sufficiently small, i.e., converged. This must be at least zero. RELTOL (input) REAL The minimum relative width of an interval. When an interval is narrower than ABSTOL, or than RELTOL times the larger (in magni- tude) endpoint, then it is considered to be sufficiently small, i.e., converged. Note: this should always be at least radix*machine epsilon. PIVMIN (input) REAL The minimum absolute value of a "pivot" in the Sturm sequence loop. This *must* be at least max |e(j)**2| * safe_min and at least safe_min, where safe_min is at least the smallest number that can divide one without overflow. D (input) REAL array, dimension (N) The diagonal elements of the tridiagonal matrix T. E (input) REAL array, dimension (N) The offdiagonal elements of the tridiagonal matrix T in positions 1 through N-1. E(N) is arbitrary. E2 (input) REAL array, dimension (N) The squares of the offdiagonal elements of the tridiagonal matrix T. E2(N) is ignored. NVAL (input/output) INTEGER array, dimension (MINP) If IJOB=1 or 2, not referenced. If IJOB=3, the desired values of N(w). The elements of NVAL will be reordered to correspond with the intervals in AB. Thus, NVAL(j) on output will not, in general be the same as NVAL(j) on input, but it will correspond with the interval (AB(j,1),AB(j,2)] on output. AB (input/output) REAL array, dimension (MMAX,2) The endpoints of the intervals. AB(j,1) is a(j), the left endpoint of the j-th interval, and AB(j,2) is b(j), the right endpoint of the j-th interval. The input intervals will, in general, be modified, split, and reordered by the calculation. C (input/output) REAL array, dimension (MMAX) If IJOB=1, ignored. If IJOB=2, workspace. If IJOB=3, then on input C(j) should be initialized to the first search point in the binary search. MOUT (output) INTEGER If IJOB=1, the number of eigenvalues in the intervals. If IJOB=2 or 3, the number of intervals output. If IJOB=3, MOUT will equal MINP. NAB (input/output) INTEGER array, dimension (MMAX,2) If IJOB=1, then on output NAB(i,j) will be set to N(AB(i,j)). If IJOB=2, then on input, NAB(i,j) should be set. It must satisfy the condition: N(AB(i,1)) <= NAB(i,1) <= NAB(i,2) <= N(AB(i,2)), which means that in interval i only eigenvalues NAB(i,1)+1,...,NAB(i,2) will be considered. Usually, NAB(i,j)=N(AB(i,j)), from a previous call to SLAEBZ with IJOB=1. On output, NAB(i,j) will contain max(na(k),min(nb(k),N(AB(i,j)))), where k is the index of the input interval that the output interval (AB(j,1),AB(j,2)] came from, and na(k) and nb(k) are the the input values of NAB(k,1) and NAB(k,2). If IJOB=3, then on output, NAB(i,j) contains N(AB(i,j)), unless N(w) > NVAL(i) for all search points w , in which case NAB(i,1) will not be modified, i.e., the output value will be the same as the input value (modulo reorderings -- see NVAL and AB), or unless N(w) < NVAL(i) for all search points w , in which case NAB(i,2) will not be modified. Normally, NAB should be set to some distinctive value(s) before SLAEBZ is called. WORK (workspace) REAL array, dimension (MMAX) Workspace. IWORK (workspace) INTEGER array, dimension (MMAX) Workspace. INFO (output) INTEGER = 0: All intervals converged. = 1--MMAX: The last INFO intervals did not converge. = MMAX+1: More than MMAX intervals were generated. FURTHER DETAILS
This routine is intended to be called only by other LAPACK routines, thus the interface is less user-friendly. It is intended for two purposes: (a) finding eigenvalues. In this case, SLAEBZ should have one or more initial intervals set up in AB, and SLAEBZ should be called with IJOB=1. This sets up NAB, and also counts the eigenvalues. Intervals with no eigenvalues would usually be thrown out at this point. Also, if not all the eigenvalues in an interval i are desired, NAB(i,1) can be increased or NAB(i,2) decreased. For example, set NAB(i,1)=NAB(i,2)-1 to get the largest eigenvalue. SLAEBZ is then called with IJOB=2 and MMAX no smaller than the value of MOUT returned by the call with IJOB=1. After this (IJOB=2) call, eigenvalues NAB(i,1)+1 through NAB(i,2) are approximately AB(i,1) (or AB(i,2)) to the tolerance specified by ABSTOL and RELTOL. (b) finding an interval (a',b'] containing eigenvalues w(f),...,w(l). In this case, start with a Gershgorin interval (a,b). Set up AB to contain 2 search intervals, both initially (a,b). One NVAL element should contain f-1 and the other should contain l , while C should contain a and b, resp. NAB(i,1) should be -1 and NAB(i,2) should be N+1, to flag an error if the desired interval does not lie in (a,b). SLAEBZ is then called with IJOB=3. On exit, if w(f-1) < w(f), then one of the intervals -- j -- will have AB(j,1)=AB(j,2) and NAB(j,1)=NAB(j,2)=f-1, while if, to the specified tolerance, w(f-k)=...=w(f+r), k > 0 and r >= 0, then the interval will have N(AB(j,1))=NAB(j,1)=f-k and N(AB(j,2))=NAB(j,2)=f+r. The cases w(l) < w(l+1) and w(l-r)=...=w(l+k) are handled similarly. LAPACK version 3.0 15 June 2000 SLAEBZ(l)
All times are GMT -4. The time now is 08:33 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy