Word count of values in a column

06-12-2012

Banned

363, 7

Join Date: Jan 2012

Last Activity: 24 June 2017, 6:25 PM EDT

Posts: 363

Thanks Given: 318

Thanked 7 Times in 7 Posts

Word count of values in a column

Hi friends,

I have an input file of the following format

Code:

a b c 1.11112
d e f 4.5767
g h i 19.098
k i l 87.9999

I am looking for an awk one liners that would help me in giving the following output

output.txt

Code:

Range of the column: 1.11112 to 87.9999
Total records between 1 and 10 - 2
Total records between 10 and 20 - 1
Total records between 20 and 30 -0
Total records between 30 and 40 -0
Total records between 40 and 50 -0
Total records between 50 and 60 -0
Total records between 60 and 70 -0
Total records between 70 and 80 -0
Total records between 80 and 90 -1

I want to know the total no. of records in the input file at 10 interval.

Thanks

jacobs.smith

View Public Profile for jacobs.smith

Find all posts by jacobs.smith

06-12-2012

Registered User

945, 306

Join Date: Jun 2011

Last Activity: 1 January 2020, 5:25 PM EST

Location: South Carolina, USA

Posts: 945

Thanks Given: 32

Thanked 306 Times in 284 Posts

your edges are ambiguous. 1-10 then 10-20. which would 10 go in?

This User Gave Thanks to neutronscott For This Post:

neutronscott

View Public Profile for neutronscott

Visit neutronscott's homepage!

Find all posts by neutronscott

06-12-2012

Banned

363, 7

Join Date: Jan 2012

Last Activity: 24 June 2017, 6:25 PM EDT

Posts: 363

Thanks Given: 318

Thanked 7 Times in 7 Posts

Quote:

Originally Posted by neutronscott

your edges are ambiguous. 1-10 then 10-20. which would 10 go in?

Thank you.

That was a very good question.

Here goes my output.txt

Code:

Range of the column: 1.11112 to 87.9999
Total records between 1 and 10.99 - 2
Total records between 11 and 20.99 - 1
Total records between 21 and 30.99 -0
Total records between 31 and 40.99 -0
Total records between 41 and 50.99 -0
Total records between 51 and 60.99 -0
Total records between 61 and 70.99 -0
Total records between 71 and 80.99 -0
Total records between 81 and 90.99 -1

jacobs.smith

View Public Profile for jacobs.smith

Find all posts by jacobs.smith

06-12-2012

Registered User

945, 306

Join Date: Jun 2011

Last Activity: 1 January 2020, 5:25 PM EST

Location: South Carolina, USA

Posts: 945

Thanks Given: 32

Thanked 306 Times in 284 Posts

Code:

[mute@geek ~/temp/jacobs.smith]$ awk 'NR==1{min=$4}$4<min{min=$4}$4>max{max=$4}{a[int($4/10)]++}END{printf("Range of the column: %f to %f\n",min,max);max=int(max/10);for(i=0;i<=max;i++)printf("Records between [%d, %d): %d\n",i*10,10+i*10,a[i])}' input
Range of the column: 1.111120 to 87.999900
Records between [0, 10): 2
Records between [10, 20): 1
Records between [20, 30): 0
Records between [30, 40): 0
Records between [40, 50): 0
Records between [50, 60): 0
Records between [60, 70): 0
Records between [70, 80): 0
Records between [80, 90): 1

This User Gave Thanks to neutronscott For This Post:

neutronscott

View Public Profile for neutronscott

Visit neutronscott's homepage!

Find all posts by neutronscott

06-12-2012

Banned

363, 7

Join Date: Jan 2012

Last Activity: 24 June 2017, 6:25 PM EDT

Posts: 363

Thanks Given: 318

Thanked 7 Times in 7 Posts

Quote:

Originally Posted by neutronscott

Code:

[mute@geek ~/temp/jacobs.smith]$ awk 'NR==1{min=$4}$4<min{min=$4}$4>max{max=$4}{a[int($4/10)]++}END{printf("Range of the column: %f to %f\n",min,max);max=int(max/10);for(i=0;i<=max;i++)printf("Records between [%d, %d): %d\n",i*10,10+i*10,a[i])}' input
Range of the column: 1.111120 to 87.999900
Records between [0, 10): 2
Records between [10, 20): 1
Records between [20, 30): 0
Records between [30, 40): 0
Records between [40, 50): 0
Records between [50, 60): 0
Records between [60, 70): 0
Records between [70, 80): 0
Records between [80, 90): 1

How wil this split the edge?

jacobs.smith

View Public Profile for jacobs.smith

Find all posts by jacobs.smith

06-12-2012

Registered User

945, 306

Join Date: Jun 2011

Last Activity: 1 January 2020, 5:25 PM EST

Location: South Carolina, USA

Posts: 945

Thanks Given: 32

Thanked 306 Times in 284 Posts

[] meaning including, () meaning not included. i think you wanted everything 1 higher. in that case, i suppose you'd subtract one first here: a[int(($4-1)/10)]++ and adjust the printf parameters.

edit: like this

Code:

#!/usr/bin/awk -f
NR==1{min=$4}$4<min{min=$4}$4>max{max=$4}{a[int(($4-1)/10)]++}
END{
        printf("Range of the column: %f to %f\n", min, max);
        max=int(max/10)
        for (i=0;i<=max;i++)
                printf("Records between %d and %.2f: %d\n",1+i*10,10.99+i*10,a[i])
}

neutronscott

View Public Profile for neutronscott

Visit neutronscott's homepage!

Find all posts by neutronscott

Shell Programming and Scripting

Word count of values in a column

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Copy columns from one file into another and get sum of column values and row count

Discussion started by: Tahir_M

2. Shell Programming and Scripting

Count number of unique values in each column of array

Discussion started by: Geneanalyst

3. UNIX for Beginners Questions & Answers

UNIX script to check word count of each word in file

Discussion started by: mirwasim

4. Shell Programming and Scripting

Count frequency of unique values in specific column

Discussion started by: owwow14

5. Shell Programming and Scripting

Count specific column values

Discussion started by: owwow14

6. UNIX for Dummies Questions & Answers

count number of distinct values in each column with awk

Discussion started by: beca123456

7. UNIX for Dummies Questions & Answers

Count the lines with the same values in a column and write the output to a file

Discussion started by: @man

8. UNIX for Dummies Questions & Answers

count number of rows based on other column values

Discussion started by: itsme999

9. Shell Programming and Scripting

print unique values of a column and sum up the corresponding values in next column

Discussion started by: amigarus

10. Shell Programming and Scripting

Word count of lines ending with certain word

Discussion started by: warlock129