Calculate the average of a column based on the value of another column

Calculate the average of a column based on the value of another column
# 1
01-27-2013
Calculate the average of a column based on the value of another column

Hi,

I would like to calculate the average of column 'y' based on the value of column 'pos'.

For example, here is file1

What I wanted is to calculate the average of "y" based on "pos", specifically, I want the average of y for pos from 1-10000, 10001-20000, ..., and out put would look like,

Thanks a lot!

Note, the 'pos' of the outputfile is the starting value of a range.

Last edited by Scrutinizer; 01-27-2013 at 03:39 PM.. Reason: code tags
 jackken007 View Public Profile for jackken007 Find all posts by jackken007
# 2
01-27-2013
PLEASE use code tags as demanded!
Try this as a starting point; the 40000 line I've left to your exercise...
or

Last edited by RudiC; 01-27-2013 at 03:55 PM..
 RudiC View Public Profile for RudiC Find all posts by RudiC
# 3
01-27-2013
It works like a charm!

Thank you so much!
 jackken007 View Public Profile for jackken007 Find all posts by jackken007

Test Your Knowledge in Computers #907
Difficulty: Easy
The Unix shell command line is a sequence of ASCII text words delimited by curly braces.
True or False?

Calculate 5th percentile based on another column

I would like to have some help in calculating 5th percentile value of column 2 for each site, the input is like below:site val1 val2 002 10 25.3 002 20 25.3 002 30 25.3 002 40 20 002 50 20 002 60 20 002 70 20 002 80 30 002 90 30 002 100 30 002 120 30 003 20 30.3 003 20 30.3 003 30 20...

Match first two columns and calculate percent of average in third column

I have the need to match the first two columns and when they match, calculate the percent of average for the third columns. The following awk script does not give me the expected results. awk 'NR==FNR {T=\$3; next} \$1,\$2 in T {P=T/\$3*100; printf "%s %s %.0f\n", \$1, \$2, (P>=0)?P:-P}' diff.file...

Calculate Average time of one column

Hello dears, I have a log file with records like below and want to get a average of one column based on the search of one specific keyword. 2015-02-07 08:15:28 10.102.51.100 10.112.55.101 "kevin.c" POST ...

Check first column - average second column based on a condition

Hi, My input file Gene1 1 Gene1 2 Gene1 3 Gene1 0 Gene2 0 Gene2 0 Gene2 4 Gene2 8 Gene3 9 Gene3 9 Gene4 0 Condition: If the first column matches, then look in the second column. If there is a value of zero in the second column, then don't consider that record while averaging. ...

Find the average based on similar names in the first column

I have a table, say this: name1 num1 num2 num3 num4 name2 num5 num6 num7 num8 name3 num1 num3 num4 num9 name2 num8 num9 num1 num2 name2 num4 num5 num6 num4 name4 num4 num5 num7 num8 name5 num1 num3 num9 num7 name5 num6 num8 num3 num4 I want a code that will sort my data according...

Calculate 2nd Column Based on 1st Column

Dear All, I have input file like this. input.txt CE2_12-15 3950.00 589221.0 9849709.0 768.0 CE2_12_2012 CE2_12-15 3949.00 589199.0 9849721.0 768.0 CE2_12_2012 CE2_12-15 3948.00 589178.0 9849734.0 768.0 CE2_12_2012 CE2_12-52 1157.00 ...

Average values in a column based on range

Hi i have data with two columns like below. I want to find average of column values like if the value in column 2 is between 0-250000 the average of column 1 is some xx and average of column2 is ww then if value is 250001-5000000 average of column 1 is yy and average of column 2 is zz. And my...

AWK: how to get average based on certain column

Hi, I'm new to shell programming, can anyone help me on this? I want to do following operations - 1. Average salary for each country 2. Total salary for each city and data that looks like - salary country city 10000 zzz BN 25000 zzz BN 30000 zzz BN 10000 yyy ZN 15000 yyy ZN ...

Use awk to calculate average of column 3

Suppose I have 500 files in a directory and I need to Use awk to calculate average of column 3 for each of the file, how would I do that?

calculate average of column 2

Hi I have fakebook.csv as following: F1(current date) F2(popularity) F3(name of book) F4(release date of book) 2006-06-21,6860,"Harry Potter",2006-12-31 2006-06-22,,"Harry Potter",2006-12-31 2006-06-23,7120,"Harry Potter",2006-12-31 2006-06-24,,"Harry Potter",2006-12-31...