Find the average based on similar names in the first column

Find the average based on similar names in the first column
# 1
12-04-2012
Find the average based on similar names in the first column

I have a table, say this:
I want a code that will sort my data according to the first column and for ALL columns with the same name, calculate the average of each of the corresponding columns. In this case it would be:

# 2
12-04-2012
What have you tried? Where are you stuck?
 Scott View Public Profile for Scott Find all posts by Scott
# 3
12-04-2012
I started by using sort command, in particular

to find those lines that are repeated but then i am not sure how to distinguish between the lines with the same name after that
# 4
12-04-2012
Once they are sorted you can read them line by line: as they are sorted already you can rely on all the identical key values coming one after the other. The underlying algorithm is a widely used and basic one and called: single group change and it works like this:

You have to remember your last key value. If the key value you read now is identical you are within the same group, so add the other values to sums or whatever you do within your groups.

If the key you read is not identical with the previous one you have to first end your last group - calculate any averages from the sums, etc. - then start with a new group.

Two things to take into account: when you read the first line your group changes (from "" to some value) but you should suppress group end-processing at this point, because otherwise you get a "ghost-group" with an empty key and all values zero/nil. Second, your last line will have to trigger a group change too, because otherwise the last group would not be processed.

I hope this helps.

bakunin
 bakunin View Public Profile for bakunin Find all posts by bakunin
# 5
12-04-2012
Sorry bakunin, I am not sure i understood what you said.
I found from a previous thread

HTML Code:
https://www.unix.com/shell-programming-scripting/121566-averaging-multiple-columns.html
that in order to find the average (after grep for name1) for a number of columns, the correct code would be:

but it only prints the (average) number and not 'name 1' at the beginning of the line.
Can someone tell me how to do this?

I mean now the output is

but i want it to be

Check first column - average second column based on a condition

Hi, My input file Gene1 1 Gene1 2 Gene1 3 Gene1 0 Gene2 0 Gene2 0 Gene2 4 Gene2 8 Gene3 9 Gene3 9 Gene4 0 Condition: If the first column matches, then look in the second column. If there is a value of zero in the second column, then don't consider that record while averaging. ...

To find similar items in a column

HI, I have a long file which looks like "1xxx_0_1" "1xxx" 500 5 "ABC*3-DEF*3-LL" "2yyy_0_1" "2yyy" 600 10 "ABC*2-DEF*2-LL" "3ddd_0_1" "3ddd" 150 52 "ABC*3-DEF*3-LL" "1xxx_0_1" "1xxx" 500 5 "ABC*3-DEF*3-LL" "2yyy_0_1" "2yyy" 600 10 "ABC*2-DEF*2-LL" ...

Calculate the average of a column based on the value of another column

Hi, I would like to calculate the average of column 'y' based on the value of column 'pos'. For example, here is file1 id pos y c 11 1 220 aa 11 4333 207 f 11 5333 112 ee 11 11116 305 e 11 11117 310 r 11 22228 781 gg 11 ...

Average values in a column based on range

Hi i have data with two columns like below. I want to find average of column values like if the value in column 2 is between 0-250000 the average of column 1 is some xx and average of column2 is ww then if value is 250001-5000000 average of column 1 is yy and average of column 2 is zz. And my...

Find the Maximum value and average of a column

Use and complete the template provided. The entire template must be completed. If you don't, your post may be deleted! 1. The problem statement, all variables and given/known data: I am trying to complete a script which will allow me to find: a) reads a value from the keyboard. (ask the...

Help with merge two file based on similar column content

Input file 1: A1BG A1BG A1BG A1CF A1CF BCAS BCAS A2LD1 A2M A2M HAT . . Input file 2: A1BG All A1CF TEMP

Joining multiple files based on one column with different and similar values (shell or perl)

Hi, I have nine files looking similar to file1 & file2 below. File1: 1 ABCA1 1 ABCC8 1 ABR:N 1 ACACB 1 ACAP2 1 ACOT1 1 ACSBG 1 ACTR1 1 ACTRT 1 ADAMT 1 AEN:N 1 AKAP1File2: 1 A4GAL 1 ACTBL 1 ACTL7

AWK: how to get average based on certain column

Hi, I'm new to shell programming, can anyone help me on this? I want to do following operations - 1. Average salary for each country 2. Total salary for each city and data that looks like - salary country city 10000 zzz BN 25000 zzz BN 30000 zzz BN 10000 yyy ZN 15000 yyy ZN ...

Script to find the average of a given column and also for specified number of rows?

Hi Friends, In continuation to my earlier post https://www.unix.com/shell-programming-scripting/99166-script-find-average-given-column-also-specified-number-rows.html I am extending my problem as follows. Input: Column1 Column2 MAS 1 MAS 4 ...

Change names in a column based on the symbols in another column

If the 4th column has - sign then the names in 3rd column has to change to some user defined names (as shown in output). Thanx input1 1 a aaaaa + 2 b bbbbb + 3 c ccccc + 4 d ddddd + 5 e eeeee + 6 f xxxxx + 8 h hhhhh +...