Sponsored Content
Top Forums Shell Programming and Scripting Average values of duplicate rows Post 302914223 by RavinderSingh13 on Saturday 23rd of August 2014 03:46:49 AM
Old 08-23-2014
Hello,

Following may also help in same.

Code:
awk 'NR==FNR && NR>1{a[$1]+=$2;b[$1]++;c[$1]+=$3;d[$1]+=$3;e[$1]+=$4;next} ($1 in a){ {if(s[$1] == ""){{f=a[$1]/b[$1]; g=c[$1]/b[$1]; h=d[$1]/b[$1]; i=e[$1]/b[$1];s[$1]=1}; {print $1 OFS f OFS g OFS i}}}}'  OFS="\t" filename filename

Output will be as follows.

Code:
gene_A  4.33333 7       8.33333
gene_B  2.5     3.5     4.5
gene_C  2       3       4


Thanks,
R. Singh
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Calculating the Number of Rows and Average

Hi All I like to know how can we calculate the number of rows and the average of the values present in the file. I will not know what will be the rowcount, which will be dynamic in nature of the file. eg. 29 33 48 30 28 (6 Replies)
Discussion started by: pk_eee
6 Replies

2. Shell Programming and Scripting

Duplicate rows in CSV files based on values

I want to duplicate a row if found two or more values in a particular column for corresponding row which is delimitted by comma. Input abc,line one,value1 abc,line two, value1, value2 abc,line three,value1 needs to converted to abc,line one,value1 abc,line two, value1 abc,line... (8 Replies)
Discussion started by: Incrediblian
8 Replies

3. Shell Programming and Scripting

Duplicate rows in CSV files based on values

I am new to this forum and this is my first post. I am looking at an old post with exactly the same name. Can not paste URL because I do not have 5 posts My requirement is exactly opposite. I want to get rid of duplicate rows and try to append the values of columns in those rows ... (10 Replies)
Discussion started by: vbhonde11
10 Replies

4. Shell Programming and Scripting

average of rows with same value in the first column

Dear All, I have this file tab delimited A 1 12 22 B 3 34 33 C 55 9 32 A 12 81 71 D 11 1 66 E 455 4 2 B 89 4 3 I would like to make the average every column where the first column is the same, for example, A 6,5 46,5 46,5 B 46,0 19,0 18,0 C 55,0 9,0 32,0 D 11,0 1,0 66,0... (8 Replies)
Discussion started by: paolo.kunder
8 Replies

5. UNIX for Dummies Questions & Answers

Writing a script to take the average of two columns every 3 rows

I have a dataset with 120 columns. I would like to write a script, that takes the average of every two columns, starting from columns 2 and 3, and moving consecutively in frames of 3 columns, all the way until the last column. The first column in the output file would be the averages of columns... (1 Reply)
Discussion started by: evelibertine
1 Replies

6. Shell Programming and Scripting

Get the average from column, and eliminate the duplicate values.

Dear Experts, Kindly help me please, I have a big file where there is duplicate values in col 11 till col 23, every 2 rows appers a new numbers, but in each row there is different coordinates x and y in col 57 till col 74. Please i will like to get a single value and average of the x and y... (8 Replies)
Discussion started by: jiam912
8 Replies

7. Shell Programming and Scripting

Average across rows with a condition

Hi Friends, My input file Gene1 10 20 0 Gene2 5 0 15 Gene3 10 10 10 Gene4 5 0 0 If there is a zero for any gene in any column, I don't want that column to be considered which reduces the denominator value during average. Here is my output Gene1 10 20 0 10 Gene2 5 0 15 10 Gene3... (5 Replies)
Discussion started by: jacobs.smith
5 Replies

8. Shell Programming and Scripting

Find duplicate values in specific column and delete all the duplicate values

Dear folks I have a map file of around 54K lines and some of the values in the second column have the same value and I want to find them and delete all of the same values. I looked over duplicate commands but my case is not to keep one of the duplicate values. I want to remove all of the same... (4 Replies)
Discussion started by: sajmar
4 Replies

9. Shell Programming and Scripting

Extract and exclude rows based on duplicate values

Hello I have a file like this: > cat examplefile ghi|NN603762|eee mno|NN607265|ttt pqr|NN613879|yyy stu|NN615002|uuu jkl|NN607265|rrr vwx|NN615002|iii yzA|NN618555|ooo def|NN190486|www BCD|NN628717|ppp abc|NN190486|qqq EFG|NN628717|aaa HIJ|NN628717|sss > I can sort the file by... (5 Replies)
Discussion started by: CHoggarth
5 Replies

10. Shell Programming and Scripting

Average select rows

I have no idea how to even get started with this script. I need to average field 3 for each of the unique identifiers found in field 1. However, I only want to average these rows when field 2 is equal to 1506 - 2000 (note that i replaced the values field 2 for security reasons, but the real... (6 Replies)
Discussion started by: ncwxpanther
6 Replies
Average(3pm)						  LogReport's Lire Documentation					      Average(3pm)

NAME
Lire::Average - Lire class that implements the avg operator SYNOPSIS
use Lire::Average DESCRIPTION
Class that implements the avg operator. This operator will compute the average value of a DLF field in a group of DLF records. If the field used by average isn't a numeric fields, the number of different values in that field will be used as the numerator. It's possible to compute a weighted average by specifying a weighting field. By default, the average is a by record average. It's possible to compute a by different values average. The list of fields that's used to determine the different values if speficied in the by-fields attribute. METHODS
new( %params ) Creates a new Lire::Average object. In addition to the values supported by its parents, the by-fields and weight attributes will be initialized to the values specified in the %params argument. by_fields( [$new_by_fields] ) Returns the fields that are going to be used to count the different values which will made up the by part of the average. This a reference to an array of DLF field names. If the $new_by_fields parameter is set, it will be used as the new $by_fields value. It must be an array reference and should only contains valid field names for the current report specification's schema. weigth( [$new_weight] ) Returns the DLF field's name by which the values will be multiplied before being summed. You can change the weight field by specifying a new name as the $new_weight parameter. Use undef to remove the use of a weighting field. SEE ALSO
Lire::ReportSpec(3pm), Lire::ReportOperator(3pm), Lire::Aggregator(3pm), Lire::Aggregate(3pm). AUTHORS
Francis J. Lacoste <flacoste@logreport.org> Wolfgang Sourdeau <wsourdea@logreport.org> VERSION
$Id: Average.pm,v 1.17 2008/03/09 19:27:30 vanbaal Exp $ COPYRIGHT
Copyright (C) 2001-2004 Stichting LogReport Foundation LogReport@LogReport.org This file is part of Lire. Lire is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program (see COPYING); if not, check with http://www.gnu.org/copyleft/gpl.html. Lire 2.1.1 2008-03-09 Average(3pm)
All times are GMT -4. The time now is 07:58 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy