How to average value if they have the same annotation names?
Hi I have a file like this
input_file
Each line has annotation name and its correlated value. The annotation name and the value are seperated by a space. I want to average the value if the lines have the same annotation names in the output file. In this case, there are 2 lines have CR387793 annotataion and 3 lines have AR388755 annotation in the input file, so the average value of CR387793 should be -3.15 and the average value of AR388755 should be 2.65, all the other annotation has only one unique value so it will be kept in the output file. So I am expecting the output file is like this below
output_file
I have thousands of lines in my input file to be processed like the example, how can I achieve this by the Unix command. Thank you very much!
If I have a file like this, could anyone please guide me how to find the average value in each metrix. The file has got about 130,000 metrixs.
Grid-ref= 142, 235
178 182 203 240 273 295 289 293 283 262 201 176
167 187 187 246 260 282 299 312 293 276 230 191
169 ... (2 Replies)
Hi,
I have the data like this
$1 $2
1 12
2 13
3 14
4 12
5 12
6 12
7 13
8 14
9 12
10 12
i want to compute average of $1 and $2 every 5th line (1-5 and 6-10)
Please help me with awk
Thank you (4 Replies)
Hi
I am looking for an awk script which can compute average of all the fields every 5th line. The file looks:
A B C D E F G H I J K L M
1 18 13 14 12 14 13 11 12 12 15 15 15
2 17 17 13 13 13 12 12 11 12 14 15 14
3 16 16 12 12 12 11 11 12 11 16 14 13
4 15 15 11 11 11 12 11 12 11... (6 Replies)
I have a file which is
2
3
4
5
6
6
so i am writing program in C to calculate mean..
#include<stdio.h>
#include<string.h>
#include <math.h>
double CALL mean(int n , double x)
main (int argc, char **argv)
{
char Buf,SEQ;
int i;
double result = 0;
FILE *fp; (3 Replies)
Sun Solaris Unix Question
Haven't been able to find any solution for this situation. Let's just say the file names listed below exist in a directory. I want the find command to find all files in this directory but at the same time I want to eliminate certain file names or files with certain... (2 Replies)
Data files coming in different names in a file name called process.txt.
1. shipments_yyyymmdd.gz
2 Order_yyyymmdd.gz
3. Invoice_yyyymmdd.gz
4. globalorder_yyyymmdd.gz
The process needs to discard all the below files and only process two of the 4 file names available
... (1 Reply)
I have the following Snps data
CHROM POS ID
chr7 78599583 rs987435
chr15 33395779 rs987436
chr1 189807684 rs987437
chr20 33907909 rs987438
chr12 75664046 rs987439
and the following gene data
genename name chrom strand txstart txend... (8 Replies)
i have a file with 2 columns. i want to calculate the average of column 1 based on the values of column 2. here's how the file looks like. i want to calculate the sums of numbers corresponding to 1 and then calculate the average. same for numbers corresponding to zero. any help with a code would... (1 Reply)
Discussion started by: onerokeyz
1 Replies
LEARN ABOUT OPENDARWIN
uniq
UNIQ(1) BSD General Commands Manual UNIQ(1)NAME
uniq -- report or filter out repeated lines in a file
SYNOPSIS
uniq [-c | -d | -u] [-i] [-f num] [-s chars] [input_file [output_file]]
DESCRIPTION
The uniq utility reads the specified input_file comparing adjacent lines, and writes a copy of each unique input line to the output_file. If
input_file is a single dash ('-') or absent, the standard input is read. If output_file is absent, standard output is used for output. The
second and succeeding copies of identical adjacent input lines are not written. Repeated lines in the input will not be detected if they are
not adjacent, so it may be necessary to sort the files first.
The following options are available:
-c Precede each output line with the count of the number of times the line occurred in the input, followed by a single space.
-d Only output lines that are repeated in the input.
-f num Ignore the first num fields in each input line when doing comparisons. A field is a string of non-blank characters separated from
adjacent fields by blanks. Field numbers are one based, i.e. the first field is field one.
-s chars
Ignore the first chars characters in each input line when doing comparisons. If specified in conjunction with the -f option, the
first chars characters after the first num fields will be ignored. Character numbers are one based, i.e. the first character is
character one.
-u Only output lines that are not repeated in the input.
-i Case insensitive comparison of lines.
DIAGNOSTICS
The uniq utility exits 0 on success, and >0 if an error occurs.
COMPATIBILITY
The historic +number and -number options have been deprecated but are still supported in this implementation.
SEE ALSO sort(1)STANDARDS
The uniq utility is expected to be IEEE Std 1003.2 (``POSIX.2'') compatible.
HISTORY
A uniq command appeared in Version 3 AT&T UNIX.
BSD June 6, 1993 BSD