awk to output id, location, and average


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk to output id, location, and average
# 1  
Old 10-10-2015
awk to output id, location, and average

Trying to use awk output the target in $1 with the region it maps to in $2 along with its average. The below is close but I just can not seem to add the region it maps to or get the count of lines not of the text. Thank you Smilie.

Basically,
Code:
$1 occurs 5 times and maps to $2 
with an average of 213.6 reads

input.txt
Code:
chr1:24019088-24019259    RPL11:exon.1;RPL11:exon.2    1    203
chr1:24019088-24019259    RPL11:exon.1;RPL11:exon.2    2    210
chr1:24019088-24019259    RPL11:exon.1;RPL11:exon.2    3    216
chr1:24019088-24019259    RPL11:exon.1;RPL11:exon.2    4    218
chr1:24019088-24019259    RPL11:exon.1;RPL11:exon.2    5    221

Code:
awk '{ N[$1]++ ; T[$1]+=$4 } END { for(X in N) printf("%s occurs %d times with an average depth of %f\n" reads, X, N[X], T[X]/N[X]); }' input.txt > output.txt

Current output
Code:
chr1:24019088-24019259 occurs 10 times with an average of 213.6 reads

Desired output
Code:
chr1:24019088-24019259 occurs 5 times and maps to RPL11:exon.1;RPL11:exon.2 with an average of 213.6 reads


Last edited by cmccabe; 10-10-2015 at 11:55 AM.. Reason: fixed format
# 2  
Old 10-10-2015
Try
Code:
awk '
        {N[$1]++
         T[$1]+=$4
         M[$1]=$2
        }
END     {for (X in N) printf ("%s occurs %d times and maps to %s with an average depth"\
                                " of %f reads\n", X, N[X], M[X], T[X]/N[X]);
        }
' file
chr1:24019088-24019259 occurs 5 times and maps to RPL11:exon.1;RPL11:exon.2 with an average depth of 213.600000 reads

You may want to correct your quoting / <new line> usage.
This User Gave Thanks to RudiC For This Post:
# 3  
Old 10-10-2015
Thank you Smilie.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk Moving Average

Hi, I'm using awk to try and get a moving average for the second column of numbers ($2) in the below example broken out by unique identifier in column 1 ($1) : H1,1.2 H1,2.3 H1,5.5 H1,6.6 H1,8.7 H1,4.1 H1,6.4 H1,7.8 H1,9.6 H1,3.2 H5,50.1 H5,54.2 H5,58.8 H5,60.9 H5,54.3 H5,52.7... (8 Replies)
Discussion started by: theflamingmoe
8 Replies

2. Shell Programming and Scripting

Bash to calculate average of all files in directory and output by part of filename

I am trying to use awk to calculate the average of all lines in $2 for every file in a directory. The below bash seems to do that, but I cannot figure out how to capture the string before the _ as the output file name and have it be tab-delimeted. Thank you :). Filenames in... (3 Replies)
Discussion started by: cmccabe
3 Replies

3. Shell Programming and Scripting

awk to combine by field and average by another

In the below awk I am trying to combine all matching $4 into a single $5 (up to the -), and count the lines in $6 and average all values in $7. The awk is close but it seems to only be using the last line in the file and skipping all others. The posted input is a sample of the file that is over... (3 Replies)
Discussion started by: cmccabe
3 Replies

4. Shell Programming and Scripting

awk to average target and gene

I am trying to modify the awk below to include the gene name ($5) for each target and can not seem to do so. Also, I'm not sure the calculation is right (average of all targets that are the same is $4 using the values in $7)? Thank you :). awk '{if((NR>1)&&($4!=last)){printf("%s\t%f\t%s\n",... (1 Reply)
Discussion started by: cmccabe
1 Replies

5. Shell Programming and Scripting

awk or Bash: Cumulative average

For the data I would like to parse down and for each parsing I want a cumulative averaging, stored in an array that can be output. I.e. 546/NR = 546 (546+344)/NR=(546+344)/2 = etc. For N record input I want N values of the average (a block averaging effectively) Any... (3 Replies)
Discussion started by: chrisjorg
3 Replies

6. Shell Programming and Scripting

Calculating average with awk

I need to find the average from a file like: data => BW:123 M:30 RTD:0 1 0 1 0 0 1 1 1 1 0 0 1 1 0' data => BW:123 N:30 RTD:0 1 0 1 0 0 1 1 1 1 0 0 1 1 0' data => BW:123 N:30 RTD:0 1 0 1 0 0 1 1 1 1 0 0 1 1 0' data => BW:123 N:30 RTD:0 1 0 1 0 0 1 1 1 1 0 0 1 1 0' data => BW:123 N:30 RTD:0 1... (4 Replies)
Discussion started by: Slagle
4 Replies

7. Shell Programming and Scripting

Calculate Average AWK

I want to calculate the average line by line of some files with several lines on them, the files are identical, just want to average the 3rd columns of those files.:wall: Example file: File 1 001 0.046 0.667267 001 0.047 0.672028 001 0.048 0.656025 001 0.049 ... (2 Replies)
Discussion started by: AriasFco
2 Replies

8. Shell Programming and Scripting

AWK novice - calculate the average

Hi, I have the following data in a file for example: P1 XXXXXXX.1 YYYYYYY.1 ZZZ.1 P1 XXXXXXX.2 YYYYYYY.2 ZZZ.2 P1 XXXXXXX.3 YYYYYYY.3 ZZZ.3 P1 XXXXXXX.4 YYYYYYY.4 ZZZ.4 P1 XXXXXXX.5 YYYYYYY.5 ZZZ.5 P1 XXXXXXX.6 YYYYYYY.6 ZZZ.6 P1 XXXXXXX.7 YYYYYYY.7 ZZZ.7 P1 XXXXXXX.8 YYYYYYY.8 ZZZ.8 P2... (6 Replies)
Discussion started by: alex2005
6 Replies

9. Shell Programming and Scripting

Average in awk

Hi I am looking for an awk script which can compute average of all the fields every 5th line. The file looks: A B C D E F G H I J K L M 1 18 13 14 12 14 13 11 12 12 15 15 15 2 17 17 13 13 13 12 12 11 12 14 15 14 3 16 16 12 12 12 11 11 12 11 16 14 13 4 15 15 11 11 11 12 11 12 11... (6 Replies)
Discussion started by: saint2006
6 Replies

10. Shell Programming and Scripting

how to average in awk

Hi, I have the data like this $1 $2 1 12 2 13 3 14 4 12 5 12 6 12 7 13 8 14 9 12 10 12 i want to compute average of $1 and $2 every 5th line (1-5 and 6-10) Please help me with awk Thank you (4 Replies)
Discussion started by: saint2006
4 Replies
Login or Register to Ask a Question