Output calculations


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Output calculations
# 1  
Old 01-27-2015
Output calculations

Attached are the is original output (zipped file) and a custom file using the awk code below in which the average reads per bait are calculated (average.txt)

Code:
  awk '{if(len==0){last=$4;total=$6;len=1;getline}if($4!=last){printf("%s\t%f\n", last, total/len);last=$4;total=$6;len=1}else{total+=$6;len+=1}}END{printf("%s\t%f\n", last, total/len)}' output.bam.hist.txt > average.txt

Is it possible to output the length of the bait, average # of reads, and the # calculated ...x coverage (3 of reads * 150/length)? Thank you Smilie.

Bait Length Reads Coverage
chr12:112884064-112884217 153 158.20915 155x (158*150)/153
# 2  
Old 01-27-2015
unless you give us a small sample input and output and describe in words whats going on, you are less likely to find help.
# 3  
Old 01-27-2015
The input file is attached. If I run the code below:

Code:
 awk '{if(len==0){last=$4;total=$6;len=1;getline}if($4!=last){printf("%s\t%f\n", last, total/len);last=$4;total=$6;len=1}else{total+=$6;len+=1}}END{printf("%s\t%f\n", last, total/len)}' output.bam.hist.txt > average.txt

a file called average.txt results that combines all the baita that match (chr....) and calculates the average # of reads.

The first three lines of average.txt:
Code:
chr12:112884064-112884217 158.2092
chr12:112888106-112888331 220.5333
chr12:112890983-112891206 228.287

The text is the bait followed by the average # of reads, in the input file each bait is outputted over and over again in different positions and thw script combines the baits that match and calculates the #'s of each.

Is it possible to output the length of the bait, average # of reads, and the # calculated ...x coverage (3 of reads * 150/length)? Thank you Smilie.
Code:
Bait                                Length Reads     Coverage
chr12:112884064-112884217 153 158.20915 155x (158*150)/153

the formula is not needed, I was just trying to show how the coverage is calculated.

Last edited by rbatte1; 01-28-2015 at 08:18 AM.. Reason: Added CODE tags
# 4  
Old 01-27-2015
Your input sample has non-printable CR characters, that obscures the output.
Here is a more straight awk code that also eliminates the CRs
Code:
awk '
function pr() {if (len>0) printf "%s\t%d\t%f\n", last, len, total/len}
{gsub("\r","")} # eliminate CRs
($4!=last) {
  pr()
  last=$4
  total=len=0
}
{total+=$6; len+=1}
END {pr()}
' output.bam.hist.txt > average.txt

---------- Post updated at 04:17 PM ---------- Previous update was at 03:51 PM ----------

You said "3 of reads". Looking at my US keyboard, you probably want "# of reads".
Then append another \t%f (tab character and a floating point field) to the first argument of the printf, and add another argument with the formula:
Code:
function pr() {if (len>0) printf "%s\t%d\t%f\t%f\n", last, len, total/len, total/len*150/len}

# 5  
Old 06-19-2015
Not sure what is wrong. Thank you very much Smilie.

Code:
$ awk '
>  function pr() {if (len>0) printf "%s\t%d\t%f\t%f\n", last, len, total/len}
>  function ar() {if (len>0) printf "%s\t%d\t%f\t%f\n", last, len, total/len, total/len*150/len}
>  {gsub("\r","")} # eliminate CRs
>  ($4!=last) {
>    pr()
>    last=$4
>    total=len=0
>  }
>  {total+=$6; len+=1}
>  END {pr()}
>  ' output.bam.hist.txt > average.txt
awk: cmd. line:2: (FILENAME=output.bam.hist.txt FNR=154) fatal: not enough arguments to satisfy format string
        `%s     %d      %f      %f
'
                  ^ ran out for this one

# 6  
Old 06-19-2015
Quote:
Originally Posted by cmccabe
Not sure what is wrong. Thank you very much Smilie.
Code:
printf "%s\t%d\t%f\t%f\n", last, len, total/len}

1 -> %s <- last
2 -> %d <- len
3 -> %f <- total/len
4 -> %f <- undefined
# 7  
Old 06-19-2015
This command runs: but does not result in the desired output, did I do something wrong? Thank you Smilie.

Code:
awk '
 function pr() {if (len>0) printf "%s\t%d\t%f\n", last, len, total/len}
 function ar() {if (len>0) printf "%s\t%d\t%f\t%f\n", last, len, total/len, total/len*150/len}
 {gsub("\r","")} # eliminate CRs
 ($4!=last) {
   pr()
   last=$4
   total=len=0
 }
 {total+=$6; len+=1}
 END {pr()}
 ' output.bam.hist.txt > average2.txt

Desired output
(in line 1 the equation for $4 would be (158*150) / 153 = 155x
the 150 is a static # that will never change
Code:
chr12:112884064-112884217 153 158.20915 155x
chr12:112888106-112888331 225 220.533333 147x
chr12:112890983-112891206 223 228.286996 153x

Thank you Smilie.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Loop doing calculations

Hello. I'm writing an awk script that looks at a .csv file and calculates the weighted grade for each student based on the scores and categories in the file. I am able to get the script to run the only issue however is that the same score for each student is the same. I'm self-teaching myself the... (1 Reply)
Discussion started by: Eric7giants
1 Replies

2. Shell Programming and Scripting

Number calculations

I'm writing a script that will read all the fields of a text file into an array(if they are numeric), while at the same time computing the minimum and maximum values from the file. After that I want to output the average of all the numbers in the array. The first problem I'm having is that many... (10 Replies)
Discussion started by: ksmarine1980
10 Replies

3. Shell Programming and Scripting

Problem with calculations

grep Quality abc.txt | awk -F"=" '{print $2}' o/p is given as 70/70 49/70 I want in the below format (percentage format) 100% 70% help me!!!!:confused::confused::confused: ---------- Post updated at 09:59 AM ---------- Previous update was at 09:57 AM ---------- Cell 01 -... (3 Replies)
Discussion started by: nikhil jain
3 Replies

4. UNIX for Dummies Questions & Answers

Doing calculations with bc on one field

Hello, I have to turn: Apple Inc.:325,64:329,57 into Apple Inc.:325,64:329,57:3,93 3,93=329,57-325,64. My code: cat beurs.txt | sed 's/\(*\):\(*\),*\(*\):\(*\),\(*\)/\4\.\5-\2\.\3/' beurs.txt | bc| tr '.' ',' | sed 's/^-*,/0,/' > winstmarges.txt; paste -d: beurs.txt winstmarges.txt; rm... (5 Replies)
Discussion started by: ikke008
5 Replies

5. UNIX for Dummies Questions & Answers

help with doing calculations on data

Dear All, I have a long list like this: 337 375 364 389 443 578 1001 20100 . . . . etc I would like to substract each value from the first entry which in this case is 337 and report it in a separate column. So the expected output looks like 337 0 (10 Replies)
Discussion started by: pawannoel
10 Replies

6. Shell Programming and Scripting

calculations in bash

HI i have following problem, i need to use split command to split files each should be cca 700 lines but i dont know how to inplement it in the scripts becasuse each time the origin file will be various size , any body got any idea cheers (2 Replies)
Discussion started by: kvok
2 Replies

7. UNIX for Dummies Questions & Answers

Date Calculations

I need to be able to use the current date and calculate 7 days ago to be stored in another variable to be passed to a file in my Unix shell script. I need the date in the following format: date '+%m/%d/%Y' or 05/16/2006 How do I calculate date minus 7 days or 1 week ago? (8 Replies)
Discussion started by: mitschcg
8 Replies

8. Shell Programming and Scripting

ksh, calculations using bc

hi all, was wondering if there is another way to do calculations in ksh scripts other than using bc ?? i am using a script to calculate average response time and my script errors out after running for a bit. e.g code i am using : averageTime=$(print "$totalTime / $numberOfEntries" |... (2 Replies)
Discussion started by: cesarNZ
2 Replies

9. UNIX for Dummies Questions & Answers

Time Calculations

I'm trying to have a loop print out statistics every X number of seconds. How can I add a specific number of seconds to a time variable and make a comparison? Thanks ahead of time. For example: startTime = `date +%H%M%S` currentTime = $startTime executeTime = startTime + X # X is equal... (5 Replies)
Discussion started by: Nysif Steve
5 Replies

10. UNIX for Dummies Questions & Answers

Float calculations

As expr is used for integer calculations, which command is used for float calculations. (1 Reply)
Discussion started by: sharmavr
1 Replies
Login or Register to Ask a Question