awk to output the percentage of a field compared to length Post: 302956520

Sponsored Content

Top Forums Shell Programming and Scripting awk to output the percentage of a field compared to length Post 302956520 by cmccabe on Wednesday 30th of September 2015 04:10:17 PM

09-30-2015

Registered User

awk to output the percentage of a field compared to length

The awk below using the sample input would output the following: Basically, it averages the text in $5 that matches if $7 < 30 .

Code:

awk '{if(len==0){last=$5;total=$7;len=1;getline}if($5!=last){printf("%s\t%f\n", last, total/len);last=$5;total=$7;len=1}else{total+=$7;len+=1}}END{printf("%s\t%f\n", last, total/len)}' Input.txt > output.txt

Sample Input

Code:

chr 1   955542  955763  +   AGRN:exon.1 1   0 
chr1   955542  955763  +   AGRN:exon.1 2   0 
chr 1   955542  955763  +   AGRN:exon.1 3   0 
chr 1   955542  955763  +   AGRN:exon.1 4   1 
chr 1   955542  955763  +   AGRN:exon.1 5   1 
chr 1   955542  955763  +   AGRN:exon.1 6   1 
.... 
.... 
chr 1   955542  955763  +   AGRN:exon.1 218 32 
chr 1   955542  955763  +   AGRN:exon.1 219 32 
chr 1   955542  955763  +   AGRN:exon.1 220 32 
chr 1   955542  955763  +   AGRN:exon.1 221 29

Output

Code:

 AGRN:exon.1 4.5714285

My question is I can not seem to add the correct syntax that will also output the total # of lines in $6 that represent $5 and the % of 7 < 30 I know my words may not be all that helpful so hopefully the desired output will help. Thank you Smilie

.

Desired output

Code:

  
ID             Average Reads      % of Baits 
AGRN:exon.1    4.5714285          3.16742     (221 (# of lines in $6 /   the # 0f lines < 30 in $7)

the boild is only to show the math and does not need rto be included.

Last edited by cmccabe; 09-30-2015 at 05:57 PM.. Reason: fixed formatting; [mod] extra code tags, added details

cmccabe

View Public Profile for cmccabe

Find all posts by cmccabe

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

creating a fixed length output from a variable length input

Is there a command that sets a variable length? I have a input of a variable length field but my output for that field needs to be set to 32 char. Is there such a command? I am on a sun box running ksh Thanks

2. Shell Programming and Scripting

perl or awk, field length check

Hi Everyone, 1.txt a;1234;134;1111111 b;123;123;1111111 c;123;1334;1111111 d;1234;1234;1111111 output a;1234;134;1111111 c;123;1334;1111111 d;1234;1234;1111111 if field2 legth>3 or field3 length >3, then output. Please advice. Thanks

3. Shell Programming and Scripting

awk output field delimiter

Dear All, 1.txt (tab in between each value in a line) a b c a b c a c d you can see below, why with ~ i can output with tab, but = cannot? # awk -F'\t' '$2 ~ /b/' 1 a b c a b c # awk -F'\t' '$2 = "b"' 1 a b c a b c a b d ...

4. Shell Programming and Scripting

Flat file-make field length equal to header length

Hello Everyone, I am stuck with one issue while working on abstract flat file which i have to use as input and load data to table. Input Data- ------ ------------------------ ---- ----------------- WFI001 Xxxxxx Control Work Item A Number of Records ------ ------------------------...

5. UNIX for Dummies Questions & Answers

awk - output field separator

In awk, how do I print all fields with a specified output field separator? I have tried the following, which does not print the output FS: echo a b c d | awk 'BEGIN{OFS = ";"}{print $0}'

6. Shell Programming and Scripting

Replace a field with a character as per the field length

Hi all, I have a requirement to replace a field with a character as per the length of the field. Suppose i have a file where second field is of 20 character length. I want to replace second field with 20 stars (*). like ******************** As the field is not a fixed one, i want to do the...

7. Shell Programming and Scripting

Need an awk script to calculate the percentage of value field and replace

Need an awk script to calculate the percentage of value field and replace I have a input file called file.txt with the following content: john|622.5674603562933|8|br:1;cn:3;fr:1;jp:1;us:2 andy|0.0|12|**:3;br:1;ca:2;de:2;dz:1;fr:2;nl:1 in fourth filed of input file, calulate percentage of each...

8. UNIX for Dummies Questions & Answers

Need an awk script to calculate the percentage of value field and replace

I have a input file called file.txt with the following content: john|622.5674603562933|8|br:1;cn:3;fr:1;jp:1;us:2 andy|0.0|12|**:3;br:1;ca:2;de:2;dz:1;fr:2;nl:1 in fourth filed of input file, calulate percentage of each sub filed seperated by ; semicolon and replace value with percentage . i...

9. UNIX for Dummies Questions & Answers

Help with awk, where line length and field position are variable

I have several questions about using awk. I'm hoping someone could lend me a hand. (I'm also hoping that my questions make sense.) I have a file that contains pipe separated data. Each line has similar data but the number of fields and the field position on each line is variable. ...

10. Shell Programming and Scripting

Using awk to add length of matching characters between field in file

The awk below produces the current output, which will add +1 to $3. However, I am trying to add the length of the matching characters between $5 and $6 to $3. I have tried using sub as a variable to store the length but am not able to do so correctly. I added comments to each line and the...

LEARN ABOUT V7

join

JOIN(1) 						      General Commands Manual							   JOIN(1)

NAME

       join - relational database operator

SYNOPSIS

       join [ options ] file1 file2

DESCRIPTION

       Join  forms,  on the standard output, a join of the two relations specified by the lines of file1 and file2.  If file1 is `-', the standard
       input is used.

       File1 and file2 must be sorted in increasing ASCII collating sequence on the fields on which they are to be joined, normally the  first	in
       each line.

       There  is  one line in the output for each pair of lines in file1 and file2 that have identical join fields.  The output line normally con-
       sists of the common field, then the rest of the line from file1, then the rest of the line from file2.

       Fields are normally separated by blank, tab or newline.	In this case, multiple separators count as one, and leading  separators  are  dis-
       carded.

       These options are recognized:

       -an    In addition to the normal output, produce a line for each unpairable line in file n, where n is 1 or 2.

       -e s   Replace empty output fields by string s.

       -jn m  Join on the mth field of file n.	If n is missing, use the mth field in each file.

       -o list
	      Each  output line comprises the fields specifed in list, each element of which has the form n.m, where n is a file number and m is a
	      field number.

       -tc    Use character c as a separator (tab character).  Every appearance of c in a line is significant.

SEE ALSO

       sort(1), comm(1), awk(1)

BUGS

       With default field separation, the collating sequence is that of sort -b; with -t, the sequence is that of a plain sort.

       The conventions of join, sort, comm, uniq, look and awk(1) are wildly incongruous.

																	   JOIN(1)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

creating a fixed length output from a variable length input

Discussion started by: r1500

2. Shell Programming and Scripting

perl or awk, field length check

Discussion started by: jimmy_y

3. Shell Programming and Scripting

awk output field delimiter

Discussion started by: jimmy_y

4. Shell Programming and Scripting

Flat file-make field length equal to header length

Discussion started by: sonali.s.more

5. UNIX for Dummies Questions & Answers

awk - output field separator

Discussion started by: locoroco

6. Shell Programming and Scripting

Replace a field with a character as per the field length

Discussion started by: gani_85

7. Shell Programming and Scripting

Need an awk script to calculate the percentage of value field and replace

Discussion started by: veeruasu

8. UNIX for Dummies Questions & Answers

Need an awk script to calculate the percentage of value field and replace

Discussion started by: veeruasu

9. UNIX for Dummies Questions & Answers

Help with awk, where line length and field position are variable

Discussion started by: Cheese64

10. Shell Programming and Scripting

Using awk to add length of matching characters between field in file

Discussion started by: cmccabe

LEARN ABOUT V7

join