Add specific string to last field of each line in perl based on value


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Add specific string to last field of each line in perl based on value
# 8  
Old 06-29-2016
Hi, cmccabe.

The following liner can be in three major parts
Code:
perl -ple '   # -p will print every line in a loop.
              # -l will remove the end of line and add it after each print.
              # -e will tell the binary perl to interpret the following as code.
BEGIN{%h=qw(0/0 hom 0/1 het 1/1 hom 1/2 het 2/2 hom)}  # Before any line is processed, create a lookup hash table
 /([0-2]\/[0-2])/ and $_ .=" $h{$1}"   # capture any combination of digit/digit from 0 to 2 inclusive
                                       # and use it as the index to the lookup hash table {$1} to append to the current line
' input > results.txt

However, I do believe you are a bit confused, there is no calculation of STB in this liner. You used another command previously to calculate that. Please, see here

In fact, the latest posted input, will not produced the previously shown achieved output. You run that command over the input shown in post number 1 of this thread.

If you were to run the following modified version with the input posted in #1, you might get what you want.
Code:
perl -plae 'BEGIN{%h=qw(0/0 hom 0/1 het 1/1 hom 1/2 het 2/2 hom)} /([0-2]\/[0-2])/ and $_ .=join ("\t", "", $h{$1}, int($F[5]/33+0.5), "score")' inputs > results.txt

I am going to leave it up to you to combine both commands, if you want to apply to the input shown in the following thread.
However, I am going to explain as well the additions highlighted in red.
Code:
-a  # creates an array @F for each line separating it by blank spaces.
 $_ .=join ("\t", "", $h{$1}, int($F[5]/33+0.5), "score") # insert a tab to the list of elements passed as argument to join, append to the current line.
int($F[5]/33+0.5) # use the value in the 5th column (starts at zero) to make the calculation you asked for.


Last edited by Aia; 06-29-2016 at 12:51 AM..
This User Gave Thanks to Aia For This Post:
# 9  
Old 06-29-2016
My attempt at combining the two threads into one perl code:

Code:
perl -plae 'BEGIN{%h=qw(0/0 hom 0/1 het 1/1 hom 1/2 het 2/2 hom)} /([0-2]\/[0-2])/ and $_ .=join ("\t", "", $h{$1}, int($F[5]/33+0.5)) and $_ .=join ("\t", "", /^[^#].*FDP=(\d+);.*STB=(\d+\.\d+);/ and $_.=($2 >= 0.8?" STRAND BIAS ":" GOOD ")).$1."' input > result
Can't find string terminator '"' anywhere before EOF at -e line 1.

I appreciate all your help and explanations Smilie

Last edited by cmccabe; 06-29-2016 at 06:53 PM.. Reason: added edit
# 10  
Old 06-29-2016
Code:
perl -plae '
    BEGIN{ %h = qw(0/0 hom 0/1 het 1/1 hom 1/2 het 2/2 hom) }
    /^[^#].*FDP=(\d+);.*STB=(\d+\.\d+);.*([0-2]\/[0-2])/ and
    $_ .= join "\t", ("", ($2 >= 0.8 ? "STRAND BIAS" : "GOOD"), $1, "reads", $h{$3}, int($F[5]/33+0.5), "score")
' input > result

This User Gave Thanks to Aia For This Post:
# 11  
Old 06-30-2016
Why does the lines in bold repeat? I have tried different combinations with no luck in figuring it out. The lines in italics are correct, but the lines in bold are at the beginning and do not need to be there. Thank you very much Smilie.

Code:
perl -plae '
    BEGIN{ %h = qw(0/0 hom 0/1 het 1/1 hom 1/2 het 2/2 hom) }
    /^[^#].*FDP=(\d+);.*STB=(\d+\.\d+);.*([0-2]\/[0-2])/ and
    $_ .= join "\t", ("", ($2 >= 0.8 ? "STRAND BIAS" : "GOOD"), $1, $h{$3}, int($F[5]/33+0.5))
' input > result

result
Code:
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality, the Phred-scaled marginal (or unconditional) probability of the called genotype">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Read Depth">
#CHROM    POS    ID    REF    ALT    QUAL    FILTER    INFO    FORMAT    SAMPLE
chr1    9324670    .    A    G    672.016    PASS    AF=0.528369;AO=148;DP=281;FAO=149;FDP=282;FR=.;FRO=133;FSAF=59;FSAR=90;FSRF=60;FSRR=73;FWDB=0.00343606;FXX=0;HRUN=1;LEN=1;MLLD=155.207;OALT=G;OID=.;OMAPALT=G;OPOS=9324670;OREF=A;PB=0.5;PBP=1;QD=9.53214;RBI=0.00594431;REFB=-0.0181827;REVB=0.00485061;RO=130;SAF=59;SAR=89;SRF=57;SRR=73;SSEN=0;SSEP=0;SSSB=-0.0352973;STB=0.526882;STBP=0.323;TYPE=snp;VARB=0.0184938;ANN=H6PD    GT:GQ:DP:FDP:RO:FRO:AO:FAO:AF:SAR:SAF:SRF:SRR:FSAR:FSAF:FSRF:FSRR    0/1:527:281:282:130:133:148:149:0.528369:89:59:57:73:90:59:60:73 GOOD 282 reads    GOOD    282    het    20
chr1    10318652    .    C    G    360.217    PASS    AF=0.566929;AO=72;DP=129;FAO=72;FDP=127;FR=.;FRO=55;FSAF=36;FSAR=36;FSRF=31;FSRR=24;FWDB=0.00760676;FXX=0.0155027;HRUN=2;LEN=1;MLLD=115.62;OALT=G;OID=.;OMAPALT=G;OPOS=10318652;OREF=C;PB=0.5;PBP=1;QD=11.3454;RBI=0.0125905;REFB=-0.0312889;REVB=-0.0100329;RO=55;SAF=36;SAR=36;SRF=31;SRR=24;SSEN=0;SSEP=0;SSSB=-0.0505108;STB=0.527551;STBP=0.492;TYPE=snp;VARB=0.0181889;ANN=KIF1B    GT:GQ:DP:FDP:RO:FRO:AO:FAO:AF:SAR:SAF:SRF:SRR:FSAR:FSAF:FSRF:FSRR    1/1:203:129:127:55:55:72:72:0.566929:36:36:31:24:36:36:31:24 GOOD 127 reads    GOOD    127    hom    11
chr1    10355834    .    C    T    504.995    PASS

# 12  
Old 06-30-2016
Please, verify that your input does not terminate with the string GOOD number reads, already. The shown Perl run instance is not responsible for the highlighted bold string that you posted.
This User Gave Thanks to Aia For This Post:
# 13  
Old 07-01-2016
I didn't even notice that.... thank you very much Smilie.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Update a specific field in file with Variable value based on other Key Word

I have an input file with A=xyz B=pqr I would want the value in Second Field (xyz or pqr) updated with a value present in Shell Variable based on the value passed in the first field. (A or B ) while read line do NEW_VALUE = `some functionality done on $line` If $line=First Field-... (1 Reply)
Discussion started by: infernalhell
1 Replies

2. Shell Programming and Scripting

awk to assign points to variables based on conditions and update specific field

I have been reading old posts and trying to come up with a solution for the below: Use a tab-delimited input file to assign point to variables that are used to update a specific field, Rank. I really couldn't find too much in the way of assigning points to variable, but made an attempt at an awk... (4 Replies)
Discussion started by: cmccabe
4 Replies

3. Shell Programming and Scripting

Perl to update field based on a specific set of rules

In the perl below, which does execute, I am having trouble with the else in Rule 3. The digit in f{8} is extracted and used to update f accordinly along with the value in f. There can be either - * or + before the number that is extracted but the same logic applies, that is if the value is greater... (5 Replies)
Discussion started by: cmccabe
5 Replies

4. Shell Programming and Scripting

File Parsing based on a character in a specific field

Hi All, I'm having a hard time finding a starting point for my issue. I have a 30k line file (fspsec.txt) that I would like to parse into smaller files based on any character existing in field 1. ACCOUNTANT LEVEL 1 (ACCT.ACCOUNTANT) OPERATORS: DOEJO (418) TOOLS: Branch Maintenance ... (2 Replies)
Discussion started by: aahlrich
2 Replies

5. Shell Programming and Scripting

Replace and add line in file with line in another file based on matching string

Hi, I want to achieve something similar to what described in another post: The difference is I want to add the line if the pattern is not found. File 1: A123, valueA, valueB B234, valueA, valueB C345, valueA, valueB D456, valueA, valueB E567, valueA, valueB F678, valueA, valueB ... (11 Replies)
Discussion started by: jyu3
11 Replies

6. Shell Programming and Scripting

Combine multiple lines in file based on specific field

Hi, I have an issue to combine multiple lines of a file. I have records as below. Fields are delimited by TAB. Each lines are ending with a new line char (\n) Input -------- ABC 123456 abcde 987 890456 7890 xyz ght gtuv ABC 5tyin 1234 789 ghty kuio ABC ghty jind 1234 678 ght ... (8 Replies)
Discussion started by: ratheesh2011
8 Replies

7. Shell Programming and Scripting

Replace specific field on specific line sed or awk

I'm trying to update a text file via sed/awk, after a lot of searching I still can't find a code snippet that I can get to work. Brief overview: I have user input a line to a variable, I then find a specific value in this line 10th field in this case. After asking for new input and doing some... (14 Replies)
Discussion started by: crownedzero
14 Replies

8. Shell Programming and Scripting

Using awk to read a specific line and a specific field on that line.

Say the input was as follows: Brat 20 x 1000 32rf Pour 15 p 1621 05pr Dart 10 z 1111 22xx My program prompts for an input, what I want is to use the input to locate a specific field. Like if I type in, "Pou" then it would return "Pour" and just "Pour" I currently have this line but it is... (6 Replies)
Discussion started by: Bungkai
6 Replies

9. Shell Programming and Scripting

Deleting a line from a file based on one specific string instance?

Hello! I need to delete one line in a file which matches one very precise instance of a string only. When searching the forum I unfortunately only found a solution which would delete each line on which a particular string occurs. Let's assume I have a file composed of thousands of lines... (4 Replies)
Discussion started by: Black Sun
4 Replies

10. Shell Programming and Scripting

using sed to replace a specific string on a specific line number using variables

using sed to replace a specific string on a specific line number using variables this is where i am at grep -v WARNING output | grep -v spawn | grep -v Passphrase | grep -v Authentication | grep -v '/sbin/tfadmin netguard -C'| grep -v 'NETWORK>' >> output.clean grep -n Destination... (2 Replies)
Discussion started by: todd.cutting
2 Replies
Login or Register to Ask a Question