awk to change value of field using multiple conditions


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk to change value of field using multiple conditions
# 1  
Old 08-02-2016
awk to change value of field using multiple conditions

In the below awk in the first step I default Classification NF-1 to VUS. Next, I am trying to change the value of Classification (NF) to whatever CLINSIG (NF-1) is. If there is only one condition everything works great, but if there are two conditions it does not work. Is the syntax used incorrect or how do I fix the awk? Thank you Smilie.

input
Code:
Chr    Start    End    Ref    Alt    Func.refGene    PopFreqMax    CLINSIG    Classification
chr1    43395635    43395635    C    T    exonic    0.12    Benign    VUS
chr1    43396414    43396414    G    A    exonic    0.14    Benign    VUS
chr1    172410967    172410967    G    A    exonic    0.66    VUS
chr1    172411496    172411496    A    G    exonic    1    VUS
chr2    51254901    51254901    G    A    exonic    0.48    Likely Benign    VUS
chr2    51254914    51254914    C    T    exonic    0.0023    VUS

awk for step 1
Code:
awk 'BEGIN{OFS="\t"} NR>1{$(NF+1)="VUS"} 1' input > out

awk for step 2
Code:
awk -v OFS='\t' '$(NF-1)=="Benign" || $(NF-1)=="Likely Benign" {$(NF)=$(NF-1)} {print $0 }' out > final

desired output
Code:
Chr    Start    End    Ref    Alt    Func.refGene    PopFreqMax    CLINSIG    Classification
chr1    43395635    43395635    C    T    exonic    0.12    Benign    VUS     Benign
chr1    43396414    43396414    G    A    exonic    0.14    Benign    VUS     Benign
chr1    172410967    172410967    G    A    exonic    0.66    VUS
chr1    172411496    172411496    A    G    exonic    1    VUS
chr2    51254901    51254901    G    A    exonic    0.48    Likely Benign    VUS     Likely Benign
chr2    51254914    51254914    C    T    exonic    0.0023    VUS


Last edited by cmccabe; 08-02-2016 at 02:50 PM.. Reason: add desired output
# 2  
Old 08-02-2016
Hello cmccabe,

Could you please try following and let me know if this helps you.
Code:
awk -v OFS='\t' '$(NF-1)=="Benign" || ($(NF-2) OFS $(NF-1))=="Likely Benign" {$(NF+1)=$(NF-2) OFS $(NF-1)} {print $0 }'  Input_file

So problem in your code was like you missed there $(NF-1)'s value will NOT be Likely Benign because by default awk's field seprator is a SPACE so that condition will never be true, because value Likely Benign should be equal to $(NF-2) OFS $(NF-1). Please try above and let me know if this helps you.

Thanks,
R. Singh
This User Gave Thanks to RavinderSingh13 For This Post:
# 3  
Old 08-02-2016
Code:
awk -F"\t" 'NR>1{$0=$0 FS "VUS" (($NF=="Benign" || $NF=="Likely Benign") ? (FS $NF) : "")} 1' file

This User Gave Thanks to rdrtx1 For This Post:
# 4  
Old 08-03-2016
Thank you both Smilie

---------- Post updated 08-03-16 at 10:06 AM ---------- Previous update was 08-02-16 at 03:22 PM ----------
# 5  
Old 08-03-2016
I guess I do not understand hoe NF works. My actual data file is attached and is much larger (file.txt).

awk to get NF
Code:
awk 'NR==1{for(i=1;i<=NF;i++){print "Number of field in terms of NF is--> NF-" NF-i", value is-->" $i}}' file.txt

Number of field in terms of NF is--> NF-55, value is-->Chr
Number of field in terms of NF is--> NF-54, value is-->Start
Number of field in terms of NF is--> NF-53, value is-->End
Number of field in terms of NF is--> NF-52, value is-->Ref
Number of field in terms of NF is--> NF-51, value is-->Alt
Number of field in terms of NF is--> NF-50, value is-->Func.refGene
Number of field in terms of NF is--> NF-49, value is-->Gene.refGene
Number of field in terms of NF is--> NF-48, value is-->GeneDetail.refGene
Number of field in terms of NF is--> NF-47, value is-->ExonicFunc.refGene
Number of field in terms of NF is--> NF-46, value is-->AAChange.refGene
Number of field in terms of NF is--> NF-45, value is-->avsnp147
Number of field in terms of NF is--> NF-44, value is-->PopFreqMax
Number of field in terms of NF is--> NF-43, value is-->1000G_ALL
Number of field in terms of NF is--> NF-42, value is-->1000G_AFR
Number of field in terms of NF is--> NF-41, value is-->1000G_AMR
Number of field in terms of NF is--> NF-40, value is-->1000G_EAS
Number of field in terms of NF is--> NF-39, value is-->1000G_EUR
Number of field in terms of NF is--> NF-38, value is-->1000G_SAS
Number of field in terms of NF is--> NF-37, value is-->ExAC_ALL
Number of field in terms of NF is--> NF-36, value is-->ExAC_AFR
Number of field in terms of NF is--> NF-35, value is-->ExAC_AMR
Number of field in terms of NF is--> NF-34, value is-->ExAC_EAS
Number of field in terms of NF is--> NF-33, value is-->ExAC_FIN
Number of field in terms of NF is--> NF-32, value is-->ExAC_NFE
Number of field in terms of NF is--> NF-31, value is-->ExAC_OTH
Number of field in terms of NF is--> NF-30, value is-->ExAC_SAS
Number of field in terms of NF is--> NF-29, value is-->ESP6500siv2_ALL
Number of field in terms of NF is--> NF-28, value is-->ESP6500siv2_AA
Number of field in terms of NF is--> NF-27, value is-->ESP6500siv2_EA
Number of field in terms of NF is--> NF-26, value is-->CG46
Number of field in terms of NF is--> NF-25, value is-->dpsi_max_tissue
Number of field in terms of NF is--> NF-24, value is-->dpsi_zscore
Number of field in terms of NF is--> NF-23, value is-->SIFT_score
Number of field in terms of NF is--> NF-22, value is-->SIFT_pred
Number of field in terms of NF is--> NF-21, value is-->Polyphen2_HDIV_score
Number of field in terms of NF is--> NF-20, value is-->Polyphen2_HDIV_pred
Number of field in terms of NF is--> NF-19, value is-->Polyphen2_HVAR_score
Number of field in terms of NF is--> NF-18, value is-->Polyphen2_HVAR_pred
Number of field in terms of NF is--> NF-17, value is-->LRT_score
Number of field in terms of NF is--> NF-16, value is-->LRT_pred
Number of field in terms of NF is--> NF-15, value is-->MutationTaster_score
Number of field in terms of NF is--> NF-14, value is-->MutationTaster_pred
Number of field in terms of NF is--> NF-13, value is-->MutationAssessor_score
Number of field in terms of NF is--> NF-12, value is-->MutationAssessor_pred
Number of field in terms of NF is--> NF-11, value is-->CLINSIG
Number of field in terms of NF is--> NF-10, value is-->CLNDBN
Number of field in terms of NF is--> NF-9, value is-->CLNACC
Number of field in terms of NF is--> NF-8, value is-->CLNDSDB
Number of field in terms of NF is--> NF-7, value is-->CLNDSDBID
Number of field in terms of NF is--> NF-6, value is-->Quality
Number of field in terms of NF is--> NF-5, value is-->Reads
Number of field in terms of NF is--> NF-4, value is-->Zygosity
Number of field in terms of NF is--> NF-3, value is-->Phred
Number of field in terms of NF is--> NF-2, value is-->Classification
Number of field in terms of NF is--> NF-1, value is-->HGMD
Number of field in terms of NF is--> NF-0, value is-->Sanger

I tried the below awk to produced the attached desired output, which is just "VUS" in the Classification or NF-2 field. Currently I get a result with the data all out of order (attached current.txt. Thank you Smilie.

Code:
awk 'BEGIN{OFS="\t"} NR>1{$(NF-2)="VUS"} 1' file.txt > VUS


Last edited by cmccabe; 08-03-2016 at 12:48 PM.. Reason: added current results
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk to change value in field according to another

I am trying to use awk to check if each $2 in file1 falls between $2 and $3 of the matching $4 line of file2. If it does then in $5 of file2, exon if it does not intron. I think the awk below will do that, but I am struggling trying to is add a calculation that if the difference is less than 10,... (27 Replies)
Discussion started by: cmccabe
27 Replies

2. Shell Programming and Scripting

awk to assign points to variables based on conditions and update specific field

I have been reading old posts and trying to come up with a solution for the below: Use a tab-delimited input file to assign point to variables that are used to update a specific field, Rank. I really couldn't find too much in the way of assigning points to variable, but made an attempt at an awk... (4 Replies)
Discussion started by: cmccabe
4 Replies

3. Shell Programming and Scripting

awk to match field between two files and use conditions on match

I am trying to look for $2 of file1 (skipping the header) in $2 of file2 (skipping the header) and if they match and the value in $10 is > 30 and $11 is > 49, then print the line from file1 to a output file. If no match is foung the line is not printed. Both the input and output are tab-delimited.... (3 Replies)
Discussion started by: cmccabe
3 Replies

4. Shell Programming and Scripting

Using awk to parse multiple conditions

There are 4 ways the user can input data and unfortunately the parse rules for each are slightly different. The first condition works great and the input file is attached for the second condition. Conditions 3 and 4 will follow I'm sure I will have trouble with them and need help as well. The... (9 Replies)
Discussion started by: cmccabe
9 Replies

5. Shell Programming and Scripting

awk multiple search and if conditions

Hi I wanted to search for 2 patterns. These patterns are matched only if the if condition is matched for example: This is the kind of command that I have in mind which is obviously not correct: awk '/abc/ if ($1>10) {print);/xyz/ if ($2>5) {print)' myfile myfile: 12 14 3 20 45 abc 21 ... (7 Replies)
Discussion started by: zorrox
7 Replies

6. Shell Programming and Scripting

awk :how to change delimiter without giving all field name

Hi Experts, i need to change delimiter from tab to "," sample test file cat test A0000368 A29938511 072569352 5 Any 2 for £1.00 BUTCHERS|CAT FOOD|400G Sep 12 2012 12:00AM Jan 5 2014 11:59PM Sep 7 2012 12:00AM M 2.000 group 5 ... (2 Replies)
Discussion started by: Lakshman_Gupta
2 Replies

7. Shell Programming and Scripting

specifying multiple conditions in AWK

how can i specify more than 1 consition in the following AWK statament?? i.e. if $2 is ABCD and $3 is MNOP and $4 is KLPM similarly for OR #!/bin/ksh awk -F '' ' $2 == "ABCD" { print $2, $3;}' file.xml (2 Replies)
Discussion started by: skyineyes
2 Replies

8. Shell Programming and Scripting

awk,cut fields by change field format

Hi Everyone, # cat 1.txt 1321631,77770132976455,19,20091001011859,20091001011907 1321631,77770132976455,19,20091001011859,20091001011907 1321631,77770132976455,19,20091001011859,20091001011907 # cat 1.txt | awk -F, '{OFS=",";print $1,$3,$4,$5}' 1321631,19,20091001011859,20091001011907... (7 Replies)
Discussion started by: jimmy_y
7 Replies

9. Shell Programming and Scripting

dynamically change awk Field Separator FS

Hi All, I was wondering if anyone knew how to dynamically change the FS in awk to accept vairiable containing a field separator. the current code is as below and does not work when i introduce the dynamic FS change :-( validate_source_file() { source_file=$1 ... (2 Replies)
Discussion started by: satnamx
2 Replies

10. Shell Programming and Scripting

change field content awk

I have a line like this: I want to move HTTP/1.1 200 OK to the next line and put a blank line between the two lines i.e. How can i get it using awk? Thanks in advance (2 Replies)
Discussion started by: littleboyblu
2 Replies
Login or Register to Ask a Question