awk to add lines with symbol to output file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk to add lines with symbol to output file
# 1  
Old 06-20-2017
awk to add lines with symbol to output file

In the awk below which does execute I get output that is close, except for all the lines that start with a # are removed. Some lines have one others two or three and after the script adds the
ID= to the fields below the pattern in the awk, I can not seem to add the # lines back to the output. Thank you Smilie.

awk
Code:
awk -v OFS="\t" '/^#CHROM/{00-0000-xx r=NR }r && NR>r{ $11="ID1="$11; $12="ID2="$12; $13="ID3="$13; $14="ID4="$14; print }' file

file
Code:
##INFO=<ID=ANN,Number=1,Type=Integer,Description="My custom annotation">
##source_20170530.1=vcf-annotate(r953) -a /home/cmccabe/Desktop/NGS/bed/vcf/annotations.bed.gz -d key=INFO,ID=ANN,Number=1,Type=Integer,Description=My custom annotation -c CHROM,FROM,TO,INFO/ANN
##INFO=<ID=,Number=A,Type=Float,Description="Variant quality">
###INFO=<ID=,Number=A,Type=Float,Description="Variant quality">
###INFO=<ID=ID1,Type=Integer,Description="Variant quality">
###INFO=<ID=ID2,Type=String,Description="Reads">
###INFO=<ID=ID3,Type=String,Description="Zygosity">
###INFO=<ID=ID4,Type=Integer,Description="Score">
#CHROM    POS    ID    REF    ALT    QUAL    FILTER    INFO    FORMAT    00-0000-xx
chr1    948846    .    T    TA    529.927    PASS    AF=0.970874;AO=97;DP=106;FAO=100;FDP=103;FR=.;FRO=3;FSAF=52;FSAR=48;FSRF=3;FSRR=0;FWDB=-0.0127942;FXX=0.00961446;HRUN=1;LEN=1;MLLD=26.521;OALT=A;OID=.;OMAPALT=TA;OPOS=948847;OREF=-;PB=.;PBP=.;QD=20.5797;RBI=0.0732214;REFB=0.0962764;REVB=0.0720949;RO=7;SAF=49;SAR=48;SRF=6;SRR=1;SSEN=0;SSEP=0;SSSB=-0.0448565;STB=0.514016;STBP=0.111;TYPE=ins;VARB=-0.0047395    GT:GQ:DP:FDP:RO:FRO:AO:FAO:AF:SAR:SAF:SRF:SRR:FSAR:FSAF:FSRF:FSRR:QT    1/1:90:106:103:7:3:97:100:0.970874:48:49:6:1:48:52:3:0:1    GOOD    103    hom    16
chr1    948870    .    C    G    279.296    PASS    AF=0.482014;AO=67;DP=139;FAO=67;FDP=139;FR=.,REALIGNEDx0.4964;FRO=72;FSAF=34;FSAR=33;FSRF=34;FSRR=38;FWDB=-0.000997446;FXX=0;HRUN=2;LEN=1;MLLD=60.2134;OALT=G;OID=.;OMAPALT=G;OPOS=948870;OREF=C;PB=.;PBP=.;QD=8.0373;RBI=0.00460624;REFB=-0.0184382;REVB=0.00449694;RO=72;SAF=34;SAR=33;SRF=34;SRR=38;SSEN=0;SSEP=0;SSSB=0.0329868;STB=0.518243;STBP=0.7;TYPE=snp;VARB=0.0213678    GT:GQ:DP:FDP:RO:FRO:AO:FAO:AF:SAR:SAF:SRF:SRR:FSAR:FSAF:FSRF:FSRR:QT    0/1:279:139:139:72:72:67:67:0.482014:33:34:34:38:33:34:34:38:1    GOOD    139    het    8

current output
Code:
chr1     948846    .    T    TA    529.927    PASS     AF=0.970874;AO=97;DP=106;FAO=100;FDP=103;FR=.;FRO=3;FSAF=52;FSAR=48;FSRF=3;FSRR=0;FWDB=-0.0127942;FXX=0.00961446;HRUN=1;LEN=1;MLLD=26.521;OALT=A;OID=.;OMAPALT=TA;OPOS=948847;OREF=-;PB=.;PBP=.;QD=20.5797;RBI=0.0732214;REFB=0.0962764;REVB=0.0720949;RO=7;SAF=49;SAR=48;SRF=6;SRR=1;SSEN=0;SSEP=0;SSSB=-0.0448565;STB=0.514016;STBP=0.111;TYPE=ins;VARB=-0.0047395     GT:GQ:DP:FDP:RO:FRO:AO:FAO:AF:SAR:SAF:SRF:SRR:FSAR:FSAF:FSRF:FSRR:QT     1/1:90:106:103:7:3:97:100:0.970874:48:49:6:1:48:52:3:0:1    ID1=GOOD    ID2=103    ID3=hom    ID4=16
chr1     948870    .    C    G    279.296    PASS     AF=0.482014;AO=67;DP=139;FAO=67;FDP=139;FR=.,REALIGNEDx0.4964;FRO=72;FSAF=34;FSAR=33;FSRF=34;FSRR=38;FWDB=-0.000997446;FXX=0;HRUN=2;LEN=1;MLLD=60.2134;OALT=G;OID=.;OMAPALT=G;OPOS=948870;OREF=C;PB=.;PBP=.;QD=8.0373;RBI=0.00460624;REFB=-0.0184382;REVB=0.00449694;RO=72;SAF=34;SAR=33;SRF=34;SRR=38;SSEN=0;SSEP=0;SSSB=0.0329868;STB=0.518243;STBP=0.7;TYPE=snp;VARB=0.0213678     GT:GQ:DP:FDP:RO:FRO:AO:FAO:AF:SAR:SAF:SRF:SRR:FSAR:FSAF:FSRF:FSRR:QT     0/1:279:139:139:72:72:67:67:0.482014:33:34:34:38:33:34:34:38:1    ID1=GOOD    ID2=139    ID3=het    ID4=8

desired output
Code:
##INFO=<ID=ANN,Number=1,Type=Integer,Description="My custom annotation">
##source_20170530.1=vcf-annotate(r953) -a /home/cmccabe/Desktop/NGS/bed/vcf/annotations.bed.gz -d key=INFO,ID=ANN,Number=1,Type=Integer,Description=My custom annotation -c CHROM,FROM,TO,INFO/ANN
##INFO=<ID=,Number=A,Type=Float,Description="Variant quality">
###INFO=<ID=,Number=A,Type=Float,Description="Variant quality">
###INFO=<ID=ID1,Type=Integer,Description="Variant quality">
###INFO=<ID=ID2,Type=String,Description="Reads">
###INFO=<ID=ID3,Type=String,Description="Zygosity">
###INFO=<ID=ID4,Type=Integer,Description="Score">
#CHROM    POS    ID    REF    ALT    QUAL    FILTER    INFO    FORMAT    00-0000-xx
chr1    948846    .    T    TA    529.927    PASS    AF=0.970874;AO=97;DP=106;FAO=100;FDP=103;FR=.;FRO=3;FSAF=52;FSAR=48;FSRF=3;FSRR=0;FWDB=-0.0127942;FXX=0.00961446;HRUN=1;LEN=1;MLLD=26.521;OALT=A;OID=.;OMAPALT=TA;OPOS=948847;OREF=-;PB=.;PBP=.;QD=20.5797;RBI=0.0732214;REFB=0.0962764;REVB=0.0720949;RO=7;SAF=49;SAR=48;SRF=6;SRR=1;SSEN=0;SSEP=0;SSSB=-0.0448565;STB=0.514016;STBP=0.111;TYPE=ins;VARB=-0.0047395    GT:GQ:DP:FDP:RO:FRO:AO:FAO:AF:SAR:SAF:SRF:SRR:FSAR:FSAF:FSRF:FSRR:QT    1/1:90:106:103:7:3:97:100:0.970874:48:49:6:1:48:52:3:0:1    ID1=GOOD    ID2=103    ID3=hom    ID4=16
chr1    948870    .    C    G    279.296    PASS    AF=0.482014;AO=67;DP=139;FAO=67;FDP=139;FR=.,REALIGNEDx0.4964;FRO=72;FSAF=34;FSAR=33;FSRF=34;FSRR=38;FWDB=-0.000997446;FXX=0;HRUN=2;LEN=1;MLLD=60.2134;OALT=G;OID=.;OMAPALT=G;OPOS=948870;OREF=C;PB=.;PBP=.;QD=8.0373;RBI=0.00460624;REFB=-0.0184382;REVB=0.00449694;RO=72;SAF=34;SAR=33;SRF=34;SRR=38;SSEN=0;SSEP=0;SSSB=0.0329868;STB=0.518243;STBP=0.7;TYPE=snp;VARB=0.0213678    GT:GQ:DP:FDP:RO:FRO:AO:FAO:AF:SAR:SAF:SRF:SRR:FSAR:FSAF:FSRF:FSRR:QT    0/1:279:139:139:72:72:67:67:0.482014:33:34:34:38:33:34:34:38:1    ID1=GOOD    ID2=139    ID3=het    ID4=8


Last edited by cmccabe; 06-20-2017 at 11:36 AM.. Reason: fixed format
# 2  
Old 06-20-2017
Try this adaptation:
Code:
awk -v OFS="\t" '!/^#/{for(i=11; i<=14; i++) $i="ID" i "=" $i}1'  file

This User Gave Thanks to Scrutinizer For This Post:
# 3  
Old 06-23-2017
Thank you very much Smilie.
# 4  
Old 06-24-2017
Hello cmccabe,

Same code like Scrutinizer, only difference is following code will add string ID=1,ID=2,ID=3 and ID=4 into the output.
Code:
awk -v OFS="\t" '!/^#/{for(i=11; i<=14; i++) $i="ID" ++q "=" $i;q=""}1'   Input_file

Thanks,
R. Singh
These 2 Users Gave Thanks to RavinderSingh13 For This Post:
# 5  
Old 06-25-2017
Ravinder is correct. It can also be corrected like so:
Code:
awk -v OFS="\t" '!/^#/{for(i=11; i<=14; i++) $i="ID" i-10 "=" $i}1' file

This User Gave Thanks to Scrutinizer For This Post:
# 6  
Old 06-26-2017
Thank you both Smilie.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk to retain header lines in output

The awk below executes and produces the current output, which is correct, except I can not seem to include the header lines # and ## in the output as well. I tried adding !/^#/ thinking that it would skip the lines with # and output them but the entire file prints as is. Thank you :). file ... (8 Replies)
Discussion started by: cmccabe
8 Replies

2. Shell Programming and Scripting

awk to add symbol to specific field

Trying to use awk to add a . to $4. The input and output is tab-delimeted, but the awk seems to add a . in front of $5 and is space-delimeted. It seems close, but I am not able to produce the desired output. Thank you :). file chr1 948895 949015 chr1:948895-949015 ISG15 chr1 ... (1 Reply)
Discussion started by: cmccabe
1 Replies

3. Shell Programming and Scripting

awk to output lines less than number

I am trying to output all lines in a file where $7 is less than 30. The below code does create a result file, but with all lines in the original file. The original file is tab deliminated is that the problem? Thank you :). awk 'BEGIN{FS=OFS=","} $7 < 30 {print}' file.txt > result.txt... (3 Replies)
Discussion started by: cmccabe
3 Replies

4. Shell Programming and Scripting

awk Merging multiple files with symbol representing new file

I just tried following ls *.dat|sort -t"_" -k2n,2|while read f1 && read f2; do awk '{print}' $f1 awk FNR==1'{print $1,$2,$3,$4,$5,"*","*","*" }' OFS="\t" $f2 awk '{print}' $f2 donegot following result 18-Dec-1983 11:45:00 AM 18.692 84.672 0 25.4 24 18-Dec-1983 ... (3 Replies)
Discussion started by: Akshay Hegde
3 Replies

5. Shell Programming and Scripting

awk file comparison, x lines after matching as output

Hello, I couldn't find anything on the Forum that would help me to solve this problem. Could any body help me process below data using awk? I have got two files: file1: Worker1: Thomas Position: Manager Department: Sales Salary: $5,000 Worker2: Jason Position: ... (5 Replies)
Discussion started by: killerbee
5 Replies

6. Shell Programming and Scripting

Need to extract some lines from output via AWK

Hello Friends, I have got, this output below and i want to extract the name of symlink which is highlighted in red and the path above it highlighted in blue. At the end i want to append path and symlink. /var/tmp/asirohi/jdk/jre /var/tmp/asirohi/jdk/jre/.systemPrefs... (3 Replies)
Discussion started by: asirohi
3 Replies

7. UNIX for Dummies Questions & Answers

AWK: Backslash \ and forcing output not to go onto new lines

Dear all, I am using Mac OSX, have been successfully written an awk script during the last days. I use the script to convert parts of a .dot-file into graphml code. First question: Backslash My .dot-code includes repeatedly the sign "\n". I would like to search for this sign and substitute... (4 Replies)
Discussion started by: ingli
4 Replies

8. Shell Programming and Scripting

Merge lines in a file with Awk - incorrect output

Hi, I would like: FastEthernet0/0 is up, line protocol is up 0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored 0 output errors, 0 collisions, 0 interface resets Serial1/0:0 is up, line protocol is up 0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored, 0 abort 0... (14 Replies)
Discussion started by: mv652
14 Replies

9. Shell Programming and Scripting

[AWK] read lines with \x00 symbol

I want to read a large (~1-4Gb) txt file with fields separated by "," and line separator "\n". Unfortunately, file contains \x00 (zero ASCII) symbols AWK treats them as end of line + it ignores reminder of the line after the \x00. As a simple example: echo "\0060\0061\000\0060\0063" | nawk... (6 Replies)
Discussion started by: Murfury
6 Replies

10. Shell Programming and Scripting

Add text before lines in command output

Hi2all, I have following command in IBM HMC console: lssyscfg -r prof -m Server-9117-MMA-SN655D350 -F lpar_name,min_mem,desired_mem --header which gives me the following output: lpar_name,min_mem,desired_mem lpar1,1024,2048 lpar2,1024,2048 lpar3,2048,4096 What I want is to add in... (3 Replies)
Discussion started by: UsRb
3 Replies
Login or Register to Ask a Question