Sponsored Content
Top Forums Shell Programming and Scripting awk to add value and text to specific lines Post 303008287 by Don Cragun on Wednesday 29th of November 2017 10:29:24 PM
Old 11-29-2017
Assuming that your sample data is representative, the following seems to do what you want:
Code:
awk '
BEGIN {	FS = OFS = "\t"
}
NF == 10 {
	split($8, a, /[=;]/)
	$11 = $12 = $13 = $14 = $15 = $18 = "."
	$16 = (a[1] == "DP") ? a[2] : "DP=num_Missing"
	$17 = "homref"
}
1' file

Note that this assumes that lines that are to be modified only have 10 fields (i.e., 9 <tab>s) as in your sample. It verifies that field #8 does start with DP= and assigns a diagnostic message to field #16 if it isn't. This seemed safer to me than blindly assuming that the input file is correctly formatted. (And, I added a line to my sample input file to be sure that it worked as I expected. It didn't the first two times I tried it.)

When the contents of file is:
Code:
##bcftools_normVersion=1.3.1+htslib-1.3.1
##bcftools_normCommand=norm -m-both -o genome_split.vcf genome.vcf.gz
##bcftools_normCommand=norm -f /home/cmccabe/Desktop/NGS/picard-tools-1.140/resources/ucsc.hg19.fasta -o genome_annovar.vcf genome_split.vcf
#CHROM	POS	ID	REF	ALT	QUAL	FILTER	INFO	FORMAT
chr1	948797	.	C	.	0	PASS	DP=159;END=948845;MAX_DP=224;MIN_DP=95	GT:DP:MIN_DP:MAX_DP	0/0:159:95:224
chr1	948846	.	T	TA	1185.38	PASS	AF=0.986111;AO=200;DP=227;FAO=213;FDP=216;FDVR=0;FR=.;FRO=3;FSAF=132;FSAR=81;FSRF=3;FSRR=0;FWDB=-0.038073;FXX=0.0226234;HRUN=1;LEN=1;MLLD=24.748;OALT=A;OID=.;OMAPALT=TA;OPOS=948847;OREF=-;PB=.;PBP=.;QD=21.9515;RBI=0.0905281;REFB=0.0706997;REVB=0.0821327;RO=17;SAF=122;SAR=78;SRF=15;SRR=2;SSEN=0;SSEP=0;SSSB=-0.0430195;STB=0.505617;STBP=0.201;TYPE=ins;VARB=0.000148797	GT:GQ:DP:FDP:RO:FRO:AO:FAO:AF:SAR:SAF:SRF:SRR:FSAR:FSAF:FSRF:FSRR	1/1:219:227:216:17:3:200:213:0.986111:78:122:15:2:81:132:3:0	0.986	132	81	1	GOOD	216	homalt	36
ALT1	948798	.	D	.	1	PAST	END=948845;MAX_DP=224;MIN_DP=95;DP=159	GT:DP:MIN_DP:MAX_DP	1/1:160:96:225

then the output produced is:
Code:
##bcftools_normVersion=1.3.1+htslib-1.3.1
##bcftools_normCommand=norm -m-both -o genome_split.vcf genome.vcf.gz
##bcftools_normCommand=norm -f /home/cmccabe/Desktop/NGS/picard-tools-1.140/resources/ucsc.hg19.fasta -o genome_annovar.vcf genome_split.vcf
#CHROM	POS	ID	REF	ALT	QUAL	FILTER	INFO	FORMAT
chr1	948797	.	C	.	0	PASS	DP=159;END=948845;MAX_DP=224;MIN_DP=95	GT:DP:MIN_DP:MAX_DP	0/0:159:95:224	.	.	.	159	homref	.
chr1	948846	.	T	TA	1185.38	PASS	AF=0.986111;AO=200;DP=227;FAO=213;FDP=216;FDVR=0;FR=.;FRO=3;FSAF=132;FSAR=81;FSRF=3;FSRR=0;FWDB=-0.038073;FXX=0.0226234;HRUN=1;LEN=1;MLLD=24.748;OALT=A;OID=.;OMAPALT=TA;OPOS=948847;OREF=-;PB=.;PBP=.;QD=21.9515;RBI=0.0905281;REFB=0.0706997;REVB=0.0821327;RO=17;SAF=122;SAR=78;SRF=15;SRR=2;SSEN=0;SSEP=0;SSSB=-0.0430195;STB=0.505617;STBP=0.201;TYPE=ins;VARB=0.000148797	GT:GQ:DP:FDP:RO:FRO:AO:FAO:AF:SAR:SAF:SRF:SRR:FSAR:FSAF:FSRF:FSRR	1/1:219:227:216:17:3:200:213:0.986111:78:122:15:2:81:132:3:0	0.986	132	81	1	GOOD	216	homalt	36
ALT1	948798	.	D	.	1	PAST	END=948845;MAX_DP=224;MIN_DP=95;DP=159	GT:DP:MIN_DP:MAX_DP	1/1:160:96:225	.	.	.	DP=num_Missing	homref	.

Hopefully, this will come close to what you want.
This User Gave Thanks to Don Cragun For This Post:
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extracting text out of specific lines

Hi, I have a file like LAHORE 2009-04-16 16:04:19 THU S5830 FAULT MESSAGE SUPPRESS STATUS LOC : ASP00 STS : SUPPRESSING CONTINUE INF : F6201 TRUNK. DATA FAULT REPORT COMPLETED LAHORE 2009-04-16 16:04:20 THU S8400 ISUP SIGNALLING TRACE -... (3 Replies)
Discussion started by: krabu
3 Replies

2. Shell Programming and Scripting

Insert a text from a specific row into a specific column using SED or AWK

Hi, I am having trouble converting a text file. I have been working for this whole day now, still i couldn't make it. Here is how the text file looks: _______________________________________________________ DEVICE STATUS INFORMATION FOR LOCATION 1: OPER STATES: Disabled E:Enabled ... (5 Replies)
Discussion started by: Issemael
5 Replies

3. Shell Programming and Scripting

[bash help]Adding multiple lines of text into a specific spot into a text file

I am attempting to insert multiple lines of text into a specific place in a text file based on the lines above or below it. For example, Here is a portion of a zone file. IN NS ns1.domain.tld. IN NS ns2.domain.tld. IN ... (2 Replies)
Discussion started by: cdn_humbucker
2 Replies

4. Shell Programming and Scripting

Assigning a specific format to a specific column in a text file using awk and printf

Hi, I have the following text file: 8 T1mapping_flip02 ok 128 108 30 1 665000-000008-000001.dcm 9 T1mapping_flip05 ok 128 108 30 1 665000-000009-000001.dcm 10 T1mapping_flip10 ok 128 108 30 1 665000-000010-000001.dcm 11 T1mapping_flip15 ok 128 108 30... (2 Replies)
Discussion started by: goodbenito
2 Replies

5. Shell Programming and Scripting

extracting specific text from lines

Hello, i've got this output text: and i need it to look something like this: which means that there won't be absolute path of each directory, just it's size and the last word after last '/' in each line, and i also don't need last line '1.7M /tmp' Looks like there is a simple... (5 Replies)
Discussion started by: krater559
5 Replies

6. Shell Programming and Scripting

Summing over specific lines and replacing the lines with the sum using sed, awk

Hi friends, This is sed & awk type question. I have a text file which has numbers spread all over the file. I want to sum the series of numbers whenever i find it and produce an output file with the sum. For example ###start of input text file #### abc def ghi 1 2 3 4 kjld random... (3 Replies)
Discussion started by: kaaliakahn
3 Replies

7. UNIX for Dummies Questions & Answers

Add strings from one file at the end of specific lines in text file

Hello All, this is my first post so I don't know if I am doing this right. I would like to append entries from a series of strings (contained in a text file) consecutively at the end of specifically labeled lines in another file. As an example: - the file that contains the values to be... (3 Replies)
Discussion started by: gus74
3 Replies

8. Shell Programming and Scripting

Delete lines above and below specific line of text

I'm trying to remove a specific number of lines, above and below a specific line of text, highlighted in red: <STMTTRN> <TRNTYPE>CREDIT <DTPOSTED>20151205000001 <TRNAMT>10 <FITID>667800001 <CHECKNUM>667800001 <MEMO>BALANCE </STMTTRN> <STMTTRN> <TRNTYPE>DEBIT <DTPOSTED>20151207000001... (8 Replies)
Discussion started by: bomsom
8 Replies

9. Shell Programming and Scripting

awk to skip lines find text and add text based on number

I am trying to use awk skip each line with a ## or # and check each line after for STB= and if that value in greater than or = to 0.8, then at the end of line the text "STRAND BIAS" is written in else "GOOD". So in the file of 4 entries attached. awk tried: awk NR > "##"' "#" -F"STB="... (6 Replies)
Discussion started by: cmccabe
6 Replies

10. Shell Programming and Scripting

awk to update specific value in file with match and add +1 to specific digit

I am trying to use awk to match the NM_ in file with $1 of id which is tab-delimited. The NM_ will always be in the line of file that starts with > and be after the second _. When there is a match between each NM_ and id, then the value of $2 in id is substituted or used to update the NM_. Each NM_... (3 Replies)
Discussion started by: cmccabe
3 Replies
All times are GMT -4. The time now is 07:04 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy