Modify text file using awk


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Modify text file using awk
# 8  
Old 05-23-2013
Please use code tags in lieu of quote tags!

Try this:
Code:
awk     '       {n  = split ($8,TMP,";")
                 $8 = ""
                 for (i=1; i<=n; i++)
                    if (match (TMP[i], /^DP=|^MQ=|^resource.EFF/))
                        {sub  (/^.*=/, "", TMP[i])
                         gsub (/[()\|]+/, " ", TMP[i])
                         $8=$8 ($8?"\t":"") TMP[i]
                        }
                }
         1
        '  FS="\t" file
chr1 412573 . A C 2758.77 . 71    58.36    INTERGENIC MODIFIER  GT:ADP:GQ:PL 1/1:0,71:71:99:2787,214,0 GATKSAM
       
chr1 602567 rs21953190 A G 5481.77 . 152    59.09    SYNONYMOUS_CODING LOW SILENT gaT/gaC D1034 ADNP2 protein_coding CODING ENSCAFT00000000008 5  GT:ADP:GQ:PL 1/1:0,151:151:99:5510,430,0 GATKSAM

# 9  
Old 05-23-2013
Ubuntu

Thanks, i have the similar fix which is doing what i need. Here is the code.

Code:
awk 'BEGIN{OFS="\t"}{n=split ($8,TMP,";"); $8=""; for (i=1; i<=n; i++) if (match (TMP[i], /^DP=|^MQ=|^resource.EFF=/)) {sub (/^.*=/, "" ,TMP[i]); $12=$12 ($12?"\t":"") TMP[i]} {gsub("[\\|()]", "\t")}}1' file

But i have a new problem,

Code:
chr1	    901534	rs21932296	   T	G	34.77	0/1:3,2:5:63:63,0,64	GATKSAM	5	55.21	INTRON	MODIFIER				CTDP1	protein_coding	CODING	ENSCAFT00000000012	11

If we observe after MODIFIER i have a series of empty tabs. When i am piping this input to another awk command to perform s0me other action with the command,
Code:
awk 'BEGIN{OFS="\t"}{split ($7,TMP,":"); $7= TMP[1]}1'

it is replacing multiple empty tabs into a single tab and gives the output like below:

Code:
chr1	    901534	rs21932296	  T	G	34.77	 0/1	GATKSAM	5	55.21	INTRON	MODIFIER	CTDP1	protein_coding	CODING	ENSCAFT00000000012	11


I don't want the multiple tabs to be replaced by single tab. Could you help where im going wrong?

Last edited by mehar; 05-23-2013 at 03:48 PM..
# 10  
Old 05-23-2013
Assigning to/modifying any field will make awk reevaluate $0, eliminating empty fields. I don't know how to circumvent that, except by running through all the fields and assign a space to all empty ones - not sure that will work, btw.
# 11  
Old 05-23-2013
Quote:
Originally Posted by mehar
Thanks, i have the similar fix which is doing what i need. Here is the code.

Code:
awk 'BEGIN{OFS="\t"}{n=split ($8,TMP,";"); $8=""; for (i=1; i<=n; i++) if (match (TMP[i], /^DP=|^MQ=|^resource.EFF=/)) {sub (/^.*=/, "" ,TMP[i]); $12=$12 ($12?"\t":"") TMP[i]} {gsub("[\\|()]", "\t")}}1' file

But i have a new problem,

Code:
chr1	    901534	rs21932296	   T	G	34.77	0/1:3,2:5:63:63,0,64	GATKSAM	5	55.21	INTRON	MODIFIER				CTDP1	protein_coding	CODING	ENSCAFT00000000012	11

If we observe after MODIFIER i have a series of empty tabs. When i am piping this input to another awk command to perform s0me other action with the command,
Code:
awk 'BEGIN{OFS="\t"}{split ($7,TMP,":"); $7= TMP[1]}1'

it is replacing multiple empty tabs into a single tab and gives the output like below:

Code:
chr1	    901534	rs21932296	  T	G	34.77	 0/1	GATKSAM	5	55.21	INTRON	MODIFIER	CTDP1	protein_coding	CODING	ENSCAFT00000000012	11


I don't want the multiple tabs to be replaced by single tab. Could you help where im going wrong?
You're just setting the output field separator to tab, you also need to set the input field separator. Change: BEGIN{OFS="\t"} to: BEGIN{FS=OFS="\t"}.
# 12  
Old 05-23-2013
Well!! I have tried that too, it still gives the same output.

---------- Post updated at 02:05 PM ---------- Previous update was at 01:59 PM ----------

Okay. Then could you help to do in the following way.

Code:
chr1	758630	.	T	TC	2221.73	.	AC=2;AF=1.00;AN=2;DP=61;FS=0.000;MLEAC=2;MLEAF=1.00;MQ=51.14;MQ0=0;QD=36.42;RPA=1,2;RU=C;STR;resource.EFF=INTRON(MODIFIER||||PQLC1|protein_coding|CODING|ENSCAFT00000000011|2)	GT:AD:DP:GQ:PL	1/1:0,55:61:99:2259,165,0	GATKSAM

Could you modify the code given, such that it transforms in the following way:

Code:
chr1	758630	.	T	TC	2221.73	.	61 51.14 INTRON MODIFIER NA  NA   NA PQLC1 protein_coding CODING ENSCAFT00000000011 2	GT:AD:DP:GQ:PL	1/1:0,55:61:99:2259,165,0	GATKSAM

i.e replacing the empty fields in between |||| with NA?
# 13  
Old 05-23-2013
Why don't you extend your similar fix in post#9 to what you require above? Should be easily doable.
# 14  
Old 05-23-2013
Sorry if i have put some stupid questions. I have tried doing that but couldn't find the logic as im novice in using awk.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Modify text file if found multiple pattern match for every line.

Looking for help, i have input file like below and want to modify to expected output, if can without create additional file, hope can direct modify it. have 2 thing need do. 1st is adding a word (testplan generation off) after ! ! IPG: Tue Aug 07 14:31:17 2018 2nd is adding... (16 Replies)
Discussion started by: kttan
16 Replies

2. Shell Programming and Scripting

Modify text file using sed

Hello all, I have some text files I need to do the following on: Delete banner page (lines 1-56) --I am doing this using sed Remove ^M --I am doing this using vi Remove trailer page --this can vary based on the contents of the file, it usually starts with *************************** I am... (5 Replies)
Discussion started by: jeffs42885
5 Replies

3. Shell Programming and Scripting

Modify one line in a plain text file

Hi everyone, I want to know, if there is a way to modify one line in a text file with unix script, with out re-writing all the file. For example, i have this file: CONFIGURATION_1=XXXX CONFIGURATION_2=YYYY CONFIGURATION_3=ZZZZ supose i have a command or function "modify" that... (7 Replies)
Discussion started by: Xedrox
7 Replies

4. Shell Programming and Scripting

Modify the text file by script

Hi All the Helpers! I have a text file which looks like input.txt.I would request to please suggest me how can I make this file look like output.txt input.txt VOP 111 0 1 2 DEM 111 0 222 333 444 555 879 888 987 888 989 VOP 118 0... (2 Replies)
Discussion started by: Indra2011
2 Replies

5. Shell Programming and Scripting

Modify text file using shell script

Hi, I have a text file which is following format - COL VAL ABC 1 ABC 2 ABC 3 ABC 4 ABC 5 My requirement is to search for a particular value (provided by user) in the file and comment the previous entries including that as well. E.g. If I search for number 3, then the output... (6 Replies)
Discussion started by: bhupinder08
6 Replies

6. UNIX for Dummies Questions & Answers

Modify Text File

Hi, I would like to remove any lines from a text file that begin with #, or that are blank. How can I do that with BASH? Mike (3 Replies)
Discussion started by: msb65
3 Replies

7. Shell Programming and Scripting

Need help to modify perl script: Text file with line and more than 1 space

Dear Friends, I am beginner in Perl and trying to find the problem in a script. Kindly help me to modify the script. My script is not giving the output for the last field and followed text (LA: Language English). Input file & script as follows: Input file: Thu Mar 19 2:34:14 EDT 2009 STC... (3 Replies)
Discussion started by: srsahu75
3 Replies

8. Shell Programming and Scripting

Modify Specific Line of a Text File

Given a text file, how do you add a line of text after a specific line number? I believe I would want to use "sed" but I am unsure of the syntax. Thank you. Mike (5 Replies)
Discussion started by: msb65
5 Replies

9. Shell Programming and Scripting

Modify a text or xml file

Hi all, I want to write a shell which would allow me to edit a text file or a xml file. Basically i want to add a new node in a existing xml file. The values for this new node are based on user input. Thanks in advance Zing (9 Replies)
Discussion started by: zing
9 Replies

10. Shell Programming and Scripting

modify file using awk

I have a file, a.asc which is generated from a shell script: -----BEGIN PGP MESSAGE----- Version: PGP 6.5.8 qANQR1DBwE4DR5PN6zVjZTcQA/9z5Eg94cwYdTnC7v+JUegQuJwHripqnyjFrEs/ejzKYCNmngbHHmf8V4K3uFkYyp74aFf+CdymA030RKs6ewOwkmqRW19oIXCgVe8Qmfg+/2KTq8XN =0QSP -----END PGP MESSAGE----- I want... (12 Replies)
Discussion started by: nattynatty
12 Replies
Login or Register to Ask a Question