Sponsored Content
Top Forums Shell Programming and Scripting awk to add text to matching pattern in field Post 303011797 by cmccabe on Tuesday 23rd of January 2018 08:14:35 PM
Old 01-23-2018
Thank you very much rdrtx1, that works perfect Smilie.

Don Cragun your are correct in that:

Quote:
The code you showed us seems like it would produce the desired number of additions to your field #9, but would omit the desired question marks and just append Smilie= to each subfield. The code to which you appended the comment:
I forgot that I changed the
sub(/$/, ":p=", NM[i]) # add :p. to end off each NM[i] before the to
sub(/$/, ":p=?", NM[i]) # add :p. to end off each NM[i] before the

However the :p.=? seemed to be iterating based on the number of splits. Maybe it is the wrong terminology but I didn't understand why, no matter what I tried. Thank you for the correction on the array being a string, I was confused.

awk
Code:
awk '
   BEGIN { FS=OFS="\t" }  # define FS and OFS as tab and start processing
   $9 ~ /NM/ {            # look for pattern NM in $9
        # split $9 by ";" and cycle through them
           out=""
       i=split($9,NM,/;/)
          for (n=1; n<=i; n++) {
           sub(/$/, ":p=", NM[i])   # add :p. to end off each NM[i] before the ;
           out = (out=="" ? "" : out";") NM[i]
          }
       $9 = out
}1' file

R_Index	Chr	Start	End	Ref	Alt	Func.refGene	Gene.refGene	GeneDetail.refGene	Inheritance	ExonicFunc.refGene	AAChange.refGene
1	chr1	155870416	155870416	G	A	splicing	RIT1	NM_006912:exon6:c.430-7C>T:p=;NM_006912:exon6:c.430-7C>T:p=:p=;NM_006912:exon6:c.430-7C>T:p=:p=:p=
9	chr10	112760138	112760138	A	-	splicing	SHOC2	NM_001269039:exon2:c.704-35A>-:p=;NM_001269039:exon2:c.704-35A>-:p=:p=
11	chr18	53070914	53070914	G	A	exonic	TCF4	.AD	nonsynonymous SNV	TCF4:NM_001243232:exon1:c.32C>T:p.A11V;TCF4:NM_001306208:exon1:c.32C>T:p.A11V

In rdrtx1 awk is the below close?

Code:
awk '
  BEGIN { FS=OFS="\t" }  # define FS and OFS as tab and start processing
  $9 ~ /NM/ { # look for pattern NM in $9
       gsub(";", ":p=?;", $9);  # split by ; in $9
       sub("$", ":p=?", $9);  # add :p=? to end of each split by ;
  } 1' file  # update input

Thank you very much Smilie.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk or sed to add field in a text file

Hi there, I have a csv file with some columns comma sepated like this : 4502-17,PETER,ITA2,LEGUE,92,ME - HALF,23/05/10 15:00 4502-18,CARL,ITA2,LEGUE,96,ME - HALF,20/01/09 14:00 4502-19,OTTO,ITA2,LEGUE,97,ME - MARY,23/05/10 15:00 As you can see the column n. 7 is a timestamp column, I need... (23 Replies)
Discussion started by: capnino
23 Replies

2. Shell Programming and Scripting

awk, comma as field separator and text inside double quotes as a field.

Hi, all I need to get fields in a line that are separated by commas, some of the fields are enclosed with double quotes, and they are supposed to be treated as a single field even if there are commas inside the quotes. sample input: for this line, 5 fields are supposed to be extracted, they... (8 Replies)
Discussion started by: kevintse
8 Replies

3. Shell Programming and Scripting

AWK : Add Fields of lines with matching field

Dear All, I would like to add values of a field, if the lines match in a certain field. Then I would like to divide the sum though the number of lines that have a matched field. This is the Input: Input: Test1 5 Test1 10 Test2 2 Test2 5 Test2 13 Test3 4 Output: Test1 7.5 Test1 7.5... (6 Replies)
Discussion started by: DerSeb
6 Replies

4. Shell Programming and Scripting

AWK: Pattern match between 2 files, then compare a field in file1 as > or < field in file2

First, thanks for the help in previous posts... couldn't have gotten where I am now without it! So here is what I have, I use AWK to match $1 and $2 as 1 string in file1 to $1 and $2 as 1 string in file2. Now I'm wondering if I can extend this AWK command to incorporate the following: If $1... (4 Replies)
Discussion started by: right_coaster
4 Replies

5. Shell Programming and Scripting

Pattern Matching and text deletion using VI

Can someone please assist me, I'm trying to get vi to remove all the occurences of the text in a file i.e. "DEVICE=/dev/mt??". The "??" represents a number variable. Is there a globel search and delete command that I can use? Thank You in Advance. (3 Replies)
Discussion started by: roadrunner
3 Replies

6. Shell Programming and Scripting

awk to parse field and include the text of 1 pipe in field 4

I am trying to parse the input in awk to include the |gc= in $4 but am not able to. The below is close: awk so far: awk '{sub(/\|]+]++/, ""); print }' input.txt Input chr1 955543 955763 AGRN-6|pr=2|gc=75 0 + chr1 957571 957852 AGRN-7|pr=3|gc=61.2 0 + chr1 970621 ... (7 Replies)
Discussion started by: cmccabe
7 Replies

7. Shell Programming and Scripting

awk to remove field and match strings to add text

In file1 field $18 is removed.... column header is "Otherinfo", then each line in file1 is used to search file2 for a match. When a match is found the last four strings in file2 are copied to file1. Maybe: cut -f1-17 file1 and then match each line to file2 file1 Chr Start End ... (6 Replies)
Discussion started by: cmccabe
6 Replies

8. Shell Programming and Scripting

awk to update field using matching value in file1 and substring in field in file2

In the awk below I am trying to set/update the value of $14 in file2 in bold, using the matching NM_ in $12 or $9 in file2 with the NM_ in $2 of file1. The lengths of $9 and $12 can be variable but what is consistent is the start pattern will always be NM_ and the end pattern is always ;... (2 Replies)
Discussion started by: cmccabe
2 Replies

9. Shell Programming and Scripting

Using awk to add length of matching characters between field in file

The awk below produces the current output, which will add +1 to $3. However, I am trying to add the length of the matching characters between $5 and $6 to $3. I have tried using sub as a variable to store the length but am not able to do so correctly. I added comments to each line and the... (4 Replies)
Discussion started by: cmccabe
4 Replies

10. Shell Programming and Scripting

awk to add text to each line of matching id

The awk below executes as expected if the id in $4 (like in f) is unique. However most of my data is like f1 where the same id can appear multiple times. I think that is the reason why the awk is not working as expected. I added a comment on the line that I can not change without causing the script... (6 Replies)
Discussion started by: cmccabe
6 Replies
IGAWK(1)							 Utility Commands							  IGAWK(1)

NAME
igawk - gawk with include files SYNOPSIS
igawk [ all gawk options ] -f program-file [ -- ] file ... igawk [ all gawk options ] [ -- ] program-text file ... DESCRIPTION
Igawk is a simple shell script that adds the ability to have ``include files'' to gawk(1). AWK programs for igawk are the same as for gawk, except that, in addition, you may have lines like @include getopt.awk in your program to include the file getopt.awk from either the current directory or one of the other directories in the search path. OPTIONS
See gawk(1) for a full description of the AWK language and the options that gawk supports. EXAMPLES
cat << EOF > test.awk @include getopt.awk BEGIN { while (getopt(ARGC, ARGV, "am:q") != -1) ... } EOF igawk -f test.awk SEE ALSO
gawk(1) Effective AWK Programming, Edition 1.0, published by the Free Software Foundation, 1995. AUTHOR
Arnold Robbins (arnold@skeeve.com). Free Software Foundation Nov 3 1999 IGAWK(1)
All times are GMT -4. The time now is 04:02 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy