chr12:12006495-chr15:88483984 Fusion Gain-of-Function ETV6NTRK3-E4N15 1868
chr12:12022903-chr15:88483984 Fusion Gain-of-Function ETV6NTRK3-E5N15 414833
chr17 entire line.... (this line does not have the keyword in it so is printed in the output)
chr10 entire line... (this line does not have the keyword in it so is printed in the output
Would you be able to comment each line so I may better understand? Thank you very much .
Code:
awk '/SVTYPE=Fusion/{ # regex to look in line for SVTYPE=Fusion
match($5,/].*]/); # grab the $5 value after the ] but before theouter ] (chr15:88483984)
sub(/.COS.*/,"",$3); # split on the . before COS and grag the value in $3 (ETV6-NTRK3.E4N15)
sub(/-/,"",$3); # split on the - and grab the text to the left (ETV6)
sub(/\./,"-",$3); # split on the - and grab the text to the right and add a hypen before (-NTRK3)
VAL=substr($5,RSTART+1,RLENGTH-2); # not sure haven't got the concept of RSTART and RLENGTH
num=split($8, array,";"); # read the $8 value into array up to the ; (SVTYPE=Fusion}
for(i=1;i<=num;i++){ # create loop to iterate next line
if(array[i] ~ /SVTYPE/){ # check array
sub(/.*=/,"",array[i]); # matches in next line
svtype=array[i] # define match criteria
};
if(array[i] ~ /READ_COUNT/){ # strore value of READ_COUNT in array[i]
sub(/.*=/,"",array[i]); # value of READ_COUNT in array[i] (1868)
read_count=array[i] # define match and print to line
}
};
match($0,/oncomineGeneClass.*,/); # grab(Gain-of-function
print $1":"$2 "-" VAL OFS svtype OFS substr($0,RSTART+20,RLENGTH-22) OFS $3 OFS read_count; # print desired output
}
' Input_file
I do not really understand RSTART and RLENGTH fully but believe they define the index of a field.
Last edited by cmccabe; 05-12-2017 at 08:29 PM..
Reason: added comments to code as what I think I understand
hi..im new to UNIX...
ok i have this information in the normal shell...
there are 2 lines display like this:
h@hotmail.com
k@hotmail.com
i want it to display like this with a space betweem them
h@hotmail.com k@hotmail.com
the information is stored in a text file....
anyone... (10 Replies)
Hi All,
I have 1 "keyword" file like this:
00-1F-FB-00-04-18
00-19-CB-8E-66-DF
00-1F-FB-00-48-9C
00-1F-FB-00-AA-4F
....
and the 2nd "details" file like this:
Wed Feb 11 00:00:02 2009
NAS-IP-Address = xxxxxxxxxxxxxxxxxx
Class = "P1-SHT-AAA01;1233704662;4886720"
... (6 Replies)
Hello, can someone help me how to find a word and 2 lines after it and then send the output to another file.
For example, here is myfile1.txt. I want to search for "Error" and 2 lines below it and send it to myfile2.txt
I tried with grep -A but it's not supported on my system.
I tried with awk,... (4 Replies)
Hello everyone,
Maybe somebody could help me with an awk script.
I have this input (field separator is comma ","):
547894982,M|N|J,U|Q|P,98,101,0,1,1
234900027,M|N|J,U|Q|P,98,101,0,1,1
234900023,M|N|J,U|Q|P,98,54,3,1,1
234900028,M|H|J,S|Q|P,98,101,0,1,1
234900030,M|N|J,U|F|P,98,101,0,1,1... (2 Replies)
URGENT HELP IS NEEDED!!
I am looking to move matching lines (01 - 07) from File1 and 77 tab the matching string from File2, to File3.txt. I am almost done but
- Currently, script is not printing lines to File3.txt in order.
- Also the matching lines are not moving out of File1.txt
... (1 Reply)
If a file has following kind of data, comma delimited
1,2,3,4
1
1
1,2,3,4
1,2
2
2,3,4
My required output must have only 4 columns with comma delimited
1,2,3,4
111,2,3,4
1,222,3,4
I have tried many awk command using ORS="" but couldnt progress (10 Replies)
I am trying to combine lines with these conditions:
1. First line starts with text of "libname VALUE db2 datasrc" where VALUE can be any text.
2. If condition1 is met then continue to combine lines through a line that ends with a semicolon.
3. Ignore case when matching patterns and remove any... (5 Replies)
I'm trying to use awk to count the occurrences of two matching fields of a CSV file.
For instance, for data that looks like this...
Joe,Blue,Yes,No,High
Mike,Blue,Yes,Yes,Low
Joe,Red,No,No,Low
Joe,Red,Yes,Yes,Low
I've been trying to use code like this...
countvar=`awk ' $2~/$color/... (4 Replies)
I am trying to combine all matching lines in the tab-delimited using awk. The below runs but no output results. Thank you :).
input
chrX 110925349 110925532 ALG13
chrX 110925349 110925532 ALG13
chrX 110925349 110925532 ALG13
chrX 47433390 47433999 SYN1... (3 Replies)
I have been searching and trying to come up with an awk that will perform the following on a
converted text file (original is a pdf).
1. Since the first two lines are (begin with) text they are removed
2. if $1 is a number then all text is merged (combined) into one line until the next... (3 Replies)