I am trying to use awk to update the below tab-delimitedfile based on 5 different rules/conditions. The final output is also tab-delimited and each line in the file will meet one of the conditions. My attemp is below as well though I am not very confident in it. Thank you .
Condition 1: The field Classification has a default value of "VUS" for all lines in file
Condition 2: The CLINSIG field updates Classification with the value in it if it hasa lenghth < 12, else it isConflicting is the result
- since it is possible for this field to have multiple strings in it I used the greatest single value "Likely Benign" and if the value in the field exceeds 12 characters
then "Conflicting" is the result, the multiple values are also separated by | symbol
Condition 3: If the Func.IDP.refGene = UTR then the value of Classification is Likely Benign,
unleess CLINSIG had a value already
Condition 4: If the PopFreqMax> .01 then If the Classification is Likely Benign else it is VUS,
unleess CLINSIG had a value already
Condition 5: If Func.IDP.refGene = spicing AND GeneDetail.IDP.refGene has +/- > 10
then the value of Classification is Likely Benign, unleess CLINSIG had a value already
Thank you .
file
Descri[tion of fields
desired output
Last edited by cmccabe; 01-21-2017 at 03:28 PM..
Reason: fixed format and added details
i have a file in unix in which the records are like this
aaa 123 233
aaa 234 222
aaa 242 222
bbb 122 111
bbb 122 123
ccc 124 222
In the output i want only the below records
aaa
ccc
The validation logic is 1st column and 2nd column need to be considered
if both columns values are... (8 Replies)
Hi
I am having files with date and time stamp as the folder names like 200906051400,200906051500,200906051600 .....hence everyday 24 files will be generated
i need to do certain things on this 24 files daily
file contains the data like
200906050016370 0 1244141195225298lessrv3 ... (13 Replies)
1. if the 1st row IDs of input1 (ID1/ID2.....) is equal to any IDNames of input2
print all relevant values together as defined in the output.
2. A bit tricky part is IDno in the output. All we need to do is numbering same kind of
letters as 1 (aa of ID1) and different letters as 2 (ab... (4 Replies)
I need to split the file
Conditions:
Ignore any record that either starts with 1 or 9
Split the file at position 404 , if position 404 is abc or def then write all the records in a file > File 1 , the remaining records should go in to a file > File 2
Further I want to split the... (7 Replies)
If $1 in file1 matches $2 in file2. Then the value in $2 of file2 is updated to $1"."$2 of file2. The awk seems to only match the two files but not update. Thank you :).
awk
awk 'NR==FNR{A ; next} $1 in A { $2 = a }1' file1 file2
file1
name version
NM_000593 5
NM_001257406... (3 Replies)
I am trying to use awk to match two files that are tab-delimited. When a match is found between file1 $1 and file2 $4, $4 in file2 is updated using the $2 value in file1. If no match is found then the next line is processed. Thank you :).
file1
uc001bwr.3 ADC
uc001bws.3 ADC... (4 Replies)
The below awk will filter a list of 30,000 lines in the tab-delimited file. What I am having trouble with is adding a condition to SVTYPE=CNV
that will only print that line if CI= must be >.05 .
The other condition to add is if SVTYPE=Fusion, then in order to print that line
READ_COUNT must... (3 Replies)
In the awk, thanks you @RavinderSingh13, for the help in below, hopefully it is close as I am trying to update the value in $12 of the tab-delimeted file2 with the matching value in $1 of the space delimeted file1. I have added comments for each line as well. Thank you :).
awk
awk '$12 ==... (10 Replies)
I have been reading old posts and trying to come up with a solution for the below: Use a tab-delimited input file to assign
point to variables that are used to update a specific field, Rank. I really couldn't find too much in the way of assigning points
to variable, but made an attempt at an awk... (4 Replies)
Trying to use awk to store the value of $5 in file1 in array x. That array x is then used to search $4 of file1 to find aa match (I use x to skip the header in file1). Since $4 can have multiple strings in it seperated by a , (comma), I split them and iterate througn each split looking for a match.... (2 Replies)