awk to update field using matching value in file1 and substring in field in file2
In the awk below I am trying to set/update the value of $14 in file2 in
bold, using the matching NM_ in $12 or $9 in file2
with the NM_ in $2 of file1.
The lengths of $9 and $12 can be variable but what is consistent is the start pattern
will always be NM_ and the end pattern is always ; (semi-colon). or a break (if it is the last)
What is extracted into $14 is all the text from the start to end (string between the NM_ up to the ; or
break.
The value in $7 determines the field to use,that is if $7 is exonic
then $12 is used to extract from. If $7 is not exonic then $9 is used to extract from. There will always be a value in $7 and exonic is there the majority of the time, but not always.
I added comments to each line as well in my attempt as to what I think is happening. I hope it is close or a start. Thank you .
awk
file1 space delimited
file2 tab-delimited
desired output tab-delimited
Last edited by cmccabe; 06-17-2017 at 04:27 PM..
Reason: fixed format
The NM_ value of $2 in file1, after splitting on the ., will match a substring NM_ in $12 (the majority of the time), or $9 (in some cases).
The substring that matches is extracted starting from the NM_ until the ; or break (if it is the last value, like in case 1 in the example).
The text in $7 of file2 determines the field to use/ extract from.... that is if $7=exonic, then use $12, but if $7 is not =exonic, then use $9.
The extracted value is used to update $14 from a . to the extracted value. Thank you very much .
Last edited by cmccabe; 06-18-2017 at 08:39 AM..
Reason: added details
I have two files which are the output of a multiple choice vocab test (60 separate questions) from 104 people (there are some missing responses) and the question list. I have the item list in one file (File1)
Item,Stimulus,Choice1,Choice2,Choice3,Choice4,Correct... (5 Replies)
Trying to use awk to:
update $2 in file2 with the $2 value in file1, if $1 in file1 matches $13 in file2, which is tab-delimeted. The $2values may already be the same so in that case nothing happens and the next line is processed.
There are exactly 4,605 unique $13 values. Thank you :).
... (4 Replies)
I am trying to use awk to find all the $2 values in file2 which is ~30MB and tab-delimited, that are between $2 and $3 in file1 which is ~2GB and tab-delimited.
I have just found out that I need to use $1 and $2 and $3 from file1 and $1 and $2of file2 must match $1 of file1 and be in the range... (6 Replies)
Dear All,
Need your help..:D
I am not regular on shell scripts..:(
I have 2 files..
Content of file1
cellRef 4};"4038_2_MTNL_KALAMBOLI"
cellRef 1020};"4112_3_RAINBOW_BLDG"
cellRef 134};"4049_2_TATA_HOSPITAL"
cellRef 1003};"4242_3_HITESH_CONSTRUCTION"
cellRef... (6 Replies)
Hi Freinds,
i have a file1 as below
file1
1|ndmf|fdd|d3484|34874
2|jdehf|wru7|478|w489
3|dfkj|wej|484|49894
file2 contains lakhs of records and not in sorted order
i want to retrive only the records from file2 by searcing the first field of file 1
i used
grep ^1 file2... (4 Replies)
I have very limited coding skills but I'm wondering if someone could help me with this. There are many threads about matching strings in two files, but I have no idea how to add a column from one file to another based on a matching string.
I'm looking to match column1 in file1 to the number... (3 Replies)
Hello,
I was hoping someone could help me with this work related problem...
basically what I want to do is the following:
file2:
1 o
2 t
4 f
5 v
7 n
8 e
10 a
file1:
1 : (8 Replies)
First, thanks for the help in previous posts... couldn't have gotten where I am now without it!
So here is what I have, I use AWK to match $1 and $2 as 1 string in file1 to $1 and $2 as 1 string in file2. Now I'm wondering if I can extend this AWK command to incorporate the following:
If $1... (4 Replies)
Hi All,
I have file1 line below:
$myName$|xxx
Now I need to read the file1 and find for $myName$ in file2 and replace with xxx
file1:
$myName$|xxx
file2:
My name is $myName$
expected output in file2 after executing the script is below:
my name is xxx
Thanks, (8 Replies)
File1 row is same as column 2 in file 2.
Also file 2 will either start with A, B or C.
And 3rd column in file 2 is always F2.
When column 2 of file 2 matches file1 column, print all those rows into a separate file.
Here is an example.
file 1:
100
103
104
108
file 2:
... (6 Replies)