awk to update unknown value in file using range of another
I am trying to use awk to update all the unknown values in $6 of file2, if the $4 value in file 2 is within the range of $1 of file1. If there is already a value in $6 other then unknown, it is skipped and the next line is processed. In my awk attempt below the final output is 6 tab-delimited fields. Thank you .
file1 (space-delimited)
file2 (tab-delimited)
desired output
--- the second and fourth unknown values are updated based on the $4value and the range that they fall in $1 of file1
awk with current output
awk number 2 with output ---- sub(/unknown/,value[$1],$6)}1' hg19.txt input | column -t all one one line
I think I need a split split($1,a,/[:-]/) but the key is not unique, is there a better way?
The only way I can think off to make the key unique is, though I am not sure how to implement i:
----- matching $2 values in file1 are combined with the first lines rstart[a[1]]=a[2] being the start and the last lines rend[a[1]]=a[3] being the end
Last edited by cmccabe; 11-18-2016 at 12:42 PM..
Reason: addes awk attempt 2
Yes, clearly you need to use the split() call to define the array a[] that you are using, but (as you noted) you can't use the elements of that array as subscripts in another array because the values are not unique. Instead of you an array of minimum values and an array of maximum values indexed by the line number in your first file.
But, I don't understand the output that you say should be produced. Why do you want the output to be (with all occurrences of four spaces in your output replaced by <tab> characters):
instead of:
?
This User Gave Thanks to Don Cragun For This Post:
Thank you very much for your help, so each line is indexed by the minimum and maximum of all matching values in file1 $2. I added what I hope is close to understanding. Thank you .
Last edited by cmccabe; 11-19-2016 at 06:54 PM..
Reason: added details
In the awk below I am trying to cp and paste each matching line in f2 to $3 in f1 if $2 of f1 is in the line in f2 somewhere. There will always be a match (usually more then 1) and my actual data is much larger (several hundreds of lines) in both f1 and f2. When the line in f2 is pasted to $3 in... (4 Replies)
In the awk below I am trying to add a penalty to a score to each matching $1 in file2 based on the sum of $3+$4 (variable TL) from file1. Then the $4 value in file1 is divided by TL and multiplied by 100 (this valvue is variable S). Finally, $2 in file2 - S gives the updated $2 result in file2.... (2 Replies)
In the awk below I am trying to print out those lines in file2 that are no between $2 and $3 in file1. Both files are
tab-delimeted and I think it's close but currently it is printeing out the matches. The --- are not part of the files they are just to show what lines match or fall into
the range... (6 Replies)
I am trying to use awk to update the below tab-delimited file based on 5 different rules/conditions. The final output is also
tab-delimited and each line in the file will meet one of the conditions. My attemp is below as well though I am not very confident in it. Thank you :).
Condition 1: The... (10 Replies)
I have a very large tab-delimited, ~2GB file2 that I am trying to filter using $2 of file1. If $2 of file1 is in the range of $2 and $3 in file1 then the entire line of file2 is outputed. If the range match is not found then that line is skipped. The awk below does run but no output results. ... (3 Replies)
I am trying to update the below awk, kindly provided by @RavinderSingh13, to update each line of file1 with either Low or No Low based on matching $2 of file1 to a range in $2 and $3 of file2. If the $2 value in file1 matches the range in file2 then that line is Low, otherwise it is No Low in the... (3 Replies)
In the below, I am trying to lookup $1 and $2 from file1, in a range search using $1 $2 $3 of file2. If the search key from file1 is found in file2, then the word low is printed in the last field of that line in the updated file1. Only the last section of file1 needs to be searched, but I am not... (6 Replies)
I am trying to match $1 in file1 with $2 in file2. If a match is found then $3 and $4 of file2 are copied to file1. Both files are tab-delimeted and I am getting a syntax error and would also like to update file1 in-place without creating a new file, but am not sure how. Thank you :).
file1
... (19 Replies)
I have a file (sorted_unknown) with ~1400 $5 values before the - that are "unknown". What I am trying to do is use the text in $2 of (sort_targets) to update those "unknown" values in the (sorted_unknown).
In $1 of (sort_targets) there are a set of numbers that can be used to update the "unknown"... (8 Replies)