awk to print text in field if match and range is met
In the awk below I am trying to match the value in $4 of file1 with the split value from $4 in file2. I store the value of $4 in file1 in A and the split value (using the _ for the split) in array. I then strore the value in $2 as min, the value in $3 as max, and the value in $1 as chr.
If A is equal to array, then i use the values stored in min, max, and chr to check if there is overlap or not between the $2, $3, and $1 values in file2. If there is then overlap is printing but if there is not missing is printed. I am trying to ensure that the lines match and that the coordinates are in covered from file1 to file2. My actual data is several thousands of lines all in the below format and a match should result for each line in file2. I commented the awk as well and hope it helps as I am getting multiple syntax errors and maybe there is a better way, but I wanted to try and see. Thank you .
is ensuring, or is supposed to, match $4 in file1 with the array split from file2. So using the first value RPS19 as an example, only those lines in file2 with RPS19 are used. Thank you .
is ensuring, or is supposed to, match $4 in file1 with the array split from file2. So using the first value RPS19 as an example, only those lines in file2 with RPS19 are used. Thank you .
okie dokie - interesting "construct" [A] - see my previous comments.
you probably meant: ($1 in A) as $1 and array[5] are the same in file2.
$7 = print "overlap" else "missing"
what is that supposed to mean/do?
I think you also are missing a } in your last block with the for
In the awk below I am trying to print the entire line, along with the header row, if $2 is SNV or MNV or INDEL. If that condition is met or is true, and $3 is less than or equal to 0.05, then in $7 the sub pattern :GMAF= is found and the value after the = sign is checked. If that value is less than... (0 Replies)
I have a file.txt containing the following:
Query= HWI-ST863:386:C5Y8UACXX:3:2302:16454:89688 1:N:0:ACACGAAT
Length=100
Score E
Sequences producing significant alignments: (Bits) Value
... (2 Replies)
I am trying to use awk to find all the $2 values in file2 which is ~30MB and tab-delimited, that are between $2 and $3 in file1 which is ~2GB and tab-delimited.
I have just found out that I need to use $1 and $2 and $3 from file1 and $1 and $2of file2 must match $1 of file1 and be in the range... (6 Replies)
In file1 field $18 is removed.... column header is "Otherinfo", then each line in file1 is used to search file2 for a match. When a match is found the last four strings in file2 are copied to file1.
Maybe:
cut -f1-17 file1 and then match each line to file2
file1
Chr Start End ... (6 Replies)
Trying to print the unique values in $2 before the -, currently the count is displayed. Hopefully, the below is close. Thank you :).
file
chr2:46603668-46603902 EPAS1-902|gc=54.3 253.1
chr2:211471445-211471675 CPS1-1205|gc=48.3 264.7
chr19:15291762-15291983 NOTCH3-1003|gc=68.8 195.8... (3 Replies)
I am trying to use awk to print the unique entries in $2
So in the example below there are 3 lines but 2 of the lines match in $2 so only one is used in the output.
File.txt
chr17:29667512-29667673 NF1:exon.1;NF1:exon.2;NF1:exon.38;NF1:exon.4;NF1:exon.46;NF1:exon.47 703.807... (5 Replies)
Hi All,
Seeking for your assistance to print all the specific field when the condition met.
Ex:
file1.txt
1|203|3|31243|5341|6452|623|22|00|01
3|45345|123214|6534|3423|6565|643|343|232|10
if field 1 = 1 and field 3 = 3 and field 5 = 5341 and field 6 = 6452
it will print from $1 to $10.... (2 Replies)
In the files attached, I am trying to:
if Files.txt $1 is in the range of Exons.txt $1, then in Files.txt $4 the value from Exons.txt $3 is copied else if no match is found Exons.txt $3 = "Intron"
For example, the first value in File.txt $1 is chr1:14895-14944 and is not found in any range... (4 Replies)
Hi Guru's,
I am trying to grep a range of line numbers (based on match) and then look for another match which starts with a special character '$' and print the line number. I have the below code but it is actually printing the line number counting starting from the first line of the range i am... (15 Replies)