Sponsored Content
Top Forums Shell Programming and Scripting awk to update file with numerical difference if condition is met Post 302995062 by cmccabe on Friday 31st of March 2017 09:36:50 AM
Old 03-31-2017
awk to update file with numerical difference if condition is met

In the file1 below if $9 and $12 are . (dot) then the value in $8 of file1 is used as a key (exact match) to lookup in each $2 of file2, when a match is found then the value of $4
in file1 is used to look for a range match within +/- 50 using the values in $4 and after in file2. The number of fields can be variable but will always start at $4.
For example, ISG15 has 2 fields in it with coordinates starting at $4 and ending at $5. CR2 has 19 coordinates in it starting at $4 ending at $22. The value in $1 of file2
tells you how many coordinates there are but the start or first will always be in $4.

There will only be one range match but if the number is closer to the first value before the - (hyphen) in it the $9 of file1 is updated from a . to the numerical difference between the two numbers with a - (minus) in front. If the number is closer to the second value after the - (hyphen) in it the $9 of
file1 is updated from a . (dot) to the numerical difference between the two numbers with a + (plus) in front. However is the calculated difference is greater than 50, then >50 is printed in $9 of file1.

If $9 or $12 of file1 have a value other then . (dot) in them then that line is skipped (nothing happens) and the next line is processed. In file1 lines 2 and 3 are skipped. The awk below will identify these lines and print them, but I am not sure how to do the rest and need some expert help. Thank you Smilie.

file1 tab-delimited

Code:
R_Index	Chr	Start	End	Ref	Alt	Func.IDP.refGene	Gene.IDP.refGene	GeneDetail.IDP.refGene	Inheritence	ExonicFunc.IDP.refGene	AAChange.IDP.refGene
1	chr1	948846	948846	-	A	upstream	ISG15	.	.	.	.
2	chr1	948870	948870	C	G	UTR5	ISG15	NM_005101.3:c.-84C>G	.	.
3	chr1	949608	949608	G	A	exonic	ISG15	.	.	nonsynonymous SNV	ISG15:NM_005101.3:exon2:c.248G>A:p.S83N
4	chr1	949925	949925	C	T	downstream	ISG15	.	.	.	.
5	chr1	207646923	207646923	G	A	intronic	CR2	.	.	.	.
6	chr2	3653844	3653844	T	C	intronic	COLEC11	.	.	.	.
7	chr1	154562623	154562625	CCG	-	intronic	ADAR	.	.	.	.
8	chr1	948840	948840	-	C	upstream	ISG15	.	.	.	.

file2 space-delimited

Code:
2 ISG15 NM_005101.3 948846-948956 949363-949919
19 CR2 NM_001006658.2 207627644-207627821 207639870-207640257 207641871-207642060 207642144-207642244 207642494-207642577 207643039-207643447 207644084-207644261 207644341-207644432 207644767-207644844 207646116-207646524 207647145-207647230 207647585-207647668 207648168-207648561 207649578-207649764 207651229-207651415 207652601-207652625 207653322-207653398 207658808-207658917 207662486-207663240
6 COLEC11 NM_024027.4 3642421-3642758 3651904-3652060 3660900-3660972 3687867-3687921 3691033-3691129 3691316-3692234
15 ADAR NM_001111.4 154554533-154557519 154557692-154557820 154558228-154558341 154558656-154558839 154560600-154560734 154561026-154561149 154561844-154561938 154562232-154562404 154562737-154562885 154569280-154569414 154569598-154569743 154570303-154570452 154570877-154571061 154573516-154575102 154580467-154580724

desired updated file1

Code:
R_Index	Chr	Start	End	Ref	Alt	Func.IDP.refGene	Gene.IDP.refGene	GeneDetail.IDP.refGene	Inheritence	ExonicFunc.IDP.refGene	AAChange.IDP.refGene
1	chr1	948846	948846	-	A	upstream	ISG15	0	.	.	.
2	chr1	948870	948870	C	G	UTR5	ISG15	NM_005101.3:c.-84C>G	.	.   .
3	chr1	949608	949608	G	A	exonic	ISG15	.	.	nonsynonymous SNV	ISG15:NM_005101.3:exon2:c.248G>A:p.S83N
4	chr1	949925	949925	C	T	downstream	ISG15	+6	.	.	.
5	chr1	207646923	207646923	G	A	intronic	CR2	>50	.	.	.
6	chr2	3653844	3653844	T	C	intronic	COLEC11	>50	.	.	.
7	chr1	154562623	154562625	CCG	-	intronic	ADAR	>50	.	.	.
8	chr1	948840	948840	-	C	upstream	ISG15	-6	.	.	.

Description of updated file1

Code:
line1:file1 $9 updated to 0 because ISG15 is matched to line 1, $2 of file2 and the value in $4 of file1, 948846 is a exact match to the first cordinate in $4 before the - 
line2:not updated, skipped because $9 or $12 in file1 have a value other then . in them
line3:not updated, skipped because $9 or $12 in file1 have a value other then . in them
line4:file1 $9 updated to +6 because ISG15 is matched to line 1, $2 of file2 and the value in $4 of file1, 949925 is a range match to the second coordinate in $5 after the - 
line5:file1 $9 updated to >50 because CR2 is matched to line 2, $2 of file2 and the value in $4 of file1, 207646923 is a range match to the first coordinate in $14 before the - but the difference of 222 is > 50
line6:file1 $9 updated to >50 because COLEC11 is matched to line 3, $2 of file2 and the value in $4 of file1, 3653844 is a range match to the second coordinate in $2 after the - but the difference of 1784 is > 50
line7:file1 $9 updated to >50 because ADAR is matched to line 4, $2 of file2 and the value in $4 of file1, 154562625 is a range match to the second coordinate in $12 after the - but the difference of112  is > 50
line8: file1 $9 updated to -6 because ISG15 is matched to line 1, $2 of file2 and the value in $4 of file1, 948840 is a range match to the first coordinate in $4 before the -

awk

Code:
 awk -F'\t' -v OFS='\t' '{if ($9=="." && $12==".") print }' file1

 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to break a loop if condition is met

I am having trouble figuring this code I want to grep a text from a file and if it match certain text it break out of the loop or it should continue searching for the text Here is what I have written but it isn't working while true f=`grep 'END OF STATUS REPORT' filename` do if ... (9 Replies)
Discussion started by: Issemael
9 Replies

2. Shell Programming and Scripting

sed/awk to update 1st column if condition met

Hi, I am trying to update the 1st column of a file but only if it contains a char here is an example of my file 1111aaa 9999 textaaa 22222bbb 9999 textbbb 3333 9999 textccc 444ddd 9999 textddd i would like the output to remove any characters () from... (5 Replies)
Discussion started by: plennon
5 Replies

3. Shell Programming and Scripting

do nothing if condition is not met but not exit

Hello all, I created the below script....and it seemed to be working fine. My problem is i want the script to ignore rest of the things if my condition is not met but do not exit.... #!/bin/ksh ########################### ########################### # Set name of the listener, this... (2 Replies)
Discussion started by: abdul.irfan2
2 Replies

4. UNIX for Advanced & Expert Users

While loop only if a condition is met

All, I wrote the following section of code (which logically in PHP would of worked): tmpPATH=${1} tmpTAG=${2} if then while read tmpTAG tmpPATH do fi echo $tmpTAG echo $tmpPATH if then done < ./config.cfg fi (4 Replies)
Discussion started by: Cranie
4 Replies

5. Shell Programming and Scripting

Delete if condition met in a column

i have a table like this: id, senderNumber, blacklist ----------------------------- 1 0835636326 Y 2 0373562343 Y 3 0273646833 Y and I want to delete automatically if a new inserted row on another table consist anything on senderNumber column above using a BASH Script I... (9 Replies)
Discussion started by: jazzyzha
9 Replies

6. Shell Programming and Scripting

Awk. Abort script if condition was met.

I want to abort script if input variable matched first field in any line of a file. #!/bin/sh read INPUTVAR1 awk "{if(\$INPUTVAR1 == $1) x = 1} END {if(x==1) print \"I want to abort script here\"; else print \"OK\"}" /etc/some.conf I tried "exit" and system("exit") but no luck. (1 Reply)
Discussion started by: urello
1 Replies

7. Shell Programming and Scripting

Getting the records once condition met

Hi All, Seeking for your assistance to get the records once the $2 met the condition. Ex. file 1.txt 123455,10-Aug-2020 07:33:37 AM,2335235,1323534,12343 123232,11-Aug-2015 08:33:37 PM,4234324,1321432,34364 Output: 123455,10-Aug-2020 07:33:37 AM,2335235,1323534,12343 What i did... (5 Replies)
Discussion started by: znesotomayor
5 Replies

8. Shell Programming and Scripting

Need help on how to append on the filename when condition met.

Hi All, Seeking for your assistance on how to append the specific string when $3 condion met. ex. file1.txt ar0050046b16,5,888,0,0,0,0.00,0.00,0.00,0.00,25689.55 ar0050046b16,5,0,0,0,0,0.00,0.00,0.00,0.00,25689.55 ar0050046b16,5,0,0,0,0,0.00,0.00,0.00,0.00,25689.55 expected output:... (5 Replies)
Discussion started by: znesotomayor
5 Replies

9. Shell Programming and Scripting

Add another condition to bash for when not met

In the below I can not seem to add a line that will add Not low if the statement in bold is not true or meet. I guess when the first if statement is true/meet then print low, otherwise print Not low in $(NF + 1). I am not sure how to correctly add this. Thank you :). if(low <= $2 && $2 <=... (5 Replies)
Discussion started by: cmccabe
5 Replies

10. UNIX for Beginners Questions & Answers

awk - print when condition is met

I have a file.txt containing the following: Query= HWI-ST863:386:C5Y8UACXX:3:2302:16454:89688 1:N:0:ACACGAAT Length=100 Score E Sequences producing significant alignments: (Bits) Value ... (2 Replies)
Discussion started by: tons92
2 Replies
All times are GMT -4. The time now is 12:03 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy