Home Man
Search
Today's Posts
Register

BSD, Linux, and UNIX shell scripting — Post awk, bash, csh, ksh, perl, php, python, sed, sh, shell scripts, and other shell scripting languages questions here.

awk to adjust text and count based on value in field

Tags
awk, shell scripts, solved

Login to Reply

 
Thread Tools Search this Thread
# 1  
Old 04-17-2018
awk to adjust text and count based on value in field

The below awk executes as is and produces the current output. It isvery close but what Ican not seem to do is add the -exon..., the ... portion comes from $1 and the _exon is static and will never change. If there is + sign in $4 then the ... is in acending order or sequential. If there is a - in $4 then the order is descending or in reverse. I think I need and if statement but not sure how to increment or subtract the value correctly. Thank you .

example of ordering based on $4
Code:
+ = exon 1,2,3
- = exon 3,2,1

file tab-delimited
Code:
208	NR_120664.1	chr5	+	141704857	141843619	141843619	141843619	4	141704857,141724980,141732790,141843534,	141704935,141725050,141733148,141843619,	0	SPRY4-AS1	unk	unk	-1,-1,-1,-1,
1161	NM_021615.4	chr16	-	75507021	75528926	75512538	75513726	3	75507021,75515714,75528837,	75513742,75515789,75528926,	0	CHST6	cmpl	cmpl	0,-1,-1,
1799	NM_002036.3	chr1	+	159173802	159176290	159174749	159176240	2	159173802,159175250,	159174770,159176290,	0	ACKR1	cmpl	cmpl	0,0,

current output tab-delimited
Code:
4	+	SPRY4-AS1	NR_120664.1	chr5:141704857-141704935     chr5:141724980-141725050     chr5:141732790-141733148     chr5:141843534-141843619     
3	-	CHST6	NM_021615.4	chr16:75507021-75513742     chr16:75515714-75515789     chr16:75528837-75528926     
2	+	ACKR1	NM_002036.3	chr1:159173802-159174770     chr1:159175250-159176290

desired output tab-delimited
Code:
4	+	SPRY4-AS1	NR_120664.1	chr5:141704857-141704935_exon1,chr5:141724980-141725050_exon2,chr5:141732790-141733148_exon3,chr5:141843534-141843619_exon4
3	-	CHST6	NM_021615.4	chr16:75507021-75513742_exon3,chr16:75515714-75515789_exon2,chr16:75528837-75528926_exon1
2	+	ACKR1	NM_002036.3	chr1:159173802-159174770_exon1	chr1:159175250-159176290_exon2

awk
Code:
awk -F '\t' '{sf="";len1=split($10,s1,",");split($11,s2,","); for (i=1;i<len1;i++){sf=sf $3":"s1[i]"-"s2[i]"     "}print $9,$4,$13,$2,sf}' OFS='\t' file > out


Last edited by cmccabe; 04-17-2018 at 06:09 PM.. Reason: fixed format
# 2  
Old 04-17-2018
Code:
BEGIN {
  FS=OFS="\t"
  suf="_exon"
}
{
   sf=""
   len1=split($10,s1,",")
   split($11,s2,",")
   for (i=1;i<len1;i++)
     sf=sf $3 ":" s1[i] "-" s2[i] suf (($4=="+")?i:len1-i) ","
   print $9,$4,$13,$2,sf
}

The Following User Says Thank You to vgersh99 For This Useful Post:
cmccabe (04-18-2018)
# 3  
Old 04-18-2018
Thank you very much .
Login to Reply

« Previous Thread | Next Thread »
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Perl to adjust coordinates based on repeat string cmccabe Shell Programming and Scripting 10 08-24-2018 08:35 AM
awk to print lines based on text in field and value in two additional fields cmccabe Shell Programming and Scripting 0 07-10-2017 08:53 AM
How to seperate text and adjust format like this? swensens Shell Programming and Scripting 2 05-28-2017 02:35 PM
awk to adjust coordinates in field based on sequential numbers in another field cmccabe Shell Programming and Scripting 3 01-30-2017 07:39 AM
awk joining multiple lines based on field count shunya Shell Programming and Scripting 5 08-30-2016 10:29 AM
awk to combine matches and use a field to adjust coordinates in other fields cmccabe Shell Programming and Scripting 4 07-21-2016 10:37 AM
Read text between regexps and write into files based on a field in the text r3d3 Shell Programming and Scripting 7 10-17-2013 10:55 PM
awk, comma as field separator and text inside double quotes as a field. kevintse Shell Programming and Scripting 8 11-15-2010 05:31 PM
Count number of occurences of a character in a field defined by the character in another field s052866 Shell Programming and Scripting 3 05-03-2010 02:16 PM
Find top N values for field X based on field Y's value FrancoisCN Shell Programming and Scripting 1 05-29-2009 09:57 AM


All times are GMT -4. The time now is 08:04 AM.

Unix & Linux Forums Content Copyright©1993-2018. All Rights Reserved.
UNIX.COM Login
Username:
Password:  
Show Password