Unix/Linux Go Back    


Shell Programming and Scripting BSD, Linux, and UNIX shell scripting — Post awk, bash, csh, ksh, perl, php, python, sed, sh, shell scripts, and other shell scripting languages questions here.

awk to adjust text and count based on value in field

Shell Programming and Scripting


Tags
awk, solved

Reply    
 
Thread Tools Search this Thread Display Modes
    #1  
Old Unix and Linux 04-17-2018   -   Original Discussion by cmccabe
cmccabe's Unix or Linux Image
cmccabe cmccabe is offline
Registered User
 
Join Date: Nov 2013
Last Activity: 10 May 2018, 7:33 AM EDT
Location: Chicago
Posts: 1,223
Thanks: 736
Thanked 14 Times in 13 Posts
awk to adjust text and count based on value in field

The below awk executes as is and produces the current output. It isvery close but what Ican not seem to do is add the -exon..., the ... portion comes from $1 and the _exon is static and will never change. If there is + sign in $4 then the ... is in acending order or sequential. If there is a - in $4 then the order is descending or in reverse. I think I need and if statement but not sure how to increment or subtract the value correctly. Thank you Linux.

example of ordering based on $4


Code:
+ = exon 1,2,3
- = exon 3,2,1

file tab-delimited


Code:
208	NR_120664.1	chr5	+	141704857	141843619	141843619	141843619	4	141704857,141724980,141732790,141843534,	141704935,141725050,141733148,141843619,	0	SPRY4-AS1	unk	unk	-1,-1,-1,-1,
1161	NM_021615.4	chr16	-	75507021	75528926	75512538	75513726	3	75507021,75515714,75528837,	75513742,75515789,75528926,	0	CHST6	cmpl	cmpl	0,-1,-1,
1799	NM_002036.3	chr1	+	159173802	159176290	159174749	159176240	2	159173802,159175250,	159174770,159176290,	0	ACKR1	cmpl	cmpl	0,0,

current output tab-delimited


Code:
4	+	SPRY4-AS1	NR_120664.1	chr5:141704857-141704935     chr5:141724980-141725050     chr5:141732790-141733148     chr5:141843534-141843619     
3	-	CHST6	NM_021615.4	chr16:75507021-75513742     chr16:75515714-75515789     chr16:75528837-75528926     
2	+	ACKR1	NM_002036.3	chr1:159173802-159174770     chr1:159175250-159176290

desired output tab-delimited


Code:
4	+	SPRY4-AS1	NR_120664.1	chr5:141704857-141704935_exon1,chr5:141724980-141725050_exon2,chr5:141732790-141733148_exon3,chr5:141843534-141843619_exon4
3	-	CHST6	NM_021615.4	chr16:75507021-75513742_exon3,chr16:75515714-75515789_exon2,chr16:75528837-75528926_exon1
2	+	ACKR1	NM_002036.3	chr1:159173802-159174770_exon1	chr1:159175250-159176290_exon2

awk


Code:
awk -F '\t' '{sf="";len1=split($10,s1,",");split($11,s2,","); for (i=1;i<len1;i++){sf=sf $3":"s1[i]"-"s2[i]"     "}print $9,$4,$13,$2,sf}' OFS='\t' file > out


Last edited by cmccabe; 04-17-2018 at 06:09 PM.. Reason: fixed format
Sponsored Links
    #2  
Old Unix and Linux 04-17-2018   -   Original Discussion by cmccabe
vgersh99's Unix or Linux Image
vgersh99 vgersh99 is offline Forum Advisor  
Forum Advisor
 
Join Date: Feb 2005
Last Activity: 28 May 2018, 3:25 AM EDT
Location: Foxborough, MA
Posts: 8,283
Thanks: 246
Thanked 824 Times in 769 Posts


Code:
BEGIN {
  FS=OFS="\t"
  suf="_exon"
}
{
   sf=""
   len1=split($10,s1,",")
   split($11,s2,",")
   for (i=1;i<len1;i++)
     sf=sf $3 ":" s1[i] "-" s2[i] suf (($4=="+")?i:len1-i) ","
   print $9,$4,$13,$2,sf
}

The Following User Says Thank You to vgersh99 For This Useful Post:
cmccabe (04-18-2018)
Sponsored Links
    #3  
Old Unix and Linux 04-18-2018   -   Original Discussion by cmccabe
cmccabe's Unix or Linux Image
cmccabe cmccabe is offline
Registered User
 
Join Date: Nov 2013
Last Activity: 10 May 2018, 7:33 AM EDT
Location: Chicago
Posts: 1,223
Thanks: 736
Thanked 14 Times in 13 Posts
Thank you very much Linux.
Sponsored Links
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Linux More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
awk to print lines based on text in field and value in two additional fields cmccabe Shell Programming and Scripting 0 07-10-2017 08:53 AM
awk to adjust coordinates in field based on sequential numbers in another field cmccabe Shell Programming and Scripting 3 01-30-2017 07:39 AM
awk joining multiple lines based on field count shunya Shell Programming and Scripting 5 08-30-2016 10:29 AM
awk to combine matches and use a field to adjust coordinates in other fields cmccabe Shell Programming and Scripting 4 07-21-2016 10:37 AM
Read text between regexps and write into files based on a field in the text r3d3 Shell Programming and Scripting 7 10-17-2013 10:55 PM



All times are GMT -4. The time now is 03:58 AM.