Awk script


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Awk script
# 1  
Old 02-20-2011
MySQL Awk script

I have following text
Code:
scaffold_1      phytozome6      gene    12632   13612   .       +       .       ID=PT_0001s00200;Name=PT_0001s00200
scaffold_1      phytozome6      mRNA    12632   13612   .     +       .     ID=PAC:18235173;Name=PT_0001s00200.1;PACid=18235173;Parent=PT_0001s00200
scaffold_1      phytozome6      5'-UTR  12632   12638   .       +       .       Parent=PAC:18235173;PACid=18235173
scaffold_1      phytozome6      CDS     12639   12650   .       +       0      Parent=PAC:18235173;PACid=18235173
scaffold_1      phytozome6      CDS     12768   12891   .       +       0      Parent=PAC:18235173;PACid=18235173
scaffold_1      phytozome6      CDS     13117   13226   .       +       2      Parent=PAC:18235173;PACid=18235173
scaffold_1      phytozome6      CDS     13310   13384   .       +       0      Parent=PAC:18235173;PACid=18235173
scaffold_1      phytozome6      3'-UTR  13385   13612   .       +       .       Parent=PAC:18235173;PACid=18235173

I want convert into
Code:
scaffold_1      phytozome6      gene    12632   13612   .       +       .       ID=PT_0001s00200;Name=PT_0001s00200
scaffold_1      phytozome6      mRNA    12632   13612   .     +       .     ID=PT_0001s00200.1;Name=PT_0001s00200.1;PACid=18235173;Parent=PT_0001s00200
scaffold_1      phytozome6      5'-UTR  12632   12638   .       +       .       Parent=PT_0001s00200.1;PACid=18235173
scaffold_1      phytozome6      CDS     12639   12650   .       +       0      Parent=PT_0001s00200.1;PACid=18235173
scaffold_1      phytozome6      CDS     12768   12891   .       +       0      Parent=PT_0001s00200.1;PACid=18235173
scaffold_1      phytozome6      CDS     13117   13226   .       +       2      Parent=PT_0001s00200.1;PACid=18235173
scaffold_1      phytozome6      CDS     13310   13384   .       +       0      Parent=PT_0001s00200.1;PACid=18235173
scaffold_1      phytozome6      3'-UTR  13385   13612   .       +       .       Parent=PT_0001s00200.1;PACid=18235173

I tried with following script
Code:
awk '{if(substr($9,17,10)=="Name=PT") print gensub("Parent=PAC:"substr($9,47,9),"Parent="substr($9,22,16)".1",
  substr($0,0)) substr($0,0) }' inputfile > outputfile

but still no luck.

Can you please help me,

Last edited by Scott; 02-20-2011 at 05:54 AM.. Reason: Please use code tags
# 2  
Old 02-20-2011
why not use sed?
Code:
sed 's/Parent=PAC:18235173;PACid=18235173/Parent=PT_0001s00200.1;PACid=18235173/g'

You can do an inline edit to the same file using the -i switch. Or redirect it to another file like your example shows
# 3  
Old 02-20-2011
I have used following script but still I cant replace the Parent=PAC
Code:
awk '{if(substr($9,17,10)=="Name=PT") n=substr($9,22,16) gsub("Parent=PAC:"substr($9,47,9),"Parent="n".1");
  gsub("ID=PAC:"substr($9,8,19),"ID="n".1");  }' input>output

any help appreciated

---------- Post updated at 04:50 AM ---------- Previous update was at 04:41 AM ----------

Thank you for your reply,but this is only one record there are set records with unique PAC and PT ids ,In this case I have to use your solution 300000 times with changing PAC and PT ids,cant we use
Code:
if(substr($9,17,10)=="Name=PT")

with your solution.

Last edited by Scott; 02-20-2011 at 05:55 AM.. Reason: Code tags
# 4  
Old 02-20-2011
Code:
awk -F"[=;]" 'NR==1{s=$2 ".1";print;FS=OFS=";";next}{sub(/=.*/,"="s,$1)}1' infile

# 5  
Old 02-20-2011
Thank you for your reply but I get following
Code:
scaffold_1      phytozome6      gene    2330052 2335284 .       -       .       ID=.1;Name=PT_0001s02940
scaffold_1      phytozome6      mRNA    2330052 2335284 .       -       .       ID=.1;Name=PT_0001s02940.1;PACid=18235154;Parent=PT_0001s02940
scaffold_1      phytozome6      CDS     2334981 2335230 .       -       0       Parent=.1;PACid=18235154
scaffold_1      phytozome6      5'-UTR  2335231 2335284 .       -       .       Parent=.1;PACid=18235154
scaffold_1      phytozome6      CDS     2334079 2334206 .       -       2       Parent=.1;PACid=18235154
scaffold_1      phytozome6      CDS     2333907 2333978 .       -       0       Parent=.1;PACid=18235154
scaffold_1      phytozome6      CDS     2333635 2333780 .       -       0       Parent=.1;PACid=18235154
scaffold_1      phytozome6      CDS     2333448 2333562 .       -       1       Parent=.1;PACid=18235154
scaffold_1      phytozome6      CDS     2333285 2333365 .       -       0       Parent=.1;PACid=18235154
scaffold_1      phytozome6      CDS     2332541 2332678 .       -       0       Parent=.1;PACid=18235154
scaffold_1      phytozome6      CDS     2331826 2331913 .       -       0       Parent=.1;PACid=18235154
scaffold_1      phytozome6      CDS     2330651 2330764 .       -       2       Parent=.1;PACid=18235154
scaffold_1      phytozome6      3'-UTR  2330052 2330460 .       -       .       Parent=.1;PACid=18235154
scaffold_1      phytozome6      CDS     2330461 2330483 .       -       2       Parent=.1;PACid=18235154

still no luck,
I used following command it will give the correct output,Did I do correct?
Code:
awk '{if(substr($9,1,5)=="ID=PT_") n=substr($9,4,16)".1"; gsub("ID=PAC:"substr($9,8,8),"ID="n);
  gsub("Parent=PAC:"substr($9,12,8),"Parent="n); print; }' infile > outfile


Last edited by Scott; 02-20-2011 at 12:46 PM.. Reason: Code tags
# 6  
Old 02-20-2011
is your document's first line different as the sample? I get the ID number from the first line.

The first line should be:

Code:
scaffold_1      phytozome6      gene    12632   13612   .       +       .       ID=PT_0001s00200;Name=PT_0001s00200

# 7  
Old 02-20-2011
yes It is different
Code:
scaffold_1      phytozome6      gene    12632   13612   .       +       .       ID=POPTR_0001s00200;Name=POPTR_0001s00200
scaffold_1      phytozome6      mRNA    12632   13612   .     +       .     ID=PAC:18235173;Name=POPTR_0001s00200.1;PACid=18235173;Parent=POPTR_0001s00200
scaffold_1      phytozome6      5'-UTR  12632   12638   .       +       .       Parent=PAC:18235173;PACid=18235173
scaffold_1      phytozome6      CDS     12639   12650   .       +       0      Parent=PAC:18235173;PACid=18235173
scaffold_1      phytozome6      CDS     12768   12891   .       +       0      Parent=PAC:18235173;PACid=18235173
scaffold_1      phytozome6      CDS     13117   13226   .       +       2      Parent=PAC:18235173;PACid=18235173
scaffold_1      phytozome6      CDS     13310   13384   .       +       0      Parent=PAC:18235173;PACid=18235173
scaffold_1      phytozome6      3'-UTR  13385   13612   .       +       .       Parent=PAC:18235173;PACid=18235173

change it to
Code:
caffold_1      phytozome6      gene    12632   13612   .       +       .       ID=POPTR_0001s00200;Name=POPTR_0001s00200
scaffold_1      phytozome6      mRNA    12632   13612   .     +       .     ID=POPTR_0001s00200.1;Name=POPTR_0001s00200.1;PACid=18235173;Parent=POPTR_0001s00200
scaffold_1      phytozome6      5'-UTR  12632   12638   .       +       .       Parent=POPTR_0001s00200.1;PACid=18235173
scaffold_1      phytozome6      CDS     12639   12650   .       +       0      Parent=POPTR_0001s00200.1;PACid=18235173
scaffold_1      phytozome6      CDS     12768   12891   .       +       0      Parent=POPTR_0001s00200.1;PACid=18235173
scaffold_1      phytozome6      CDS     13117   13226   .       +       2      Parent=POPTR_0001s00200.1;PACid=18235173
scaffold_1      phytozome6      CDS     13310   13384   .       +       0      Parent=POPTR_0001s00200.1;PACid=18235173
scaffold_1      phytozome6      3'-UTR  13385   13612   .       +       .       Parent=POPTR_0001s00200.1;PACid=18235173

Do you think your function will help.sorry for changing format

Last edited by Scott; 02-20-2011 at 12:46 PM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Shell script to call and sort awk script and output

I'm trying to create a shell script that takes a awk script that I wrote and a filename as an argument. I was able to get that done but I'm having trouble figuring out how to keep the header of the output at the top but sort the rest of the rows alphabetically. This is what I have now but it is... (1 Reply)
Discussion started by: Eric7giants
1 Replies

2. Shell Programming and Scripting

awk script to call another script based on second column entry

Hi I have a text file (Input.txt) with two column entries separated by tab as given below: aaa str1 bbb str2 cccccc str3 dddd str4 eee str3 ssss str2 sdf str3 hhh str1 fff str2 ccc str3 ..... ..... ..... (1 Reply)
Discussion started by: my_Perl
1 Replies

3. UNIX for Dummies Questions & Answers

Passing shell script parameter value to awk command in side the script

I have a shell script (.sh) and I want to pass a parameter value to the awk command but I am getting exception, please assist. diff=$1$2.diff id=$2 new=new_$diff echo "My id is $1" echo "I want to sync for user account $id" ##awk command I am using is as below cat $diff |... (1 Reply)
Discussion started by: Sarita Behera
1 Replies

4. Post Here to Contact Site Administrators and Moderators

Unable to pass shell script parameter value to awk command in side the same script

Variable I have in my shell script diff=$1$2.diff id=$2 new=new_$diff echo "My id is $1" echo "I want to sync for user account $id" ##awk command I am using is as below cat $diff | awk -F'~' ''$2 == "$id"' {print $0}' > $new I could see value of $id is not passing to the awk... (0 Replies)
Discussion started by: Ashunayak
0 Replies

5. Shell Programming and Scripting

Calling shell script within awk script throws error

I am getting the following error while passing parameter to a shell script called within awk script. Any idea what's causing this issue and how to ix it ? Thanks sh: -c: line 0: syntax error near unexpected token `newline' sh: -c: line 0: `./billdatecalc.sh ... (10 Replies)
Discussion started by: Sudhakar333
10 Replies

6. Shell Programming and Scripting

Passing awk variable argument to a script which is being called inside awk

consider the script below sh /opt/hqe/hqapi1-client-5.0.0/bin/hqapi.sh alert list --host=localhost --port=7443 --user=hqadmin --password=hqadmin --secure=true >/tmp/alerts.xml awk -F'' '{for(i=1;i<=NF;i++){ if($i=="Alert id") { if(id!="") if(dt!=""){ cmd="sh someScript.sh... (2 Replies)
Discussion started by: vivek d r
2 Replies

7. Shell Programming and Scripting

Help: How to convert this bash+awk script in awk script only?

This is the final first release of the dynamic menu generator for pekwm (WM). #!/bin/bash function param_val { awk "/^${1}=/{gsub(/^${1}="'/,""); print; exit}' $2 } echo "Dynamic {" for CF in `ls -c1 /usr/share/applications/*.desktop` do name=$(param_val Name $CF) ... (3 Replies)
Discussion started by: alexscript
3 Replies

8. Shell Programming and Scripting

Call shell script function from awk script

hi everyone i am trying to do this bash> cat abc.sh deepak() { echo Deepak } deepak bash>./abc.sh Deepak so it is giving me write simply i created a func and it worked now i modified it like this way bash> cat abc.sh (2 Replies)
Discussion started by: aishsimplesweet
2 Replies

9. Shell Programming and Scripting

want to pass parameters to awk script from shell script

Hello, I have this awk script that I want to execute by passing parameters through a shell script. I'm a little confused. This awk script removes duplicates from an input file. Ok, so I have a .sh file called rem_dups.sh #!/usr/bin/sh... (4 Replies)
Discussion started by: script_op2a
4 Replies

10. Shell Programming and Scripting

create a shell script that calls another script and and an awk script

Hi guys I have a shell script that executes sql statemets and sends the output to a file.the script takes in parameters executes sql and sends the result to an output file. #!/bin/sh echo " $2 $3 $4 $5 $6 $7 isql -w400 -U$2 -S$5 -P$3 << xxx use $4 go print"**Changes to the table... (0 Replies)
Discussion started by: magikminox
0 Replies
Login or Register to Ask a Question