Using key words or symbols to determine parser


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Using key words or symbols to determine parser
# 1  
Old 03-17-2015
Using key words or symbols to determine parser

There are multiple conditions in which the user can input data.

1. >
2. del
3. ins
4. delins

The user input is saved in a specific directory using the code below:

Code:
Code:
 
cd 'C:'
    C:/Users/cmccabe/Desktop/Python27/python.exe C:/Users/cmccabe/Desktop/Python27/run_batch_job.py C:/Users/cmccabe/Desktop/Python27/${id}.txt C:/Users/cmccabe/Desktop/annovar/${id}_position.txt PositionConverter

Is there a way to use that saved file and one of the indicaters (1-4) to determine what parser is used?

For example, if the saved file has in bold has a > in it then, awk 'NR==2 {split($2,a,"[_.>]");b=substr(a[4],1,length(a[4]-1));print a[2]+0,b,b,substr(a[4],length(a[4])),a[5]}' OFS="\t" ${id}_position.txt > ${id}_parse.txt is used.
However if it has a "del" in it then awk 'NR==2 {split($2,a,"[_.'del']");b=substr(a[4],1,length(a[4]-1));print a[2]+0,b,b,substr(a[4],length(a[4])),a[5]}' OFS="\t" ${id}_position.txt >${id}_parse.txt

The files attached have examples of each in $3
SNP.txt has the >
del.txt has the "del"

Thank you Smilie.
# 2  
Old 03-18-2015
I highly doubt that creating several threads on the same/similar subjects will result in gaining any tangible traction from the audience of these forums.
I believe you've already ran out of folks willing to help. You can keep trying with the "thank you" smilies going forward, but I'd suggest you rethink your approach....

Last edited by vgersh99; 03-18-2015 at 11:50 AM..
These 2 Users Gave Thanks to vgersh99 For This Post:
# 3  
Old 03-18-2015
These are not the same post, they are similiar in nature but rather then have one large thread I thought it be better one at a time. Maybe I need a diagram/flowchart and that will help. Please close all threads and I apologize, I was just doing what I thought made things easier. Thank you Smilie.
# 4  
Old 03-18-2015
@vgersh99: Absolutely yes. I've lost track of which field is taken from where and needs to be parsed how. And, I've lost interest.

@cmcabe: Start over. Attaching sample files is fine, even more if they are large, but you need to connect your requests to the data, be very careful (and detailed) with your descriptions, and don't leave people guessing.
This User Gave Thanks to RudiC For This Post:
# 5  
Old 03-18-2015
I hope that makes things a bit more clear. There is a diagram, some steps that explain the diagram, the parse rules, and a bash menu attached. Step 4 (parse by condition) is what I am struggling with and can not find a solution as of yet. I am trying to use awk to parse a file based on the format or text in that file, if the file has a ">" in it then SNP parser, but if file has a "del" in it then deletion parser. I am not a programmer by trade, but I am learning slowly. Thank you Smilie.

Code:
menu +> gjb2 -> -> -> -> gjb2name -> -> position -> -> parse -> add2text -> end program         
  |  |    ^                 ^              ^             ^         ^           ^
  |  |    |                 |              |             |         |           |         
  |    input variant/id   python         python         parse      |     +--Y/N select---+
  |    add NM_           ${id}_name   ${id}_position    input      |     menu |      exit
  |    merge input                                                 |          |            
  |    add 2 text                                                  |          |                                     
  |        |                      	                               |          |            
  |        |                                                       |          |            
  |	       |                                                       |          |            
  |-------+<-------------------------------------------------------<+----------                        

  1. user inputs id and variant, the specific NM_ is added, the two inputs are combined, and the input is added to list
  2. python script to verify name and create ${id}_name.txt and save it in specific directory (C:/Users/cmccabe/Desktop/annovar/${id}_name.txt)
  3. python script to convert and create ${id}_position.txt and save it in specific directory (C:/Users/cmccabe/Desktop/annovar/${id}_position.txt)
  4. conditional parse based on saved input in (C:/Users/cmccabe/Desktop/annovar/${id}_position.txt)
  5. function that adds input to a list file used in a later command
  6. user asked if there is more to do (If "Y" then return to menu, if "N" then Goodbye message and exit program)

Parse input based on condition:
If the ${id}_position.txt has ">" in $1 then # SNP awk $1 in SNP_position.txt is an example (NM_004004.5:c.79G>A) the SNP awk is run on $3 (NC_000013.10:g.20763642C>T) and the parsed output saved as ${id}_parse.txt (13 20763642 20763642 C T)
parse rules followed:
1. 4 zeros after the NC_ (not always the case) and the digits before the .
2. g. ### g.###
3. letter before the > (C)
4. letter after the > (T)

If the ${id}_name.position has "del" in $1 then # deletion awk $1 in DEL_position.txt is an example (NM_004004.5:c.35delG) the SNP awk is run on $3 (NC_000013.10:g.20763686delC) and the parsed output saved as ${id}_parse.txt (13 20763686 20763686 C -)
parse rules:
1. 4 zeros after the NC_ (not always the case) and the digits before the .
2. g. ### g.###
3. letter after the del (C)
4. "-" is used in spot after the "C" or in $5

Last edited by cmccabe; 03-18-2015 at 01:37 PM..
# 6  
Old 03-18-2015
You can use sth like
Code:
awk 'FNR==1 {next} $1 ~ /\.[0-9]+del[ACGT]+/ {print "del: ", FILENAME} /\.[0-9]+[ACGT]+>[ACGT]+/ {print ">: ",FILENAME}' file[34]
>:  file3
del:  file4

to identify your patterns, and sth along this line
Code:
awk 'FNR==1 {next} match($1,/\.[0-9]+del[ACGT]+/) {print "del: ", FILENAME, substr($1, RSTART+RLENGTH-1)} /\.[0-9]+[ACGT]+>[ACGT]+/ {print ">: ",FILENAME}' file[34]
>:  file3
del:  file4 G

to get at the char e.g. after the "del"...
# 7  
Old 03-18-2015
This sounds like it is what I am looking for

Code:
 awk 'FNR==1 {next} match($1,/\.[0-9]+del[ACGT]+/) {print "del: ", FILENAME, substr($1, RSTART+RLENGTH-1)} /\.[0-9]+[ACGT]+>[ACGT]+/ {print ">: ",FILENAME}' file[34]
>:  file3
del:  file4 G

However, I do not know what to do, it is a new command for me.

Code:
 $ awk 'FNR==1 {next} match($1,/\.[0-9]+del[ACGT]+/) {print "del: ", DEL_Position.txt, substr($1, RSTART+RLENGTH-1)} /\.[0-9]+[ACGT]+>[ACGT]+/ {print ">: ",DEL_Posistion.txt}' file[34] > output.txt

awk: cmd. line:1: FNR==1 {next} match($1,/\.[0-9]+del[ACGT]+/) {print "del: ", DEL_Position.txt, substr($1, RSTART+RLENGTH-1)} /\.[0-9]+[ACGT]+>[ACGT]+/ {print ">: ",DEL_Posistion.txt}
awk: cmd. line:1:  ^ syntax error
awk: cmd. line:1: FNR==1 {next} match($1,/\.[0-9]+del[ACGT]+/) {print "del: ", DEL_Position.txt, substr($1, RSTART+RLENGTH-1)} /\.[0-9]+[ACGT]+>[ACGT]+/ {print ">: ",DEL_Posistion.txt}
awk: cmd. line:1:  ^ syntax error

Thank you Smilie.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Red Hat

How to block some key words in my url for apache config?

Hi Folks, I am running a website and that needs to be tightened with security in terms of hacking... Whereas, In my URL, when i click on certain links the entire link as contains some words like below:/control_panel /controlpanel /admin /cms Whereas, i need to block those words in apache... (1 Reply)
Discussion started by: gsiva
1 Replies

2. Shell Programming and Scripting

sed parser behaving strange on replacing multiple words in multiple files

I have 4000 files like $cat clus_grp_seq10_g.phy 18 1002 anig_OJJ65951_1 ATGGTTTCGCAGCGTGATAGAGAATTGTTTAGGGATGATATTCGCTCGCGAGGAACGAAGCTCAATGCTGCCGAGCGCGAGAGTCTGCTAAGGCCATATCTGCCAGATCCGTCTGACCTTCCACGCAGGCCACTTCAGCGGCGCAAGAAGGTTCCTCG aver_OOF92921_1 ... (1 Reply)
Discussion started by: sammy777888
1 Replies

3. Shell Programming and Scripting

Delete lines containing key words dynamically

Hi Frens, I have a requirement where I need to delete lines having key words and am using the below command to do that sed '/UNIX/d' inputfile > output But now I have one more requirement where in there will be one reference file which has the ID's to be deleted from the master file. ... (3 Replies)
Discussion started by: weknowd
3 Replies

4. UNIX for Beginners Questions & Answers

Non-root script used search and list specific key words

Hy there all. Im new here. Olso new to terminal & bash, but it seams that for me it's much easyer to undarsatnd scripts than an actual programming language as c or anyother languare for that matter. S-o here is one og my home works s-o to speak. Write a shell script which: -only works as a... (1 Reply)
Discussion started by: Crisso2Face
1 Replies

5. UNIX for Dummies Questions & Answers

Delete lines according to a key words in that line

HI, I have a file A like this: c 1 length 14432 width 3434 temp 34 c 2 length 3343 width 0923 height 9383 hm 902 temp34 c 3 length 938 height 982 hm 9292 temp 23 ... (2 Replies)
Discussion started by: the_simpsons
2 Replies

6. UNIX for Dummies Questions & Answers

Filter logs for key words and create an index

well, im a total novice of unix commands and shell scripting, i never made anything. But yesterday in my work i was working with a lot of log files and i was wondering if there is a command for doing something like this: i have a bunch of files with text like this: blablabla errorcode1... (11 Replies)
Discussion started by: matius_88
11 Replies

7. Shell Programming and Scripting

Extract key words and print their values

Input file (HTTP request log file): GET... (2 Replies)
Discussion started by: buptwy
2 Replies

8. Shell Programming and Scripting

grep - Extracting multiple key words from stdout

Hello. From command line, the command zypper info nxclient return a bloc of data : linux local # zypper info nxclient Loading repository data... Reading installed packages... Information for package nxclient: Repository: zypper_local Name: nxclient Version: 3.5.0-7 Arch: x86_64... (7 Replies)
Discussion started by: jcdole
7 Replies

9. Shell Programming and Scripting

removing the words with symbols in a file in unix

I have file like below Hi iam author <br>joseph</br> in france. I live in my home <br></br> but no food. I will play footbal <br></br> but i wont play cricket. I will read all the books <br>all fiction stories</br> i hate horror stories. I want output like below Hi iam author... (3 Replies)
Discussion started by: vinothsekark
3 Replies

10. Shell Programming and Scripting

deleting symbols and characters between two words

Hi Please tell me how could i delete symbols, whitespaces, characters, words everything between two words in a line. Let my file is aaa BB ccc ddd eee FF kkk xxx 123456 BB 44^& iop FF 999 xxx uuu rrr BB hhh nnn FF 000 I want to delete everything comes in between BB and FF( deletion... (3 Replies)
Discussion started by: rish_max
3 Replies
Login or Register to Ask a Question