Sponsored Content
Top Forums Shell Programming and Scripting Please suggest a script or solution? Post 28463 by Optimus_P on Wednesday 18th of September 2002 06:23:51 PM
Old 09-18-2002
Re: Please suggest a script or solution?

Quote:
Originally posted by nmsinghe
I have to solve a programming problem for my wife who is engaged in Research in Breast Cancer.

1. She has frequently to search a long single line of alphabetic characters (lower case) for an exact match of a string.

e.g. mwaaagglwrsraglralfrsrdaalfpgcerglhcsavscknwlkkfasktkkkvwyespslgshstykpskleflmrstskktrkedharlralngll ykaltdllctpevsqelydlnvelskvsltpdfsacraywkttlsaeqnahmeavlqrsaahmslisywqsqtldpgmkettlykmisgtlmphnpaapq srpqapvcvgsimrrstsrlwstkggkikgsgawcgrgrwls

2. The ONLY two strings to be searched for are -


r-r--s
r-r--t

The - can be any of the following characters

acdefghiklmnpqrstvyz

3. Once an exact match has been made it is essential to know the number of characters from the start of the line inclusive of the 6 character string.

Can anyone suggest a program or script.

It is urgent.

Thanks

Nev
well something like the following pattern match can be used.

/r[acdefghiklmnpqrstvyz]r[acdefghiklmnpqrstvyz][acdefghiklmnpqrstvyz][s|t]/

useing perls index() or substr() would prolly be the best way to go i think. I know i will work on this tommarow just to knwo for myself how to do it. but i will be excited to see what others come up with befor i can post again.

this gives me something to think about tonight. heh

mmm some of the logic in this would be like so if index is used.

load the string into the index function.
index will find a "specified" number of occurances. always going with the left most unless otherwise specified. (so if there are 2 found strings i am at a loss. unless you take the return value of the index and load that into another index search and use the return value as a starting position, and or incromenting the occurance rateing. tossing this in a loop till the end of string.

the return value of the index search is the # of characters till a match is found. so that should fulfill your request.

what do you guys think?

Last edited by Optimus_P; 09-18-2002 at 07:41 PM..
 

5 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

How to get the script corrected to get the solution

Can anyone help it out, My Requirement: Actually i grep for the items in the atrblist (Result of it will provide the line where the item present and also the next line of where it presents)then it will be stored in $i.txt From tat result i wil grep 2nd word after getdate() word and store... (2 Replies)
Discussion started by: prsam
2 Replies

2. Shell Programming and Scripting

Perl script solution

Hi I need to do this thing in awk (or perl?). I try to find out how can I identify 1st and 2nd result from the OR expression in gensub: block='title Sata Mandriva kernel /boot/vmlinuz initrd /boot/initrd.img' echo "$block" | awk '{ x=gensub(/(kernel|initrd) /,"\\1XXX","g"); print x }' ... (12 Replies)
Discussion started by: webhope
12 Replies

3. UNIX for Advanced & Expert Users

Solution required for awk script

Hi Jim, The following script is in working state. But i m having one more problem with awk cmd. Could you tell me how to use any variable inside awk or how to take any variable value outside awk. My problem is i want to maintain one property file in which i am declaring variable value into that... (1 Reply)
Discussion started by: Ganesh Khandare
1 Replies

4. Shell Programming and Scripting

Please suggest the Sites for perl script beginners for better understanding

I am begginer to perl scripting, i like to learn all the functionality of the perl scrpting , Could you please help me on this :confused::confused: (2 Replies)
Discussion started by: jothi basu
2 Replies

5. Shell Programming and Scripting

Script solution as singleline ?

Hello My script has following line and output find path -type d | awk -F "/" 'NF == 4{print $3}' path/custype=Type1/logdate=20160414 path/custype=Type11122/logdate=20160414 But I need following output that I need custtype information between "" like... (4 Replies)
Discussion started by: msuluhan
4 Replies
CD-HIT-2D-PARA.PL(1)						   User Commands					      CD-HIT-2D-PARA.PL(1)

NAME
cd-hit-2d-para.pl - divide a big clustering job into pieces to run cd-hit-2d or cd-hit-est-2d jobs SYNOPSIS
cd-hit-2d-para.pl options DESCRIPTION
This script divide a big clustering job into pieces and submit jobs to remote computers over a network to make it parallel. After all the jobs finished, the script merge the clustering results as if you just run a single cd-hit-2d or cd-hit-est-2d. You can also use it to divide big jobs on a single computer if your computer does not have enough RAM (with -L option). Requirements: 1 When run this script over a network, the directory where you run the scripts and the input files must be available on all the remote hosts with identical path. 2 If you choose "ssh" to submit jobs, you have to have passwordless ssh to any remote host, see ssh manual to know how to set up passwordless ssh. 3 I suggest to use queuing system instead of ssh, I currently support PBS and SGE 4 cd-hit-2d cd-hit-est-2d cd-hit-div cd-hit-div.pl must be in same directory where this script is in. Options -i input filename for 1st db in fasta format, required -i2 input filename for 2nd db in fasta format, required -o output filename, required --P program, "cd-hit-2d" or "cd-hit-est-2d", default "cd-hit-2d" --B filename of list of hosts, requred unless -Q or -L option is supplied --L number of cpus on local computer, default 0 when you are not running it over a cluster, you can use this option to divide a big clustering jobs into small pieces, I suggest you just use "--L 1" unless you have enough RAM for each cpu --S Number of segments to split 1st db into, default 2 --S2 Number of segments to split 2nd db into, default 8 --Q number of jobs to submit to queue queuing system, default 0 by default, the program use ssh mode to submit remote jobs --T type of queuing system, "PBS", "SGE" are supported, default PBS --R restart file, used after a crash of run -h print this help More cd-hit-2d/cd-hit-est-2d options can be speicified in command line Questions, bugs, contact Weizhong Li at liwz@sdsc.edu cd-hit-2d-para.pl 4.6-2012-04-25 April 2012 CD-HIT-2D-PARA.PL(1)
All times are GMT -4. The time now is 02:14 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy