Please suggest a script or solution?


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Please suggest a script or solution?
# 1  
Old 09-18-2002
Question Please suggest a script or solution?

I have to solve a programming problem for my wife who is engaged in Research in Breast Cancer.

1. She has frequently to search a long single line of alphabetic characters (lower case) for an exact match of a string.

e.g. mwaaagglwrsraglralfrsrdaalfpgcerglhcsavscknwlkkfasktkkkvwyespslgshstykpskleflmrstskktrkedharlralngll ykaltdllctpevsqelydlnvelskvsltpdfsacraywkttlsaeqnahmeavlqrsaahmslisywqsqtldpgmkettlykmisgtlmphnpaapq srpqapvcvgsimrrstsrlwstkggkikgsgawcgrgrwls

2. The ONLY two strings to be searched for are -


r-r--s
r-r--t

The - can be any of the following characters

acdefghiklmnpqrstvyz

3. Once an exact match has been made it is essential to know the number of characters from the start of the line inclusive of the 6 character string.

Can anyone suggest a program or script.

It is urgent.

Thanks

Nev
# 2  
Old 09-18-2002
Re: Please suggest a script or solution?

Quote:
Originally posted by nmsinghe
I have to solve a programming problem for my wife who is engaged in Research in Breast Cancer.

1. She has frequently to search a long single line of alphabetic characters (lower case) for an exact match of a string.

e.g. mwaaagglwrsraglralfrsrdaalfpgcerglhcsavscknwlkkfasktkkkvwyespslgshstykpskleflmrstskktrkedharlralngll ykaltdllctpevsqelydlnvelskvsltpdfsacraywkttlsaeqnahmeavlqrsaahmslisywqsqtldpgmkettlykmisgtlmphnpaapq srpqapvcvgsimrrstsrlwstkggkikgsgawcgrgrwls

2. The ONLY two strings to be searched for are -


r-r--s
r-r--t

The - can be any of the following characters

acdefghiklmnpqrstvyz

3. Once an exact match has been made it is essential to know the number of characters from the start of the line inclusive of the 6 character string.

Can anyone suggest a program or script.

It is urgent.

Thanks

Nev
well something like the following pattern match can be used.

/r[acdefghiklmnpqrstvyz]r[acdefghiklmnpqrstvyz][acdefghiklmnpqrstvyz][s|t]/

useing perls index() or substr() would prolly be the best way to go i think. I know i will work on this tommarow just to knwo for myself how to do it. but i will be excited to see what others come up with befor i can post again.

this gives me something to think about tonight. heh

mmm some of the logic in this would be like so if index is used.

load the string into the index function.
index will find a "specified" number of occurances. always going with the left most unless otherwise specified. (so if there are 2 found strings i am at a loss. unless you take the return value of the index and load that into another index search and use the return value as a starting position, and or incromenting the occurance rateing. tossing this in a loop till the end of string.

the return value of the index search is the # of characters till a match is found. so that should fulfill your request.

what do you guys think?

Last edited by Optimus_P; 09-18-2002 at 07:41 PM..
# 3  
Old 09-18-2002
If I understand what you're asking, try this. This script does what I think you want done. Ai least, I think it does...
Code:
#! /usr/bin/ksh

##  r-r--s
##  r-r--t

longset="[acdefghiklmnpqrstvyz]"
pattern="r${longset}r${longset}${longset}[ts]"
typeset -u upshift

linen=0
IFS=""
while read input ; do
        orig=$input
        matches=0
        pos=1
        ((linen=linen+1))
        image=""
        while ((${#input})) ; do
                preamble="Line: ${linen} At position"
                if [[ $input = *(?)${pattern}*(?) ]] ; then
                        ((matches=matches+1))
                        leftover=${input#*${pattern}}
                        temp=${input%${leftover}}
                        lead=${temp%${pattern}}
                        this=${temp#${lead}}
                        upshift=${this}
                        input=$leftover
                        if ((${#lead})) ; then
                                echo $preamble $pos ${#lead} unmatched characters
                                image="${image}${lead}"
                                ((pos=pos+${#lead}))
                        fi
                        echo $preamble $pos MATCH: $this
                        image="${image}${upshift}"
                        ((pos=pos+${#this}))
                else
                        if ((matches)) ; then
                                echo $preamble $pos ${#input} trailing characters
                        fi
                        image="${image}${input}"
                        input=""
                fi
        done
        if ((matches)) ; then
                echo "$image"
                echo
                echo
        fi
done
exit 0

Put the lines to searched into a data file and run them against this script. Something like:
./thisscript < data.file
# 4  
Old 09-19-2002
Code:
#!/usr/bin/perl -w

while (<>) {
        chomp;

        if ($_ =~ /(r[acdefghiklmnpqrstvyz]r.{2}[st])/ ) {      # see if the current line matches what we are looking for.
                $found = index($_,$1);  # if we find a match find out how many positions over in the line the match is
                write (STDOUT);         # print out the report.
        }
}



format STDOUT =
@<<<<< @<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
$.,($found+6),(s/(r[acdefghiklmnpqrstvyz]r.{2}[st])/\U$1/g)
              ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
$_
              ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
$_
+-----------------------------+
.

format STDOUT_TOP =
+=================================+
|COOL STUFF BY OPTIMUSP at UNIXCOM|
+=================================+
Line Position Text
==== ======== ====
.

Code:
MY TEST DATA
mwaaagglwrsraragtlralfrsrdaalfpgcerglhcstrarjxsatacsavswlkkfaslgshstykpskleflmrstskktrkedharlralngll
ykaltdllctpevsqelydlnvelskvsltpdfsacraywkttlsaeqnahmeavlqrsaahmslisywqsqtldpgmkettlykmisgtlmphnpaapq
srpqapvcvgsimrrstsrlwstkggkikgsgawcgrgrwls

so far this is the base code. the only thing left is something to iterate thru the line to see if there is more then 1 match. but then again. if you have at least 1 match you can study that line a bit more. for a final inspection.

Last edited by Optimus_P; 09-19-2002 at 04:46 PM..
# 5  
Old 09-20-2002
Bug

Okay

Thanks the ksh script works fine.
I added some refinements including a log file.

I haven'y yet tried the PERL but I am sure it will work.

Thanks guys.

Smilie
Login or Register to Ask a Question

Previous Thread | Next Thread

5 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Script solution as singleline ?

Hello My script has following line and output find path -type d | awk -F "/" 'NF == 4{print $3}' path/custype=Type1/logdate=20160414 path/custype=Type11122/logdate=20160414 But I need following output that I need custtype information between "" like... (4 Replies)
Discussion started by: msuluhan
4 Replies

2. Shell Programming and Scripting

Please suggest the Sites for perl script beginners for better understanding

I am begginer to perl scripting, i like to learn all the functionality of the perl scrpting , Could you please help me on this :confused::confused: (2 Replies)
Discussion started by: jothi basu
2 Replies

3. UNIX for Advanced & Expert Users

Solution required for awk script

Hi Jim, The following script is in working state. But i m having one more problem with awk cmd. Could you tell me how to use any variable inside awk or how to take any variable value outside awk. My problem is i want to maintain one property file in which i am declaring variable value into that... (1 Reply)
Discussion started by: Ganesh Khandare
1 Replies

4. Shell Programming and Scripting

Perl script solution

Hi I need to do this thing in awk (or perl?). I try to find out how can I identify 1st and 2nd result from the OR expression in gensub: block='title Sata Mandriva kernel /boot/vmlinuz initrd /boot/initrd.img' echo "$block" | awk '{ x=gensub(/(kernel|initrd) /,"\\1XXX","g"); print x }' ... (12 Replies)
Discussion started by: webhope
12 Replies

5. UNIX for Dummies Questions & Answers

How to get the script corrected to get the solution

Can anyone help it out, My Requirement: Actually i grep for the items in the atrblist (Result of it will provide the line where the item present and also the next line of where it presents)then it will be stored in $i.txt From tat result i wil grep 2nd word after getdate() word and store... (2 Replies)
Discussion started by: prsam
2 Replies
Login or Register to Ask a Question