Sliding window for string manipulation


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Sliding window for string manipulation
# 1  
Old 05-14-2012
Sliding window for string manipulation

I have a sting of "0"s and "1"s that I need to analyze. I need to look at each "1" and determine if it is in a neighborhood that is enriched for "1"s which means it is one of at least three "1"s in a 4 character window. My desired output is a count of "1"s in an enriched area.

For Example
Input sequence= 0100101000111011010111000

Output = 9

SO far my code looks like the following:
Code:
    echo $string
    length=$(echo ${#string})
    rlength=$[$length-3]
    i=3
    count=0
    while [ $i -lt $rlength ];
    do
        res=$(echo ${string:$[$i]:1})
        if [ $[$res] -eq "1" ]
        then
            if [ $[$(echo ${string:$[$i-3]:1}) + $(echo ${string:$[$i-2]:1}) + $(echo ${string:$[$i-1]:1}) ] -ge "2" ]
            then
                count=$[$count+1]
            elif [ $[$(echo ${string:$[$i-3]:1}) + $(echo ${string:$[$i-2]:1}) + $(echo ${string:$[$i-1]:1}) ] -ge "2" ]
            then
                count=$[$count+1]
            elif [ $[$(echo ${string:$[$i-3]:1}) + $(echo ${string:$[$i-2]:1}) + $(echo ${string:$[$i-1]:1}) ] -ge "2" ]
            then
                count=$[$count+1]
            elif [ $[$(echo ${string:$[$i-3]:1}) + $(echo ${string:$[$i-2]:1}) + $(echo ${string:$[$i-1]:1}) ] -ge "2" ]
            then
                count=$[$count+1]
            fi
        fi
        i=$[$i+1]
    done
    echo $count

It works just fine but problems include:
1) that, most importantly, it is slow as a snail.
2) it misses the first 3 characters of the string and the last three. I could live with this if necessary as long as the rest of the code works more quickly.

Any and all suggestions are welcome. Please understand that I am still new to this and description of what suggested code is doing is really, really useful.

Last edited by monstrousturtle; 05-14-2012 at 03:18 PM.. Reason: clarity of the code
# 2  
Old 05-14-2012
Put this to "script.awk":
Code:
{
for (i=1;i<=NF;i++) {
  if ($i==1) {
    for (j=1;j<=4;j++) {
      ones=0;
      for (k=(i+j-4);k<=(i+j-1);k++) {
        if (k>0) {
          if ($k==1) {
            ones++;
          }
          if (ones>=3) {
            e[i]=1;
          }
        }
      }
    }
  }
}
}
END{for (i in e) count++;print count}

Then run:
Code:
echo $string | awk -vFS="" -f script.awk

BTW, your script is giving "6" for this sample input...

Last edited by bartus11; 05-14-2012 at 04:13 PM.. Reason: fixed for first three characters
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Search and replace with a sliding window

Hi Unix Gurus, I have a file with data like: >header_1 TCCCCGA >header_2 CCAATTGGGTA The data to work with starts from the next line after '>header_xx'. (1) I want to search the three letter patterns 'CHH' or 'DDG' and replace C and G by exclamation ! so that CHH becomes !HH and DDG... (3 Replies)
Discussion started by: Fahmida
3 Replies

2. Shell Programming and Scripting

How do add values in a vector using a sliding window?

Greetings. I have a vector of numbers such as the following: 1 75 79 90 91 92 109 120 167 198 203 204 206 224 230 236 240 (4 Replies)
Discussion started by: Twinklefingers
4 Replies

3. Shell Programming and Scripting

Deleting part of a string : string manipulation

i have something like this... echo "teCertificateId" | awk -F'Id' '{ print $1 }' | awk -F'te' '{ print $2 }' Certifica the awk should remove 'te' only if it is present at the start of the string.. anywhere else it should ignore it. expected output is Certificate (7 Replies)
Discussion started by: vivek d r
7 Replies

4. UNIX for Dummies Questions & Answers

"Sliding window" with variables

I'm doing a little work that involves computing the average completion time of the last 5 of many file decompressions. It's not too tough, but I'm wondering if maybe there's a better way to write it. This is a bash script; here's the current idea: ctime5=$ctime4 ctime4=$ctime3 ctime3=$ctime2... (2 Replies)
Discussion started by: treesloth
2 Replies

5. Shell Programming and Scripting

String manipulation

Hello Could you help with small script: How to split string X1 into 3 string String X1 can have 1 or many strings X1='A1:B1:C1:D1:A2:B2:C2:D2:A3:B3:C3:D3' This is output which I want to have: Z1='A1:B1:C1:D1' Z2='A2:B2:C2:D2' Z3='A3:B3:C3:D3' (5 Replies)
Discussion started by: vikus
5 Replies

6. Shell Programming and Scripting

String manipulation

I want to do the next "I don't want to go school because I'm sick today." I want to join these two line but only when the first line is not more than 20 characters and ended whit nothing or a comma and the second line not more than 15. The 20 and the 15 can be change in the script. I know... (10 Replies)
Discussion started by: thailand
10 Replies

7. UNIX for Dummies Questions & Answers

Sliding window

Very simple problem I am not able to solve. I have been trying to modify the following code: awk '{t=$1; c = x}{for (i = 1; i <= length; i += wn)print t FS"" substr($2, i, mx) > ("block" ++c)}' mx=100 wn=100 infile.txt What I am tryng to acccomplish, I have a bunch of files where the first... (3 Replies)
Discussion started by: Xterra
3 Replies

8. Shell Programming and Scripting

Sliding window for sequencing data

Hi! I have some sequencing data that I have aligned using maq software Now, I have data that looks like this each line is a 'tag' chr1 10001 chr1 10002 chr1 10005 chr1 10007 chr1 10008 chr1 10008 chr1 10008 chr1 10019 chr1 10019 chr1 10020 What I really want to find out is how... (1 Reply)
Discussion started by: biobio
1 Replies

9. Shell Programming and Scripting

I need help with string manipulation

First of all I am VERY new to this so bare with me and try and explain everything even if it seems simple. Basically I want to read a line of text from a html file. See if the line of text has a certain string in it. copy an unknown number of characters (the last 4 characters wiil be ".jpg" the... (1 Reply)
Discussion started by: c3lica
1 Replies

10. UNIX for Dummies Questions & Answers

String manipulation

I am doing some training for a job I have just got and there is an exercise I am stuck with. I am not posting to ask a question about logic, just a trivial help with string manipulation. I would appreciate if somebody could at least give me a hint on how to do it. Basically, the intelligent part... (8 Replies)
Discussion started by: Dantastik
8 Replies
Login or Register to Ask a Question