I have been thinking how to go around this problem but I just do not find a way to do it. So, I finally decided to ask. I have a real bunch of different sequences of different lenghts aligned in the following format:
Quote:
>Sequence ID1
TAGATGTGCCCGTGGGTTTCTAGATGTGCCCGTGGGTTTC
>Sequence ID2
TTGATGTCGTGGGTTTCCCGTAGATGTGCCCGTGGGTT
>Sequence ID3
TTGATGTGCCAGTTTCCCGTTAGATGTGCCCGTGGGTTTC
>Sequence ID4
TTGATGTGTCCCGTCGACACTAGATGTGCCCGTGGG
>Sequence ID5
TTGATTCCCGTCGACACCGGTAGATGTGCCCGTGGGTTTC
Now, what I need is a 'window' of let say 10 characters that I have to 'slide' along the entire alignment in "steps" of let say 5 characters and then generate the corresponding files with a consecutive number, something like Block1, Block2, etc. Thus, I will end up with the following files:
Block1=
Quote:
>Sequence ID1
TAGATGTGCC
>Sequence ID2
TTGATGTCGT
>Sequence ID3
TTGATGTGCC
>Sequence ID4
TTGATGTGTC
>Sequence ID5
TTGATTCCCG
Block2=
Quote:
>Sequence ID1
GTGCCCGTGG
>Sequence ID2
GTCGTGGGTT
>Sequence ID3
GTGCCAGTTT
>Sequence ID4
GTGTCCCGTC
>Sequence ID5
TCCCGTCGAC
Block3=
Quote:
>Sequence ID1
CGTGGGTTTC
>Sequence ID2
GGGTTTCCCG
>Sequence ID3
AGTTTCCCGT
>Sequence ID4
CCGTCGACAC
>Sequence ID5
TCGACACCGG
So on and so forth. Most probably the last "Block" will not have a windows of 10 characters and that's is perfectly fine.
Any help with this problem will be greatly appreciate it!