Tricky task with DNA sequences.


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Tricky task with DNA sequences.
# 1  
Old 08-31-2011
Tricky task with DNA sequences.

I am trying to reverse and complement my DNA sequences. The file format is FASTA, something like this:
Quote:
>Influenza
TGGATATGAT
>Dengue
AGTTACTCCGAAA
>Varicella
CTTTGGATA
Now, to reverse the sequence, I should start reading from right to left. At the same should be complemented. Thus, "A" should be read as "T"; "C" should be read as "G"; "T" should be converted into "A"; and "G" must be seen as "C" in the reversed-complemented sequence. From my example, the resulting file should look something like this:
Quote:
>Influenza
ATCATATCCA
>Dengue
TTTCGGAGTAACT
>Varicella
TATCCAAAG
Any help will be greatly appreciated!
# 2  
Old 08-31-2011
You can use sth like that to reverse and complement dna chain (given as parameter to the script):
Code:
#!/bin/bash

# get chain as parameter
dna=$1

function rrev() {
	case $1 in
		"T")
		echo -n A
		;;
		"A")
		echo -n T
		;;
		"G")
		echo -n C
		;;
		"C")
		echo -n G
		;;
	esac
}

reversed=`echo $dna | rev`

for i in $(seq 0 $((${#dna}-1))); do
	rrev ${reversed:$i:1};
done
echo

Run script as follow:
Code:
$ ./rev.sh ATCATATCCA
TGGATATGAT

# 3  
Old 08-31-2011
Quote:
Thus, "A" should be read as "T"; "C" should be read as "G"; "T" should be converted into "A"; and "G" must be seen as "C" in the reversed-complemented sequence
Is there some algorithm behind or this is a fixed mapping?
I mean there are only A, C, G and T and T, G, C and A?
# 4  
Old 08-31-2011
If the answer is yes:

Code:
perl -nle'BEGIN {
  @map{ A, C, G, T } = ( T, G, C, A )
  }
  print /^>/ ?
    $_ :
      join //, map $map{ $_ }, split //, scalar reverse
  ' infile

This User Gave Thanks to radoulov For This Post:
# 5  
Old 08-31-2011
Ok, got one, which takes file as parameter:

Code:
#!/bin/bash

file=$1

function rrev() {
	case $1 in
		"T")
		echo -n A
		;;
		"A")
		echo -n T
		;;
		"G")
		echo -n C
		;;
		"C")
		echo -n G
		;;
	esac
}

while read line; do

	if [ `echo $line | grep -c "^>"` -eq 0 ]; then
		orig_revd=`echo $line | rev`
		for i in $(seq 0 $((${#line}-1))); do
        		rrev ${orig_revd:$i:1};
		done
		echo
	else
		echo $line
	fi

done < $file

Edit:
@up - looks much simplier in perl :-)
# 6  
Old 08-31-2011
With awk:

Code:
awk 'BEGIN {
  j = n = split("A C G T", t)
  for (i = 0; ++i <= n;)
    map[t[i]] = t[j--]
  }
{
  if (/^>/) print
  else {
    for (i = length; i; i--)
	  printf "%s", map[substr($0, i, 1)]
    print x
	}	  
  }'  infile

# 7  
Old 09-01-2011
Radulov

That's great! The only part that is missing is the headers (>Sequence ID) for each sequence. How can I modify your perl script so it can include the headers for each and every sequence?
Thanks!
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Convert a DNA sequence into Amino Acid

I am trying to write a bash script that would be able to read DNA sequences (each line in the file is a sequence) from a file, where sequences are separated by an empty line. I am then to find the amino acid that these DNA sequences encode per codon (each group of three literals.) For example, if I... (3 Replies)
Discussion started by: faizlo
3 Replies

2. Shell Programming and Scripting

Shell script for changing the accession number of DNA sequences in a FASTA file

Hi, I am having a file of dna sequences in fasta format which look like this: >admin_1_45 atatagcaga >admin_1_46 atatagcagaatatatat with many such thousands of sequences in a single file. I want to the replace the accession Id "admin_1_45" similarly in following sequences to... (5 Replies)
Discussion started by: margarita
5 Replies

3. HP-UX

Tricky situation getting IP address

Hi, I have a multihomed system HP-UX with two NIC cards having IP address 10.9.0.13 & 10.9.0.45 I have two weblogic servers running one listening on "10.9.0.13" and the other on "10.9.0.45" Given a PID how is it possible to extract the IP Address that the weblogic server is using and... (1 Reply)
Discussion started by: mohtashims
1 Replies

4. Solaris

Tricky egrep

Hi folks! My first post here. I'm working on a script that retrieves a range of files from a list depending on a range of time. UPDATE: I've seen it could be difficult to read all this thing, so I'll make a summarize it.. How come I do this and take a result.. grep "..\:.." lista.new |... (4 Replies)
Discussion started by: kl0x
4 Replies

5. Shell Programming and Scripting

Extracting DNA sequences from GenBank files using Perl

Hi all, Using Perl, I need to extract DNA bases from a GenBank file for a given plant species. A sample GenBank file is here... Nucleotide This is saved on my computer as NC_001666.gb. I also have a file that is saved on my computer as NC_001666.txt. This text file has a list of all... (5 Replies)
Discussion started by: akreibich07
5 Replies

6. Shell Programming and Scripting

Parse an XML task list to create each task.xml file

I have an task definition listing xml file that contains a list of tasks such as <TASKLIST <TASK definition="Completion date" id="Taskname1" Some other <CODE name="Code12" <Parameter pname="Dog" input="5.6" units="feet" etc /Parameter> <Parameter... (3 Replies)
Discussion started by: MissI
3 Replies

7. Shell Programming and Scripting

comment and Uncomment single task out of multiple task

I have a file contains TASK gsnmpproxy { CommandLine = $SMCHOME/bin/gsnmpProxy.exe } TASK gsnmpdbgui { CommandLine = $SMCHOME/bin/gsnmpdbgui.exe I would like to comment and than uncomment specific task eg TASK gsnmpproxy Pls suggest how to do in shell script (9 Replies)
Discussion started by: madhusmita
9 Replies

8. Shell Programming and Scripting

Tricky Sed

Hello. I am trying to convert occurrences of 'NULL' from a datafile. The 'NULL' occurences appears at this: |NULL| NULL|NULL| NULL|NULL| NULL|NULL| NULL| There should be 52 fields per line. I would like any occurrence of | NULL| or |NULL| to appear as '||' Currently I am using this sed... (2 Replies)
Discussion started by: bestbuyernc
2 Replies

9. Windows & DOS: Issues & Discussions

Tricky one...

Here's my problem: I have a laptop running Windows XP Pro with no internal CD or Floppy drives. I want to install Linux on it. I don't care about the Windows XP Pro installation, in fact I would like to install Linux over the entirety of the HD. However I cannot boot from any external CD drive... (1 Reply)
Discussion started by: saabir
1 Replies
Login or Register to Ask a Question