Ok... a bit messy but done very quickly... first I created a sed script (call the file dna.sed):
(you don't need the /g at the end of these... again, created this quicky)
then a script to process DNA sequence lines (assumes sequences each on a line):
again script expects to read the sequences one at a time, you can redirect from a pipe, etc..
In my example below this is just with the sample line you provided.
Hi all,
I have a requirement where the variable name starts with $, like
$Amd=/home/student/test/
How to work wit it? can some one help me, am in gr8 confusion:confused: (5 Replies)
Hi all,
Using Perl, I need to extract DNA bases from a GenBank file for a given plant species. A sample GenBank file is here...
Nucleotide
This is saved on my computer as NC_001666.gb. I also have a file that is saved on my computer as NC_001666.txt. This text file has a list of all... (5 Replies)
I am trying to reverse and complement my DNA sequences. The file format is FASTA, something like this:
Now, to reverse the sequence, I should start reading from right to left. At the same should be complemented. Thus, "A" should be read as "T"; "C" should be read as "G"; "T" should be converted... (8 Replies)
Hi all,
I have a file like this
ID 3BP5L_HUMAN Reviewed; 393 AA.
AC Q7L8J4; Q96FI5; Q9BQH8; Q9C0E3;
DT 05-FEB-2008, integrated into UniProtKB/Swiss-Prot.
DT 05-JUL-2004, sequence version 1.
DT 05-SEP-2012, entry version 71.
FT COILED 59 140 ... (1 Reply)
Looking for a simple way to convert ranges to a numerical sequence that would assign the original value of the range to the individual numbers that are on the range.
Thank you
given data
13196-13199 0
13200 4
13201 10
13202-13207 3
13208-13210 7
desired... (3 Replies)
Hi,
I am having a file of dna sequences in fasta format which look like this:
>admin_1_45
atatagcaga
>admin_1_46
atatagcagaatatatat
with many such thousands of sequences in a single file. I want to the replace the accession Id "admin_1_45" similarly in following sequences to... (5 Replies)
If I run rm -rf * command under one parent directory.
/data > rm -rf *
Is there anyway to know which files will be deleted first ?
Start using code tags please, ty. (2 Replies)
Discussion started by: sameermohite
2 Replies
LEARN ABOUT DEBIAN
ssake
SSAKE(1) General Commands Manual SSAKE(1)NAME
ssake - assembling millions of very short DNA sequences
SYNOPSIS
Progressive assembly of millions of short DNA sequences by k-mer search through a prefix tree and 3' extension.
OPTIONS -f Fasta file containing all the [paired (-p 1) / unpaired (-p 0)] reads (required) paired reads must now be separated by ":"
-s Fasta file containing sequences to use as seeds exclusively (specify only if different from read set, optional)
-m Minimum number of overlapping bases with the seed/contig during overhang consensus build up (default -m 16)
-o Minimum number of reads needed to call a base during an extension (default -o 3)
-r Minimum base ratio used to accept a overhang consensus base (default -r 0.7)
-t Trim up to -t base(s) on the contig end when all possibilities have been exhausted for an extension (default -t 0)>
-p Paired-end reads used? (-p 1=yes, -p 0=no, default -p 0)
-v Runs in verbose mode (-v 1=yes, -v 0=no, default -v 0, optional)
-b Base name for your output files (optional)
============ Options below only considered with -p 1 ============
-d Mean distance expected/observed between paired-end reads (default -d 200, optional)
-e Error (%) allowed on mean distance e.g. -e 0.75 == distance +/- 75% (default -e 0.75, optional)
-k Minimum number of links (read pairs) to compute scaffold (default -k 2, optional)
-a Maximum link ratio between two best contig pairs *higher values lead to least accurate scaffolding* (default -a 0.70, optional)
-z Minimum contig size to track paired-end reads (default -z 50, optional)
-g Fasta file containing unpaired sequence reads (optional)
SEE ALSO
/usr/share/doc/ssake/SSAKE.readme between
AUTHORS
This manual page was written by Andreas Tille <tille@debian.org> for the Debian system (but may be used by others). Permission is granted
to copy, distribute and/or modify this document under the terms of the GNU General Public License, Version 2 any later version published by
the Free Software Foundation.
On Debian systems, the complete text of the GNU General Public License can be found in /usr/share/common-licenses/GPL.
January 2008 SSAKE(1)