How to count the length of fasta sequences? Post: 303033621

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extract length wise sequences from fastq file

I have a fastq file from small RNA sequencing with sequence lengths between 15 - 30. I wanted to filter sequence lengths between 21-25 and write to another fastq file. how can i do that?

2. Shell Programming and Scripting

Shell script for changing the accession number of DNA sequences in a FASTA file

Hi, I am having a file of dna sequences in fasta format which look like this: >admin_1_45 atatagcaga >admin_1_46 atatagcagaatatatat with many such thousands of sequences in a single file. I want to the replace the accession Id "admin_1_45" similarly in following sequences to...

3. Shell Programming and Scripting

Extract sequences from a FASTA file based on another file

4. Shell Programming and Scripting

Count and search by sequence in multiple fasta file

Hello, I have 10 fasta files with sequenced reads information with read sizes from 15 - 35 . I have combined the reads and collapsed in to unique reads and filtered for sizes 18 - 26 bp long unique reads. Now i wanted to count each unique read appearance in all the fasta files and make a table...

5. Shell Programming and Scripting

Shorten header of protein sequences in fasta file

I have a fasta file as follows >sp|O15090|FABP4_HUMAN Fatty acid-binding protein, adipocyte OS=Homo sapiens GN=FABP4 PE=1 SV=3 MCDAFVGTWKLVSSENFDDYMKEVGVGFATRKVAGMAKPNMIISVNGDVITIKSESTFKN TEISFILGQEFDEVTADDRKVKSTITLDGGVLVHVQKWDGKSTTIKRKREDDKLVVECVM KGVTSTRVYERA >sp|L18484|AP2A2_RAT AP-2...

6. UNIX for Dummies Questions & Answers

Select distinct sequences from fasta file and list

Hi How can I extract sequences from a fasta file with respect a certain criteria? The beginning of my file (containing in total more than 1000 sequences) looks like this: >H8V34IS02I59VP SDACNDLTIALLQIAREVRVCNPTFSFRWHPQVKDEVMRECFDCIRQGLG YPSMRNDPILIANCMNWHGHPLEEARQWVHQACMSPCPSTKHGFQPFRMA...

7. Shell Programming and Scripting

Getting unique sequences from multiple fasta file

Hi, I have a fasta file with multiple sequences. How can i get only unique sequences from the file. For example my_file.fasta >seq1 TCTCAAAGAAAGCTGTGCTGCATACTGTACAAAACTTTGTCTGGAGAGATGGAGAATCTCATTGACTTTACAGGTGTGGACGGTCTTCAGAGATGGCTCAAGCTAACATTCCCTGACACACCTATAGGGAAAGAGCTAAC >seq2...

8. Shell Programming and Scripting

Outputting sequences based on length with sed

I have this file: >ID1 AA >ID2 TTTTTT >ID-3 AAAAAAAAA >ID4 TTTTTTGGAGATCAGTAGCAGATGACAG-GGGGG-TGCACCCC Add I am trying to use this script to output sequences longer than 15 characters: sed -r '/^>/N;{/^.{,15}$/d}' The desire output would be this: >ID4...

9. Shell Programming and Scripting

Shorten header of protein sequences in fasta file to only organism name

I have a fasta file as follows >sp|Q8WWQ8|STAB2_HUMAN Stabilin-2 OS=Homo sapiens OX=9606 GN=STAB2 PE=1 SV=3 MMLQHLVIFCLGLVVQNFCSPAETTGQARRCDRKSLLTIRTECRSCALNLGVKCPDGYTM ITSGSVGVRDCRYTFEVRTYSLSLPGCRHICRKDYLQPRCCPGRWGPDCIECPGGAGSPC NGRGSCAEGMEGNGTCSCQEGFGGTACETCADDNLFGPSCSSVCNCVHGVCNSGLDGDGT...

10. UNIX for Beginners Questions & Answers

How to add specific bases at the beginning and ending of all the fasta sequences?

Hi, I have to add 7 bases of specific nucleotide at the beginning and ending of all the fasta sequences of a file. For example, I have a multi fasta file namely test.fasta as given below test.fasta >TalAA18_Xoo_CIAT_NZ_CP033194.1:_2936369-2939570:+1...

LEARN ABOUT DEBIAN

reprof

REPROF(1)							   User Commands							 REPROF(1)

NAME

       reprof - predict protein secondary structure and solvent accessibility

SYNOPSIS

       reprof -i [query.blastPsiMat] [OPTIONS]

       reprof -i [query.fasta] [OPTIONS]

       reprof -i [query.blastPsiMat|query.fasta] --mutations [mutations.txt] [OPTIONS]

DESCRIPTION

       Predict protein secondary structure and solvent accessibility.

   Output Format
       The output format is self-explanatory, i.e. the colums of the output are described in the output file itself.

OPTIONS

       -i, --input=FILE
	   Input BLAST PSSM matrix file (from Blast -Q option) or input (single) FASTA file.

       -o, --out=FILE
	   Either an output file or a directory. If not provided or a directory, the suffix of the input filename (i.e. .fasta or .blastPsiMat) is
	   replaced to create an output filename.

       --mutations=[all|FILE]
	   Either the keyword "all" to predict all possible mutations or a file containing mutations one per line such as "C12M" for C is mutated
	   to M on position 12:

	    C30Y
	    R31W
	    G48D

	   This mutation code is also attached to the output filename using "_".  An additional file ending "_ORI" contains the prediction using
	   no evolutionary information even if a BLAST PSSM matrix was provided.

       --modeldir=DIR
	   Directory where the model and feature files are stored.  Default: /usr/share/reprof.

AUTHOR

       Peter Hoenigschmid hoenigschmid@rostlab.org, Burkhard Rost

EXAMPLES

       Prediction from BLAST PSSM matrix for best results:
	    reprof -i /usr/share/doc/reprof/examples/example.Q -o /tmp/example.Q.reprof

       Prediction from FASTA file:
	    reprof -i /usr/share/doc/reprof/examples/example.fasta -o /tmp/example.fasta.reprof

       Prediction from BLAST PSSM matrix file using the mutation mode:
	    reprof -i /usr/share/doc/reprof/examples/example.Q -o /tmp/mutations_example.Q.reprof --mutations /usr/share/doc/reprof/examples/mutations.txt
	    # Result files for the above call are going to be:
	    # /tmp/mutations_example.Q.{reprof,reprof_F172P,reprof_M1Q,reprof_N34Y,reprof_ORI} - see --mutations for a description of the extensions.

COPYRIGHT

       This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by
       the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

       This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of
       MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for more details.

       You should have received a copy of the GNU General Public License along with this program.  If not, see <http://www.gnu.org/licenses/>.

BUGS

       https://rostlab.org/bugzilla3/enter_bug.cgi?product=reprof

SEE ALSO

       blast2(1)

       http://rostlab.org/

1.0.1								    2012-01-13								 REPROF(1)