Sponsored Content
Full Discussion: grep FASTA files
Top Forums UNIX for Dummies Questions & Answers grep FASTA files Post 302430266 by Xterra on Thursday 17th of June 2010 03:50:17 AM
Old 06-17-2010
pseudocoder

Would it be a way to do the same with bash? It will be easier for me to understand.
I was wondering if there is any way to calculate the frequency of each sequence? In other words, let assume that after 'trimming' the sequences there are several that are identical, would it be possible to determine the frequency and include it as part of the ID line? Something like this:

Quote:
> Seq A Freq 50
AGAGATAGATAGAGCTGAT
> Seq B Freq 25
AGAGATAGATAGAGCTGAT
> Seq C Freq 25
AGAGATAGATAGAGCTGAT


Thanks

Last edited by Xterra; 06-17-2010 at 05:00 AM..
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

fasta format?

Hi, I'm in need of creating a file in the fasta format: >1A6A.A HVIIQAEFYLNPDQSGEFMFDFDGDEIFHVDMAKKETVWRLEEFGRFASFEAQGALANIAVDKANLEIMTKRSNYTPITN VPPEVTVLTNSPVELREPNVLICFIDKFTPPVVNVTWLRNGKPVTTGVSETVFLPREDHLFRKFHYLPFLPSTEDVYDCR VEHWGLDEPLLKHWEF >1A6A.B ... (5 Replies)
Discussion started by: lost
5 Replies

2. Shell Programming and Scripting

grep for certain files using a file as input to grep and then move

Hi All, I need to grep few files which has words like the below in the file name , which i want to put it in a file and and grep for the files which contain these names and move it to a new directory , full file name -C20091210.1000-20091210.1100_SMGBSC3:1000... (2 Replies)
Discussion started by: anita07
2 Replies

3. Shell Programming and Scripting

Changing from FASTA to PHYLIP format

I really need some help with this task. I have a bunch of FASTA files with hundreds of DNA sequences that look like this: >SeqID1 AACCATGACAGAGGAGATGTGAACAGATAGAGGGATGACAGATGACAGATAGACCCAGAC TGACAGGTTCAAAGGCTGCAGTGCAGTGACGTGACGATTT >Sequence 22... (13 Replies)
Discussion started by: Xterra
13 Replies

4. UNIX for Dummies Questions & Answers

renaming (renumbering) fasta files

I have a fasta file that looks like this: >Noname ACCAAAATAATTCATGATATACTCAGATCCATCTGAGGGTTTCACCACTTGTAGAGCTAT CAGAAGAATGTCAATCAACTGTCCGAGAAAAAAGAATCCCAGG >Noname ACTATAAACCCTATTTCTCTTTCTAAAAATTGAAATATTAAAGAAACTAGCACTAGCCTG ACCTTTAGCCAGACTTCTCACTCTTAATGCTGCGGACAAACAGA ... I want to... (2 Replies)
Discussion started by: Oyster
2 Replies

5. UNIX for Dummies Questions & Answers

Breaking a fasta formatted file into multiple files containing each gene separately

Hey, I've been trying to break a massive fasta formatted file into files containing each gene separately. Could anyone help me? I've tried to use the following code but i've recieved errors every time: for i in *.rtf.out do awk '/^>/{f=++d".fasta"} {print > $i.out}' $i done (1 Reply)
Discussion started by: Ann Mc Cartney
1 Replies

6. UNIX for Dummies Questions & Answers

How to change sequence name in along fasta file?

Hi I have an alignment file (.fasta) with ~80 sequences. They look like this- >JV101.contig00066(+):25302-42404|sequence_index=0|block_index=4|species=JV101|JV101_4_0 GAGGTTAATTATCGATAACGTTTAATTAAAGTGTTTAGGTGTCATAATTT TAAATGACGATTTCTCATTACCATACACCTAAATTATCATCAATCTGAAT... (2 Replies)
Discussion started by: baika
2 Replies

7. UNIX for Dummies Questions & Answers

Fasta header modification

Hi, I need some help with modifying fasta headers. I have a fasta file with thousands of contigs and I need to modify their headers with the information obtained from a second file. File 1 contains the fasta sequences: >contig0001 length=11115 numreads=10777 agatgtagatctct... (6 Replies)
Discussion started by: Lokaps
6 Replies

8. UNIX for Dummies Questions & Answers

Round up -FASTA file

I have the following script: awk 'FNR==NR{s+=$3;next;} { print $1 , $2, 100*$3/s }' and the following file: >P39PT-1224 Freq 900 cccctacgacggcattggtaatggctcagctgctccggatcccgcaagccatcttggatatgagggttcgtcggcctcttcagccaagg-cccccagcagaacatccagctgatcg >P39PT-784 Freq 2... (2 Replies)
Discussion started by: Xterra
2 Replies

9. Shell Programming and Scripting

Help with reformat single-line multi-fasta into multi-line multi-fasta

Input File: >Seq1 ASDADAFASFASFADGSDGFSDFSDFSDFSDFSDFSDFSDFSDFSDFSDFSD >Seq2 SDASDAQEQWEQeqAdfaasd >Seq3 ASDSALGHIUDFJANCAGPATHLACJHPAUTYNJKG ...... Desired Output File >Seq1 ASDADAFASF ASFADGSDGF SDFSDFSDFS DFSDFSDFSD FSDFSDFSDF SD >Seq2 (4 Replies)
Discussion started by: patrick87
4 Replies

10. UNIX for Beginners Questions & Answers

How to append two fasta files?

I have two fasta files as shown below, File:1 >Contig_1:90600-91187 AAGGCCATCAAGGACGTGGATGAGGTCGTCAAGGGCAAGGAACAGGAATTGATGACGGTC >Contig_98:35323-35886 GACGAAGCGCTCGCCAAGGCCGAAGAAGAAGGCCTGGATCTGGTCGAAATCCAGCCGCAG >Contig_24:26615-28387... (11 Replies)
Discussion started by: dineshkumarsrk
11 Replies
Bio::Assembly::Singlet(3pm)				User Contributed Perl Documentation			       Bio::Assembly::Singlet(3pm)

NAME
Bio::Assembly::Singlet - Perl module to hold and manipulate singlets from sequence assembly contigs. SYNOPSIS
# Module loading use Bio::Assembly::IO; # Assembly loading methods $aio = Bio::Assembly::IO->new( -file => 'test.ace.1', -format => 'phrap' ); $assembly = $aio->next_assembly; foreach $singlet ($assembly->all_singlets) { # do something } # OR, if you want to build the singlet yourself, use Bio::Assembly::Singlet; $singlet = Bio::Assembly::Singlet->new( -id => 'Singlet1', -seqref => $seq ); DESCRIPTION
A singlet is a sequence that phrap was unable to align to any other sequences. FEEDBACK
Mailing Lists User feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments and suggestions preferably to the Bioperl mailing lists Your participation is much appreciated. bioperl-l@bioperl.org - General discussion http://bioperl.org/wiki/Mailing_lists - About the mailing lists Support Please direct usage questions or support issues to the mailing list: bioperl-l@bioperl.org rather than to the module maintainer directly. Many experienced and reponsive experts will be able look at the problem and quickly address it. Please include a thorough description of the problem with code and data examples if at all possible. Reporting Bugs Report bugs to the Bioperl bug tracking system to help us keep track the bugs and their resolution. Bug reports can be submitted via the web: https://redmine.open-bio.org/projects/bioperl/ AUTHOR - Chad S. Matsalla bioinformatics1 at dieselwurks.com APPENDIX
The rest of the documentation details each of the object methods. Internal methods are usually preceded with a _ new Title : new Usage : $singlet = $io->new( -seqref => $seq ) Function: Create a new singlet object Returns : A Bio::Assembly::Singlet object Args : -seqref => Bio::Seq-compliant sequence object for the singlet seqref Title : seqref Usage : $seqref = $singlet->seqref($seq); Function: Get/set the sequence to which this singlet refers Returns : A Bio::Seq-compliant object Args : A Bio::Seq-compliant or Bio::Seq::Quality object _seq_to_singlet Title : _seq_to_singlet Usage : $singlet->seqref($seq) Function: Transform a sequence into a singlet Returns : 1 for sucess Args : A Bio::Seq-compliant object perl v5.14.2 2012-03-02 Bio::Assembly::Singlet(3pm)
All times are GMT -4. The time now is 08:26 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy