grep FASTA files Post: 302430308

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

fasta format?

Hi, I'm in need of creating a file in the fasta format: >1A6A.A HVIIQAEFYLNPDQSGEFMFDFDGDEIFHVDMAKKETVWRLEEFGRFASFEAQGALANIAVDKANLEIMTKRSNYTPITN VPPEVTVLTNSPVELREPNVLICFIDKFTPPVVNVTWLRNGKPVTTGVSETVFLPREDHLFRKFHYLPFLPSTEDVYDCR VEHWGLDEPLLKHWEF >1A6A.B ...

2. Shell Programming and Scripting

grep for certain files using a file as input to grep and then move

Hi All, I need to grep few files which has words like the below in the file name , which i want to put it in a file and and grep for the files which contain these names and move it to a new directory , full file name -C20091210.1000-20091210.1100_SMGBSC3:1000...

3. Shell Programming and Scripting

Changing from FASTA to PHYLIP format

I really need some help with this task. I have a bunch of FASTA files with hundreds of DNA sequences that look like this: >SeqID1 AACCATGACAGAGGAGATGTGAACAGATAGAGGGATGACAGATGACAGATAGACCCAGAC TGACAGGTTCAAAGGCTGCAGTGCAGTGACGTGACGATTT >Sequence 22...

4. UNIX for Dummies Questions & Answers

renaming (renumbering) fasta files

I have a fasta file that looks like this: >Noname ACCAAAATAATTCATGATATACTCAGATCCATCTGAGGGTTTCACCACTTGTAGAGCTAT CAGAAGAATGTCAATCAACTGTCCGAGAAAAAAGAATCCCAGG >Noname ACTATAAACCCTATTTCTCTTTCTAAAAATTGAAATATTAAAGAAACTAGCACTAGCCTG ACCTTTAGCCAGACTTCTCACTCTTAATGCTGCGGACAAACAGA ... I want to...

5. UNIX for Dummies Questions & Answers

Breaking a fasta formatted file into multiple files containing each gene separately

Hey, I've been trying to break a massive fasta formatted file into files containing each gene separately. Could anyone help me? I've tried to use the following code but i've recieved errors every time: for i in *.rtf.out do awk '/^>/{f=++d".fasta"} {print > $i.out}' $i done

6. UNIX for Dummies Questions & Answers

How to change sequence name in along fasta file?

Hi I have an alignment file (.fasta) with ~80 sequences. They look like this- >JV101.contig00066(+):25302-42404|sequence_index=0|block_index=4|species=JV101|JV101_4_0 GAGGTTAATTATCGATAACGTTTAATTAAAGTGTTTAGGTGTCATAATTT TAAATGACGATTTCTCATTACCATACACCTAAATTATCATCAATCTGAAT...

7. UNIX for Dummies Questions & Answers

Fasta header modification

Hi, I need some help with modifying fasta headers. I have a fasta file with thousands of contigs and I need to modify their headers with the information obtained from a second file. File 1 contains the fasta sequences: >contig0001 length=11115 numreads=10777 agatgtagatctct...

8. UNIX for Dummies Questions & Answers

Round up -FASTA file

I have the following script: awk 'FNR==NR{s+=$3;next;} { print $1 , $2, 100*$3/s }' and the following file: >P39PT-1224 Freq 900 cccctacgacggcattggtaatggctcagctgctccggatcccgcaagccatcttggatatgagggttcgtcggcctcttcagccaagg-cccccagcagaacatccagctgatcg >P39PT-784 Freq 2...

9. Shell Programming and Scripting

Help with reformat single-line multi-fasta into multi-line multi-fasta

Input File: >Seq1 ASDADAFASFASFADGSDGFSDFSDFSDFSDFSDFSDFSDFSDFSDFSDFSD >Seq2 SDASDAQEQWEQeqAdfaasd >Seq3 ASDSALGHIUDFJANCAGPATHLACJHPAUTYNJKG ...... Desired Output File >Seq1 ASDADAFASF ASFADGSDGF SDFSDFSDFS DFSDFSDFSD FSDFSDFSDF SD >Seq2

10. UNIX for Beginners Questions & Answers

How to append two fasta files?

I have two fasta files as shown below, File:1 >Contig_1:90600-91187 AAGGCCATCAAGGACGTGGATGAGGTCGTCAAGGGCAAGGAACAGGAATTGATGACGGTC >Contig_98:35323-35886 GACGAAGCGCTCGCCAAGGCCGAAGAAGAAGGCCTGGATCTGGTCGAAATCCAGCCGCAG >Contig_24:26615-28387...

LEARN ABOUT DEBIAN

bio::tools::run::alignment::standalonefasta

Bio::Tools::Run::Alignment::StandAloneFasta(3pm)	User Contributed Perl Documentation	  Bio::Tools::Run::Alignment::StandAloneFasta(3pm)

NAME

       Bio::Tools::Run::Alignment::StandAloneFasta - Object for the local execution of the Fasta3 programs ((t)fasta3, (t)fastx3, (t)fasty3
       ssearch3)

SYNOPSIS

	 #!/usr/bin/perl
	 use Bio::Tools::Run::Alignment::StandAloneFasta;
	 use Bio::SeqIO;
	 use strict;
	 my @arg=(
	 'b' =>'15',
	 'O' =>'resultfile',
	 'H'=>'',
	 'program'=>'fasta34'
	 );

	 my $factory=Bio::Tools::Run::Alignment::StandAloneFasta->new(@arg);
	 $factory->ktup(1);

	 $factory->library('p');

	 #print result file name
	 print $factory->O;

	 my @fastreport=$factory->run($ARGV[0]);

	 foreach  (@fastreport) {
	       print "Parsed fasta report:
";
	   my $result = $_->next_result;
	   while( my $hit = $result->next_hit()) {
	      print "	hit name: ", $hit->name(), "
";
		while( my $hsp = $hit->next_hsp()) {
		print "E: ", $hsp->evalue(), "frac_identical: ",
	       $hsp->frac_identical(), "
";
		}
	     }
	   }

	  #pass in seq objects
	  my $sio = Bio::SeqIO->new(-file=>$ARGV[0],-format=>"fasta");
	  my $seq = $sio->next_seq;
	  my @fastreport=$factory->run($ARGV[0]);

DESCRIPTION

       This wrapper works with version 3 of the FASTA program package (see W. R. Pearson and D. J. Lipman(1988), "Improved Tools for Biological
       Sequence Analysis", PNAS 85:2444-2448 (Pearson and Lipman, 1988); W. R.	Pearson(1996) "Effective protein sequence comparison" Meth.
       Enzymol.  266:227-258 (Pearson, 1996); Pearson et. al.(1997) Genomics 46:24-36 (Zhang et al., 1997); Pearson, (1999) Meth. in Molecular
       Biology 132:185-219 (Pearson, 1999).  Version 3 of the FASTA packages contains many programs for searching DNA and protein databases and
       one program (prss3) for evaluating statistical significance from randomly shuffled sequences.

       Fasta is available at ftp://ftp.virginia.edu/pub/fasta

FEEDBACK

   Mailing Lists
       User feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments and suggestions preferably to one
       of the Bioperl mailing lists.  Your participation is much appreciated.

	 bioperl-l@bioperl.org			- General discussion
	 http://bioperl.org/wiki/Mailing_lists	- About the mailing lists

   Support
       Please direct usage questions or support issues to the mailing list:

       bioperl-l@bioperl.org

       rather than to the module maintainer directly. Many experienced and reponsive experts will be able look at the problem and quickly address
       it. Please include a thorough description of the problem with code and data examples if at all possible.

   Reporting Bugs
       Report bugs to the Bioperl bug tracking system to help us keep track the bugs and their resolution.  Bug reports can be submitted via the
       web:

	 http://redmine.open-bio.org/projects/bioperl/

AUTHOR -  Tiequan Zhang
	      Adapted for bioperl by Shawn Hoon
	      Enhanced by Jason Stajich

       Email tqzhang1973@yahoo.com
	     shawnh@fugu-sg.org
	     jason-at-bioperl.org

Appendix
       The rest of the documendation details each of the object methods. Internal methods are preceded with a underscore

   program_name
	Title	: program_name
	Usage	: $factory->program_name()
	Function: holds the program name
	Returns:  string
	Args	: None

   executable
	Title	: executable
	Usage	: my $exe = $blastfactory->executable('blastall');
	Function: Finds the full path to the 'codeml' executable
	Returns : string representing the full path to the exe
	Args	: [optional] name of executable to set path to
		  [optional] boolean flag whether or not warn when exe is not found

   program_dir
	Title	: program_dir
	Usage	: $factory->program_dir(@params)
	Function: returns the program directory, obtained from ENV variable.
	Returns:  string
	Args	:

   run
	Title	: run

	Usage	: my @fasta_object = $factory->($input,$onefile);
		  where $factory is the name of executable FASTA program;
		  $input is file name containing the sequences in the format
		  of fasta  or Bio::Seq object or array of Bio::Seq object;
		  $onefile is 0 if you want to save the outputs to different files
		  default: outputs are saved in one file

	Function: Attempts to run an executable FASTA program
		  and return array of  fasta objects containing the fasta report
	Returns : aray of fasta report object
		  If the user specify the output file(s),
		  the raw fasta report will be saved
	Args	: sequence object OR array reference of sequence objects
		  filename of file containing fasta formatted sequences

   library
	Title	: library
	Usage	: my $lb = $self->library
	Function: Fetch or set the name of the library to search against
	Returns : The name of the library
	Args	: No argument if user wants to fetch the name of library file;
		  A letter or a string of letter preceded by %;
		  (e.g. P or %pn, the letter is  the character in the third field
		  of any line of fastlibs file	) or the name of library file
		  (if environmental variable FASTLIBS is not set);
		   if user wants to set the name of library file to search against

   output
	Title	: output
	Usage	: $obj->output($newval)
	Function: The output directory if we want to use this
	Example :
	Returns : value of output (a scalar)
	Args	: on set, new value (a scalar or undef, optional)

   ktup
	Title	:  ktup
	Usage	:  my $ktup = $self->ktup
	Function:  Fetch or set the ktup value for executable FASTA programs
	Example :
	Returns :  The value of ktup  if defined, else undef is returned
	Args	:  No argument if user want to fetch ktup value; A integer value between 1-6 if user want to set the
		  ktup value

   _setinput
	Title	:  _setinput
	Usage	:  Internal function, not to be called directly
	Function:   Create input file(s) for Blast executable
	Example :
	Returns : array of Bio::Seq object reference
	Args	: Seq object reference or input file name

   _exist
	Title	: _exist
	Usage	: Internal function, not to be called directly
	Function: Determine whether a executable FASTA program can be found
		  Cf. the DESCRIPTION section of this POD for how to make sure
		  for your FASTA installation to be found. This method checks for
		  existence of the blastall executable in the path.
	Returns : 1 if FASTA program found at expected location, 0 otherwise.
	Args	:  none

   _setparams
	Title	:  _setparams
	Usage	:  Internal function, not to be called directly
	Function:  Create parameter inputs for FASTA executable
	Returns : part of parameter string to be passed to FASTA program
	Args	: none

perl v5.12.3							    2011-06-18			  Bio::Tools::Run::Alignment::StandAloneFasta(3pm)