grep FASTA files Post: 302430266

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

fasta format?

Hi, I'm in need of creating a file in the fasta format: >1A6A.A HVIIQAEFYLNPDQSGEFMFDFDGDEIFHVDMAKKETVWRLEEFGRFASFEAQGALANIAVDKANLEIMTKRSNYTPITN VPPEVTVLTNSPVELREPNVLICFIDKFTPPVVNVTWLRNGKPVTTGVSETVFLPREDHLFRKFHYLPFLPSTEDVYDCR VEHWGLDEPLLKHWEF >1A6A.B ...

2. Shell Programming and Scripting

grep for certain files using a file as input to grep and then move

Hi All, I need to grep few files which has words like the below in the file name , which i want to put it in a file and and grep for the files which contain these names and move it to a new directory , full file name -C20091210.1000-20091210.1100_SMGBSC3:1000...

3. Shell Programming and Scripting

Changing from FASTA to PHYLIP format

I really need some help with this task. I have a bunch of FASTA files with hundreds of DNA sequences that look like this: >SeqID1 AACCATGACAGAGGAGATGTGAACAGATAGAGGGATGACAGATGACAGATAGACCCAGAC TGACAGGTTCAAAGGCTGCAGTGCAGTGACGTGACGATTT >Sequence 22...

4. UNIX for Dummies Questions & Answers

renaming (renumbering) fasta files

I have a fasta file that looks like this: >Noname ACCAAAATAATTCATGATATACTCAGATCCATCTGAGGGTTTCACCACTTGTAGAGCTAT CAGAAGAATGTCAATCAACTGTCCGAGAAAAAAGAATCCCAGG >Noname ACTATAAACCCTATTTCTCTTTCTAAAAATTGAAATATTAAAGAAACTAGCACTAGCCTG ACCTTTAGCCAGACTTCTCACTCTTAATGCTGCGGACAAACAGA ... I want to...

5. UNIX for Dummies Questions & Answers

Breaking a fasta formatted file into multiple files containing each gene separately

Hey, I've been trying to break a massive fasta formatted file into files containing each gene separately. Could anyone help me? I've tried to use the following code but i've recieved errors every time: for i in *.rtf.out do awk '/^>/{f=++d".fasta"} {print > $i.out}' $i done

6. UNIX for Dummies Questions & Answers

How to change sequence name in along fasta file?

Hi I have an alignment file (.fasta) with ~80 sequences. They look like this- >JV101.contig00066(+):25302-42404|sequence_index=0|block_index=4|species=JV101|JV101_4_0 GAGGTTAATTATCGATAACGTTTAATTAAAGTGTTTAGGTGTCATAATTT TAAATGACGATTTCTCATTACCATACACCTAAATTATCATCAATCTGAAT...

7. UNIX for Dummies Questions & Answers

Fasta header modification

Hi, I need some help with modifying fasta headers. I have a fasta file with thousands of contigs and I need to modify their headers with the information obtained from a second file. File 1 contains the fasta sequences: >contig0001 length=11115 numreads=10777 agatgtagatctct...

8. UNIX for Dummies Questions & Answers

Round up -FASTA file

I have the following script: awk 'FNR==NR{s+=$3;next;} { print $1 , $2, 100*$3/s }' and the following file: >P39PT-1224 Freq 900 cccctacgacggcattggtaatggctcagctgctccggatcccgcaagccatcttggatatgagggttcgtcggcctcttcagccaagg-cccccagcagaacatccagctgatcg >P39PT-784 Freq 2...

9. Shell Programming and Scripting

Help with reformat single-line multi-fasta into multi-line multi-fasta

Input File: >Seq1 ASDADAFASFASFADGSDGFSDFSDFSDFSDFSDFSDFSDFSDFSDFSDFSD >Seq2 SDASDAQEQWEQeqAdfaasd >Seq3 ASDSALGHIUDFJANCAGPATHLACJHPAUTYNJKG ...... Desired Output File >Seq1 ASDADAFASF ASFADGSDGF SDFSDFSDFS DFSDFSDFSD FSDFSDFSDF SD >Seq2

10. UNIX for Beginners Questions & Answers

How to append two fasta files?

I have two fasta files as shown below, File:1 >Contig_1:90600-91187 AAGGCCATCAAGGACGTGGATGAGGTCGTCAAGGGCAAGGAACAGGAATTGATGACGGTC >Contig_98:35323-35886 GACGAAGCGCTCGCCAAGGCCGAAGAAGAAGGCCTGGATCTGGTCGAAATCCAGCCGCAG >Contig_24:26615-28387...

LEARN ABOUT DEBIAN

fastx_clipper

FASTX_CLIPPER(1)						   User Commands						  FASTX_CLIPPER(1)

NAME

       fastx_clipper - FASTA/Q Clipper

DESCRIPTION

       usage:  fastx_clipper  [-h]  [-a  ADAPTER]  [-D] [-l N] [-n] [-d N] [-c] [-C] [-o] [-v] [-z] [-i INFILE] [-o OUTFILE] Part of FASTX Toolkit
       0.0.13.2 by A. Gordon (gordon@cshl.edu)

       [-h]   = This helpful help screen.

	      [-a ADAPTER] = ADAPTER string. default is CCTTAAGG (dummy adapter).  [-l N]	= discard sequences shorter  than  N  nucleotides.
	      default is 5.  [-d N]	  = Keep the adapter and N bases after it.

	      (using '-d 0' is the same as not using '-d' at all. which is the default).

       [-c]   = Discard non-clipped sequences (i.e. - keep only sequences which contained the adapter).

       [-C]   = Discard clipped sequences (i.e. - keep only sequences which did not contained the adapter).

       [-k]   = Report Adapter-Only sequences.

       [-n]   = keep sequences with unknown (N) nucleotides. default is to discard such sequences.

       [-v]   = Verbose - report number of sequences.

       If [-o] is specified,
	      report will be printed to STDOUT.

	      If [-o] is not specified (and output goes to STDOUT), report will be printed to STDERR.

       [-z]   = Compress output with GZIP.

       [-D]   = DEBUG output.

       [-M N] = require minimum adapter alignment length of N.

       If less than N nucleotides aligned with the adapter - don't clip it.
	      [-i INFILE]  = FASTA/Q input file. default is STDIN.

	      [-o OUTFILE] = FASTA/Q output file. default is STDOUT.

SEE ALSO

       The quality of this automatically generated manpage might be insufficient.  It is suggested to visit

	      http://hannonlab.cshl.edu/fastx_toolkit/commandline.html

       to get a better layout as well as an overview about connected FASTX tools.

fastx_clipper 0.0.13.2						     May 2012							  FASTX_CLIPPER(1)