grep FASTA files Post: 302430178

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

fasta format?

Hi, I'm in need of creating a file in the fasta format: >1A6A.A HVIIQAEFYLNPDQSGEFMFDFDGDEIFHVDMAKKETVWRLEEFGRFASFEAQGALANIAVDKANLEIMTKRSNYTPITN VPPEVTVLTNSPVELREPNVLICFIDKFTPPVVNVTWLRNGKPVTTGVSETVFLPREDHLFRKFHYLPFLPSTEDVYDCR VEHWGLDEPLLKHWEF >1A6A.B ...

2. Shell Programming and Scripting

grep for certain files using a file as input to grep and then move

Hi All, I need to grep few files which has words like the below in the file name , which i want to put it in a file and and grep for the files which contain these names and move it to a new directory , full file name -C20091210.1000-20091210.1100_SMGBSC3:1000...

3. Shell Programming and Scripting

Changing from FASTA to PHYLIP format

I really need some help with this task. I have a bunch of FASTA files with hundreds of DNA sequences that look like this: >SeqID1 AACCATGACAGAGGAGATGTGAACAGATAGAGGGATGACAGATGACAGATAGACCCAGAC TGACAGGTTCAAAGGCTGCAGTGCAGTGACGTGACGATTT >Sequence 22...

4. UNIX for Dummies Questions & Answers

renaming (renumbering) fasta files

I have a fasta file that looks like this: >Noname ACCAAAATAATTCATGATATACTCAGATCCATCTGAGGGTTTCACCACTTGTAGAGCTAT CAGAAGAATGTCAATCAACTGTCCGAGAAAAAAGAATCCCAGG >Noname ACTATAAACCCTATTTCTCTTTCTAAAAATTGAAATATTAAAGAAACTAGCACTAGCCTG ACCTTTAGCCAGACTTCTCACTCTTAATGCTGCGGACAAACAGA ... I want to...

5. UNIX for Dummies Questions & Answers

Breaking a fasta formatted file into multiple files containing each gene separately

Hey, I've been trying to break a massive fasta formatted file into files containing each gene separately. Could anyone help me? I've tried to use the following code but i've recieved errors every time: for i in *.rtf.out do awk '/^>/{f=++d".fasta"} {print > $i.out}' $i done

6. UNIX for Dummies Questions & Answers

How to change sequence name in along fasta file?

Hi I have an alignment file (.fasta) with ~80 sequences. They look like this- >JV101.contig00066(+):25302-42404|sequence_index=0|block_index=4|species=JV101|JV101_4_0 GAGGTTAATTATCGATAACGTTTAATTAAAGTGTTTAGGTGTCATAATTT TAAATGACGATTTCTCATTACCATACACCTAAATTATCATCAATCTGAAT...

7. UNIX for Dummies Questions & Answers

Fasta header modification

Hi, I need some help with modifying fasta headers. I have a fasta file with thousands of contigs and I need to modify their headers with the information obtained from a second file. File 1 contains the fasta sequences: >contig0001 length=11115 numreads=10777 agatgtagatctct...

8. UNIX for Dummies Questions & Answers

Round up -FASTA file

I have the following script: awk 'FNR==NR{s+=$3;next;} { print $1 , $2, 100*$3/s }' and the following file: >P39PT-1224 Freq 900 cccctacgacggcattggtaatggctcagctgctccggatcccgcaagccatcttggatatgagggttcgtcggcctcttcagccaagg-cccccagcagaacatccagctgatcg >P39PT-784 Freq 2...

9. Shell Programming and Scripting

Help with reformat single-line multi-fasta into multi-line multi-fasta

Input File: >Seq1 ASDADAFASFASFADGSDGFSDFSDFSDFSDFSDFSDFSDFSDFSDFSDFSD >Seq2 SDASDAQEQWEQeqAdfaasd >Seq3 ASDSALGHIUDFJANCAGPATHLACJHPAUTYNJKG ...... Desired Output File >Seq1 ASDADAFASF ASFADGSDGF SDFSDFSDFS DFSDFSDFSD FSDFSDFSDF SD >Seq2

10. UNIX for Beginners Questions & Answers

How to append two fasta files?

I have two fasta files as shown below, File:1 >Contig_1:90600-91187 AAGGCCATCAAGGACGTGGATGAGGTCGTCAAGGGCAAGGAACAGGAATTGATGACGGTC >Contig_98:35323-35886 GACGAAGCGCTCGCCAAGGCCGAAGAAGAAGGCCTGGATCTGGTCGAAATCCAGCCGCAG >Contig_24:26615-28387...

LEARN ABOUT DEBIAN

asn2fsa

ASN2FSA(1)						     NCBI Tools User's Manual							ASN2FSA(1)

NAME

       asn2fsa - convert biological sequence data from ASN.1 to FASTA

SYNOPSIS

       asn2fsa [-] [-A acc] [-D] [-E] [-H] [-L filename] [-T] [-a type] [-b] [-c] [-d path] [-e N] [-f path] [-g] [-h filename] [-i filename] [-k]
       [-l] [-m] [-o filename] [-p path] [-q filename] [-r] [-s] [-u] [-v filename] [-x str] [-z]

DESCRIPTION

       asn2fsa converts biological sequence data from ASN.1 to FASTA.

OPTIONS

       A summary of options is included below.

       -      Print usage message

       -A acc Accession to fetch

       -D     Use Dash for Gap

       -E     Extended Seq-ids

       -H     HTML spans

       -L filename
	      Log file

       -T     Use Threads

       -a type
	      Input ASN.1 type:
	      a      Automatic (default)
	      z      Any
	      e      Seq-entry
	      b      Bioseq
	      s      Bioseq-set
	      m      Seq-submit
	      t      batch processing (suitable for official releases; autodetects specific type)

       -b     Bioseq-set is Binary

       -c     Bioseq-set is Compressed

       -d path
	      Path to ReadDB Database

       -e N   Line length (70 by default; may range from 10 to 120)

       -f path
	      Path to indexed FASTA data

       -g     Expand delta gaps into Ns

       -h filename
	      Far component cache output file name

       -i filename
	      Single input file (standard input by default)

       -k     Local fetching

       -l     Lock components in advance

       -m     Master style for near segmented sequences

       -o filename
	      Nucleotide Output file name

       -p path
	      Path to ASN.1 Files

       -q filename
	      Quality score output file name

       -r     Remote fetching from NCBI

       -s     Far genomic contig for quality scores

       -u     Recurse

       -v filename
	      Protein output file name

       -x str File selection substring (.ent by default) [String]

       -z     Print quality score gap as -1

AUTHOR

       The National Center for Biotechnology Information.

SEE ALSO

       asn2all(1), asn2asn(1), asn2ff(1), asn2gb(1), asn2xml(1), asndhuff(1).

NCBI
								    2011-09-02								ASN2FSA(1)