09-24-2014
Hi Marion Welcome to Forums, can we have expected output as well please.
10 More Discussions You Might Find Interesting
1. UNIX for Dummies Questions & Answers
Hi, buddies out there.
I have a text file ( only one column ) which I created using vi editor. The file contains duplicate rows and I would like to select distinct rows, how to go on it using unix command:
file content =
apple
apple
orange
watermelon
apple
orange
Can it be done... (7 Replies)
Discussion started by: merry susana
7 Replies
2. Shell Programming and Scripting
Hi ,
I have a similar problem.
Please can anyone help me with a shell script or a perl.
I have a flat file like this
fruit country
apple germany
apple india
banana pakistan
banana saudi
mango india
I want to get a output like
fruit country
apple ... (7 Replies)
Discussion started by: smalya
7 Replies
3. Shell Programming and Scripting
Hi, I have the following file:
LOG:015608::ERR:2310:map_spsrec:Invalid parameter
LOG:015608::ERR:2471:map_dgdrec:Invalid parameter
LOG:015608::ERR:2487:map_nnmrec:Invalid number
LOG:015608::ERR:2310:map_nmrec:Invalid number
LOG:015608::ERR:2438:map_nmrec:Invalid number
As a delimiter I... (2 Replies)
Discussion started by: apenkov
2 Replies
4. Shell Programming and Scripting
Hi,
I am having a file of dna sequences in fasta format which look like this:
>admin_1_45
atatagcaga
>admin_1_46
atatagcagaatatatat
with many such thousands of sequences in a single file. I want to the replace the accession Id "admin_1_45" similarly in following sequences to... (5 Replies)
Discussion started by: margarita
5 Replies
5. Shell Programming and Scripting
I have two files. File1 is shown below.
>153L:B|PDBID|CHAIN|SEQUENCE
RTDCYGNVNRIDTTGASCKTAKPEGLSYCGVSASKKIAERDLQAMDRYKTIIKKVGEKLCVEPAVIAGIISRESHAGKVL
KNGWGDRGNGFGLMQVDKRSHKPQGTWNGEVHITQGTTILINFIKTIQKKFPSWTKDQQLKGGISAYNAGAGNVRSYARM
DIGTTHDDYANDVVARAQYYKQHGY
>16VP:A|PDBID|CHAIN|SEQUENCE... (7 Replies)
Discussion started by: nelsonfrans
7 Replies
6. Shell Programming and Scripting
I have a fasta file as follows
>sp|O15090|FABP4_HUMAN Fatty acid-binding protein, adipocyte OS=Homo sapiens GN=FABP4 PE=1 SV=3
MCDAFVGTWKLVSSENFDDYMKEVGVGFATRKVAGMAKPNMIISVNGDVITIKSESTFKN
TEISFILGQEFDEVTADDRKVKSTITLDGGVLVHVQKWDGKSTTIKRKREDDKLVVECVM
KGVTSTRVYERA
>sp|L18484|AP2A2_RAT AP-2... (3 Replies)
Discussion started by: alexypaul
3 Replies
7. Shell Programming and Scripting
Hi,
I have a fasta file with multiple sequences. How can i get only unique sequences from the file.
For example
my_file.fasta
>seq1
TCTCAAAGAAAGCTGTGCTGCATACTGTACAAAACTTTGTCTGGAGAGATGGAGAATCTCATTGACTTTACAGGTGTGGACGGTCTTCAGAGATGGCTCAAGCTAACATTCCCTGACACACCTATAGGGAAAGAGCTAAC
>seq2... (3 Replies)
Discussion started by: Ibk
3 Replies
8. UNIX for Beginners Questions & Answers
I could calculate the length of entire fasta sequences by following command,
awk '/^>/{if (l!="") print l; print; l=0; next}{l+=length($0)}END{print l}' unique.fasta
But, I need to calculate the length of a particular fasta sequence specified/listed in another txt file. The results to to be... (14 Replies)
Discussion started by: dineshkumarsrk
14 Replies
9. Shell Programming and Scripting
I have a fasta file as follows
>sp|Q8WWQ8|STAB2_HUMAN Stabilin-2 OS=Homo sapiens OX=9606 GN=STAB2 PE=1 SV=3
MMLQHLVIFCLGLVVQNFCSPAETTGQARRCDRKSLLTIRTECRSCALNLGVKCPDGYTM
ITSGSVGVRDCRYTFEVRTYSLSLPGCRHICRKDYLQPRCCPGRWGPDCIECPGGAGSPC
NGRGSCAEGMEGNGTCSCQEGFGGTACETCADDNLFGPSCSSVCNCVHGVCNSGLDGDGT... (3 Replies)
Discussion started by: jerrild
3 Replies
10. UNIX for Beginners Questions & Answers
Hi,
I have to add 7 bases of specific nucleotide at the beginning and ending of all the fasta sequences of a file. For example, I have a multi fasta file namely test.fasta as given below
test.fasta
>TalAA18_Xoo_CIAT_NZ_CP033194.1:_2936369-2939570:+1... (1 Reply)
Discussion started by: dineshkumarsrk
1 Replies
LEARN ABOUT DEBIAN
fastacmd
FASTACMD(1) NCBI Tools User's Manual FASTACMD(1)
NAME
fastacmd - retrieve FASTA sequences from a BLAST database
SYNOPSIS
fastacmd [-] [-D N] [-I] [-L start,stop] [-P N] [-S N] [-T] [-a] [-c] [-d str] [-i str] [-l N] [-o filename] [-p type] [-s str] [-t]
DESCRIPTION
fastacmd retrieves FASTA formatted sequences from a blast(1) database formatted using the `-o' option. An example fastacmd call would be
fastacmd -d nr -s p38398
OPTIONS
A summary of options is included below.
- Print usage message
-D N Dump the entire database in some format:
1 fasta
2 GI list
3 Accession.version list
-I Print database information only (overrides all other options)
-L start,stop
Range of sequence to extract (0 in start is beginning of sequence, 0 in stop is end of sequence, default is whole sequence)
-P N Retrieve sequences with Protein Identification Group (PIG) N.
-S N Strand on subsequence (nucleotide only):
1 top (default)
2 bottom
-T Print taxonomic information for requested sequence(s)
-a Retrieve duplicate accessions
-c Use ^A ( 01) as non-redundant defline separator
-d str Database (default is nr)
-i str Input file with GIs/accessions/loci for batch retrieval
-l N Line length for sequence (default = 80)
-o filename
Output file (default = stdout)
-p type
Type of file:
G guess (default): look for protein, then nucleotide
T protein
F nucleotide
-s str Comma-delimited search string(s). GIs, accessions, loci, or fullSeq-id strings may be used, e.g., 555, AC147927, 'gnl|dbname|tag'
-t Definition line should contain target GI only
EXIT STATUS
0 Completed successfully.
1 An error (other than those below) occurred.
2 The BLAST database was not found.
3 A search (accession, GI, or taxonomy info) failed.
4 No taxonomy database was found.
AUTHOR
The National Center for Biotechnology Information.
SEE ALSO
blast(1), /usr/share/doc/blast2/fastacmd.html.
NCBI
2005-11-04 FASTACMD(1)