06-27-2012
How to change sequence name in along fasta file?
Hi
I have an alignment file (.fasta) with ~80 sequences. They look like this-
>JV101.contig00066(+):25302-42404|sequence_index=0|block_index=4|species=JV101|JV101_4_0
GAGGTTAATTATCGATAACGTTTAATTAAAGTGTTTAGGTGTCATAATTT
TAAATGACGATTTCTCATTACCATACACCTAAATTATCATCAATCTGAAT
TCAGATGTTTATTATAAAAATTAGATGAAAAATATGTTAATATACAAGTA
>JV501.contig00066(+):24356-42404|sequence_index=0|block_index=4|species=JV501|JV501_4_0
AATGACGATTTAGATGAAAAATAT...
The name of the sequences are too big and I want to just keep JV101, JV501 and delete rest of the words after the dot. I am new to unix, please suggest an easy unix command to do this.
Thanks
Baika
10 More Discussions You Might Find Interesting
1. Solaris
Hi all,
I have solaris and xp installed...
Usually solaris occupies the first boot slot.
So i wanted to know if it is possible to change the boot sequence to xp first and then solaris? (5 Replies)
Discussion started by: wrapster
5 Replies
2. Red Hat
Hi,
I use red hat linux kernel 2.6
I want to add the application shutdown in shutdown sequence .
I add the K script in /etc/rc.d/ all sub directory for all
running level .
But the auto shutdown application is not appear when
I type "shutdown -r now" ..
There is no indication the application... (5 Replies)
Discussion started by: chuikingman
5 Replies
3. Shell Programming and Scripting
Hi All,
I want to change the start-up sequence of services on SLES10/11.
I have my own start-up scripts for some services and I want them to start in a particular order(not in alphabetical order)
Can anyone help me on this issue? (4 Replies)
Discussion started by: senrooy
4 Replies
4. Shell Programming and Scripting
Hi,
Can anyone tell me what the following 2 lines are doing
base=${0##*/}
link=${base#*}
I found this in a start up service script and I think it is giving the service link names which in turn will change the start up sequence of services. (3 Replies)
Discussion started by: senrooy
3 Replies
5. Shell Programming and Scripting
Hi.. I have a seperate chromosome sequences and i wanted to parse some regions of chromosome based on start site and end site.. how can i achieve this?
For Example Chr 1 is in following format
I need regions from 2 - 10 should give me AATTCCAAA
and in a similar way 15- 25 should give... (8 Replies)
Discussion started by: empyrean
8 Replies
6. UNIX for Dummies Questions & Answers
I have fasta files with multiple sequences in each. I need to change the sequence name headers from:
>accD:_59176-60699
ATGGAAAAGTGGAGGATTTATTCGTTTCAGAAGGAGTTCGAACGCA
>atpA_(reverse_strand):_showing_revcomp_of_10525-12048
ATGGTAACCATTCAAGCCGACGAAATTAGTAATCTTATCCGGGAAC... (2 Replies)
Discussion started by: tyrianthinae
2 Replies
7. Shell Programming and Scripting
Hi,
I want to match the sequence id (sub-string of line starting with '>' and extract the information upto next '>' line ). Please help .
input
> fefrwefrwef X900
AGAGGGAATTGG
AGGGGCCTGGAG
GGTTCTCTTC
> fefrwefrwef X932
AGAGGGAATTGG
AGGAGGTGGAG
GGTTCTCTTC
> fefrwefrwef X937... (2 Replies)
Discussion started by: ritakadm
2 Replies
8. Shell Programming and Scripting
Hello,
I have 10 fasta files with sequenced reads information with read sizes from 15 - 35 . I have combined the reads and collapsed in to unique reads and filtered for sizes 18 - 26 bp long unique reads. Now i wanted to count each unique read appearance in all the fasta files and make a table... (5 Replies)
Discussion started by: empyrean
5 Replies
9. UNIX for Dummies Questions & Answers
I have the following script:
awk 'FNR==NR{s+=$3;next;} { print $1 , $2, 100*$3/s }'
and the following file:
>P39PT-1224 Freq 900
cccctacgacggcattggtaatggctcagctgctccggatcccgcaagccatcttggatatgagggttcgtcggcctcttcagccaagg-cccccagcagaacatccagctgatcg
>P39PT-784 Freq 2... (2 Replies)
Discussion started by: Xterra
2 Replies
10. UNIX for Beginners Questions & Answers
I have to mine the following sequence pattern from a large fasta file namely gene.fasta (contains multiple fasta sequences) along with the flanking sequences of 5 bases at starting position and ending position,
AAGCZ-N16-AAGCZ
Z represents A, C or G (Except T)
N16 represents any of the four... (3 Replies)
Discussion started by: dineshkumarsrk
3 Replies
LEARN ABOUT DEBIAN
ali2gff
ALI2GFF(1) General Commands Manual ALI2GFF(1)
NAME
ali2gff - Module to translate a MUMmer output files into gff formatted output.
SYNOPSIS
ali2gff [-r] [-t <.|0|1|2>] [-x <name>] [-y <name>] [-H] [-f] [-h] <MUMmer_output_file>
OPTIONS
-h, --help
Show summary of options.
-r Interchange the order of sequences (sequence 1 on y-axis, sequence 2 on x-axis).
-t <.|0|1|2>
Put label 'frame' in gff output.
-x <name>
Specify the species name for species1 (default: "Seq1").
-y <name>
Specify the species name for species2 (default: "Seq2").
-i Ignore full sequence identities.
-f Write output to file.
SEE ALSO
blat2gff(1), gff2aplot(1), parseblast(1), sim2gff(1).
AUTHOR
ali2gff was written by Steffi Gebauer-Jung.
This manual page was written by Nelson A. de Oliveira <naoliv@gmail.com>, for the Debian project (but may be used by others).
Mon, 21 Mar 2005 21:44:15 -0300 ALI2GFF(1)