08-05-2013
Hi Don,
I am expecting the output file might be around 250kb to 300kb. All the sequences will have one header line starting with >sp.... The sequence line will have 60 letters each line. The change might happen anywhere not restricted to first line. The new record will start in new line with >sp and the end of the sequence will have *.
Thanks Kaavya
---------- Post updated at 03:35 PM ---------- Previous update was at 03:35 PM ----------
Hi Don,
The position count should start after the header line
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Hi,
Long list of Input file1 content:
1285_t
4860_i
4817_v
8288_c
9626_a
.
.
.
Long list of Input file2 content:
1285_t chris germany
8288_c steve england
9626_a dave swiss
9260_s stephanie denmark
.
.
. (14 Replies)
Discussion started by: patrick87
14 Replies
2. Shell Programming and Scripting
Input:
21 templeta parent 35718 36554 . - . ID=parent_cluster_50.21.11; Name=Partial%20parent%20for%20training%20set;
21 templeta kids 35718 36554 . - . ID=_52; Parent=parent_cluster_5085.21.11;
21 templeta ... (7 Replies)
Discussion started by: patrick87
7 Replies
3. Shell Programming and Scripting
My input
chr3 galGal3_xenoRefFlat CDS 4178235 4178264 0.000000 + 0 gene_id "T6J4.19; T6J4_19"; transcript_id "T6J4.19; T6J4_19";
chr3 galGal3_xenoRefFlat exon 4178235 4178264 0.000000 + . gene_id "T6J4.19; T6J4_19"; transcript_id "T6J4.19;... (2 Replies)
Discussion started by: jacobs.smith
2 Replies
4. Shell Programming and Scripting
Hello,
:/ (0 Replies)
Discussion started by: jvr42
0 Replies
5. Shell Programming and Scripting
Hi,
im able to search for string in a file (ex: grep -w "$a" input.txt). but i have to search for the uppercase of a string in a file where upper case of the file content matches something like below.
where upper("$a")== converted to upper case string in (input.txt)
can someone please provide... (5 Replies)
Discussion started by: p_satyambabu
5 Replies
6. Shell Programming and Scripting
S 0.0 0.0 (reg, inst050)
k
e
f
d
c
S 0.0 0.0 (mux, m030)
k
g
r
s
x
v
S 0.0 0.0 (reg, inst020)
q
s
n
m (12 Replies)
Discussion started by: ctphua
12 Replies
7. Shell Programming and Scripting
Good afternoon all,
I want to ask how to change some letter in my file with other letter in spesific line
eg.
data.txt
1
1
1
0
0
0
0
for example i want to change the 4th line with character 1.
How could I do it by SED or AWK.
I have tried to run this code but actually did not... (3 Replies)
Discussion started by: weslyarfan
3 Replies
8. Shell Programming and Scripting
Hi,
I need to replace, as the title says, the first letter of each line (when it's not a number) by the same letter, but capital.
For instance :
hello
Who
123pass
Would become :
Hello
Who
123pass
Is there a way with sed to do that ? Or other unix command ?
Thank you :) (7 Replies)
Discussion started by: ganon551
7 Replies
9. UNIX for Dummies Questions & Answers
I have a pbd file, which has the following format:
TITLE Protein X
MODEL 1
ATOM 1 N PRO 24 45.220 71.410 43.810 1.00 0.00
ATOM 2 H1 PRO 24 45.800 71.310 42.000 1.00 0.00
TER
ENDMDL
Column 22 is the chain... (5 Replies)
Discussion started by: Egy
5 Replies
10. Shell Programming and Scripting
AWK command to get file content until 3 occurrence of pattern match,
INPUT FILE:
JMS_BODY_FIELD:JMSText = <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<custOptIn xmlns="http://com/walm/ta/cu/ccs/xml2">
<person>Romi</person>
<appName>SAP</appName>
</custOptIn>
... (4 Replies)
Discussion started by: prince1987
4 Replies
LEARN ABOUT DEBIAN
fasta_formatter
FASTA_FORMATTER(1) User Commands FASTA_FORMATTER(1)
NAME
fasta_formatter - changes the width of sequences line in a FASTA file
DESCRIPTION
usage: fasta_formatter [-h] [-i INFILE] [-o OUTFILE] [-w N] [-t] [-e] Part of FASTX Toolkit 0.0.13.2 by gordon@cshl.edu
[-h] = This helpful help screen.
[-i INFILE]
= FASTA/Q input file. default is STDIN.
[-o OUTFILE] = FASTA/Q output file. default is STDOUT. [-w N] = max. sequence line width for output FASTA file.
When ZERO (the default), sequence lines will NOT be wrapped - all nucleotides of each sequences will appear on a single line (good
for scripting).
[-t] = Output tabulated format (instead of FASTA format).
Sequence-Identifiers will be on first column, Nucleotides will appear on second column (as single line).
[-e] = Output empty sequences (default is to discard them).
Empty sequences are ones who have only a sequence identifier, but not actual nucleotides.
Input Example:
>MY-ID AAAAAGGGGG CCCCCTTTTT AGCTN
Output example with unlimited line width [-w 0]:
>MY-ID AAAAAGGGGGCCCCCTTTTTAGCTN
Output example with max. line width=7 [-w 7]:
>MY-ID AAAAAGG GGGTTTT TCCCCCA GCTN
Output example with tabular output [-t]:
MY-ID AAAAAGGGGGCCCCCTTTTAGCTN
example of empty sequence: (will be discarded unless [-e] is used)
>REGULAR-SEQUENCE-1 AAAGGGTTTCCC >EMPTY-SEQUENCE >REGULAR-SEQUENCE-2 AAGTAGTAGTAGTAGT GTATTTTATAT
SEE ALSO
The quality of this automatically generated manpage might be insufficient. It is suggested to visit
http://hannonlab.cshl.edu/fastx_toolkit/commandline.html
to get a better layout as well as an overview about connected FASTX tools.
fasta_formatter 0.0.13.2 May 2012 FASTA_FORMATTER(1)