Hi ,
I have a typical situation. I have 4 files and with different headers (number of headers is varible ).
I need to make such a merged file which will have headers combined from all files (comman coluns should appear once only).
For example -
File 1
H1|H2|H3|H4
11|12|13|14
21|22|23|23... (1 Reply)
Hello,
A bioperl problem I thought could be done with awk: convert the fasta format (Note: the length of each row is not the same for each entry as they were combined from different files!) to tabular format.
input.fasta:
>YAL069W-1.334 Putative promoter sequence... (6 Replies)
Hi
I have an alignment file (.fasta) with ~80 sequences. They look like this-
>JV101.contig00066(+):25302-42404|sequence_index=0|block_index=4|species=JV101|JV101_4_0
GAGGTTAATTATCGATAACGTTTAATTAAAGTGTTTAGGTGTCATAATTT
TAAATGACGATTTCTCATTACCATACACCTAAATTATCATCAATCTGAAT... (2 Replies)
Hi,
I want to match the sequence id (sub-string of line starting with '>' and extract the information upto next '>' line ). Please help .
input
> fefrwefrwef X900
AGAGGGAATTGG
AGGGGCCTGGAG
GGTTCTCTTC
> fefrwefrwef X932
AGAGGGAATTGG
AGGAGGTGGAG
GGTTCTCTTC
> fefrwefrwef X937... (2 Replies)
I have two files. File1 is shown below.
>153L:B|PDBID|CHAIN|SEQUENCE
RTDCYGNVNRIDTTGASCKTAKPEGLSYCGVSASKKIAERDLQAMDRYKTIIKKVGEKLCVEPAVIAGIISRESHAGKVL
KNGWGDRGNGFGLMQVDKRSHKPQGTWNGEVHITQGTTILINFIKTIQKKFPSWTKDQQLKGGISAYNAGAGNVRSYARM
DIGTTHDDYANDVVARAQYYKQHGY
>16VP:A|PDBID|CHAIN|SEQUENCE... (7 Replies)
Hi. Unix rookie here. Been looking for a few days for reference documents on how BSD UNIX lays the logical file format onto a disk. Goal is to view/edit with hex editor for data repair. Lots of docs are available for how to use Unix commands (like xxd), but I want to learn the map of how Unix... (4 Replies)
I have the following script:
awk 'FNR==NR{s+=$3;next;} { print $1 , $2, 100*$3/s }'
and the following file:
>P39PT-1224 Freq 900
cccctacgacggcattggtaatggctcagctgctccggatcccgcaagccatcttggatatgagggttcgtcggcctcttcagccaagg-cccccagcagaacatccagctgatcg
>P39PT-784 Freq 2... (2 Replies)
I would like to extract all entries containing the following patterns: ccccta & ccccccccc from the following infile:
>P39PT-1224_Freq_900
cccctacgacggcattggtaatggctcccgcaagccatctctcttcagccaagg
>P39PT-784_Freq_2
cccctacgacggcattggtaatggcacccgcaagccatctctcttccccccccc
>P39PT-678_Freq_5... (4 Replies)
I have two fasta files as shown below,
File:1
>Contig_1:90600-91187
AAGGCCATCAAGGACGTGGATGAGGTCGTCAAGGGCAAGGAACAGGAATTGATGACGGTC
>Contig_98:35323-35886
GACGAAGCGCTCGCCAAGGCCGAAGAAGAAGGCCTGGATCTGGTCGAAATCCAGCCGCAG
>Contig_24:26615-28387... (11 Replies)
I have 5 sequences in a fasta file namely gene1.fasta as follows,
gene1.fasta
>1256
ATGTAGC
>GEP
TAGAG
>GTY578
ATGCATA
>67_iga
ATGCTGA
>90_ld
ATGCTG
I need to rename the gene1.fasta file based on the sequence position specified in list.txt as follows,
list.txt
position1=org5... (5 Replies)
Discussion started by: dineshkumarsrk
5 Replies
LEARN ABOUT DEBIAN
pynast
VERSION:(1) User Commands VERSION:(1)NAME
PyNAST - alignment of short DNA sequences
SYNOPSIS
pynast [options] {-i input_fp -t template_fp}
DESCRIPTION
[] indicates optional input (order unimportant) {} indicates required input (order unimportant)
Example usage:
pynast -i my_input.fasta -t my_template.fasta
OPTIONS --version
show program's version number and exit
-h, --help
show this help message and exit
-t TEMPLATE_FP, --template_fp=TEMPLATE_FP
path to template alignment file [REQUIRED]
-i INPUT_FP, --input_fp=INPUT_FP
path to input fasta file [REQUIRED]
-v, --verbose
Print status and other information during execution [default: False]
-p MIN_PCT_ID, --min_pct_id=MIN_PCT_ID
minimum percent sequence identity to consider a sequence a match [default: 75.0]
-l MIN_LEN, --min_len=MIN_LEN
minimum sequence length to include in NAST alignment [default: 1000]
-m PAIRWISE_ALIGNMENT_METHOD, --pairwise_alignment_method=PAIRWISE_ALIGNMENT_METHOD
method for performing pairwise alignment [default: uclust]
-a FASTA_OUT_FP, --fasta_out_fp=FASTA_OUT_FP
path to store resulting alignment file [default: derived from input filepath]
-g LOG_FP, --log_fp=LOG_FP
path to store log file [default: derived from input filepath]
-f FAILURE_FP, --failure_fp=FAILURE_FP
path to store file of seqs which fail to align [default: derived from input filepath]
-e MAX_E_VALUE, --max_e_value=MAX_E_VALUE
Depreciated. Will be removed in PyNAST 1.2
-d BLAST_DB, --blast_db=BLAST_DB
Depreciated. Will be removed in PyNAST 1.2
SEE ALSO
http://pynast.sourceforge.net
Version: pynast 1.1 August 2011 VERSION:(1)