This is what I get
So I ran it in two steps the first awk
then the second awk
And I get the same error
I am also attaching the input and intermediate files -the out outfile is empty
Thanks!
I am attempting to replace positions 44-46 with YYY if positions 48-50 = XXX.
awk -F "" '{if (substr($0,48,3)=="XXX") $44="YYY"}1' OFS="" $filename > $tempfile
But this is not working, 44-46 is still spaces in my tempfile instead of YYY. Any suggestions would be greatly appreciated. (9 Replies)
in my file data is like this
1,2,3
3,4,5,6,7,8
10,11,23,24
i want to make as
1,2,3,?,?,?
3,4,5,6,7,8
10,11,23,24,?,?
here max no of words(separated by comma) in a line is 6.so every line contains 6 words.Line which have less than 6 words replaced with '?' as a word
i have... (3 Replies)
Hi
My file has a series of rows up to 160 characters in length.
There are 7 columns for each row.
In each row, column 1 starts at position 4
column 2 starts at position 12
column 3 starts at position 43
column 4 starts at position 82
column 5 starts at... (7 Replies)
Greetings.
I need to extract text between two character positions, e.g: all text between character 4921 and 6534.
The text blocks are FASTA-format sequence of whole chromosomes, so basically a million A, T, G, C, combinations. E.g:
>Chr_1
ACCTGTTCAACTCTCAGGACTCTCAGGTCAACTCTCAG... (3 Replies)
Hello,
For example:
12........6789101112..............20212223242526..................50 ( Positions)
LName FName DOB (Lastname starts from 1 to 6 , FName from 8 to 15 and date of birth from 21 to29)
CURTIS KENNETH ... (5 Replies)
I have files with hundreds of sequences with frequency values reported as "Freq X" and missing characters represented by a dash ("-"), something like this
>39sample Freq 4
TAGATGTGCCCGTGGGTTTCCCGTCAACACCGGATAGTAGCAGCACTA
>22sample Freq 15
T-GATGTCGTGGGTTTCCCGTCAACACCGGCAAATAGTAGCAGCACTA... (12 Replies)
hi.
I have a Fixed Length text file as input where the character positions 4-5(two character positions starting from 4th position) indicates the LOB indicator. The file structure is something like below:
10126Apple DrinkOmaha
10231Milkshake New Jersey
103 Billabong Illinois
... (6 Replies)
Discussion started by: kumarjt
6 Replies
LEARN ABOUT DEBIAN
tfbs::word::consensus
TFBS::Word::Consensus(3pm) User Contributed Perl Documentation TFBS::Word::Consensus(3pm)NAME
TFBS::Word - IUPAC DNA consensus word-based pattern class =head1 DESCRIPTION
TFBS::Word is a base class consisting of universal constructor called by its subclasses (TFBS::Matrix::*), and word pattern manipulation
methods that are independent of the word type. It is not meant to be instantiated itself.
FEEDBACK
Please send bug reports and other comments to the author.
AUTHOR - Boris Lenhard
Boris Lenhard <Boris.Lenhard@cgb.ki.se>
APPENDIX
The rest of the documentation details each of the object methods. Internal methods are preceded with an underscore.
new
Title : new
Usage : my $pwm = TFBS::Matrix::PWM->new(%args)
Function: constructor for the TFBS::Matrix::PWM object
Returns : a new TFBS::Matrix::PWM object
Args : # you must specify the -word argument:
-word, # a strig consisting of letters in
# IUPAC degenerate DNA alphabet
# (any of ACGTSWKMPYBDHVN)
#######
-name, # string, OPTIONAL
-ID, # string, OPTIONAL
-class, # string, OPTIONAL
-tags # a hash reference reference, OPTIONAL
search_seq
Title : search_seq
Usage : my $siteset = $pwm->search_seq(%args)
Function: scans a nucleotide sequence with the pattern represented
by the PWM
Returns : a TFBS::SiteSet object
Args : # you must specify either one of the following three:
-file, # the name od a fasta file (single sequence)
#or
-seqobj # a Bio::Seq object
# (more accurately, a Bio::PrimarySeqobject or a
# subclass thereof)
#or
-seqstring # a string containing the sequence
-max_mismatches, # number of allowed positions in the site that do
# not match the consensus
# OPTIONAL: default 0
search_aln
Title : search_aln
Usage : my $site_pair_set = $pwm->search_aln(%args)
Function: Scans a pairwise alignment of nucleotide sequences
with the pattern represented by the word: it reports only
those hits that are present in equivalent positions of both
sequences and exceed a specified threshold score in both, AND
are found in regions of the alignment above the specified
conservation cutoff value.
Returns : a TFBS::SitePairSet object
Args : # you must specify either one of the following three:
-file, # the name of the alignment file in Clustal
format
#or
-alignobj # a Bio::SimpleAlign object
# (more accurately, a Bio::PrimarySeqobject or a
# subclass thereof)
#or
-alignstring # a multi-line string containing the alignment
# in clustal format
#############
-max_mismatches, # number of allowed positions in the site that do
# not match the consensus
# OPTIONAL: default 0
-window, # size of the sliding window (inn nucleotides)
# for calculating local conservation in the
# alignment
# OPTIONAL: default 50
-cutoff # conservation cutoff (%) for including the
# region in the results of the pattern search
# OPTIONAL: default "70%"
to_PWM
validate_word
length
perl v5.14.2 2008-01-24 TFBS::Word::Consensus(3pm)