Am trying to print all sequence that matches this pattern CGTTGggtTTCATT and their positions in my file but "ggt" can be any nucleotide. The sequence in big letters must match.
I want to search a file for a string and then if the string is found I need the line that the string is on - but also the previous two lines from the file (that the pattern will not be found in)
This is on solaris
Can you help? (2 Replies)
Hi,
I am new to this forum and i would like to get help in this issue.
I have a file 1.txt as shown:
apple
banana
orange
apple
grapes
banana
orange
grapes
orange
....
Now i would like to search for pattern say apple or orange and then put a # at the beginning of the pattern... (2 Replies)
Hi,
I think you ppl did not get my question correctly, let me explain
I have 1.txt with following entries as shown:
0152364|134444|10.20.30.40|015236433
0233654|122555|10.20.30.50|023365433
**
**
**
In file 2.txt I have the following entries as shown:
... (1 Reply)
I am trying to do some thing like this ..
In a file , if pattern found insert new pattern at the begining of the line containing the pattern.
example:
in a file I have this.
gtrow0unit1/gctunit_crrownorth_stage5_outnet_feedthru_pin
if i find feedthru_pin want to insert !! at the... (7 Replies)
Hi all,
I am trying to extract the values ( text between the xml tags) based on the Order Number.
here is the sample input
<?xml version="1.0" encoding="UTF-8"?>
<NJCustomer>
<Header>
<MessageIdentifier>Y504173382</MessageIdentifier>
... (13 Replies)
I am trying to search a file for a patterns ERR- in a file and return a count for each of the error reported
Input file is a free flowing file without any format
example of output
ERR-00001=5
....
ERR-01010=10
.....
ERR-99999=10 (4 Replies)
Hi,
I have two files file1.txt and file2.txt. Please see the attachments.
In file2.txt (which actually is a diff output between two versions of file1.txt.), I extract the pattern corresponding to 1172c1172. Now ,In file1.txt I have to search for this pattern 1172c1172 and if found, I have to... (9 Replies)
Hi,
I've written a script to search for an Oracle ORA- error on a log file, print that line and the .trc file associated with it as well as the dateline of when I assumed the error occured. In most it is the first dateline previous to the error.
Unfortunately, this is not a fool proof script.... (2 Replies)
I have this fileA
TEST FILE ABC
this file contains ABC;
TEST FILE DGHT this file contains DGHT;
TEST FILE 123
this file contains ABC,
this file contains DEF,
this file contains XYZ,
this file contains KLM
;
I want to have a fileZ that has only (begin search pattern for will be... (2 Replies)
Discussion started by: vbabz
2 Replies
LEARN ABOUT DEBIAN
tfbs::word::consensus
TFBS::Word::Consensus(3pm) User Contributed Perl Documentation TFBS::Word::Consensus(3pm)NAME
TFBS::Word - IUPAC DNA consensus word-based pattern class =head1 DESCRIPTION
TFBS::Word is a base class consisting of universal constructor called by its subclasses (TFBS::Matrix::*), and word pattern manipulation
methods that are independent of the word type. It is not meant to be instantiated itself.
FEEDBACK
Please send bug reports and other comments to the author.
AUTHOR - Boris Lenhard
Boris Lenhard <Boris.Lenhard@cgb.ki.se>
APPENDIX
The rest of the documentation details each of the object methods. Internal methods are preceded with an underscore.
new
Title : new
Usage : my $pwm = TFBS::Matrix::PWM->new(%args)
Function: constructor for the TFBS::Matrix::PWM object
Returns : a new TFBS::Matrix::PWM object
Args : # you must specify the -word argument:
-word, # a strig consisting of letters in
# IUPAC degenerate DNA alphabet
# (any of ACGTSWKMPYBDHVN)
#######
-name, # string, OPTIONAL
-ID, # string, OPTIONAL
-class, # string, OPTIONAL
-tags # a hash reference reference, OPTIONAL
search_seq
Title : search_seq
Usage : my $siteset = $pwm->search_seq(%args)
Function: scans a nucleotide sequence with the pattern represented
by the PWM
Returns : a TFBS::SiteSet object
Args : # you must specify either one of the following three:
-file, # the name od a fasta file (single sequence)
#or
-seqobj # a Bio::Seq object
# (more accurately, a Bio::PrimarySeqobject or a
# subclass thereof)
#or
-seqstring # a string containing the sequence
-max_mismatches, # number of allowed positions in the site that do
# not match the consensus
# OPTIONAL: default 0
search_aln
Title : search_aln
Usage : my $site_pair_set = $pwm->search_aln(%args)
Function: Scans a pairwise alignment of nucleotide sequences
with the pattern represented by the word: it reports only
those hits that are present in equivalent positions of both
sequences and exceed a specified threshold score in both, AND
are found in regions of the alignment above the specified
conservation cutoff value.
Returns : a TFBS::SitePairSet object
Args : # you must specify either one of the following three:
-file, # the name of the alignment file in Clustal
format
#or
-alignobj # a Bio::SimpleAlign object
# (more accurately, a Bio::PrimarySeqobject or a
# subclass thereof)
#or
-alignstring # a multi-line string containing the alignment
# in clustal format
#############
-max_mismatches, # number of allowed positions in the site that do
# not match the consensus
# OPTIONAL: default 0
-window, # size of the sliding window (inn nucleotides)
# for calculating local conservation in the
# alignment
# OPTIONAL: default 50
-cutoff # conservation cutoff (%) for including the
# region in the results of the pattern search
# OPTIONAL: default "70%"
to_PWM
validate_word
length
perl v5.14.2 2008-01-24 TFBS::Word::Consensus(3pm)