10-27-2015
Hi RudiC, thank you, will this help deal with the 345 million records in the score table problem? Is there a way we can avoid the 3 for loops? Thanks a lot for your help.
9 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
i get text files with
Action & Adventure|2012: Supernova NR|2009-11-01 00:01:00|2010-05-01 23:59:00|Active|3
Action & Adventure|50 Dead Men Walking|2010-01-05 00:01:00|2010-06-30 23:59:00|Active|4
Action & Adventure|Afterwards|2009-11-26 00:01:00|2010-03-26 23:59:00|Deactivated|6
Based... (3 Replies)
Discussion started by: ramse8pc
3 Replies
2. UNIX for Dummies Questions & Answers
Hi,
I'm trying to assign a score to each row which will allow me to identify which rows differ. In the example file below, I've used "," to indicate column separators (my actual file has tab separators). In this example, I'd like to identify that row 1 and row 5 are the same, and row 2 and row... (4 Replies)
Discussion started by: auburn
4 Replies
3. UNIX for Advanced & Expert Users
Hey everyone.
I am trying to figure out a way to create a file that will be renamed based off of one of multiple files. For example, if I have 3 files (cat.ctl, dog.ctl, and bird.ctl) that gets placed on to an ftp site I want to create a single file called new.cat.ctl, new.dog.ctl, etc for each... (3 Replies)
Discussion started by: coach5779
3 Replies
4. Shell Programming and Scripting
Hi All,
I have a file which looks like this:
234422 1 .00222
323232 1 3232
32323 1 0.00222
1234 2 1211
2332 2 0.9
233 3 0.883
123 3 45
As you can see, the second column of the file is already sorted which I did using sort command.
Now, I want to create files based on the second... (1 Reply)
Discussion started by: shoaibjameel123
1 Replies
5. UNIX for Dummies Questions & Answers
I have to create a bunch of functional (non-user) accounts that are owned by 1 person. And I create several of these functional accounts each day so there are several owners. Is there a way to make a password based off an algorithm that uses the owners identification number, so all accounts I... (2 Replies)
Discussion started by: MaindotC
2 Replies
6. Shell Programming and Scripting
I have two files with thousands of sequences of different lengths. infile1 contains the actual sequences and infile2 the scores for each A, T, G and C in infile1. Something like this:
infile1:
>HZVJKYI01ECH5R
TTGATGTGCCAGCTGCCGTTGGTGTGCCAA
>HZVJKYI01AQWJ8
GGATATGATGATGAACTGGTTTGGCACACC... (4 Replies)
Discussion started by: Xterra
4 Replies
7. Shell Programming and Scripting
hi,
i need help to create a table from an input file like this:-
DB|QZX3 140 165 RT_2 VgGIGvGVR
DB|QZX3 155 182 UT_1 rlgslqqLaIvlGiFT
DB|QZX3 345 362 RT_1 GRKpllligS
DB|ZXK6 174 199 RT_2 IstvtvptYlgEiatvkaR
DB|ZXK6 189 216 UT_1 algtiyqLfLviGiLF
DB|AZ264 15 17... (7 Replies)
Discussion started by: redse171
7 Replies
8. UNIX for Dummies Questions & Answers
Hello masters,
I am filtering data based on completeness. A (Name , Group) combination in File2 is only complete when it has data for all subgroups specified in File1.
All incomplete (Name , Group) combinations do not appear in the output.
So for example , Name1 Group 1 in File2 is... (6 Replies)
Discussion started by: senhia83
6 Replies
9. UNIX for Beginners Questions & Answers
I have looked through several threads regarding merging files with awk and attempted using join however have been unsuccessful likely as I do not fully understand awk.
What I am attempting is to take a csv file which could be between 1 and 15,000 lines with 5 colums and another csv file that will... (4 Replies)
Discussion started by: cdubu2
4 Replies
LEARN ABOUT DEBIAN
bio::searchdist
Bio::SearchDist(3pm) User Contributed Perl Documentation Bio::SearchDist(3pm)
NAME
Bio::SearchDist - A perl wrapper around Sean Eddy's histogram object
SYNOPSIS
$dis = Bio::SearchDist->new();
foreach $score ( @scores ) {
$dis->add_score($score);
}
if( $dis->fit_evd() ) {
foreach $score ( @scores ) {
$evalue = $dis->evalue($score);
print "Score $score had an evalue of $evalue
";
}
} else {
warn("Could not fit histogram to an EVD!");
}
DESCRIPTION
The Bio::SearchDist object is a wrapper around Sean Eddy's excellent histogram object. The histogram object can bascially take in a number
of scores which are sensibly distributed somewhere around 0 that come from a supposed Extreme Value Distribution. Having add all the scores
from a database search via the add_score method you can then fit a extreme value distribution using fit_evd(). Once fitted you can then get
out the evalue for each score (or a new score) using evalue($score).
The fitting procedure is better described in Sean Eddy's own code (available from http://hmmer.janelia.org/, or in the histogram.h header
file in Compile/SW). Bascially it fits a EVD via a maximum likelhood method with pruning of the top end of the distribution so that real
positives are discarded in the fitting procedure. This comes from an orginally idea of Richard Mott's and the likelhood fitting is from a
book by Lawless [should ref here].
The object relies on the fact that the scores are sensibly distributed around about 0 and that integer bins are sensible for the histogram.
Scores based on bits are often ideal for this (bits based scoring mechanisms is what this histogram object was originally designed for).
CONTACT
The original code this was based on comes from the histogram module as part of the HMMer2 package. Look at http://hmmer.janelia.org/
Its use in Bioperl is via the Compiled XS extension which is cared for by Ewan Birney (birney@ebi.ac.uk). Please contact Ewan first about
the use of this module
FEEDBACK
Mailing Lists
User feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments and suggestions preferably to one
of the Bioperl mailing lists. Your participation is much appreciated.
bioperl-l@bioperl.org - General discussion
http://bioperl.org/wiki/Mailing_lists - About the mailing lists
Support
Please direct usage questions or support issues to the mailing list:
bioperl-l@bioperl.org
rather than to the module maintainer directly. Many experienced and reponsive experts will be able look at the problem and quickly address
it. Please include a thorough description of the problem with code and data examples if at all possible.
Reporting Bugs
Report bugs to the Bioperl bug tracking system to help us keep track the bugs and their resolution. Bug reports can be submitted via the
web:
https://redmine.open-bio.org/projects/bioperl/
APPENDIX
The rest of the documentation details each of the object methods. Internal methods are usually preceded with a _
add_score
Title : add_score
Usage : $dis->add_score(300);
Function: Adds a single score to the distribution
Returns : nothing
Args :
fit_evd
Title : fit_evd
Usage : $dis->fit_evd();
Function: fits an evd to the current distribution
Returns : 1 if it fits successfully, 0 if not
Args :
fit_Gaussian
Title : fit_Gaussian
Usage :
Function:
Example :
Returns :
Args :
evalue
Title : evalue
Usage : $eval = $dis->evalue($score)
Function: Returns the evalue of this score
Returns : float
Args :
_engine
Title : _engine
Usage : $obj->_engine($newval)
Function: underlyine bp_sw:: histogram engine
Returns : value of _engine
Args : newvalue (optional)
perl v5.14.2 2012-03-02 Bio::SearchDist(3pm)