3 # BioPerl module for Bio::SearchDist
5 # Please direct questions and support issues to <bioperl-l@bioperl.org>
7 # Cared for by Ewan Birney <birney@ebi.ac.uk>
9 # Copyright Ewan Birney
11 # You may distribute this module under the same terms as perl itself
13 # POD documentation - main docs before the code
17 Bio::SearchDist - A perl wrapper around Sean Eddy's histogram object
21 $dis = Bio::SearchDist->new();
22 foreach $score ( @scores ) {
23 $dis->add_score($score);
26 if( $dis->fit_evd() ) {
27 foreach $score ( @scores ) {
28 $evalue = $dis->evalue($score);
29 print "Score $score had an evalue of $evalue\n";
32 warn("Could not fit histogram to an EVD!");
37 The Bio::SearchDist object is a wrapper around Sean Eddy's excellent
38 histogram object. The histogram object can bascially take in a number
39 of scores which are sensibly distributed somewhere around 0 that come
40 from a supposed Extreme Value Distribution. Having add all the scores
41 from a database search via the add_score method you can then fit a
42 extreme value distribution using fit_evd(). Once fitted you can then
43 get out the evalue for each score (or a new score) using
46 The fitting procedure is better described in Sean Eddy's own code
47 (available from http://hmmer.janelia.org/, or in the histogram.h header
48 file in Compile/SW). Bascially it fits a EVD via a maximum likelhood
49 method with pruning of the top end of the distribution so that real
50 positives are discarded in the fitting procedure. This comes from
51 an originally idea of Richard Mott's and the likelhood fitting
52 is from a book by Lawless [should ref here].
55 The object relies on the fact that the scores are sensibly distributed
56 around about 0 and that integer bins are sensible for the
57 histogram. Scores based on bits are often ideal for this (bits based
58 scoring mechanisms is what this histogram object was originally
64 The original code this was based on comes from the histogram module as
65 part of the HMMer2 package. Look at http://hmmer.janelia.org/
67 Its use in Bioperl is via the Compiled XS extension which is cared for
68 by Ewan Birney (birney@ebi.ac.uk). Please contact Ewan first about
69 the use of this module
75 User feedback is an integral part of the evolution of this and other
76 Bioperl modules. Send your comments and suggestions preferably to one
77 of the Bioperl mailing lists. Your participation is much appreciated.
79 bioperl-l@bioperl.org - General discussion
80 http://bioperl.org/wiki/Mailing_lists - About the mailing lists
84 Please direct usage questions or support issues to the mailing list:
86 I<bioperl-l@bioperl.org>
88 rather than to the module maintainer directly. Many experienced and
89 reponsive experts will be able look at the problem and quickly
90 address it. Please include a thorough description of the problem
91 with code and data examples if at all possible.
95 Report bugs to the Bioperl bug tracking system to help us keep track
96 the bugs and their resolution. Bug reports can be submitted via the
99 https://github.com/bioperl/bioperl-live/issues
103 The rest of the documentation details each of the object
104 methods. Internal methods are usually preceded with a _
109 # Let the code begin...
112 package Bio
::SearchDist
;
118 require Bio
::Ext
::Align
;
122 print STDERR
("\nThe C-compiled engine for histogram object (Bio::Ext::Align) has not been installed.\n Please install the bioperl-ext package\n\n");
128 use base
qw(Bio::Root::Root);
131 my($class,@args) = @_;
132 my $self = $class->SUPER::new
(@args);
133 my($min, $max, $lump) =
134 $self->_rearrange([qw(MIN MAX LUMP)], @args);
148 $self->_engine(&Bio
::Ext
::Align
::new_Histogram
($min,$max,$lump));
156 Usage : $dis->add_score(300);
157 Function: Adds a single score to the distribution
165 my ($self,$score) = @_;
167 $eng = $self->_engine();
168 #$eng->AddToHistogram($score);
175 Usage : $dis->fit_evd();
176 Function: fits an evd to the current distribution
177 Returns : 1 if it fits successfully, 0 if not
184 my ($self,@args) = @_;
186 return $self->_engine()->fit_EVD(10000,1);
202 my ($self,$high) = @_;
204 if( ! defined $high ) {
208 return $self->_engine()->fit_Gaussian($high);
215 Usage : $eval = $dis->evalue($score)
216 Function: Returns the evalue of this score
224 my ($self,$score) = @_;
226 return $self->_engine()->evalue($score);
235 Usage : $obj->_engine($newval)
236 Function: underlyine bp_sw:: histogram engine
237 Returns : value of _engine
238 Args : newvalue (optional)
244 my ($self,$value) = @_;
245 if( defined $value) {
246 $self->{'_engine'} = $value;
248 return $self->{'_engine'};