2 # BioPerl module for Bio::Search::Tiling::TilingI
4 # Please direct questions and support issues to <bioperl-l@bioperl.org>
6 # Cared for by Mark A. Jensen <maj@fortinbras.us>
8 # Copyright Mark A. Jensen
10 # You may distribute this module under the same terms as perl itself
12 # POD documentation - main docs before the code
16 Bio::Search::Tiling::TilingI - Abstract interface for an HSP tiling module
20 Not used directly. Useful POD here for developers, however.
22 The interface is designed to make the following code conversion as
27 # Bio::Search::SearchUtils-based
28 while ( local $_ = $result->next_hit ) {
29 printf( "E-value: %g; Fraction aligned: %f; Number identical: %d\n",
30 $hit->significance, $hit->frac_aligned_query, $hit->num_identical);
36 while ( local $_ = $result->next_hit ) {
37 my $tiling = Bio::Search::Tiling::MyTiling($_);
38 printf( "E-value: %g; Fraction aligned: %f; Number identical: %d\n",
39 $hit->significance, $tiling->frac_aligned_query, $tiling->num_identical);
46 This module provides strong suggestions for any intended HSP tiling
47 object implementation. An object subclassing TilingI should override
48 the methods defined here according to their descriptions below.
50 See the section STATISTICS METHODS for hints on implementing methods
51 that are valid across different algorithms and report types.
57 User feedback is an integral part of the evolution of this and other
58 Bioperl modules. Send your comments and suggestions preferably to
59 the Bioperl mailing list. Your participation is much appreciated.
61 bioperl-l@bioperl.org - General discussion
62 http://bioperl.org/wiki/Mailing_lists - About the mailing lists
66 Please direct usage questions or support issues to the mailing list:
68 I<bioperl-l@bioperl.org>
70 rather than to the module maintainer directly. Many experienced and
71 reponsive experts will be able look at the problem and quickly
72 address it. Please include a thorough description of the problem
73 with code and data examples if at all possible.
77 Report bugs to the Bioperl bug tracking system to help us keep track
78 of the bugs and their resolution. Bug reports can be submitted via
81 https://github.com/bioperl/bioperl-live/issues
83 =head1 AUTHOR - Mark A. Jensen
85 Email maj@fortinbras.us
89 The rest of the documentation details each of the object methods.
90 Internal methods are usually preceded with a _
94 # Let the code begin...
96 package Bio
::Search
::Tiling
::TilingI
;
101 # Object preamble - inherits from Bio::Root::Root
105 use base
qw(Bio::Root::Root);
107 =head2 STATISTICS METHODS
109 The tiling statistics can be thought of as global counterparts to
110 similar statistics defined for the individual HSPs. We therefore
111 prescribe definitions for many of the synonymous methods defined in
112 L<Bio::Search::HSP::HSPI>.
114 The tiling statistics must be able to keep track of the coordinate
115 systems in which both the query and subject sequences exist; i.e.,
116 either nucleotide or amino acid. This information is typically
117 inferred from the name of the algorithm used to perform the original
118 search (contained in C<$hit_object-E<gt>algorithm>). Here is a table
119 of algorithm information that may be useful (if you trust us).
121 algorithm query on hit coordinates(q/h)
122 --------- ------------ ---------------
123 blastn dna on dna dna/dna
124 blastp aa on aa aa/aa
125 blastx xna on aa dna/aa
126 tblastn aa on xna aa/dna
127 tblastx xna on xna dna/dna
128 fasta dna on dna dna/dna
130 fastx xna on aa dna/aa
131 fasty xna on aa dna/aa
132 tfasta aa on xna aa/dna
133 tfasty aa on xna aa/dna
134 megablast dna on dna dna/dna
136 xna: translated nucleotide data
138 Statistics methods must also be aware of differences in reporting
139 among the algorithms. Hit attributes are not necessarily normalized
140 over all algorithms. Devs, please feel free to add examples to the
145 =item NCBI BLAST vs WU-BLAST (AB-BLAST) lengths
147 The total length of the alignment is reported differently between these two flavors. C<$hit_object-E<gt>length()> will contain the number in the denominator of the stats line; i.e., 120 in
149 Identical = 34/120 Positives = 67/120
151 NCBI BLAST uses the total length of the query sequence as input by the user (a.k.a. "with gaps"). WU-BLAST uses the length of the query sequence actually aligned by the algorithm (a.k.a. "without gaps").
155 Finally, developers should remember that sequence data may or may not
156 be associated with the HSPs contained in the hit object. This will
157 typically depend on whether a full report (e.g, C<blastall -m0>) or a
158 summary (e.g., C<blastall -m8>) was parsed. Statistics methods that
159 depend directly on the sequence data will need to check that
160 that data is present.
165 Alias : num_identical
166 Usage : $num_identities = $tiling->identities()
167 Function: Return the estimated or exact number of identities in the
168 tiling, accounting for overlapping HSPs
170 Returns : number of identical residue pairs
176 my ($self,@args) = @_;
177 $self->throw_not_implemented;
181 sub num_identical
{ shift->identities( @_ ) }
186 Alias : num_conserved
187 Usage : $num_conserved = $tiling->conserved()
188 Function: Return the estimated or exact number of conserved sites in the
189 tiling, accounting for overlapping HSPs
191 Returns : number of conserved residue pairs
197 my ($self,@args) = @_;
198 $self->throw_not_implemented;
202 sub num_conserved
{ shift->conserved( @_ ) }
207 Usage : $max_length = $tiling->length($type)
208 Function: Return the total number of residues of the subject or query
209 sequence covered by the tiling
210 Returns : number of "logical" residues covered
211 Args : scalar $type, one of 'hit', 'subject', 'query'
216 my ($self, $type, @args) = @_;
217 $self->throw_not_implemented;
220 =head2 frac_identical
222 Title : frac_identical
223 Usage : $tiling->frac_identical($type)
224 Function: Return the fraction of sequence length consisting
226 Returns : scalar float
227 Args : scalar $type, one of 'hit', 'subject', 'query'
228 Note : This method must take account of the $type coordinate
229 system and the length reporting method (see STATISTICS
235 my ($self, $type, @args) = @_;
236 $self->throw_not_implemented;
239 =head2 percent_identity
241 Title : percent_identity
242 Usage : $tiling->percent_identity($type)
243 Function: Return the fraction of sequence length consisting
244 of identical pairs as a percentage
245 Returns : scalar float
246 Args : scalar $type, one of 'hit', 'subject', 'query'
250 sub percent_identity
{
251 my ($self, $type, @args) = @_;
252 return $self->frac_identical($type, @args) * 100;
255 =head2 frac_conserved
257 Title : frac_conserved
258 Usage : $tiling->frac_conserved($type)
259 Function: Return the fraction of sequence length consisting
261 Returns : scalar float
262 Args : scalar $type, one of 'hit', 'subject', 'query'
263 Note : This method must take account of the $type coordinate
264 system and the length reporting method (see STATISTICS
270 my ($self, $type, @args) = @_;
271 $self->throw_not_implemented;
274 =head2 percent_conserved
276 Title : percent_conserved
277 Usage : $tiling->percent_conserved($type)
278 Function: Return the fraction of sequence length consisting
279 of conserved pairs as a percentage
280 Returns : scalar float
281 Args : scalar $type, one of 'hit', 'subject', 'query'
285 sub percent_conserved
{
286 my ($self, $type, @args) = @_;
287 return $self->frac_conserved($type, @args) * 100;
293 Usage : $tiling->frac_aligned($type)
294 Function: Return the fraction of B<input> sequence length consisting
295 that was aligned by the algorithm
296 Returns : scalar float
297 Args : scalar $type, one of 'hit', 'subject', 'query'
298 Note : This method must take account of the $type coordinate
299 system and the length reporting method (see STATISTICS
305 my ($self, $type, @args) = @_;
306 $self->throw_not_implemented;
309 # aliases for back compat
310 sub frac_aligned_query
{ shift->frac_aligned('query', @_) }
311 sub frac_aligned_hit
{ shift->frac_aligned('hit', @_) }
316 Usage : $tiling->range($type)
317 Function: Returns the extent of the longest tiling
318 as ($min_coord, $max_coord)
319 Returns : array of two scalar integers
320 Args : scalar $type, one of 'hit', 'subject', 'query'
325 my ($self, $type, @args) = @_;
326 $self->throw_not_implemented;
329 =head1 TILING ITERATORS
334 Usage : @hsps = $self->next_tiling($type);
335 Function: Obtain a tiling of HSPs over the $type ('hit', 'subject',
338 Returns : an array of HSPI objects
339 Args : scalar $type: one of 'hit', 'subject', 'query', with
340 'subject' an alias for 'hit'
345 my ($self,$type,@args) = @_;
346 $self->throw_not_implemented;
349 =head2 rewind_tilings
351 Title : rewind_tilings
352 Usage : $self->rewind_tilings($type)
353 Function: Reset the next_tilings($type) iterator
355 Returns : True on success
356 Args : scalar $type: one of 'hit', 'subject', 'query', with
357 'subject' an alias for 'hit'
362 my ($self, $type, @args) = @_;
363 $self->throw_not_implemented;
367 sub rewind
{ shift->rewind_tilings(@_) }
369 =head1 INFORMATIONAL ACCESSORS
374 Usage : $tiling->algorithm
375 Function: Retrieve the algorithm name associated with the
376 invocant's hit object
377 Returns : scalar string
383 my ($self, @args) = @_;
384 $self->throw_not_implemented;