3 Bio::DB::GFF::Feature -- A relative segment identified by a feature type
11 Bio::DB::GFF::Feature is a stretch of sequence that corresponding to a
12 single annotation in a GFF database. It inherits from
13 Bio::DB::GFF::RelSegment, and so has all the support for relative
14 addressing of this class and its ancestors. It also inherits from
15 Bio::SeqFeatureI and so has the familiar start(), stop(),
16 primary_tag() and location() methods (it implements Bio::LocationI
19 Bio::DB::GFF::Feature adds new methods to retrieve the annotation
20 type, group, and other GFF attributes. Annotation types are
21 represented by Bio::DB::GFF::Typename objects, a simple class that has
22 two methods called method() and source(). These correspond to the
23 method and source fields of a GFF file.
25 Annotation groups serve the dual purpose of giving the annotation a
26 human-readable name, and providing higher-order groupings of
27 subfeatures into features. The groups returned by this module are
28 objects of the Bio::DB::GFF::Featname class.
30 Bio::DB::GFF::Feature inherits from and implements the abstract
31 methods of Bio::SeqFeatureI, allowing it to interoperate with other
34 Generally, you will not create or manipulate Bio::DB::GFF::Feature
35 objects directly, but use those that are returned by the
36 Bio::DB::GFF::RelSegment-E<gt>features() method.
38 =head2 Important note about start() vs end()
40 If features are derived from segments that use relative addressing
41 (which is the default), then start() will be less than end() if the
42 feature is on the opposite strand from the reference sequence. This
43 breaks Bio::SeqI compliance, but is necessary to avoid having the real
44 genomic locations designated by start() and end() swap places when
45 changing reference points.
47 To avoid this behavior, call $segment-E<gt>absolute(1) before fetching
48 features from it. This will force everything into absolute
53 my $segment = $db->segment('CHROMOSOME_I');
54 $segment->absolute(1);
55 my @features = $segment->features('transcript');
59 The remainder of this document describes the public and private
60 methods implemented by this module.
64 package Bio
::DB
::GFF
::Feature
;
68 use Bio
::DB
::GFF
::Util
::Rearrange
;
69 use Bio
::DB
::GFF
::Featname
;
70 use Bio
::DB
::GFF
::Typename
;
71 use Bio
::DB
::GFF
::Homol
;
75 use vars
qw($AUTOLOAD);
76 use base qw(Bio::DB::GFF::RelSegment Bio::SeqFeatureI Bio::Root::Root);
80 *segments
= *get_SeqFeatures
= \
&sub_SeqFeature
;
82 my %CONSTANT_TAGS = (method
=>1, source
=>1, score
=>1, phase
=>1, notes
=>1, id
=>1, group
=>1);
84 =head2 new_from_parent
86 Title : new_from_parent
87 Usage : $f = Bio::DB::GFF::Feature->new_from_parent(@args);
88 Function: create a new feature object
89 Returns : new Bio::DB::GFF::Feature object
93 This method is called by Bio::DB::GFF to create a new feature using
94 information obtained from the GFF database. It is one of two similar
95 constructors. This one is called when the feature is generated from a
96 RelSegment object, and should inherit the coordinate system of that
99 The 13 arguments are positional (sorry):
101 $parent a Bio::DB::GFF::RelSegment object (or descendent)
102 $start start of this feature
103 $stop stop of this feature
104 $method this feature's GFF method
105 $source this feature's GFF source
106 $score this feature's score
107 $fstrand this feature's strand (relative to the source
108 sequence, which has its own strandedness!)
109 $phase this feature's phase
110 $group this feature's group (a Bio::DB::GFF::Featname object)
111 $db_id this feature's internal database ID
112 $group_id this feature's internal group database ID
113 $tstart this feature's target start
114 $tstop this feature's target stop
116 tstart and tstop are not used for anything at the moment, since the
117 information is embedded in the group object.
121 # this is called for a feature that is attached to a parent sequence,
122 # in which case it inherits its coordinate reference system and strandedness
123 sub new_from_parent
{
127 $method,$source,$score,
129 $group,$db_id,$group_id,
130 $tstart,$tstop) = @_;
132 ($start,$stop) = ($stop,$start) if defined($fstrand) and $fstrand eq '-';
133 my $class = $group ?
$group->class : $parent->class;
136 factory
=> $parent->{factory
},
137 sourceseq
=> $parent->{sourceseq
},
138 strand
=> $parent->{strand
},
139 ref => $parent->{ref},
140 refstart
=> $parent->{refstart
},
141 refstrand
=> $parent->{refstrand
},
142 absolute
=> $parent->{absolute
},
145 type
=> Bio
::DB
::GFF
::Typename
->new($method,$source),
151 group_id
=> $group_id,
160 Usage : $f = Bio::DB::GFF::Feature->new(@args);
161 Function: create a new feature object
162 Returns : new Bio::DB::GFF::Feature object
166 This method is called by Bio::DB::GFF to create a new feature using
167 information obtained from the GFF database. It is one of two similar
168 constructors. This one is called when the feature is generated
169 without reference to a RelSegment object, and should therefore use its
170 default coordinate system (relative to itself).
172 The 11 arguments are positional:
174 $factory a Bio::DB::GFF adaptor object (or descendent)
175 $srcseq the source sequence
176 $start start of this feature
177 $stop stop of this feature
178 $method this feature's GFF method
179 $source this feature's GFF source
180 $score this feature's score
181 $fstrand this feature's strand (relative to the source
182 sequence, which has its own strandedness!)
183 $phase this feature's phase
184 $group this feature's group
185 $db_id this feature's internal database ID
189 # 'This is called when creating a feature from scratch. It does not have
190 # an inherited coordinate system.
198 $score,$fstrand,$phase,
199 $group,$db_id,$group_id,
200 $tstart,$tstop) = @_;
202 my $self = bless { },$package;
203 ($start,$stop) = ($stop,$start) if defined($fstrand) and $fstrand eq '-';
205 my $class = $group ?
$group->class : 'Sequence';
207 @
{$self}{qw(factory sourceseq start stop strand class)} =
208 ($factory,$srcseq,$start,$stop,$fstrand,$class);
210 # if the target start and stop are defined, then we use this information to create
211 # the reference sequence
212 # THIS SHOULD BE BUILT INTO RELSEGMENT
213 if (0 && $tstart ne '' && $tstop ne '') {
214 if ($tstart < $tstop) {
215 @
{$self}{qw(ref refstart refstrand)} = ($group,$start - $tstart + 1,'+');
217 @
{$self}{'start','stop'} = @
{$self}{'stop','start'};
218 @
{$self}{qw(ref refstart refstrand)} = ($group,$tstop + $stop - 1,'-');
222 @
{$self}{qw(ref refstart refstrand)} = ($srcseq,1,'+');
225 @
{$self}{qw(type fstrand score phase group db_id group_id absolute)} =
226 (Bio
::DB
::GFF
::Typename
->new($method,$source),$fstrand,$score,$phase,
227 $group,$db_id,$group_id,$factory->{absolute
});
235 Usage : $type = $f->type([$newtype])
236 Function: get or set the feature type
237 Returns : a Bio::DB::GFF::Typename object
238 Args : a new Typename object (optional)
241 This method gets or sets the type of the feature. The type is a
242 Bio::DB::GFF::Typename object, which encapsulates the feature method
245 The method() and source() methods described next provide shortcuts to
246 the individual fields of the type.
252 my $d = $self->{type
};
253 $self->{type
} = shift if @_;
260 Usage : $method = $f->method([$newmethod])
261 Function: get or set the feature method
263 Args : a new method (optional)
266 This method gets or sets the feature method. It is a convenience
267 feature that delegates the task to the feature's type object.
273 my $d = $self->{type
}->method;
274 $self->{type
}->method(shift) if @_;
281 Usage : $source = $f->source([$newsource])
282 Function: get or set the feature source
284 Args : a new source (optional)
287 This method gets or sets the feature source. It is a convenience
288 feature that delegates the task to the feature's type object.
294 my $d = $self->{type
}->source;
295 $self->{type
}->source(shift) if @_;
302 Usage : $score = $f->score([$newscore])
303 Function: get or set the feature score
305 Args : a new score (optional)
308 This method gets or sets the feature score.
314 my $d = $self->{score
};
315 $self->{score
} = shift if @_;
322 Usage : $phase = $f->phase([$phase])
323 Function: get or set the feature phase
325 Args : a new phase (optional)
328 This method gets or sets the feature phase.
334 my $d = $self->{phase
};
335 $self->{phase
} = shift if @_;
342 Usage : $strand = $f->strand
343 Function: get the feature strand
348 Returns the strand of the feature. Unlike the other methods, the
349 strand cannot be changed once the object is created (due to coordinate
356 return 0 unless $self->{fstrand
};
357 if ($self->absolute) {
358 return Bio
::DB
::GFF
::RelSegment
::_to_strand
($self->{fstrand
});
360 return $self->SUPER::strand
|| Bio
::DB
::GFF
::RelSegment
::_to_strand
($self->{fstrand
});
366 Usage : $group = $f->group([$new_group])
367 Function: get or set the feature group
368 Returns : a Bio::DB::GFF::Featname object
369 Args : a new group (optional)
372 This method gets or sets the feature group. The group is a
373 Bio::DB::GFF::Featname object, which has an ID and a class.
379 my $d = $self->{group
};
380 $self->{group
} = shift if @_;
387 Usage : $display_id = $f->display_id([$display_id])
388 Function: get or set the feature display id
389 Returns : a Bio::DB::GFF::Featname object
390 Args : a new display_id (optional)
393 This method is an alias for group(). It is provided for
394 Bio::SeqFeatureI compatibility.
401 Usage : $info = $f->info([$new_info])
402 Function: get or set the feature group
403 Returns : a Bio::DB::GFF::Featname object
404 Args : a new group (optional)
407 This method is an alias for group(). It is provided for AcePerl
413 *display_id
= \
&group
;
414 *display_name
= \
&group
;
419 Usage : $target = $f->target([$new_target])
420 Function: get or set the feature target
421 Returns : a Bio::DB::GFF::Homol object
422 Args : a new group (optional)
425 This method works like group(), but only returns the group if it
426 implements the start() method. This is typical for
427 similarity/assembly features, where the target encodes the start and
428 stop location of the alignment.
430 The returned object is of type Bio::DB::GFF::Homol, which is a
431 subclass of Bio::DB::GFF::Segment.
438 my $group = $self->group or return;
439 return unless $group->can('start');
443 =head2 flatten_target
445 Title : flatten_target
446 Usage : $target = $f->flatten_target($f->target)
447 Function: flatten a target object
448 Returns : a string (GFF2), an array [GFF2.5] or an array ref [GFF3]
449 Args : a target object (required), GFF version (optional)
452 This method flattens a target object into text for
453 GFF dumping. If a second argument is provided, version-specific
454 vocabulary is used for the flattened target.
460 my $t = shift || return;
463 return 0 unless $t->can('start');
464 my $class = $t->class;
466 my $start = $t->start;
472 print STDERR
qq(Target
"$class:$name"), "tstart $start", "tstop $stop\n";
473 return (qq(Target
"$class:$name"), "tstart $start", "tstop $stop");
476 return [Target
=>"$name $start $stop"];
479 return qq(Target
"$class:$name" $start $stop);
483 # override parent a smidgeon so that setting the ref for top-level feature
484 # sets ref for all subfeatures
487 my $result = $self->SUPER::refseq
(@_);
489 my $newref = $self->SUPER::refseq
;
490 for my $sub ($self->get_SeqFeatures) {
501 Usage : $hit = $f->hit([$new_hit])
502 Function: get or set the feature hit
503 Returns : a Bio::DB::GFF::Featname object
504 Args : a new group (optional)
507 This is the same as target(), for compatibility with
508 Bio::SeqFeature::SimilarityPair.
518 Function: get the feature ID
519 Returns : a database identifier
523 This method retrieves the database identifier for the feature. It
528 sub id
{ shift->{db_id
} }
529 sub primary_id
{ shift->{db_id
} }
534 Usage : $id = $f->group_id
535 Function: get the feature ID
536 Returns : a database identifier
540 This method retrieves the database group identifier for the feature.
541 It cannot be changed. Often the group identifier is more useful than
542 the feature identifier, since it is used to refer to a complex object
547 sub group_id
{ shift->{group_id
} }
552 Usage : $feature = $f->clone
553 Function: make a copy of the feature
554 Returns : a new Bio::DB::GFF::Feature object
558 This method returns a copy of the feature.
564 my $clone = $self->SUPER::clone
;
566 if (ref(my $t = $clone->type)) {
567 my $type = $t->can('clone') ?
$t->clone : bless {%$t},ref $t;
571 if (ref(my $g = $clone->group)) {
572 my $group = $g->can('clone') ?
$g->clone : bless {%$g},ref $g;
573 $clone->group($group);
576 if (my $merged = $self->{merged_segs
}) {
577 $clone->{merged_segs
} = { %$merged };
586 Usage : $flag = $f->compound([$newflag])
587 Function: get or set the compound flag
589 Args : a new flag (optional)
592 This method gets or sets a flag indicated that the feature is not a
593 primary one from the database, but the result of aggregation.
599 my $d = $self->{compound
};
600 $self->{compound
} = shift if @_;
604 =head2 sub_SeqFeature
606 Title : sub_SeqFeature
607 Usage : @feat = $feature->sub_SeqFeature([$method])
608 Function: get subfeatures
609 Returns : a list of Bio::DB::GFF::Feature objects
610 Args : a feature method (optional)
613 This method returns a list of any subfeatures that belong to the main
614 feature. For those features that contain heterogeneous subfeatures,
615 you can retrieve a subset of the subfeatures by providing a method
618 This method may also be called as segments() or get_SeqFeatures().
625 my $subfeat = $self->{subfeatures
} or return;
626 $self->sort_features;
629 my $features = $subfeat->{lc $type} or return;
632 @a = map {@
{$_}} values %{$subfeat};
637 =head2 add_subfeature
639 Title : add_subfeature
640 Usage : $feature->add_subfeature($feature)
641 Function: add a subfeature to the feature
643 Args : a Bio::DB::GFF::Feature object
646 This method adds a new subfeature to the object. It is used
647 internally by aggregators, but is available for public use as well.
654 my $type = $feature->method;
655 my $subfeat = $self->{subfeatures
}{lc $type} ||= [];
656 push @
{$subfeat},$feature;
662 Usage : $sf->attach_seq($seq)
663 Function: Attaches a Bio::Seq object to this feature. This
664 Bio::Seq object is for the *entire* sequence: ie
667 Returns : TRUE on success
668 Args : a Bio::PrimarySeqI compliant object
678 Usage : my $location = $seqfeature->location()
679 Function: returns a location object suitable for identifying location
680 of feature on sequence or parent feature
681 Returns : Bio::LocationI object
688 require Bio
::Location
::Split
unless Bio
::Location
::Split
->can('new');
689 require Bio
::Location
::Simple
unless Bio
::Location
::Simple
->can('new');
692 if (my @segments = $self->segments) {
693 $location = Bio
::Location
::Split
->new(-seq_id
=> $self->seq_id);
694 foreach (@segments) {
695 $location->add_sub_Location($_->location);
698 $location = Bio
::Location
::Simple
->new(-start
=> $self->start,
700 -strand
=> $self->strand,
701 -seq_id
=> $self->seq_id);
709 Usage : $whole_seq = $sf->entire_seq()
710 Function: gives the entire sequence that this seqfeature is attached to
712 Returns : a Bio::PrimarySeqI compliant object, or undef if there is no
721 $self->factory->segment($self->sourceseq);
724 =head2 merged_segments
726 Title : merged_segments
727 Usage : @segs = $feature->merged_segments([$method])
728 Function: get merged subfeatures
729 Returns : a list of Bio::DB::GFF::Feature objects
730 Args : a feature method (optional)
733 This method acts like sub_SeqFeature, except that it merges
734 overlapping segments of the same time into contiguous features. For
735 those features that contain heterogeneous subfeatures, you can
736 retrieve a subset of the subfeatures by providing a method name to
739 A side-effect of this method is that the features are returned in
740 sorted order by their start tposition.
746 sub merged_segments
{
749 $type ||= ''; # prevent uninitialized variable warnings
751 my $truename = overload
::StrVal
($self);
753 return @
{$self->{merged_segs
}{$type}} if exists $self->{merged_segs
}{$type};
754 my @segs = map { $_->[0] }
755 sort { $a->[1] <=> $b->[1] ||
756 $a->[2] cmp $b->[2] }
757 map { [$_, $_->start, $_->type] } $self->sub_SeqFeature($type);
759 # attempt to merge overlapping segments
762 my $previous = $merged[-1] if @merged;
763 my ($pscore,$score) = (eval{$previous->score}||0,eval{$s->score}||0);
764 if (defined($previous)
765 && $previous->stop+1 >= $s->start
767 && $previous->method eq $s->method
769 if ($self->absolute && $self->strand < 0) {
770 $previous->{start
} = $s->{start
};
772 $previous->{stop
} = $s->{stop
};
774 # fix up the target too
775 my $g = $previous->{group
};
776 if ( ref($g) && $g->isa('Bio::DB::GFF::Homol')) {
777 my $cg = $s->{group
};
778 $g->{stop
} = $cg->{stop
};
781 elsif (defined($previous)
782 && $previous->start == $s->start
783 && $previous->stop == $s->stop
784 && $previous->method eq $s->method
790 my $copy = $s->clone;
794 $self->{merged_segs
}{$type} = \
@merged;
801 Usage : @methods = $feature->sub_types
802 Function: get methods of all sub-seqfeatures
803 Returns : a list of method names
807 For those features that contain subfeatures, this method will return a
808 unique list of method names of those subfeatures, suitable for use
809 with sub_SeqFeature().
815 my $subfeat = $self->{subfeatures
} or return;
816 return keys %$subfeat;
822 Usage : @attributes = $feature->attributes($name)
823 Function: get the "attributes" on a particular feature
824 Returns : an array of string
828 Some GFF version 2 files use the groups column to store a series of
829 attribute/value pairs. In this interpretation of GFF, the first such
830 pair is treated as the primary group for the feature; subsequent pairs
831 are treated as attributes. Two attributes have special meaning:
832 "Note" is for backward compatibility and is used for unstructured text
833 remarks. "Alias" is considered as a synonym for the feature name.
835 @gene_names = $feature->attributes('Gene');
836 @aliases = $feature->attributes('Alias');
838 If no name is provided, then attributes() returns a flattened hash, of
839 attribute=E<gt>value pairs. This lets you do:
841 %attributes = $db->attributes;
847 my $factory = $self->factory;
848 defined(my $id = $self->id) or return;
849 $factory->attributes($id,@_)
856 Usage : @notes = $feature->notes
857 Function: get the "notes" on a particular feature
858 Returns : an array of string
862 Some GFF version 2 files use the groups column to store various notes
863 and remarks. Adaptors can elect to store the notes in the database,
864 or just ignore them. For those adaptors that store the notes, the
865 notes() method will return them as a list.
871 $self->attributes('Note');
877 Usage : @aliases = $feature->aliases
878 Function: get the "aliases" on a particular feature
879 Returns : an array of string
883 This method will return a list of attributes of type 'Alias'.
889 $self->attributes('Alias');
894 =head2 Autogenerated Methods
897 Usage : @subfeat = $feature->Method
898 Function: Return subfeatures using autogenerated methods
899 Returns : a list of Bio::DB::GFF::Feature objects
903 Any method that begins with an initial capital letter will be passed
904 to AUTOLOAD and treated as a call to sub_SeqFeature with the method
905 name used as the method argument. For instance, this call:
907 @exons = $feature->Exon;
909 is equivalent to this call:
911 @exons = $feature->sub_SeqFeature('exon');
915 =head2 SeqFeatureI methods
917 The following Bio::SeqFeatureI methods are implemented:
919 primary_tag(), source_tag(), all_tags(), has_tag(), each_tag_value() [renamed get_tag_values()].
923 *primary_tag
= \
&method
;
924 *source_tag
= \
&source
;
927 my %atts = $self->attributes;
928 my @tags = keys %atts;
930 # autogenerated methods
931 #if (my $subfeat = $self->{subfeatures}) {
932 # push @tags,keys %$subfeat;
937 *get_all_tags
= \
&all_tags
;
942 my %att = $self->attributes;
943 my %tags = map {$_=>1} ( $self->all_tags );
948 *each_tag_value
= \
&get_tag_values
;
953 return $self->$tag() if $CONSTANT_TAGS{$tag};
955 my $atts = $self->attributes;
956 return @
{$atts->{$tag}} if $atts && $atts->{$tag};
959 return $self->$tag(); # try autogenerated tag
963 my($pack,$func_name) = $AUTOLOAD=~/(.+)::([^:]+)$/;
967 # ignore DESTROY calls
968 return if $func_name eq 'DESTROY';
970 # fetch subfeatures if func_name has an initial cap
971 # return sort {$a->start <=> $b->start} $self->sub_SeqFeature($func_name) if $func_name =~ /^[A-Z]/;
972 return $self->sub_SeqFeature($func_name) if $func_name =~ /^[A-Z]/;
974 # error message of last resort
975 $self->throw(qq(Can
't locate object method "$func_name" via package "$pack"));
980 Title : adjust_bounds
981 Usage : $feature->adjust_bounds
982 Function: adjust the bounds of a feature
983 Returns : ($start,$stop,$strand)
987 This method adjusts the boundaries of the feature to enclose all its
988 subfeatures. It returns the new start, stop and strand of the
993 # adjust a feature so that its boundaries are synched with its subparts' boundaries.
994 # this works recursively, so subfeatures can contain other features
998 my $g = $self->{group
};
1002 if (my $subfeat = $self->{subfeatures
}) {
1003 for my $list (values %$subfeat) {
1004 for my $feat (@
$list) {
1005 # fix up our bounds to hold largest subfeature
1006 my($start,$stop,$strand) = $feat->adjust_bounds($shrink);
1008 if (defined($self->{fstrand
})) {
1009 $self->debug("Subfeature's strand ($strand) doesn't match parent strand ($self->{fstrand})\n") if $self->{fstrand
} ne $strand;
1011 $self->{fstrand
} = $strand;
1014 my ($low,$high) = $start < $stop ?
($start,$stop) : ($stop,$start);
1015 if ($shrink && !$first++) {
1016 # first subfeature resets start & stop:
1017 $self->{start
} = $self->{fstrand
} ne '-' ?
$low : $high;
1018 $self->{stop
} = $self->{fstrand
} ne '-' ?
$high : $low;
1020 if ($self->{fstrand
} ne '-') {
1021 $self->{start
} = $low
1022 if (!defined($self->{start
})) || $low < $self->{start
};
1023 $self->{stop
} = $high
1024 if (!defined($self->{stop
})) || $high > $self->{stop
};
1026 $self->{start
} = $high
1027 if (!defined($self->{start
})) || $high > $self->{start
};
1028 $self->{stop
} = $low
1029 if (!defined($self->{stop
})) || $low < $self->{stop
};
1033 # fix up endpoints of targets too (for homologies only)
1034 my $h = $feat->group;
1035 next unless $h && $h->isa('Bio::DB::GFF::Homol');
1036 next unless $g && $g->isa('Bio::DB::GFF::Homol');
1038 ($start,$stop) = ($h->{start
},$h->{stop
});
1039 if ($shrink && !$tfirst++) {
1040 $g->{start
} = $start;
1043 if ($start <= $stop) {
1044 $g->{start
} = $start if (!defined($g->{start
})) || $start < $g->{start
};
1045 $g->{stop
} = $stop if (!defined($g->{stop
})) || $stop > $g->{stop
};
1047 $g->{start
} = $start if (!defined($g->{start
})) || $start > $g->{start
};
1048 $g->{stop
} = $stop if (!defined($g->{stop
})) || $stop < $g->{stop
};
1055 ($self->{start
},$self->{stop
},$self->strand);
1058 =head2 sort_features
1060 Title : sort_features
1061 Usage : $feature->sort_features
1062 Function: sort features
1067 This method sorts subfeatures in ascending order by their start
1068 position. For reverse strand features, it sorts subfeatures in
1069 descending order. After this is called sub_SeqFeature will return the
1072 This method is called internally by merged_segments().
1079 return if $self->{sorted
}++;
1080 my $strand = $self->strand or return;
1081 my $subfeat = $self->{subfeatures
} or return;
1082 for my $type (keys %$subfeat) {
1083 $subfeat->{$type} = [map { $_->[0] }
1084 sort {$a->[1] <=> $b->[1] }
1085 map { [$_,$_->start] }
1086 @
{$subfeat->{$type}}] if $strand > 0;
1087 $subfeat->{$type} = [map { $_->[0] }
1088 sort {$b->[1] <=> $a->[1]}
1089 map { [$_,$_->start] }
1090 @
{$subfeat->{$type}}] if $strand < 0;
1097 Usage : $string = $feature->asString
1098 Function: return human-readabled representation of feature
1103 This method returns a human-readable representation of the feature and
1104 is called by the overloaded "" operator.
1110 my $type = $self->type;
1111 my $name = $self->group;
1112 return "$type($name)" if $name;
1114 # my $type = $self->method;
1115 # my $id = $self->group || 'unidentified';
1116 # return join '/',$id,$type,$self->SUPER::asString;
1121 return $self->group || $self->SUPER::name
;
1127 Usage : $string = $feature->gff_string
1128 Function: return GFF2 of GFF2.5 representation of feature
1137 my $version = $self->version;
1139 # gff3_string and gff_string are synonymous if the version is set to 3
1140 return $self->gff3_string(@_) if $version == 3;
1142 my ($start,$stop) = ($self->start,$self->stop);
1144 # the defined() tests prevent uninitialized variable warnings, when dealing with clone objects
1145 # whose endpoints may be undefined
1146 ($start,$stop) = ($stop,$start) if defined($start) && defined($stop) && $start > $stop;
1148 my ($class,$name) = ('','');
1149 my $strand = ('-','.','+')[$self->strand+1];
1153 if (my $t = $self->target) {
1154 push @group, $version == 2.5 ?
$self->flatten_target($t,2.5)
1155 : $self->flatten_target($t);
1157 elsif (my $g = $self->group) {
1158 $class = $g->class || '';
1159 $name = $g->name || '';
1160 ($name =~ /\S\s\S/)?
(push @group, "$class '$name'"):(push @group,"$class $name");
1163 # add exhaustive list of attributes
1164 my $att = $self->attributes;
1165 for ( keys %$att ) {
1166 for my $v ( @
{$att->{$_}} ) {
1167 $v = qq("$v") if $v=~ /\S\s+\S/;
1168 push @group, qq($_ $v);
1172 my $group_field = join ' ; ',@group;
1173 my $ref = $self->refseq;
1174 my $n = ref($ref) ?
$ref->name : $ref;
1175 my $phase = $self->phase;
1176 $phase = '.' unless defined $phase;
1179 $self->source,$self->method,
1180 (defined $start ?
$start : '.'),
1181 (defined $stop ?
$stop : '.'),
1182 (defined $self->score ?
$self->score : '.'),
1183 (defined $strand ?
$strand : '.'),
1191 Usage : $string = $feature->gff3_string([$recurse])
1192 Function: return GFF3 representation of feature
1194 Args : An optional flag, which if true, will cause the feature to recurse over
1202 my ($recurse,$parent) = @_;
1203 my ($start,$stop) = ($self->start,$self->stop);
1205 # the defined() tests prevent uninitialized variable warnings, when dealing with clone objects
1206 # whose endpoints may be undefined
1207 ($start,$stop) = ($stop,$start) if defined($start) && defined($stop) && $start > $stop;
1209 my $strand = ('-','.','+')[$self->strand+1];
1210 my $ref = $self->refseq;
1211 my $n = ref($ref) ?
$ref->name : $ref;
1212 my $phase = $self->phase;
1213 $phase = '.' unless defined $phase;
1215 my ($class,$name) = ('','');
1217 if (my $g = $self->group) {
1218 $class = $g->class || '';
1219 $name = $g->name || '';
1220 $name = "$class:$name" if defined $class;
1221 push @group,[ID
=> $name] if !defined($parent) || $name ne $parent;
1224 push @group,[Parent
=> $parent] if defined $parent && $parent ne '';
1226 if (my $t = $self->target) {
1227 $strand = '-' if $t->stop < $t->start;
1228 push @group, $self->flatten_target($t,3);
1231 my @attributes = $self->attributes;
1232 while (@attributes) {
1233 push @group,[shift(@attributes),shift(@attributes)]
1235 my $group_field = join ';',map {join '=',_escape
($_->[0]),_escape
($_->[1])} @group;
1236 my $string = join("\t",$n,$self->source,$self->method,$start||'.',$stop||'.',
1237 $self->score||'.',$strand||'.',$phase,$group_field);
1240 foreach ($self->sub_SeqFeature) {
1241 $string .= $_->gff3_string(1,$name);
1250 Usage : $feature->version()
1251 Function: get/set the GFF version to be returned by gff_string
1252 Returns : the GFF version (default is 2)
1253 Args : the GFF version (2, 2.5 of 3)
1259 my ($self, $version) = @_;
1260 $self->{version
} = $version if $version;
1261 return $self->{version
} || 2;
1266 my $toencode = shift;
1267 $toencode =~ s/([^a-zA-Z0-9_. :?^*\(\)\[\]@!-])/uc sprintf("%%%02x",ord($1))/eg;
1268 $toencode =~ tr/ /+/;
1275 Usage : $link = $feature->cmap_link
1276 Function: returns a URL link to the corresponding feature in cmap
1281 If integrated cmap/gbrowse installation, it returns a link to the map otherwise
1282 it returns a link to a feature search on the feature name. See the cmap
1283 documentation for more information.
1285 This function is intended primarily to be used in gbrowse conf files.
1288 link = sub {my $self = shift; return $self->cmap_viewer_link(data_source);}
1293 sub cmap_viewer_link
{
1294 # Use ONLY if CMap is installed
1296 my $data_source = shift;
1297 my $group_id = $self->group_id;
1298 my $factory = $self->factory; # aka adaptor
1302 if ($factory->can("create_cmap_viewer_link")){
1303 $link_str = $factory->create_cmap_viewer_link(
1304 data_source
=> $data_source,
1305 group_id
=> $group_id,
1308 my $name = $self->name();
1309 $link_str = '/cgi-bin/cmap/feature_search?features='
1311 . '&search_field=feature_name&order_by=&data_source='
1320 =head1 A Note About Similarities
1322 The current default aggregator for GFF "similarity" features creates a
1323 composite Bio::DB::GFF::Feature object of type "gapped_alignment".
1324 The target() method for the feature as a whole will return a
1325 RelSegment object that is as long as the extremes of the similarity
1326 hit target, but will not necessarily be the same length as the query
1327 sequence. The length of each "similarity" subfeature will be exactly
1328 the same length as its target(). These subfeatures are essentially
1329 the HSPs of the match.
1331 The following illustrates this:
1333 @similarities = $segment->feature('similarity:BLASTN');
1334 $sim = $similarities[0];
1336 print $sim->type; # yields "gapped_similarity:BLASTN"
1338 $query_length = $sim->length;
1339 $target_length = $sim->target->length; # $query_length != $target_length
1341 @matches = $sim->Similarity; # use autogenerated method
1342 $query1_length = $matches[0]->length;
1343 $target1_length = $matches[0]->target->length; # $query1_length == $target1_length
1345 If you merge segments by calling merged_segments(), then the length of
1346 the query sequence segments will no longer necessarily equal the
1347 length of the targets, because the alignment information will have
1348 been lost. Nevertheless, the targets are adjusted so that the first
1349 and last base pairs of the query match the first and last base pairs
1358 This module is still under development.
1362 L<bioperl>, L<Bio::DB::GFF>, L<Bio::DB::RelSegment>
1366 Lincoln Stein E<lt>lstein@cshl.orgE<gt>.
1368 Copyright (c) 2001 Cold Spring Harbor Laboratory.
1370 This library is free software; you can redistribute it and/or modify
1371 it under the same terms as Perl itself.