maint: remove Travis stuff which has been replaced with Github actions (#325)
[bioperl-live.git] / lib / Bio / SeqIO / swiss.pm
blob6728443dfc6db1aeced4d386a833326c87e388a6
2 # BioPerl module for Bio::SeqIO::swiss
4 # Copyright Elia Stupka
6 # You may distribute this module under the same terms as perl itself
8 # POD documentation - main docs before the code
10 =head1 NAME
12 Bio::SeqIO::swiss - Swissprot sequence input/output stream
14 =head1 SYNOPSIS
16 It is probably best not to use this object directly, but
17 rather go through the SeqIO handler system:
19 use Bio::SeqIO;
21 $stream = Bio::SeqIO->new(-file => $filename,
22 -format => 'swiss');
24 while ( my $seq = $stream->next_seq() ) {
25 # do something with $seq
28 =head1 DESCRIPTION
30 This object can transform Bio::Seq objects to and from Swiss-Pprot flat
31 file databases.
33 There is a lot of flexibility here about how to dump things which needs
34 to be documented.
36 =head2 GN (Gene name) line management details
38 A Uniprot/Swiss-Prot entry holds information on one protein
39 sequence. If that sequence is identical across genes and species, they
40 are all merged into one entry. This creates complex needs for several
41 annotation fields in swiss-prot format.
43 The latest syntax for GN line is described in the user manual:
45 http://www.expasy.ch/sprot/userman.html#GN_line
47 Each of the possibly multiple genes in an entry can have Name,
48 Synonyms (only if there is a name), OrderedLocusNames (names from
49 genomic sequences) and ORFNames (temporary or cosmid names). "Name"
50 here really means "symbol". This complexity is now dealt with the
51 following way:
53 A new Bio::AnnotationI class was created in order to store the
54 data in tag-value pairs. This class (Bio::Annotation::TagTree)
55 is stored in the Bio::Annotation::Collection object and is
56 accessed like all other annotations. The tag name is 'gene_name'.
58 There is a single Bio::Annotation::TagTree per sequence record, which
59 corresponds to the original class that stored this data
60 (Bio::Annotation::StructuredValue). Depending on how we progress
61 this may change to represent each group of gene names.
63 For now, to access the gene name tree annotation, one uses the below method:
65 my ($gene) = $seq->annotation->get_Annotations('gene_name');
67 If you are only interested in displaying the values, value() returns a
68 string with similar formatting.
70 There are several ways to get directly at the information you want if you
71 know the element (tag) for the data. For gene names all data is stored with
72 the element-tag pairs:
74 "element1=tag1, tag2, tag3; element2=tag4, tag5;"
76 This normally means the element will be 'Name', 'Synonyms', etc. and the
77 gene names the values. Using findval(), you can do the following:
79 # grab a flattened list of all gene names
80 my @names = $ann->findval('Name');
82 # or iterated through the nodes and grab the name for each group
83 for my $node ($ann->findnode('gene_name')) {
84 my @names = $node->findval('Name');
87 The current method for parsing gene name data (and reconstructing gene name
88 output) is very generic. This is somewhat preemptive if, for instance, UniProt
89 decides to update and add another element name to the current ones using the
90 same formatting layout. Under those circumstances, one can iterate through the
91 tag tree in a safe way and retrieve all node data like so.
93 # retrieve the gene name nodes (groups like names, synonyms, etc).
94 for my $ann ($seq->annotation->get_Annotations('gene_name')) {
96 # each gene name group
97 for my $node ($ann->findnode('gene_name')) {
98 print "Gene name:\n";
100 # each gene name node (tag => value pair)
101 for my $n ($node->children) {
102 print "\t".$n->element.": ".$n->children."\n";
107 For more uses see Bio::Annotation::TagTree.
109 Since Uniprot/Swiss-Prot format have been around for quite some time, the
110 parser is also able to read in the older GN line syntax where genes
111 are separated by AND and various symbols by OR. The first symbol is
112 taken to be the 'Name' and the remaining ones are stored as 'Synonyms'.
114 Also, for UniProt output we support using other Bio::AnnotationI, but in this
115 case we only use the stringified version of the annotation. This is to allow for
116 backwards compatibility with code that previously used
117 Bio::Annotation::SimpleValue or other Bio::AnnotationI classes.
119 =head2 Optional functions
121 =over 3
123 =item _show_dna()
125 (output only) shows the dna or not
127 =item _post_sort()
129 (output only) provides a sorting func which is applied to the FTHelpers
130 before printing
132 =item _id_generation_func()
134 This is function which is called as
136 print "ID ", $func($seq), "\n";
138 To generate the ID line. If it is not there, it generates a sensible ID
139 line using a number of tools.
141 If you want to output annotations in Swissprot format they need to be
142 stored in a Bio::Annotation::Collection object which is accessible
143 through the Bio::SeqI interface method L<annotation()|annotation>.
145 The following are the names of the keys which are polled from a
146 L<Bio::Annotation::Collection> object.
148 reference - Should contain Bio::Annotation::Reference objects
149 comment - Should contain Bio::Annotation::Comment objects
150 dblink - Should contain Bio::Annotation::DBLink objects
151 gene_name - Should contain Bio::Annotation::SimpleValue object
153 =back
155 =head1 FEEDBACK
157 =head2 Mailing Lists
159 User feedback is an integral part of the evolution of this
160 and other Bioperl modules. Send your comments and suggestions,
161 preferably to one of the Bioperl mailing lists.
162 Your participation is much appreciated.
164 bioperl-l@bioperl.org - General discussion
165 http://bioperl.org/wiki/Mailing_lists - About the mailing lists
167 =head2 Support
169 Please direct usage questions or support issues to the mailing list:
171 I<bioperl-l@bioperl.org>
173 rather than to the module maintainer directly. Many experienced and
174 reponsive experts will be able look at the problem and quickly
175 address it. Please include a thorough description of the problem
176 with code and data examples if at all possible.
178 =head2 Reporting Bugs
180 Report bugs to the Bioperl bug tracking system to help us keep track
181 the bugs and their resolution.
182 Bug reports can be submitted via the web:
184 https://github.com/bioperl/bioperl-live/issues
186 =head1 AUTHOR - Elia Stupka
188 Email elia@tll.org.sg
190 =head1 APPENDIX
192 The rest of the documentation details each of the object methods.
193 Internal methods are usually preceded with a _
195 =cut
197 # Let the code begin...
199 package Bio::SeqIO::swiss;
201 use vars qw(@Unknown_names @Unknown_genus);
202 use strict;
203 use Bio::SeqIO::FTHelper;
204 use Bio::SeqFeature::Generic;
205 use Bio::Species;
206 use Bio::Tools::SeqStats;
207 use Bio::Seq::SeqFactory;
208 use Bio::Annotation::Collection;
209 use Bio::Annotation::Comment;
210 use Bio::Annotation::Reference;
211 use Bio::Annotation::DBLink;
212 use Bio::Annotation::SimpleValue;
213 use Bio::Annotation::TagTree;
215 use base qw(Bio::SeqIO);
217 our $LINE_LENGTH = 76;
219 # this is for doing species name parsing
220 @Unknown_names=('other', 'unidentified',
221 'unknown organism', 'not specified',
222 'not shown', 'Unspecified', 'Unknown',
223 'None', 'unclassified', 'unidentified organism',
224 'not supplied'
226 # dictionary of synonyms for taxid 32644
227 # all above can be part of valid species name
228 @Unknown_genus = qw(unknown unclassified uncultured unidentified);
230 # if there are any other gene name tags, they are added to the end
231 our @GENE_NAME_ORDER = qw(Name Synonyms OrderedLocusNames ORFNames);
233 sub _initialize {
234 my($self,@args) = @_;
235 $self->SUPER::_initialize(@args);
236 # hash for functions for decoding keys.
237 $self->{'_func_ftunit_hash'} = {};
238 # sets this to one by default. People can change it
239 $self->_show_dna(1);
240 if ( ! defined $self->sequence_factory ) {
241 $self->sequence_factory(Bio::Seq::SeqFactory->new
242 (-verbose => $self->verbose(),
243 -type => 'Bio::Seq::RichSeq'));
247 =head2 next_seq
249 Title : next_seq
250 Usage : $seq = $stream->next_seq()
251 Function: returns the next sequence in the stream
252 Returns : Bio::Seq object
253 Args :
255 =cut
257 sub next_seq {
258 my ($self,@args) = @_;
259 my ($pseq,$c,$line,$name,$desc,$acc,$seqc,$mol,$div, $sptr,$seq_div,
260 $date,$comment,@date_arr);
261 my $genename = "";
262 my ($annotation, %params, @features) = ( Bio::Annotation::Collection->new());
264 local $_;
266 1 while defined($_ = $self->_readline) && /^\s+$/;
267 return unless defined $_ && /^ID\s/;
269 # fixed to allow _DIVISION to be optional for bug #946
270 # see bug report for more information
272 # 9/6/06 Note: Swiss/TrEMBL sequences have no division acc. to UniProt
273 # release notes; this is fixed to simplify the regex parsing
274 # STANDARD (SwissProt) and PRELIMINARY (TrEMBL) added to namespace()
275 unless( m{^
276 ID \s+ #
277 (\S+) \s+ # $1 entryname
278 ([^\s;]+); \s+ # $2 DataClass
279 (?:PRT;)? \s+ # Molecule Type (optional)
280 [0-9]+[ ]AA \. # Sequencelength (capture?)
282 }ox ) {
283 # I couldn't find any new current UniProt sequences
284 # that matched this format:
285 # || m/^ID\s+(\S+)\s+(_([^\s_]+))? /ox ) {
286 $self->throw("swissprot stream with no ID. Not swissprot in my book");
288 ($name, $seq_div) = ($1, $2);
289 $params{'-namespace'} =
290 ($seq_div eq 'Reviewed' || $seq_div eq 'STANDARD') ? 'Swiss-Prot' :
291 ($seq_div eq 'Unreviewed' || $seq_div eq 'PRELIMINARY') ? 'TrEMBL' :
292 $seq_div;
293 # we shouldn't be setting the division, but for now...
294 my ($junk, $division) = split q(_), $name;
295 $params{'-division'} = $division;
296 $params{'-alphabet'} = 'protein';
297 # this is important to have the id for display in e.g. FTHelper, otherwise
298 # you won't know which entry caused an error
299 $params{'-display_id'} = $name;
301 BEFORE_FEATURE_TABLE :
302 while ( defined($_ = $self->_readline) ) {
303 # Exit at start of Feature table and at the sequence at the
304 # latest HL 05/11/2000
305 last if( /^(FT|SQ)/ );
307 # Description line(s)
308 if (/^DE\s+(\S.*\S)/) {
309 $desc .= $desc ? " $1" : $1;
311 #Gene name
312 elsif (/^GN\s+(.*)/) {
313 $genename .= " " if $genename;
314 $genename .= $1;
316 #accession number(s)
317 elsif ( /^AC\s+(.+)/) {
318 my @accs = split(/[; ]+/, $1); # allow space in addition
319 $params{'-accession_number'} = shift @accs
320 unless defined $params{'-accession_number'};
321 push @{$params{'-secondary_accessions'}}, @accs;
323 #date and sequence version
324 elsif ( /^DT\s+(.*)/ ) {
325 my $line = $1;
326 my ($date, $version) = split(' ', $line, 2);
327 $date =~ tr/,//d; # remove comma if new version
328 if ($version =~ /\(Rel\. (\d+), Last sequence update\)/ || # old
329 /sequence version (\d+)/) { #new
330 my $update = Bio::Annotation::SimpleValue->new
331 (-tagname => 'seq_update',
332 -value => $1
334 $annotation->add_Annotation($update);
335 } elsif ($version =~ /\(Rel\. (\d+), Last annotation update\)/ || #old
336 /entry version (\d+)/) { #new
337 $params{'-version'} = $1;
339 push @{$params{'-dates'}}, $date;
341 # Evidence level
342 elsif ( /^PE\s+(.*)/ ) {
343 my $line = $1;
344 $line =~ s/;\s*//; # trim trailing semicolon and any spaces at the end of the line
345 my $evidence = Bio::Annotation::SimpleValue->new
346 (-tagname => 'evidence',
347 -value => $line
349 $annotation->add_Annotation($evidence);
351 # Organism name and phylogenetic information
352 elsif (/^O[SCG]/) {
353 my $species = $self->_read_swissprot_Species($_);
354 $params{'-species'}= $species;
355 # now we are one line ahead -- so continue without reading the next
356 # line HL 05/11/2000
358 # References
359 elsif (/^R/) {
360 my $refs = $self->_read_swissprot_References($_);
361 foreach my $r (@$refs) {
362 $annotation->add_Annotation('reference',$r);
365 # Comments
366 elsif (/^CC\s{3}(.*)/) {
367 $comment .= $1;
368 $comment .= "\n";
369 while (defined ($_ = $self->_readline) && /^CC\s{3}(.*)/ ) {
370 $comment .= $1 . "\n";
373 # note: don't try to process comments here -- they may contain
374 # structure. LP 07/30/2000
376 my $commobj = Bio::Annotation::Comment->new(-tagname => 'comment',
377 -text => $comment);
378 $annotation->add_Annotation('comment',$commobj);
379 $comment = "";
380 $self->_pushback($_);
382 #DBLinks
383 # old regexp
384 # /^DR\s+(\S+)\;\s+(\S+)\;\s+(\S+)[\;\.](.*)$/) {
385 # new regexp from Andreas Kahari bug #1584
386 elsif (/^DR\s+(\S+)\;\s+(\S+)\;\s+([^;]+)[\;\.](.*)$/) {
387 my ($database,$primaryid,$optional,$comment) = ($1,$2,$3,$4);
389 # drop leading and training spaces and trailing .
390 $comment =~ s/\.\s*$//;
391 $comment =~ s/^\s+//;
393 my $dblinkobj = Bio::Annotation::DBLink->new
394 (-database => $database,
395 -primary_id => $primaryid,
396 -optional_id => $optional,
397 -comment => $comment,
398 -tagname => 'dblink',
401 $annotation->add_Annotation('dblink',$dblinkobj);
403 #keywords
404 elsif ( /^KW\s+(.*)$/ ) {
405 my @kw = split(/\s*\;\s*/,$1);
406 defined $kw[-1] && $kw[-1] =~ s/\.$//;
407 push @{$params{'-keywords'}}, @kw;
410 # process and parse the gene name line if there was one (note: we
411 # can't do this above b/c GN may be multi-line and we can't
412 # unequivocally determine whether we've seen the last GN line in
413 # the new format)
414 if ($genename) {
415 my @stags;
416 if ($genename =~ /\w=\w/) {
417 # new format (e.g., Name=RCHY1; Synonyms=ZNF363, CHIMP)
418 for my $n (split(m{\s+and\s+},$genename)) {
419 my @genenames;
420 for my $section (split(m{\s*;\s*},$n)) {
421 my ($tag, $rest) = split("=",$section);
422 $rest ||= '';
423 for my $val (split(m{\s*,\s*},$rest)) {
424 push @genenames, [$tag => $val];
427 push @stags, ['gene_name' => \@genenames];
429 } else {
430 # old format
431 for my $section (split(/ AND /, $genename)) {
432 my @genenames;
433 $section =~ s/[\(\)\.]//g;
434 my @names = split(m{\s+OR\s+}, $section);
435 push @genenames, ['Name' => shift @names];
436 push @genenames, map {['Synonyms' => $_]} @names;
437 push @stags, ['gene_name' => \@genenames]
439 } #use Data::Dumper; print Dumper $gn, $genename;# exit;
440 my $gn = Bio::Annotation::TagTree->new(-tagname => 'gene_name',
441 -value => ['gene_names' => \@stags]);
442 $annotation->add_Annotation('gene_name', $gn);
445 FEATURE_TABLE :
446 # if there is no feature table, or if we've got beyond, exit loop or don't
447 # even enter HL 05/11/2000
448 while (defined $_ && /^FT/ ) {
449 my $ftunit = $self->_read_FTHelper_swissprot($_);
451 # process ftunit
452 # when parsing of the line fails we get undef returned
453 if ($ftunit) {
454 push(@features,
455 $ftunit->_generic_seqfeature($self->location_factory(),
456 $params{'-seqid'}, "SwissProt"));
457 } else {
458 $self->warn("failed to parse feature table line for seq " .
459 $params{'-display_id'}. "\n$_");
461 $_ = $self->_readline;
463 while ( defined($_) && ! /^SQ/ ) {
464 $_ = $self->_readline;
466 $seqc = "";
467 while ( defined ($_ = $self->_readline) ) {
468 last if m{^//};
469 s/[^A-Za-z]//g;
470 $seqc .= uc($_);
473 my $seq= $self->sequence_factory->create
474 (-verbose => $self->verbose,
475 %params,
476 -seq => $seqc,
477 -desc => $desc,
478 -features => \@features,
479 -annotation => $annotation,
482 # The annotation doesn't get added by the constructor
483 $seq->annotation($annotation);
485 return $seq;
488 =head2 write_seq
490 Title : write_seq
491 Usage : $stream->write_seq($seq)
492 Function: writes the $seq object (must be seq) to the stream
493 Returns : 1 for success and 0 for error
494 Args : array of 1 to n Bio::SeqI objects
497 =cut
499 sub write_seq {
500 my ($self,@seqs) = @_;
501 foreach my $seq ( @seqs ) {
502 $self->throw("Attempting to write with no seq!") unless defined $seq;
504 if ( ! ref $seq || ! $seq->isa('Bio::SeqI') ) {
505 $self->warn(" $seq is not a SeqI compliant module. Attempting to dump, but may fail!");
508 my $i;
509 my $str = $seq->seq;
511 my $div;
512 my $ns = ($seq->can('namespace')) && $seq->namespace();
513 my $len = $seq->length();
515 if ( !$seq->can('division') || ! defined ($div = $seq->division()) ) {
516 $div = 'UNK';
519 # namespace dictates database, takes precedent over division. Sorry!
520 if (defined($ns) && $ns ne '') {
521 $div = ($ns eq 'Swiss-Prot') ? 'Reviewed' :
522 ($ns eq 'TrEMBL') ? 'Unreviewed' :
523 $ns;
524 } else {
525 $ns = 'Swiss-Prot';
526 # division not reset; acts as fallback
529 $self->warn("No whitespace allowed in SWISS-PROT display id [". $seq->display_id. "]")
530 if $seq->display_id =~ /\s/;
532 my $temp_line;
533 if ( $self->_id_generation_func ) {
534 $temp_line = &{$self->_id_generation_func}($seq);
535 } else {
536 #$temp_line = sprintf ("%10s STANDARD; %3s; %d AA.",
537 # $seq->primary_id()."_".$div,$mol,$len);
538 # Reconstructing the ID relies heavily upon the input source having
539 # been in a format that is parsed as this routine expects it -- that is,
540 # by this module itself. This is bad, I think, and immediately breaks
541 # if e.g. the Bio::DB::GenPept module is used as input.
542 # Hence, switch to display_id(); _every_ sequence is supposed to have
543 # this. HL 2000/09/03
544 # Changed to reflect ID line changes in UniProt
545 # Oct 2006 - removal of molecule type - see bug 2134
546 $temp_line = sprintf ("%-24s%-12s%9d AA.",
547 $seq->display_id(), $div.';', $len);
550 $self->_print( "ID $temp_line\n");
552 # if there, write the accession line
553 local($^W) = 0; # suppressing warnings about uninitialized fields
555 if ( $self->_ac_generation_func ) {
556 $temp_line = &{$self->_ac_generation_func}($seq);
557 $self->_print( "AC $temp_line\n");
559 elsif ($seq->can('accession_number') ) {
560 my $ac_line = $seq->accession_number;
561 if ($seq->can('get_secondary_accessions') ) {
562 foreach my $sacc ($seq->get_secondary_accessions) {
563 $ac_line .= "; ". $sacc;;
565 $ac_line .= ";";
568 $self->_write_line_swissprot_regex("AC ","AC ",$ac_line,
569 "\\s\+\|\$",$LINE_LENGTH);
571 # otherwise - cannot print <sigh>
574 # Date lines and sequence versions (changed 6/15/2006)
575 # This is rebuilt from scratch using the current SwissProt/UniProt format
576 if ( $seq->can('get_dates') ) {
577 my @dates = $seq->get_dates();
578 my $ct = 1;
579 my $seq_version = $seq->version;
580 my ($update_version) = $seq->annotation->get_Annotations("seq_update");
581 foreach my $dt (@dates) {
582 $self->_write_line_swissprot_regex("DT ","DT ",
583 $dt.', integrated into UniProtKB/'.$ns.'.',
584 "\\s\+\|\$",$LINE_LENGTH) if $ct == 1;
585 $self->_write_line_swissprot_regex("DT ","DT ",
586 $dt.", sequence version ".$update_version->display_text.'.',
587 "\\s\+\|\$",$LINE_LENGTH) if $ct == 2;
588 $self->_write_line_swissprot_regex("DT ","DT ",
589 $dt.", entry version $seq_version.",
590 "\\s\+\|\$",$LINE_LENGTH) if $ct == 3;
591 $ct++;
595 #Definition lines
596 $self->_write_line_swissprot_regex("DE ","DE ",$seq->desc(),"\\s\+\|\$",$LINE_LENGTH);
598 #Gene name; print out new format
599 foreach my $gene ( my @genes = $seq->annotation->get_Annotations('gene_name') ) {
600 # gene is a Bio::Annotation::TagTree;
601 if ($gene->isa('Bio::Annotation::TagTree')) {
602 my @genelines;
603 for my $node ($gene->findnode('gene_name')) {
604 # check for Name and Synonym first, then the rest get tagged on
605 my $geneline = "GN ";
606 my %genedata = $node->hash;
607 for my $tag (@GENE_NAME_ORDER) {
608 if (exists $genedata{$tag}) {
609 $geneline .= (ref $genedata{$tag} eq 'ARRAY') ?
610 "$tag=".join(', ',@{$genedata{$tag}})."; " :
611 "$tag=$genedata{$tag}; ";
612 delete $genedata{$tag};
615 # add rest
616 for my $tag (sort keys %genedata) {
617 $geneline .= (ref $genedata{$tag} eq 'ARRAY') ?
618 "$tag=".join(', ',@{$genedata{$tag}})."; " :
619 "$tag=$genedata{$tag}; ";
620 delete $genedata{$tag};
622 push @genelines, "$geneline\n";
624 $self->_print(join("GN and\n",@genelines));
625 } else { # fall back to getting stringified output
626 $self->_write_line_swissprot_regex("GN ","GN ",
627 $gene->display_text,
628 "\\s\+\|\$",
629 $LINE_LENGTH);
633 # Organism lines
634 if ($seq->can('species') && (my $spec = $seq->species)) {
635 my @class = $spec->classification();
636 shift(@class);
637 my $species = $spec->species;
638 my $genus = $spec->genus;
639 my $OS = $spec->scientific_name;
640 if ($class[-1] =~ /viruses/i) {
641 $OS = $species;
642 $OS .= " ". $spec->sub_species if $spec->sub_species;
644 foreach (($spec->variant, $spec->common_name)) {
645 $OS .= " ($_)" if $_;
647 $self->_print( "OS $OS.\n");
648 my $OC = join('; ', reverse(@class)) .'.';
649 $self->_write_line_swissprot_regex("OC ","OC ",$OC,"\; \|\$",$LINE_LENGTH);
650 if ($spec->organelle) {
651 $self->_write_line_swissprot_regex("OG ","OG ",$spec->organelle,"\; \|\$",$LINE_LENGTH);
653 if ($spec->ncbi_taxid) {
654 $self->_print("OX NCBI_TaxID=".$spec->ncbi_taxid.";\n");
658 # Reference lines
659 my $t = 1;
660 foreach my $ref ( $seq->annotation->get_Annotations('reference') ) {
661 $self->_print( "RN [$t]\n");
662 # changed by lorenz 08/03/00
663 # j.gilbert and h.lapp agreed that the rp line in swissprot seems
664 # more like a comment than a parseable value, so print it as is
665 if ($ref->rp) {
666 $self->_write_line_swissprot_regex("RP ","RP ",$ref->rp,
667 "\\s\+\|\$",$LINE_LENGTH);
669 if ($ref->comment) {
670 $self->_write_line_swissprot_regex("RC ","RC ",$ref->comment,
671 "\\s\+\|\$",$LINE_LENGTH);
673 if ($ref->medline or $ref->pubmed or $ref->doi) {
674 # new RX format in swissprot LP 09/17/00
675 # RX line can now have a DOI, Heikki 13 Feb 2008
677 my $line;
678 $line .= "MEDLINE=". $ref->medline. '; ' if $ref->medline;
679 $line .= "PubMed=". $ref->pubmed. '; ' if $ref->pubmed;
680 $line .= "DOI=". $ref->doi. '; ' if $ref->doi;
681 chop $line;
683 $self->_write_line_swissprot_regex("RX ","RX ",
684 $line,
685 "\\s\+\|\$",$LINE_LENGTH);
688 my $author = $ref->authors .';' if($ref->authors);
689 my $title = $ref->title .';' if( $ref->title);
690 my $rg = $ref->rg . ';' if $ref->rg;
691 $author =~ s/([\w\.]) (\w)/$1#$2/g; # add word wrap protection char '#'
693 $self->_write_line_swissprot_regex("RG ","RG ",$rg,"\\s\+\|\$",$LINE_LENGTH) if $rg;
694 $self->_write_line_swissprot_regex("RA ","RA ",$author,"\\s\+\|\$",$LINE_LENGTH) if $author;
695 $self->_write_line_swissprot_regex("RT ","RT ",$title,'[\s\-]+|$',$LINE_LENGTH) if $title;
696 $self->_write_line_swissprot_regex("RL ","RL ",$ref->location,"\\s\+\|\$",$LINE_LENGTH);
697 $t++;
700 # Comment lines
702 foreach my $comment ( $seq->annotation->get_Annotations('comment') ) {
703 foreach my $cline (split ("\n", $comment->text)) {
704 while (length $cline > 74) {
705 $self->_print("CC ",(substr $cline,0,74),"\n");
706 $cline = substr $cline,74;
708 $self->_print("CC ",$cline,"\n");
712 # Database xref lines
714 foreach my $dblink ( $seq->annotation->get_Annotations('dblink') ) {
715 my ($primary_id) = $dblink->primary_id;
717 if (defined($dblink->comment) && ($dblink->comment) ) {
718 $self->_print("DR ",$dblink->database,"; ",$primary_id,"; ",
719 $dblink->optional_id,"; ",$dblink->comment,".\n");
720 } elsif ($dblink->optional_id) {
721 $self->_print("DR ",$dblink->database,"; ",
722 $primary_id,"; ",
723 $dblink->optional_id,".\n");
724 } else {
725 $self->_print("DR ",$dblink->database,
726 "; ",$primary_id,"; ","-.\n");
730 # Evidence lines
732 foreach my $evidence ( $seq->annotation->get_Annotations('evidence') ) {
733 $self->_print("PE ",$evidence->value,";\n");
736 # if there, write the kw line
738 my $kw;
739 if ( my $func = $self->_kw_generation_func ) {
740 $kw = &{$func}($seq);
741 } elsif ( $seq->can('keywords') ) {
742 $kw = $seq->keywords;
743 if ( ref($kw) =~ /ARRAY/i ) {
744 $kw = join("; ", @$kw);
746 $kw .= '.' if $kw and $kw !~ /\.$/ ;
748 $kw =~ s/([\w\.]) (\w)/$1#$2/g; # add word wrap protection char '#'
749 $self->_write_line_swissprot_regex("KW ","KW ",
750 $kw, "\\s\+\|\$",$LINE_LENGTH)
751 if $kw;
754 #Check if there is seqfeatures before printing the FT line
755 my @feats = $seq->can('top_SeqFeatures') ? $seq->top_SeqFeatures : ();
756 if ($feats[0]) {
757 if ( defined $self->_post_sort ) {
759 # we need to read things into an array. Process. Sort them. Print 'em
761 my $post_sort_func = $self->_post_sort();
762 my @fth;
764 foreach my $sf ( @feats ) {
765 push(@fth,Bio::SeqIO::FTHelper::from_SeqFeature($sf,$seq));
767 @fth = sort { &$post_sort_func($a,$b) } @fth;
769 foreach my $fth ( @fth ) {
770 $self->_print_swissprot_FTHelper($fth);
772 } else {
773 # not post sorted. And so we can print as we get them.
774 # lower memory load...
776 foreach my $sf ( @feats ) {
777 my @fth = Bio::SeqIO::FTHelper::from_SeqFeature($sf,$seq);
778 foreach my $fth ( @fth ) {
779 if ( ! $fth->isa('Bio::SeqIO::FTHelper') ) {
780 $sf->throw("Cannot process FTHelper... $fth");
782 $self->_print_swissprot_FTHelper($fth);
787 if ( $self->_show_dna() == 0 ) {
788 return;
791 # finished printing features.
793 # molecular weight
794 my $mw = ${Bio::Tools::SeqStats->get_mol_wt($seq->primary_seq)}[0];
795 # checksum
796 # was crc32 checksum, changed it to crc64
797 my $crc64 = $self->_crc64(\$str);
798 $self->_print( sprintf("SQ SEQUENCE %4d AA; %d MW; %16s CRC64;\n",
799 $len,$mw,$crc64));
800 $self->_print( " ");
801 my $linepos;
802 for ($i = 0; $i < length($str); $i += 10) {
803 $self->_print( " ", substr($str,$i,10));
804 $linepos += 11;
805 if ( ($i+10)%60 == 0 && (($i+10) < length($str))) {
806 $self->_print( "\n ");
809 $self->_print( "\n//\n");
811 $self->flush if $self->_flush_on_write && defined $self->_fh;
812 return 1;
816 # Thanks to James Gilbert for the following two. LP 08/01/2000
818 =head2 _generateCRCTable
820 Title : _generateCRCTable
821 Usage :
822 Function:
823 Example :
824 Returns :
825 Args :
828 =cut
830 sub _generateCRCTable {
831 # 10001000001010010010001110000100
832 # 32
833 my $poly = 0xEDB88320;
834 my ($self) = shift;
836 $self->{'_crcTable'} = [];
837 foreach my $i (0..255) {
838 my $crc = $i;
839 for (my $j=8; $j > 0; $j--) {
840 if ($crc & 1) {
841 $crc = ($crc >> 1) ^ $poly;
842 } else {
843 $crc >>= 1;
846 ${$self->{'_crcTable'}}[$i] = $crc;
851 =head2 _crc32
853 Title : _crc32
854 Usage :
855 Function:
856 Example :
857 Returns :
858 Args :
861 =cut
863 sub _crc32 {
864 my( $self, $str ) = @_;
866 $self->throw("Argument to crc32() must be ref to scalar")
867 unless ref($str) eq 'SCALAR';
869 $self->_generateCRCTable() unless exists $self->{'_crcTable'};
871 my $len = length($$str);
873 my $crc = 0xFFFFFFFF;
874 for (my $i = 0; $i < $len; $i++) {
875 # Get upper case value of each letter
876 my $int = ord uc substr $$str, $i, 1;
877 $crc = (($crc >> 8) & 0x00FFFFFF) ^
878 ${$self->{'_crcTable'}}[ ($crc ^ $int) & 0xFF ];
880 return $crc;
883 =head2 _crc64
885 Title : _crc64
886 Usage :
887 Function:
888 Example :
889 Returns :
890 Args :
893 =cut
895 sub _crc64{
896 my ($self, $sequence) = @_;
897 my $POLY64REVh = 0xd8000000;
898 my @CRCTableh = 256;
899 my @CRCTablel = 256;
900 my $initialized;
902 my $seq = $$sequence;
904 my $crcl = 0;
905 my $crch = 0;
906 if (!$initialized) {
907 $initialized = 1;
908 for (my $i=0; $i<256; $i++) {
909 my $partl = $i;
910 my $parth = 0;
911 for (my $j=0; $j<8; $j++) {
912 my $rflag = $partl & 1;
913 $partl >>= 1;
914 $partl |= (1 << 31) if $parth & 1;
915 $parth >>= 1;
916 $parth ^= $POLY64REVh if $rflag;
918 $CRCTableh[$i] = $parth;
919 $CRCTablel[$i] = $partl;
923 foreach (split '', $seq) {
924 my $shr = ($crch & 0xFF) << 24;
925 my $temp1h = $crch >> 8;
926 my $temp1l = ($crcl >> 8) | $shr;
927 my $tableindex = ($crcl ^ (unpack "C", $_)) & 0xFF;
928 $crch = $temp1h ^ $CRCTableh[$tableindex];
929 $crcl = $temp1l ^ $CRCTablel[$tableindex];
931 my $crc64 = sprintf("%08X%08X", $crch, $crcl);
932 return $crc64;
935 =head2 _print_swissprot_FTHelper
937 Title : _print_swissprot_FTHelper
938 Usage :
939 Function:
940 Example :
941 Returns :
942 Args :
945 =cut
947 sub _print_swissprot_FTHelper {
948 my ($self,$fth,$always_quote) = @_;
949 $always_quote ||= 0;
950 my ($start,$end) = ('?', '?');
952 if ( ! ref $fth || ! $fth->isa('Bio::SeqIO::FTHelper') ) {
953 $fth->warn("$fth is not a FTHelper class. ".
954 "Attempting to print, but there could be tears!");
956 my $desc = "";
958 for my $tag ( qw(description gene note product) ) {
959 if ( exists $fth->field->{$tag} ) {
960 $desc = @{$fth->field->{$tag}}[0].".";
961 last;
964 $desc =~ s/\.$//;
966 my $ftid = "";
967 if ( exists $fth->field->{'FTId'} ) {
968 $ftid = @{$fth->field->{'FTId'}}[0]. '.';
971 my $key =substr($fth->key,0,8);
972 my $loc = $fth->loc;
973 if ( $loc =~ /(\?|\d+|\>\d+|<\d+)?\.\.(\?|\d+|<\d+|>\d+)?/ ) {
974 $start = $1 if defined $1;
975 $end = $2 if defined $2;
977 # to_FTString only returns one value when start == end, #JB955
978 # so if no match is found, assume it is both start and end #JB955
979 } elsif ( $loc =~ /join\((\d+)((?:,\d+)+)?\)/) {
980 my @y = ($1);
981 if ( defined( my $m = $2) ) {
982 $m =~ s/^\,//;
983 push @y, split(/,/,$m);
985 for my $x ( @y ) {
986 $self->_write_line_swissprot_regex(
987 sprintf("FT %-8s %6s %6s ",
988 $key,
989 $x ,$x),
990 "FT ",
991 $desc.'.','\s+|$',$LINE_LENGTH);
993 return;
994 } else {
995 $start = $end = $fth->loc;
997 if ($desc) {
998 $self->_write_line_swissprot_regex(sprintf("FT %-8s %6s %6s ",
999 $key,
1000 $start ,$end),
1001 "FT ",
1002 $desc. '.', '\s+|$', $LINE_LENGTH);
1003 } else { #HELIX and STRAND do not have descriptions
1004 $self->_write_line_swissprot_regex(sprintf("FT %-8s %6s %6s",
1005 $key,
1006 $start ,$end),
1007 "FT ",
1008 ' ', '\s+|$', $LINE_LENGTH);
1012 if ($ftid) {
1013 $self->_write_line_swissprot_regex("FT ",
1014 "FT ",
1015 "/FTId=$ftid",'.|$',$LINE_LENGTH);
1022 =head2 _read_swissprot_References
1024 Title : _read_swissprot_References
1025 Usage :
1026 Function: Reads references from swissprot format. Internal function really
1027 Example :
1028 Returns :
1029 Args :
1032 =cut
1034 sub _read_swissprot_References{
1035 my ($self,$line) = @_;
1036 my ($b1, $b2, $rp, $rg, $title, $loc, $au, $med, $com, $pubmed, $doi);
1037 my @refs;
1038 local $_ = $line;
1039 while ( defined $_ ) {
1040 if ( /^[^R]/ || /^RN/ ) {
1041 if ( $rp ) {
1042 $rg =~ s/;\s*$//g if defined($rg);
1043 if (defined($au)) {
1044 $au =~ s/;\s*$//;
1045 } else {
1046 $au = $rg;
1048 $title =~ s/;\s*$//g if defined($title);
1049 push @refs, Bio::Annotation::Reference->new
1050 (-title => $title,
1051 -start => $b1,
1052 -end => $b2,
1053 -authors => $au,
1054 -location=> $loc,
1055 -medline => $med,
1056 -pubmed => $pubmed,
1057 -doi => $doi,
1058 -comment => $com,
1059 -rp => $rp,
1060 -rg => $rg,
1061 -tagname => 'reference',
1063 # reset state for the next reference
1064 $rp = '';
1066 if (index($_,'R') != 0) {
1067 $self->_pushback($_); # want this line to go back on the list
1068 last; # may be the safest exit point HL 05/11/2000
1070 # don't forget to reset the state for the next reference
1071 $b1 = $b2 = $rg = $med = $com = $pubmed = $doi = undef;
1072 $title = $loc = $au = undef;
1073 } elsif ( /^RP\s{3}(.+? OF (\d+)-(\d+).*)/) {
1074 $rp .= $1;
1075 $b1 = $2;
1076 $b2 = $3;
1077 } elsif ( /^RP\s{3}(.*)/) {
1078 if ($rp) {
1079 $rp .= " ".$1;
1080 } else {
1081 $rp = $1;
1083 } elsif (/^RX\s{3}(.*)/) { # each reference can have only one RX line
1084 my $line = $1;
1085 $med = $1 if $line =~ /MEDLINE=(\d+);/;
1086 $pubmed = $1 if $line =~ /PubMed=(\d+);/;
1087 $doi = $1 if $line =~ /DOI=(.+);/;
1088 } elsif ( /^RA\s{3}(.*)/ ) {
1089 $au .= $au ? " $1" : $1;
1090 } elsif ( /^RG\s{3}(.*)/ ) {
1091 $rg .= $rg ? " $1" : $1;
1092 } elsif ( /^RT\s{3}(.*)/ ) {
1093 if ($title) {
1094 my $tline = $1;
1095 $title .= ($title =~ /[\w;,:\?!]$/) ? " $tline" : $tline;
1096 } else {
1097 $title = $1;
1099 } elsif (/^RL\s{3}(.*)/ ) {
1100 $loc .= $loc ? " $1" : $1;
1101 } elsif ( /^RC\s{3}(.*)/ ) {
1102 $com .= $com ? " $1" : $1;
1104 $_ = $self->_readline;
1106 return \@refs;
1110 =head2 _read_swissprot_Species
1112 Title : _read_swissprot_Species
1113 Usage :
1114 Function: Reads the swissprot Organism species and classification
1115 lines.
1116 Able to deal with unconventional species names.
1117 Example : OS Unknown prokaryotic organism
1118 $genus = undef ; $species = Unknown prokaryotic organism
1119 Returns : A Bio::Species object
1120 Args :
1122 =cut
1124 sub _read_swissprot_Species {
1125 my( $self,$line ) = @_;
1126 my $org;
1127 local $_ = $line;
1129 my( $sub_species, $species, $genus, $common, $variant, $ncbi_taxid, $sci_name, $class_lines, $descr );
1130 my $osline = "";
1131 my $do_genus_check = 1;
1132 while ( defined $_ ) {
1133 last unless /^O[SCGX]/;
1134 # believe it or not, but OS may come multiple times -- at this time
1135 # we can't capture multiple species
1136 if (/^OS\s+(\S.+)/ && (! defined($sci_name))) {
1137 $osline .= " " if $osline;
1138 $osline .= $1;
1139 if ($osline =~ s/(,|, and|\.)$//) {
1140 # OS lines are usually like:
1141 # Homo sapiens (human)
1142 # where we have $sci_name followed by $descr (common name) in
1143 # brackets, but we can also have:
1144 # Venerupis (Ruditapes) philippinarum
1145 # where we have brackets but they don't indicate a $descr
1146 if ($osline =~ /[^\(\)]+\(.+\)[^\(\)]+$/) {
1147 #*** Danger! no idea if this will pick up some syntaxes for
1148 # common names as well)
1149 $sci_name = $osline;
1150 $sci_name =~ s/\.$//;
1151 $descr = '';
1152 $do_genus_check = 0;
1153 } else {
1154 ($sci_name, $descr) = $osline =~ /(\S[^\(]+)(.*)/;
1156 $sci_name =~ s/\s+$//;
1158 while ($descr =~ /\(([^\)]+)\)/g) {
1159 my $item = $1;
1160 # strain etc may not necessarily come first (yes, swissprot
1161 # is messy)
1162 if ((! defined($variant)) &&
1163 (($item =~ /(^|[^\(\w])([Ss]train|isolate|serogroup|serotype|subtype|clone)\b/) ||
1164 ($item =~ /^(biovar|pv\.|type\s+)/))) {
1165 $variant = $item;
1166 } elsif ($item =~ s/^subsp\.\s+//) {
1167 if (! $sub_species) {
1168 $sub_species = $item;
1169 } elsif (! $variant) {
1170 $variant = $item;
1172 } elsif (! defined($common)) {
1173 # we're only interested in the first common name
1174 $common = $item;
1175 if ((index($common, '(') >= 0) &&
1176 (index($common, ')') < 0)) {
1177 $common .= ')';
1182 } elsif (s/^OC\s+(\S.+)$//) {
1183 $class_lines .= $1;
1184 } elsif (/^OG\s+(.*)/) {
1185 $org = $1;
1186 } elsif (/^OX\s+(.*)/ && (! defined($ncbi_taxid))) {
1187 my $taxstring = $1;
1188 # we only keep the first one and ignore all others
1189 if ($taxstring =~ /NCBI_TaxID=([\w\d]+)/) {
1190 $ncbi_taxid = $1;
1191 } else {
1192 $self->throw("$taxstring doesn't look like NCBI_TaxID");
1195 $_ = $self->_readline;
1197 $self->_pushback($_); # pushback the last line because we need it
1199 $sci_name || return;
1201 # if the organism belongs to taxid 32644 then no Bio::Species object.
1202 return if grep { $_ eq $sci_name } @Unknown_names;
1204 # Convert data in classification lines into classification array.
1205 # Remove trailing . then split on ';' or '.;' so that classification that is 2
1206 # or more words will still get matched, use map() to remove trailing/leading/intervening
1207 # spaces
1208 $class_lines=~s/\.\s*$//;
1209 my @class = map { s/^\s+//; s/\s+$//; s/\s{2,}/ /g; $_; } split /[;\.]*;/, $class_lines;
1211 if ($class[0] =~ /viruses/i) {
1212 # viruses have different OS/OC syntax
1213 my @virusnames = split(/\s+/, $sci_name);
1214 $species = (@virusnames > 1) ? pop(@virusnames) : '';
1215 $genus = join(" ", @virusnames);
1216 $sub_species = $descr;
1217 } elsif ($do_genus_check) {
1218 # do we have a genus?
1219 my $possible_genus = $class[-1];
1220 $possible_genus .= "|$class[-2]" if $class[-2];
1221 if ($sci_name =~ /^($possible_genus)/) {
1222 $genus = $1;
1223 ($species) = $sci_name =~ /^$genus\s+(.+)/;
1224 } else {
1225 $species = $sci_name;
1227 # is this organism of rank species or is it lower?
1228 # (doesn't catch everything, but at least the guess isn't dangerous)
1229 if ($species && $species =~ /subsp\.|var\./) {
1230 ($species, $sub_species) = $species =~ /(.+)\s+((?:subsp\.|var\.).+)/;
1234 # Bio::Species array needs array in Species -> Kingdom direction
1235 unless ($class[-1] eq $sci_name) {
1236 push(@class, $sci_name);
1238 @class = reverse @class;
1240 my $taxon = Bio::Species->new();
1241 $taxon->scientific_name($sci_name);
1242 $taxon->classification(@class);
1243 $taxon->common_name($common) if $common;
1244 $taxon->sub_species($sub_species) if $sub_species;
1245 $taxon->organelle($org) if $org;
1246 $taxon->ncbi_taxid($ncbi_taxid) if $ncbi_taxid;
1247 $taxon->variant($variant) if $variant;
1249 # done
1250 return $taxon;
1253 =head2 _filehandle
1255 Title : _filehandle
1256 Usage : $obj->_filehandle($newval)
1257 Function:
1258 Example :
1259 Returns : value of _filehandle
1260 Args : newvalue (optional)
1263 =cut
1265 # inherited from SeqIO.pm ! HL 05/11/2000
1267 =head2 _read_FTHelper_swissprot
1269 Title : _read_FTHelper_swissprot
1270 Usage : _read_FTHelper_swissprot(\$buffer)
1271 Function: reads the next FT key line
1272 Example :
1273 Returns : Bio::SeqIO::FTHelper object
1274 Args :
1277 =cut
1279 sub _read_FTHelper_swissprot {
1280 my ($self,$line ) = @_;
1281 # initial version implemented by HL 05/10/2000
1282 # FIXME this may not be perfect, so please review
1283 # lots of cleaning up by JES 2004/07/01, still may not be perfect =)
1284 # FTId now sepated from description as a qualifier
1286 local $_ = $line;
1287 my ($key, # The key of the feature
1288 $loc, # The location line from the feature
1289 $desc, # The descriptive text
1290 $ftid, # feature Id is like a qualifier but there can be only one of them
1292 if ( m/^FT\s{3}(\w+)\s+([\d\?\<]+)[\s.]+([\d\?\>]+)\s*(.*)$/ox) {
1293 $key = $1;
1294 my $loc1 = $2;
1295 my $loc2 = $3;
1296 $loc = "$loc1..$loc2";
1297 if ($4 && (length($4) > 0)) {
1298 $desc = $4;
1299 chomp($desc);
1300 } else {
1301 $desc = "";
1303 } elsif ( m/^FT\s{3}(\w+)\s+([\d\?\<]+)\s+(.*)$/ox) {
1304 $key = $1;
1305 my $loc1 = $2;
1306 my $loc2 = $2;
1307 $loc = "$loc1";
1308 if ($3 && (length($3) > 0)) {
1309 $desc = $3;
1310 chomp($desc);
1311 } else {
1312 $desc = "";
1316 while ( defined($_ = $self->_readline) && /^FT\s{4,}(\S.*)$/ ) {
1317 my $continuation_line = $1;
1318 if ( $continuation_line =~ /.FTId=(.*)\./ ) {
1319 $ftid=$1;
1321 elsif ( $desc) {
1322 $desc .= " $continuation_line";
1323 } else {
1324 $desc = $continuation_line;
1326 chomp $desc;
1328 $self->_pushback($_);
1329 unless( $key ) {
1330 # No feature key. What's this?
1331 $self->warn("No feature key in putative feature table line: $line");
1332 return;
1335 # Make the new FTHelper object
1336 my $out = Bio::SeqIO::FTHelper->new(-verbose => $self->verbose());
1337 $out->key($key);
1338 $out->loc($loc);
1340 # store the description if there is one
1341 if ( $desc && length($desc) ) {
1342 $desc =~ s/\.$//;
1343 push(@{$out->field->{"description"}}, $desc);
1345 # Store the qualifier i.e. FTId
1346 if ( $ftid ) {
1347 push(@{$out->field->{"FTId"}}, $ftid);
1349 return $out;
1353 =head2 _write_line_swissprot
1355 Title : _write_line_swissprot
1356 Usage :
1357 Function: internal function
1358 Example :
1359 Returns :
1360 Args :
1363 =cut
1365 sub _write_line_swissprot{
1366 my ($self,$pre1,$pre2,$line,$length) = @_;
1368 $length || $self->throw( "Miscalled write_line_swissprot without length. Programming error!");
1369 my $subl = $length - length $pre2;
1370 my $linel = length $line;
1371 my $i;
1373 my $sub = substr($line,0,$length - length $pre1);
1375 $self->_print( "$pre1$sub\n");
1377 for ($i= ($length - length $pre1);$i < $linel;) {
1378 $sub = substr($line,$i,($subl));
1379 $self->_print( "$pre2$sub\n");
1380 $i += $subl;
1385 =head2 _write_line_swissprot_regex
1387 Title : _write_line_swissprot_regex
1388 Usage :
1389 Function: internal function for writing lines of specified
1390 length, with different first and the next line
1391 left hand headers and split at specific points in the
1392 text
1393 Example :
1394 Returns : nothing
1395 Args : file handle, first header, second header, text-line, regex for line breaks, total line length
1398 =cut
1400 sub _write_line_swissprot_regex {
1401 my ($self,$pre1,$pre2,$line,$regex,$length) = @_;
1403 #print STDOUT "Going to print with $line!\n";
1405 $length || $self->throw( "Miscalled write_line_swissprot without length. Programming error!");
1407 if ( length $pre1 != length $pre2 ) {
1408 $self->warn( "len 1 is ". length ($pre1) . " len 2 is ". length ($pre2) . "\n");
1409 $self->throw( "Programming error - cannot called write_line_swissprot_regex with different length \npre1 ($pre1) and \npre2 ($pre2) tags!");
1412 my $subl = $length - (length $pre1) -1 ;
1414 my $first_line = 1;
1415 while ($line =~ m/(.{1,$subl})($regex)/g) {
1416 my $s = $1.$2;
1417 $s =~ s/([\w\.])#(\w)/$1 $2/g # remove word wrap protection char '#'
1418 if $pre1 eq "RA " or $pre1 eq "KW ";
1419 # remove annoying extra spaces at the end of the wrapped lines
1420 substr($s, -1, 1, '') if substr($s, -1, 1) eq ' ';
1421 if ($first_line) {
1422 $self->_print( "$pre1$s\n");
1423 $first_line = 0;
1424 } else {
1425 $self->_print( "$pre2$s\n");
1431 =head2 _post_sort
1433 Title : _post_sort
1434 Usage : $obj->_post_sort($newval)
1435 Function:
1436 Returns : value of _post_sort
1437 Args : newvalue (optional)
1440 =cut
1442 sub _post_sort{
1443 my $obj = shift;
1444 if ( @_ ) {
1445 my $value = shift;
1446 $obj->{'_post_sort'} = $value;
1448 return $obj->{'_post_sort'};
1452 =head2 _show_dna
1454 Title : _show_dna
1455 Usage : $obj->_show_dna($newval)
1456 Function:
1457 Returns : value of _show_dna
1458 Args : newvalue (optional)
1461 =cut
1463 sub _show_dna{
1464 my $obj = shift;
1465 if ( @_ ) {
1466 my $value = shift;
1467 $obj->{'_show_dna'} = $value;
1469 return $obj->{'_show_dna'};
1473 =head2 _id_generation_func
1475 Title : _id_generation_func
1476 Usage : $obj->_id_generation_func($newval)
1477 Function:
1478 Returns : value of _id_generation_func
1479 Args : newvalue (optional)
1482 =cut
1484 sub _id_generation_func{
1485 my $obj = shift;
1486 if ( @_ ) {
1487 my $value = shift;
1488 $obj->{'_id_generation_func'} = $value;
1490 return $obj->{'_id_generation_func'};
1494 =head2 _ac_generation_func
1496 Title : _ac_generation_func
1497 Usage : $obj->_ac_generation_func($newval)
1498 Function:
1499 Returns : value of _ac_generation_func
1500 Args : newvalue (optional)
1503 =cut
1505 sub _ac_generation_func{
1506 my $obj = shift;
1507 if ( @_ ) {
1508 my $value = shift;
1509 $obj->{'_ac_generation_func'} = $value;
1511 return $obj->{'_ac_generation_func'};
1515 =head2 _sv_generation_func
1517 Title : _sv_generation_func
1518 Usage : $obj->_sv_generation_func($newval)
1519 Function:
1520 Returns : value of _sv_generation_func
1521 Args : newvalue (optional)
1524 =cut
1526 sub _sv_generation_func{
1527 my $obj = shift;
1528 if ( @_ ) {
1529 my $value = shift;
1530 $obj->{'_sv_generation_func'} = $value;
1532 return $obj->{'_sv_generation_func'};
1536 =head2 _kw_generation_func
1538 Title : _kw_generation_func
1539 Usage : $obj->_kw_generation_func($newval)
1540 Function:
1541 Returns : value of _kw_generation_func
1542 Args : newvalue (optional)
1545 =cut
1547 sub _kw_generation_func{
1548 my $obj = shift;
1549 if ( @_ ) {
1550 my $value = shift;
1551 $obj->{'_kw_generation_func'} = $value;
1553 return $obj->{'_kw_generation_func'};