2 # BioPerl module for Bio::Variation::IO
4 # Please direct questions and support issues to <bioperl-l@bioperl.org>
6 # Cared for by Heikki Lehvaslaiho <heikki-at-bioperl-dot-org>
8 # Copyright Heikki Lehvaslaiho
10 # You may distribute this module under the same terms as perl itself
12 # POD documentation - main docs before the code
16 Bio::Variation::IO - Handler for sequence variation IO Formats
20 use Bio::Variation::IO;
22 $in = Bio::Variation::IO->new(-file => "inputfilename" ,
24 $out = Bio::Variation::IO->new(-file => ">outputfilename" ,
27 while ( my $seq = $in->next() ) {
33 use Bio::Variation::IO;
35 #input file format can be read from the file extension (dat|xml)
36 $in = Bio::Variation::IO->newFh(-file => "inputfilename");
37 $out = Bio::Variation::IO->newFh(-format => 'xml');
39 # World's shortest flat<->xml format converter:
40 print $out $_ while <$in>;
44 Bio::Variation::IO is a handler module for the formats in the
45 Variation IO set (eg, Bio::Variation::IO::flat). It is the officially
46 sanctioned way of getting at the format objects, which most people
49 The structure, conventions and most of the code is inherited from
50 L<Bio::SeqIO> module. The main difference is that instead of using
51 methods next_seq and write_seq, you drop '_seq' from the method names.
53 The idea is that you request a stream object for a particular format.
54 All the stream objects have a notion of an internal file that is read
55 from or written to. A particular SeqIO object instance is configured
56 for either input or output. A specific example of a stream object is
57 the Bio::Variation::IO::flat object.
59 Each stream object has functions
65 $stream->write($seqDiff);
69 $stream->type() # returns 'INPUT' or 'OUTPUT'
71 As an added bonus, you can recover a filehandle that is tied to the
72 SeqIO object, allowing you to use the standard E<lt>E<gt> and print
73 operations to read and write sequence objects:
75 use Bio::Variation::IO;
77 $stream = Bio::Variation::IO->newFh(-format => 'flat');
78 # read from standard input
80 while ( $seq = <$stream> ) {
81 # do something with $seq
86 print $stream $seq; # when stream is in output mode
88 This makes the simplest ever reformatter
95 use Bio::Variation::IO;
97 $in = Bio::Variation::IO->newFh(-format => $format1 );
98 $out = Bio::Variation::IO->newFh(-format => $format2 );
100 print $out $_ while <$in>;
105 =head2 Bio::Variation::IO-E<gt>new()
107 $seqIO = Bio::Variation::IO->new(-file => 'filename', -format=>$format);
108 $seqIO = Bio::Variation::IO->new(-fh => \*FILEHANDLE, -format=>$format);
109 $seqIO = Bio::Variation::IO->new(-format => $format);
111 The new() class method constructs a new Bio::Variation::IO object. The
112 returned object can be used to retrieve or print BioSeq objects. new()
113 accepts the following parameters:
119 A file path to be opened for reading or writing. The usual Perl
122 'file' # open file for reading
123 '>file' # open file for writing
124 '>>file' # open file for appending
125 '+<file' # open file read/write
126 'command |' # open a pipe from the command
127 '| command' # open a pipe to the command
131 You may provide new() with a previously-opened filehandle. For
132 example, to read from STDIN:
134 $seqIO = Bio::Variation::IO->new(-fh => \*STDIN);
136 Note that you must pass filehandles as references to globs.
138 If neither a filehandle nor a filename is specified, then the module
139 will read from the @ARGV array or STDIN, using the familiar E<lt>E<gt>
144 Specify the format of the file. Supported formats include:
146 flat pseudo EMBL format
147 xml seqvar xml format
149 If no format is specified and a filename is given, then the module
150 will attempt to deduce it from the filename. If this is unsuccessful,
151 Fasta format is assumed.
153 The format name is case insensitive. 'FLAT', 'Flat' and 'flat' are
158 =head2 Bio::Variation::IO-E<gt>newFh()
160 $fh = Bio::Variation::IO->newFh(-fh => \*FILEHANDLE, -format=>$format);
161 $fh = Bio::Variation::IO->newFh(-format => $format);
165 $out = Bio::Variation::IO->newFh( '-FORMAT' => 'flat');
168 This constructor behaves like new(), but returns a tied filehandle
169 rather than a Bio::Variation::IO object. You can read sequences from this
170 object using the familiar E<lt>E<gt> operator, and write to it using print().
171 The usual array and $_ semantics work. For example, you can read all
172 sequence objects into an array like this:
176 Other operations, such as read(), sysread(), write(), close(), and printf()
179 =head1 OBJECT METHODS
181 See below for more detailed summaries. The main methods are:
183 =head2 $sequence = $seqIO-E<gt>next()
185 Fetch the next sequence from the stream.
187 =head2 $seqIO-E<gt>write($sequence [,$another_sequence,...])
189 Write the specified sequence(s) to the stream.
191 =head2 TIEHANDLE(), READLINE(), PRINT()
193 These provide the tie interface. See L<perltie> for more details.
199 User feedback is an integral part of the evolution of this and other
200 Bioperl modules. Send your comments and suggestions preferably to the
201 Bioperl mailing lists Your participation is much appreciated.
203 bioperl-l@bioperl.org - General discussion
204 http://bioperl.org/wiki/Mailing_lists - About the mailing lists
208 Please direct usage questions or support issues to the mailing list:
210 I<bioperl-l@bioperl.org>
212 rather than to the module maintainer directly. Many experienced and
213 reponsive experts will be able look at the problem and quickly
214 address it. Please include a thorough description of the problem
215 with code and data examples if at all possible.
217 =head2 Reporting Bugs
219 Report bugs to the Bioperl bug tracking system to help us keep track
220 the bugs and their resolution. Bug reports can be submitted via the
223 https://github.com/bioperl/bioperl-live/issues
225 =head1 AUTHOR - Heikki Lehvaslaiho
227 Email: heikki-at-bioperl-dot-org
231 The rest of the documentation details each of the object
232 methods. Internal methods are usually preceded with a _
236 # Let the code begin...
238 package Bio
::Variation
::IO
;
243 use base
qw(Bio::SeqIO Bio::Root::IO);
248 Usage : $stream = Bio::Variation::IO->new(-file => $filename, -format => 'Format')
249 Function: Returns a new seqstream
250 Returns : A Bio::Variation::IO::Handler initialised with the appropriate format
251 Args : -file => $filename
253 -fh => filehandle to attach to
259 my ($class, %param) = @_;
262 @param{ map { lc $_ } keys %param } = values %param; # lowercase keys
263 $format = $param{'-format'}
264 || $class->_guess_format( $param{-file
} || $ARGV[0] )
266 $format = "\L$format"; # normalize capitalization to lower case
268 return unless $class->_load_format_module($format);
269 return "Bio::Variation::IO::$format"->new(%param);
276 Usage : $format = $stream->format()
277 Function: Get the variation format
278 Returns : variation format
283 # format() method inherited from Bio::Root::IO
286 sub _load_format_module
{
287 my ($class, $format) = @_;
288 my $module = "Bio::Variation::IO::" . $format;
291 $ok = $class->_load_module($module);
295 $class: $format cannot be found
297 For more information about the IO system please see the IO docs.
298 This includes ways of checking for formats at compile time, not run time
308 Usage : $seqDiff = $stream->next
309 Function: reads the next $seqDiff object from the stream
310 Returns : a Bio::Variation::SeqDiff object
316 my ($self, $seq) = @_;
317 $self->throw("Sorry, you cannot read from a generic Bio::Variation::IO object.");
321 my ($self, $seq) = @_;
322 $self->throw("These are not sequence objects. Use method 'next' instead of 'next_seq'.");
329 Usage : $stream->write($seq)
330 Function: writes the $seq object into the stream
331 Returns : 1 for success and 0 for error
332 Args : Bio::Variation::SeqDiff object
337 my ($self, $seq) = @_;
338 $self->throw("Sorry, you cannot write to a generic Bio::Variation::IO object.");
342 my ($self, $seq) = @_;
343 $self->warn("These are not sequence objects. Use method 'write' instead of 'write_seq'.");
349 Title : _guess_format
350 Usage : $obj->_guess_format($filename)
353 Returns : guessed format of filename (lower case)
360 return unless $_ = shift;
361 return 'flat' if /\.dat$/i;
362 return 'xml' if /\.xml$/i;