3 lwpcook - The libwww-perl cookbook
7 This document contain some examples that show typical usage of the
8 libwww-perl library. You should consult the documentation for the
9 individual modules for more detail.
11 All examples should be runnable programs. You can, in most cases, test
12 the code sections by piping the program text directly to perl.
18 It is very easy to use this library to just fetch documents from the
19 net. The LWP::Simple module provides the get() function that return
20 the document specified by its URL argument:
23 $doc = get 'http://www.linpro.no/lwp/';
25 or, as a perl one-liner using the getprint() function:
27 perl -MLWP::Simple -e 'getprint "http://www.linpro.no/lwp/"'
29 or, how about fetching the latest perl by running this command:
31 perl -MLWP::Simple -e '
32 getstore "ftp://ftp.sunet.se/pub/lang/perl/CPAN/src/latest.tar.gz",
35 You will probably first want to find a CPAN site closer to you by
36 running something like the following command:
38 perl -MLWP::Simple -e 'getprint "http://www.perl.com/perl/CPAN/CPAN.html"'
40 Enough of this simple stuff! The LWP object oriented interface gives
41 you more control over the request sent to the server. Using this
42 interface you have full control over headers sent and how you want to
43 handle the response returned.
46 $ua = LWP::UserAgent->new;
47 $ua->agent("$0/0.1 " . $ua->agent);
48 # $ua->agent("Mozilla/8.0") # pretend we are very capable browser
50 $req = HTTP::Request->new(GET => 'http://www.linpro.no/lwp');
51 $req->header('Accept' => 'text/html');
54 $res = $ua->request($req);
57 if ($res->is_success) {
58 print $res->decoded_content;
61 print "Error: " . $res->status_line . "\n";
64 The lwp-request program (alias GET) that is distributed with the
65 library can also be used to fetch documents from WWW servers.
71 If you just want to check if a document is present (i.e. the URL is
72 valid) try to run code that looks like this:
80 The head() function really returns a list of meta-information about
81 the document. The first three values of the list returned are the
82 document type, the size of the document, and the age of the document.
84 More control over the request or access to all header values returned
85 require that you use the object oriented interface described for GET
86 above. Just s/GET/HEAD/g.
91 There is no simple procedural interface for posting data to a WWW server. You
92 must use the object oriented interface for this. The most common POST
93 operation is to access a WWW form application:
96 $ua = LWP::UserAgent->new;
98 my $req = HTTP::Request->new(POST => 'http://www.perl.com/cgi-bin/BugGlimpse');
99 $req->content_type('application/x-www-form-urlencoded');
100 $req->content('match=www&errors=0');
102 my $res = $ua->request($req);
103 print $res->as_string;
105 Lazy people use the HTTP::Request::Common module to set up a suitable
106 POST request message (it handles all the escaping issues) and has a
107 suitable default for the content_type:
109 use HTTP::Request::Common qw(POST);
111 $ua = LWP::UserAgent->new;
113 my $req = POST 'http://www.perl.com/cgi-bin/BugGlimpse',
114 [ search => 'www', errors => 0 ];
116 print $ua->request($req)->as_string;
118 The lwp-request program (alias POST) that is distributed with the
119 library can also be used for posting data.
125 Some sites use proxies to go through fire wall machines, or just as
126 cache in order to improve performance. Proxies can also be used for
127 accessing resources through protocols not supported directly (or
128 supported badly :-) by the libwww-perl library.
130 You should initialize your proxy setting before you start sending
134 $ua = LWP::UserAgent->new;
135 $ua->env_proxy; # initialize from environment variables
137 $ua->proxy(ftp => 'http://proxy.myorg.com');
138 $ua->proxy(wais => 'http://proxy.myorg.com');
139 $ua->no_proxy(qw(no se fi));
141 my $req = HTTP::Request->new(GET => 'wais://xxx.com/');
142 print $ua->request($req)->as_string;
144 The LWP::Simple interface will call env_proxy() for you automatically.
145 Applications that use the $ua->env_proxy() method will normally not
146 use the $ua->proxy() and $ua->no_proxy() methods.
148 Some proxies also require that you send it a username/password in
149 order to let requests through. You should be able to add the
150 required header, with something like this:
154 $ua = LWP::UserAgent->new;
155 $ua->proxy(['http', 'ftp'] => 'http://username:password@proxy.myorg.com');
157 $req = HTTP::Request->new('GET',"http://www.perl.com");
159 $res = $ua->request($req);
160 print $res->decoded_content if $res->is_success;
162 Replace C<proxy.myorg.com>, C<username> and
163 C<password> with something suitable for your site.
166 =head1 ACCESS TO PROTECTED DOCUMENTS
168 Documents protected by basic authorization can easily be accessed
172 $ua = LWP::UserAgent->new;
173 $req = HTTP::Request->new(GET => 'http://www.linpro.no/secret/');
174 $req->authorization_basic('aas', 'mypassword');
175 print $ua->request($req)->as_string;
177 The other alternative is to provide a subclass of I<LWP::UserAgent> that
178 overrides the get_basic_credentials() method. Study the I<lwp-request>
179 program for an example of this.
184 Some sites like to play games with cookies. By default LWP ignores
185 cookies provided by the servers it visits. LWP will collect cookies
186 and respond to cookie requests if you set up a cookie jar.
191 $ua = LWP::UserAgent->new;
192 $ua->cookie_jar(HTTP::Cookies->new(file => "lwpcookies.txt",
195 # and then send requests just as you used to do
196 $res = $ua->request(HTTP::Request->new(GET => "http://www.yahoo.no"));
197 print $res->status_line, "\n";
199 As you visit sites that send you cookies to keep, then the file
200 F<lwpcookies.txt"> will grow.
204 URLs with https scheme are accessed in exactly the same way as with
205 http scheme, provided that an SSL interface module for LWP has been
206 properly installed (see the F<README.SSL> file found in the
207 libwww-perl distribution for more details). If no SSL interface is
208 installed for LWP to use, then you will get "501 Protocol scheme
209 'https' is not supported" errors when accessing such URLs.
211 Here's an example of fetching and printing a WWW page using SSL:
215 my $ua = LWP::UserAgent->new;
216 my $req = HTTP::Request->new(GET => 'https://www.helsinki.fi/');
217 my $res = $ua->request($req);
218 if ($res->is_success) {
219 print $res->as_string;
222 print "Failed: ", $res->status_line, "\n";
227 If you want to mirror documents from a WWW server, then try to run
228 code similar to this at regular intervals:
233 'http://www.sn.no/' => 'sn.html',
234 'http://www.perl.com/' => 'perl.html',
235 'http://www.sn.no/libwww-perl/' => 'lwp.html',
236 'gopher://gopher.sn.no/' => 'gopher.html',
239 while (($url, $localfile) = each(%mirrors)) {
240 mirror($url, $localfile);
243 Or, as a perl one-liner:
245 perl -MLWP::Simple -e 'mirror("http://www.perl.com/", "perl.html")';
247 The document will not be transfered unless it has been updated.
251 =head1 LARGE DOCUMENTS
253 If the document you want to fetch is too large to be kept in memory,
254 then you have two alternatives. You can instruct the library to write
255 the document content to a file (second $ua->request() argument is a file
259 $ua = LWP::UserAgent->new;
261 my $req = HTTP::Request->new(GET =>
262 'http://www.linpro.no/lwp/libwww-perl-5.46.tar.gz');
263 $res = $ua->request($req, "libwww-perl.tar.gz");
264 if ($res->is_success) {
268 print $res->status_line, "\n";
272 Or you can process the document as it arrives (second $ua->request()
273 argument is a code reference):
276 $ua = LWP::UserAgent->new;
277 $URL = 'ftp://ftp.unit.no/pub/rfc/rfc-index.txt';
280 my $bytes_received = 0;
282 $ua->request(HTTP::Request->new(GET => $URL),
284 my($chunk, $res) = @_;
285 $bytes_received += length($chunk);
286 unless (defined $expected_length) {
287 $expected_length = $res->content_length || 0;
289 if ($expected_length) {
290 printf STDERR "%d%% - ",
291 100 * $bytes_received / $expected_length;
293 print STDERR "$bytes_received bytes received\n";
295 # XXX Should really do something with the chunk itself
298 print $res->status_line, "\n";
304 Copyright 1996-2001, Gisle Aas
306 This library is free software; you can redistribute it and/or
307 modify it under the same terms as Perl itself.