3 # Put the full path to perl on the first line, run the program with
4 # `perl CGIservlet`, or put a symbolic link to perl in
5 # the startup directory if you need a special version of perl.
7 if(grep(/\-\-help/i, @ARGV))
9 print << 'ENDOFHELPTEXT';
11 # A HTTPd "connector" for running CGI scripts on unix systems as WWW
12 # accessible Web sites. The servlet starts a true HTTP daemon that channels
13 # HTTP requests to forked daughter processes. CGIservlet.pl is NOT a
14 # full fledged server. Moreover, this servlet is definitely NOT intended
15 # as a replacement of a real server (e.g., Apache). It's design goal was
16 # SIMPLICITY, and not mileage.
18 # Note that a HTTP server can be accessed on your local machine WITHOUT
19 # internet access (but WITH a DNS?):
20 # use "http://localhost[:port]/[path]" or "http://127.0.0.1[:port]/[path]"
21 # as the URL. It is also easy to restrict access to the servlet to localhost
22 # users (i.e., the computer running the servlet).
25 # - A testbed for CGI-scripts and document-trees outside the primary server.
26 # When developing new scripts and services, you don't want to mess up your
27 # current Web-site. CGIservlet is an easy way to start a temporary (private)
28 # server. CGIservlet allows to test separate HTTP server components, e.g.,
29 # user authentication, in isolation.
31 # - A special purpose temporary server (WWW everywhere/anytime).
32 # We run identification and other experiments over the inter-/intra-net using
33 # CGI-scripts. This means a lot of development and changes and only little
34 # actual run-time. The people doing this do not want "scripting" access to our
35 # departmental server with all its restrictions and security. So we need a
36 # small, lightweigth, easy-to-configure server that can be run by each
37 # investigator on her own account (and risk).
39 # - Interactive WWW presentations.
40 # Not everyone is content with the features of "standard" office presentation
41 # software. HTML and its associated browsers are an alternative (especially
42 # under Linux). However, you need a server to realize the full interactive
43 # nature of the WWW. CGIservlet with the necessary scripts can be run from
44 # a floppie (a Web server in 100 kB). The CGIservlet can actually run a
45 # (small) web site from RAM, without disk access (if you DO NOT use the
46 # 2>pid.log redirection on startup).
47 # With the "localhost" or "127.0.0.1" id in your browser you can use the
50 # When the servlet is started with the -r option, only requests from "localhost"
51 # or "127.0.0.1" are accepted (default) or from addresses indicated after the
54 # Running demo's and more information can be found at
55 # http://www.fon.hum.uva.nl/rob/OSS/OSS.html
58 ############################################################################
60 # Changes (document ALL changes with date, name and email here):
62 # 18 Jun 2012 - Added --env, %UserEnv user defined ENV variables. Freeze %ENV.
63 # 06 Jun 2012 - Added HTTP Cookie string to an Environment variable: COOKIE_JAR
64 # 29 May 2012 - Added .log -> / to @RegAliasTranslation, blocks attempts to read
66 # 22 May 2012 - Blocked "hidden" files and directories starting with "." in
67 # default @RegAliasTranslation. Removed obsolete CVS reference.
68 # 21 May 2012 - Inserted the -m[emory] switch for loading and serving from RAM.
69 # The option was mentioned in the manual, but the cli switch was
71 # 22 Jul 2003 - Plain output using binary print io. `cat ...`
72 # 22 Jul 2003 - Added 'use CGI::Carp qw(fatalsToBrowser);' line
73 # for debugging. Standard this is commented out
74 # for security reasons (suggested by Jochen_Hayek@ACM.org).
75 # 22 Jul 2003 - Added error checking to doarg (suggested by Jochen_Hayek@ACM.org)
76 # 22 Jul 2003 - Removed SERVER_PORT from HTTP_HOST (Bug found by Jochen_Hayek@ACM.org)
77 # 22 Jul 2003 - Updated documentation. Added CGIservlet directory
78 # to the search path of CGIservletSETUP.pl
79 # 20 May 2003 - Made sure recycled (double) pid's do not mess up the
80 # @brood list and added a --help switch.
81 # 20 May 2003 - Added a maximum running time for child processes
82 # with command line switch -xterm.
83 # 15 Jan 2002 - Version 1.3
84 # 19 Oct 2001 - Included browsing of directories and a new -s
85 # security switch. With security toggled of
86 # directories can be browsed and all mime-types
87 # are served, either as 'text/plain' or as
88 # 'application/octed-stream'.
89 # 18 May 2001 - Added some HTTP HTTP lines.
90 # 13 Jun 2000 - Included the possibility to add POST request
91 # to GET query-strings (and change the request
92 # method). The -l ($Maxlength) maximum length
93 # option now covers POST requests too.
94 # 8 Dec 1999 - Included hooks for compression when running from RAM.
95 # 2 Dec 1999 - Autoflush enabled.
96 # 2 Dec 1999 - Allow running a Web Site from RAM.
97 # 2 Dec 1999 - Changed the behavior of CGIservletSETUP. CGIservlet
98 # will eval ALL setup files, the one in the CGIscriptor
99 # subdirectory (if any) AND the one in the current
100 # directory. (also added a close(SETUP) command)
101 # 26 Nov 1999 - Added some minimal security for 'automatic', out of
102 # the box installation.
103 # 26 Nov 1999 - Made the text/osshell mime-type functional (i.e.,
104 # without any scripts, implement a dynamic web server)
105 # Linited to '.cgi' extension.
106 # 26 Nov 1999 - Added aliasing of URL paths, both one-to-one lookups
107 # and full regular expression, i.e., $Path =~ s/.../.../g
109 # 28 Sep 1999 - Made all client supplied HTTP parameter names lowercase
110 # to handle inconsistencies in case use.
111 # 29 Jul 1999 - Allowed for a SETUP configuration file 'CGIservletSETUP.pl'.
112 # Use $beginarg from the 'CGIscriptor/' directory if it exists.
113 # (R.J.J.H.vanSon@uva.nl)
116 ############################################################################
120 # 23 Mar 2000 - An odd server side network error is reported by Netscape
121 # when a Post is initiated from a Javascript Submit of a
122 # <FORM>. This was found on Red Hat 6.1 Linux with perl 5.00503,
123 # 5.00503 and 5.6.0. But not on IRIX or Red Hat 5.0, 7.x.
125 ############################################################################
129 # Whenever an HTTP request is received, the specified CGI script is
130 # started inside a child process as if it was inside a real server (e.g.,
131 # Apache). The evironment variables are set more or less as in Apache.
132 # Note that CGIservlet only uses a SINGLE script for ALL requests.
133 # No attemps for security are made, it is the script's responsibility to
134 # check access rights and the validity of the request.
135 # When no scripts are given, CGIservlet runs as a bare bone WWW server
136 # configurable to execute scripts (the default setting is as a
139 # Author and copyright (c) :
142 # R.J.J.H.vanSon@gmail.com
146 # copying freely from the mhttpd server by Jerry LeVan (levan@eagle.eku.edu)
147 # Date: July 22, 2012
149 # Env: Perl 5.002 and later
152 ################################################################################
156 # This program is free software; you can redistribute it and/or #
157 # modify it under the terms of the GNU General Public License #
158 # as published by the Free Software Foundation; either version 2 #
159 # of the License, or (at your option) any later version. #
161 # This program is distributed in the hope that it will be useful, #
162 # but WITHOUT ANY WARRANTY; without even the implied warranty of #
163 # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the #
164 # GNU General Public License for more details. #
166 # You should have received a copy of the GNU General Public License #
167 # along with this program; if not, write to the Free Software #
168 # Foundation, Inc., 59 Temple Place - Suite 330, #
169 # Boston, MA 02111-1307, USA. #
171 ################################################################################
173 # Note: CGIservlet.pl was directly inspired by Jerry LeVan's
174 # (levan@eagle.eku.edu) simple mhttpd server which again was
175 # inspired by work of others. CGIservlet is used as a bare bones
176 # socket server for a single CGI script at a time.
178 # Use: CGIservlet.pl -<switch> <argument> 2>pid.log & (sh/bash)
179 # CGIservlet.pl -<switch> <argument> >&pid.log & (csh)
181 # The servlet prints out pid and port number on STDERR. It is
182 # adviced to store these in a separate file (this will become the
184 # NOTE: When running CGIservlet from a Memmory Image (i.e. RAM),
185 # do NOT redirect the error output to a file, but use something
186 # like MAILTO or /dev/null!
188 # Stop: sh pid.log (kills the server process)
190 # The first line in the file that receives STDERR output is a command
191 # to stop CGIservlet.
194 # CGIservlet.pl -p 2345 -d /cgi-bin/CGIscriptor.pl -t /WWW 2>pid.log &
195 # CGIservlet.pl -p 8080 -b 'require "CGIscriptor.pl";' -t $PWD -e \
196 # 'Handle_Request();' 2>pid.log &
198 # The following example settings implement a static WWW server using 'cat'
199 # (and prohibiting Queries):
204 # '$ENV{QUERY_STRING}="";$ENV{PATH_INFO}=~/\.([\w]+)$/; "Content-type: ".$mimeType{uc($1)}."\n\n";'
210 # This is identical to the (static) behaviour of CGIservlet when
211 # -e '' -d '' -x '' is used.
212 # The CGIservlet command should be run from the intended server-root directory.
214 # Another setting will use a package 'CGIscriptor.pl' with a function
215 # 'HandleRequest()' to implement an interactive WWW server with inline
219 # -b 'require "CGIscriptor.pl";'
220 # -e 'HandleRequest();'
226 # Look below or in the CGIservletSETUP.pl file for the current default
230 # ###############################################################################
232 # There are many switches to tailor the workings of CGIservlet.pl.
233 # Some are fairly esoteric and you should only look for them if you
234 # need something special urgently. When building a Web site,
235 # the specific options you need will "suggest" themselves (e.g., port
236 # number, script, or server-root directory). Most default settings
239 # You can add your own configuration in a file called
240 # 'CGIservletSETUP.pl'. This file will be executed ("eval"-ed)
241 # after the default setup, but before the command line options take
242 # effect. CGIservlet looks for the SETUP file in the startup directory
243 # and in the CGIscriptor subdirectory.
244 # (Note that the $beginarg variable is evaluated AFTER the setup file).
246 # In any case, it is best to change the default settings instead of
247 # using the option switches. All defaults are put in a single block.
249 # switches and arguments:
251 # -p[ort] port number
252 # For example -p 2345
253 # Obviously the port CGIservlet listenes to. Suggested Default: -p 8008
255 # -a[lias] Alias1 RealURL1 ...
256 # For example -a '/Stimulus.aifc' '/catAIFC.xmr'
257 # Replaces the given Alias URL path by its real URL path. Accepts full
258 # regular expressions too (identified by NON-URL characters).
259 # That is, on each request it performs (in order):
260 # if($AliasTranslation{$Path})
262 # $Path = $AliasTranslation{$Path};
264 # elsif(@RegAliasTranslation)
267 # for($i=0; $i<scalar(@RegAliasTranslation); ++$i)
269 # my $Alias = $RegAliasTranslation[$i];
270 # my $RealURL = $RegURLTranslation[$i];
271 # last if ($Path =~ s#$Alias#$RealURL#g);
274 # The effects can be quite drastic, so be
275 # carefull. Note also, that entering many Regular Expression
276 # aliases could slow down your servlet. Checking stops after
278 # Full regular expression alias translations are done in the
279 # order given! They are recognized as Aliases containing
280 # regexp's (i.e., non-URL) operator characters like '^' and
282 # Note: The command line is NOT a good place for entering
283 # Aliases, change the code below or add aliases to
284 # CGIservletSETUP.pl.
290 # -b[egin] perl commands
291 # For example -b 'require "CGIscriptor.pl";' or
292 # 'require "/WWW/cgi-bin/XMLelement.pl";'
293 # Perl commands evaluated at server startup
295 # -d[o] perl script file
296 # For example -d '/WWW/cgi-bin/CGIscriptor.pl'
297 # The actual CGI-script started as a perl {do "scriptfile"} command.
298 # The PATH_INFO and the QUERY are pushed on @ARGV.
302 # -exec shell command
303 # OS shell script or command, e.g., -x 'CGIscriptor.pl' or
304 # -x '/WWW/cgi-bin/my-script'
305 # The actual CGI-script started as `my-script \'$Path\' \'$QueryString\'`.
306 # -qx and -exec[ute] are aliases of -x. For security reasons, Paths or
307 # queries containing '-quotes are rejected.
309 # -e[val] perl commands
310 # For example -e 'Handle_Request();'
311 # The argument is evaluated as perl code. The actual CGI-script
312 # can be loaded once with -b 'require module.pm' and you only have to
313 # call the central function(s).
317 # For example -t "$PWD" or -t "/WWW/documents"
318 # The root of the server hierachy. Defaults to the working directory
319 # at startup time (`pwd`)
321 # -w[elcome] filepath
322 # For example -w "/index.html" (default)
323 # The default welcome page used when no path is entered. Note that
324 # this path can point to anything (or nothing real at all).
327 # The following arguments supply some rudimentary security. It is the
328 # responsibility of the script to ensure that the requests are indeed
331 # -c[hildren] maximum child processes
333 # The maximum number of subprocesses started. If there are more requests,
334 # the oldest requests are "killed". This should take care of "zombie"
335 # processes and server overloading. Note that new requests will be
336 # serviced irrespective of the success of killing it's older siblings.
338 # -xtime maximum running time of a child
339 # For example -xtime 36000
340 # The maximum time a child may run in seconds. After a new request has
341 # been servised, all children that have run for longer than this time
342 # will be killed. This stops runaway processes, often connected to
345 # -l[ength] maximum length of HTTP request in bytes
346 # For example -l 32768
347 # This prevents overloading the server with enormous queries. Reading of
348 # requests simply stops when this limit is reached. This DOES affect
349 # POST requests. If the combined length of the COMPLETE HTTP request,
350 # including headers, exceeds this limit, the whole request is dropped.
352 # -r[estrict] [Remote-address [Remote-host]]
353 # For example -r 127.0.0.1 (default of -r)
354 # A space separated list of client IP addresses and/or domain names that
355 # should be serviced. Default, i.e., '-r' without any addresses or domain
356 # names, is the localhost IP address '127.0.0.1'.
357 # When using CGIservlet for local purposes only (e.g., development or a
358 # presentation), it would be unsafe to allow others to access the servlet.
359 # If -r is used (or the corresponding @RemoteAddr or @RemoteHost lists are
360 # filled in the code below), all requests from clients whose Remote-address
361 # or Remote-host do not match the indicated addresses will be rejected.
362 # Partial addresses and domain names are allowed. Matching is done according
363 # to Remote-addr =~ /^\Q$pattern\E/ (front to back) and
364 # Remote-host =~ /\Q$pattern\E$/ (back to front)
366 # --env name=value,name=value
367 # Watch double dash. Define $ENV{name}=value for every pair. These are
368 # internally stored in %UserEnv, eg, $UserEnv{name}=value; This is set anew
369 # in the Child with every request. That is, Changes in %ENV are not stored.
373 # A toggle switch that blocks all access to files with undefined
374 # mime-types (or to serve ascii files as "text/plain"), and blocking directory
375 # browsing. Defaults to blocking what is not explicitely allowed.
379 # Reads complete Web site into memory and runs from this image.
380 # Set $UseRAMimage = 1; to activate memory-only running.
381 # Under certain circumstance, this can improve security.
382 # Note, however, that running osshellscripts from this image
383 # makes any "security" related claims very shaky.
388 # Retrieving the domain name of the Client (i.e., Remote-host) is a
389 # very slow process and normally useless. To skip it, enter this
390 # option. Note that you cannot use '-r Remote-host' anymore after
391 # you enter -n, only IP addresses will work.
393 # Configuration with the CGIservletSETUP.pl file
395 # You can add your own configuration in a file
396 # called 'CGIservletSETUP.pl'. This file will be executed ("eval"-ed)
397 # after the default setup, but before the command line options take
398 # effect. CGIservlet looks for the SETUP file in the startup directory
399 # and in the CGIservlet and CGIscriptor subdirectories.
400 # (Note that the $beginarg variable is evaluated even later).
402 # Changing POST to GET requests
404 # CGIservlet normally only handles requests with the GET method. Processing
405 # the input from POST requests is left to the reading application. POST
406 # requests add some extra complexity to processing requests. Sometimes,
407 # the reading application doesn't handle POST requests. CGIservlet
408 # already has to manage the HTTP request. Therefore, it can easily
409 # handle the POST request. If the variable $POSTtoGET is set to any
410 # non-false value, the content of whole POST request is added to the
411 # QUERY_STRING environment variable (preceeded by a '&' if necessary).
412 # The content-length is set to 0. If $POSTtoGET equals 'GET', the method
413 # will also be changed to 'GET'.
416 # All of the arguments of -d, -e, and -x are processed sequentially
417 # in this order. This might not be what you want so you should be
418 # carefull when using multiple executable arguments.
419 # If none of the executable arguments is DEFINED (i.e., they are entered
420 # as -d '' -e '' -x ''), each request is treated as a simple
421 # text-retrieval. THIS CAN BE A SECURITY RISK!
423 # The wiring of an interactive web-server, which also calls shell
424 # scripts with the extension '.cgi', is in place. You can
425 # "activate" it by changing the "my $ExecuteOSshell = 0;" line to
426 # "my $ExecuteOSshell = 1;".
427 # If you have trouble doing this, it might be a good idea
428 # to reconsider using a dynamic web server. Executing shell
429 # scripts inside a web server is a rather dangerous practise.
431 # CGIservlet can run its "standard" web server from memory.
432 # At startup, all files are read into a hash table. Upon
433 # request, the contents of the file are placed in the
434 # environment variable: CGI_FILE_CONTENTS.
435 # No further disk access is necessary. This means that:
436 # 1 CGIservlet can run a WWW site from a removable disk,
438 # 2 The web servlet can run without any read or write privilege.
439 # 3 The integrity of the Web-site contents can be secured at the
442 # To compres the memory (RAM) immage, you should hook the
443 # compression function to
444 # $CompressRAMimage = sub { return shift;};
445 # and the decompression function to
446 # $DecompressRAMimage = sub { return shift;};
452 ###################################################################################
455 use strict
; # Should realy be used!
457 use Carp
; # could come in handy (can be missed, I think)
459 # For debugging: uncommenting the use-line below will send
460 # nicely formanted output to the client. However, it is
461 # generally not a good idea to enable clients to test your
462 # scripts and look for holes (SECURITY).
463 # use CGI::Carp qw(fatalsToBrowser);
465 $| = 1; # Autoflush (i'm not sure whether this is usefull)
467 my $version = "1.301";
468 my $program = "CGIservlet.pl";
470 ##################################################################
472 # print some information to STDERR, e.g., the process number #
474 ##################################################################
475 sub logmsg
{ print STDERR
"kill -KILL $$;exit;\n", # Stop CGIservlet
476 "$0 $$: @_ at ", scalar localtime, "\n" }
478 ############################################################
480 # Parse arguments (you can define DEFAULT VALUES here) #
482 ############################################################
484 my $port = 8008; # The port number
486 # Add POST requests to the QUERY_STRING, change method to
487 # GET if the value is 'GET'
488 my $POSTtoGET = 0; # Add POST requests to the query string
490 # (Fast) direct translation of full URL paths
491 my %AliasTranslation = (); # Alias => RealURL pairs (ONLY paths)
492 # Regular expression alias translation, in order of application
493 # (this can be quite slow)
494 my @RegAliasTranslation = ('^(\..*|.*/\..*)$','\.htm$', '^.*\.log$'); # Full regular expression alias/url pairs: URL
495 my @RegURLTranslation = ('/','.html', '/'); # Full regular expression alias/url pairs: PATH
497 my $textroot = $ENV{'PWD'} || `pwd`; # current working directory
498 chomp($textroot); # Remove nasty newline, if present
499 my $doarg = ''; # do "filename",
501 my $beginarg = ''; # eval($Argument) at the start of the program
502 my $evalarg = ''; # eval($Argument) for each request
503 my $execarg = ''; # execute `command \'$textroot$Path\' \'$QueryString\'`
505 my $welcome = '/index.html'; # Default path
507 # Rudimentary security, overflow detection
508 my $MaxBrood = 32; # Maximum number of running children
509 my $MaxTime = 36000; # Maximum time a child may run in seconds
510 my $MaxLength = 2**15; # Maximum Request Length
511 my $Secure = 1; # Block browsing directories and text files or not
514 # If one of the following lists contains any client addresses or names, all others are
515 # blocked (be carefull, your site will be inaccessible if you misspell them).
516 my @RemoteHost = (); # Accepted Hosts, suggest: localhost
517 my @RemoteAddr = (); # Accepted IP addresses, suggest: @RemoteAddr=('127.0.0.1')
518 my $DefaultRemoteAddr = '127.0.0.1'; # default, use localhost IP address
519 my $NONAME = 0; # if 1, do NOT ask for REMOTE_HOST (faster)
521 # Initialization ready
522 my %FrozenEnv = %ENV; # Freeze %ENV
524 # Store the whole Web Site in a hash table and use this RAM memory image (if non-zero)
526 # Empty function handlers for data compression
527 # In general, these must be redefined in the $beginarg
528 my $CompressRAMimage = sub { return shift;};
529 my $DecompressRAMimage = sub { return shift;};
531 # Execute shell CGI scripts when no -d, -e, or -x are supplied
532 my $ExecuteOSshell = 0; # Do you REALY want this? It is dangerous
534 #################################################################
536 # Configure CGIservlet with a setup file (overides the #
537 # default settings, but not the command line options). #
538 # Note that, if it exists, the setup file in the CGIscriptor #
539 # subdirectory is processed EVEN if there is a SETUP file #
540 # in the current directory. #
542 #################################################################
543 # There exists a CGIservlet subdirectory and it contains
544 # a CGIservletSETUP.pl file
545 if((-e
'./CGIservlet/CGIservletSETUP.pl') &&
546 open(SETUP
, '<./CGIservlet/CGIservletSETUP.pl'))
549 my $SetupCode = join("", <SETUP
>);
550 # 'Eval' is used to ensure that the values are entered in the current
551 # package (contrary to what 'do' and 'require' do).
552 (eval $SetupCode) || die "$! $@\n";
555 # There exists a CGIscriptor subdirectory and it contains
556 # a CGIservletSETUP.pl file
557 if((-e
'./CGIscriptor/CGIservletSETUP.pl') &&
558 open(SETUP
, '<./CGIscriptor/CGIservletSETUP.pl'))
561 my $SetupCode = join("", <SETUP
>);
562 # 'Eval' is used to ensure that the values are entered in the current
563 # package (contrary to what 'do' and 'require' do).
564 (eval $SetupCode) || die "$! $@\n";
567 # There is a CGIservletSETUP.pl file in the current directory
568 if((-e
'./CGIservletSETUP.pl') &&
569 open(SETUP
, '<./CGIservletSETUP.pl'))
572 my $SetupCode = join("", <SETUP
>);
573 # 'Eval' is used to ensure that the values are entered in the current
574 # package (contrary to what 'do' and 'require' do).
575 (eval $SetupCode) || die "-e $SetupCode: $! $@\n";
579 ######################################
581 # process arguments and defaults #
583 ######################################
585 while ($_ = shift(@ARGV))
590 $port = shift(@ARGV);
594 $doarg = shift(@ARGV);
596 elsif(/\-(x|qx|exec)/is) # Execute
598 $execarg = shift(@ARGV);
600 elsif(/\-b/is) # Begin
602 $beginarg = shift(@ARGV);
604 elsif(/^\-e/is) # Evaluate
606 $evalarg = shift(@ARGV);
608 elsif(/\-t/is) # Textroot
610 $textroot = shift(@ARGV);
612 elsif(/\-w/is) # Default welcome page
614 $welcome = shift(@ARGV);
616 elsif(/\-c/is) # Maximum Children
618 $MaxBrood = shift(@ARGV) || $MaxBrood;
620 elsif(/\-xtime/is) # Maximum running time
622 $MaxTime = shift(@ARGV) || $MaxTime;
624 elsif(/\-l/is) # Maximum Length
626 $MaxLength = shift(@ARGV) || $MaxLength;
628 elsif(/\-m/is) # Run from RAM
632 elsif(/\-a/is) # Aliases
634 while(@ARGV && $ARGV[0] !~ /^\-/) # while not a parameter
636 my $Alias = shift(@ARGV);
637 my $RealURL = $ARGV[0] !~ /^\-/ ?
shift(@ARGV) : "";
638 next unless $Alias && $RealURL;
640 # Simple straight translations
641 unless($Alias =~ m/[\$\Q^*&@!\?(){}[];:\E]/)
643 $AliasTranslation{$Alias} = $RealURL;
645 else # Full regular expressions
647 push(@RegAliasTranslation, $Alias);
648 push(@RegURLTranslation, $RealURL);
653 elsif(/\-r/is) # Remote host or address
655 while(@ARGV && $ARGV[0] !~ /^\-/) # while not a parameter
657 my $Remote = shift(@ARGV);
658 if($Remote =~ /[\d\.]+/) # A host IP address
660 push(@RemoteAddr, $Remote);
662 else # A host domain name, less secure
664 push(@RemoteHost, $Remote);
668 # Use the default Remote Host (Client) IP address (e.g., localhost)
669 # if no addresses or domain names are entered.
670 push(@RemoteAddr, $DefaultRemoteAddr) unless @RemoteAddr || @RemoteHost;
672 elsif(/^\-\-env/is) # Environment variables
674 while(@ARGV && $ARGV[0] !~ /^\-/) # while not a parameter
676 my $envlist = shift(@ARGV);
677 foreach my $envstring (split(',', $envlist))
679 my ($name, $value) = split('=', $envstring);
681 # Store the Environment variable
682 $UserEnv{$name} = $value;
686 elsif(/\-s/is) # Secure or not
688 $Secure = !$Secure; # Toggle blocking directory browsing and ASCII file access
690 elsif(/\-n/is) # Do NOT extract Remote host
694 else # perform unreliable magick without switches
696 if(/^[0-9]+$/ && $_ > 1024) # A (large) number must be a port
700 elsif(-T
&& /\.pl$/) # Text file with extension .pl is a Perl file
704 elsif(-T
&& /\.pm$/) # Text file with extension .pm is a Perl module file
708 elsif(-x
) # Executables can be executed
712 elsif(-d
) # A directory can only be the root
716 elsif(-T
&& /^\// && /\
.html
$/) # An html file path is the default path
720 elsif(-T
) # A text file is something to do
724 elsif(/[\s\{\`\[\@\%]/) # I give up, just try it
726 $evalarg = shift(@ARGV);
731 ################################################
733 # All argument values are known. #
734 # Initialize environment variables. #
735 # (should be accessible to eval($beginarg)) #
737 ################################################
740 $ENV{'SERVER_SOFTWARE'} = "$program $version";
741 $ENV{'GATEWAY_INTERFACE'} = "CGI/1.1";
742 $ENV{'SERVER_PORT'} = "$port";
743 $ENV{'CGI_HOME'} = $textroot;
744 $ENV{'SERVER_ROOT'} = $textroot; # Server Root Directory
745 $ENV{'DOCUMENT_ROOT'} = $textroot; # Server Root Directory
746 $ENV{'SCRIPT_NAME'} = $doarg.$execarg.$evalarg; # Combine executable arguments
748 $FrozenEnv{'SERVER_SOFTWARE'} = $ENV{'SERVER_SOFTWARE'};
749 $FrozenEnv{'GATEWAY_INTERFACE'} = $ENV{'GATEWAY_INTERFACE'};
750 $FrozenEnv{'SERVER_PORT'} = $ENV{'SERVER_PORT'};
751 $FrozenEnv{'CGI_HOME'} = $ENV{'CGI_HOME'};
752 $FrozenEnv{'SERVER_ROOT'} = $ENV{'SERVER_ROOT'}; # Server Root Directory
753 $FrozenEnv{'DOCUMENT_ROOT'} = $ENV{'DOCUMENT_ROOT'}; # Server Root Directory
754 $FrozenEnv{'SCRIPT_NAME'} = $ENV{'SCRIPT_NAME'}; # Combine executable arguments
756 ################################################
758 # The initial argument should be evaluated #
760 ################################################
762 eval($beginarg) if $beginarg;
764 ################################################
766 # The initial argument has been evaluated #
768 ################################################
770 # Socket related code
771 my $proto = getprotobyname('tcp');
772 $port = $1 if $port =~ /(\d+)/; # untaint port number
774 socket(Server
, PF_INET
, SOCK_STREAM
, $proto) || die "socket: $!";
775 setsockopt(Server
, &SOL_SOCKET
, &SO_REUSEADDR
,
776 pack("l", 1)) || die "setsockopt: $!";
777 bind(Server
, sockaddr_in
($port, INADDR_ANY
)) || die "bind: $!";
778 listen(Server
,SOMAXCONN
) || die "listen: $!";
781 # Report start of server
782 logmsg
"server started on port $port";
784 # Set up SIG vector (every signal will kill the process that receives it)
785 $SIG{CHLD
} = 'IGNORE';
786 $SIG{'KILL'} = "SigHandler";
787 $SIG{'TERM'} = "SigHandler";
788 $SIG{'QUIT'} = "SigHandler";
789 $SIG{'HUP'} = "SigHandler";
791 # Define text mime types served if no scripts are defined
792 # Note that the "text/osshell" mime-type is executed by CGIservlet ITSELF!
793 # You should remove it if you don't want that!
795 'HTML'=> "text/html",
796 'TXT' => "text/plain",
797 'PL' => "text/plain", # This is incorrect, of course
798 'JPG' => "image/jpeg",
799 'JPEG' => "image/jpeg",
800 'GIF' => "image/gif",
801 'AU' => "audio/basic",
802 'AIF' => "audio/aiff",
803 'AIFC' => "audio/aiff",
804 'AIFF' => "audio/aiff",
805 'GZ' => "application/gzip",
806 'TGZ' => "application/tar",
807 #'CGI' => "text/osshell", # Executes SERVER side shell scripts, HIGHLY DANGEROUS
808 'WAV' => "audio/wav",
809 'OGG' => "audio/x-vorbis",
810 'PDF' => "application/pdf",
811 'PS' => "application/postscript"
814 ################################################
816 # Fill the RAM image of the web site #
818 ################################################
820 my %WWWramImage = ();
824 my @WWWfilelist = `find $textroot ! -type l ! -type d -print`;
826 foreach $WWWfile (@WWWfilelist)
829 # Skip unsupported file types
830 $WWWfile =~ /\.(\w+)$/;
831 my $WWWfileExtension = uc($1);
832 next unless $mimeType{$WWWfileExtension};
833 # Store GnuZipped image of file
834 $WWWramImage{$WWWfile} = "";
835 open(FILEIN
, "<$WWWfile") || die "$WWWfile could not be opened: $!\n";
837 while(sysread(FILEIN
, $Buffer, 1024))
839 $WWWramImage{$WWWfile} .= $Buffer;
842 my $CompressedPtr = &$CompressRAMimage(\
${WWWramImage
{$WWWfile}});
843 $WWWramImage{$WWWfile} = $$CompressedPtr;
844 $TotalSize += length($WWWramImage{$WWWfile});
847 # Report size of Web RAM image
848 print STDERR
"Total number of $TotalSize bytes read in memory image\n";
851 ################################################
853 # The RAM image of the web site has been #
856 ################################################
858 # Map HTTP request parameters to Environment variables
859 # HTTP request => Environment variable
861 'content-length' => 'CONTENT_LENGTH', # Necessary for POST
862 'user-agent' => 'HTTP_USER_AGENT',
863 'accept' => 'HTTP_ACCEPT',
864 'content-type' => 'CONTENT_TYPE',
865 'auth-type' => 'AUTH_TYPE',
866 'ident' => 'REMOTE_IDENT',
867 'referer' => 'HTTP_REFERER',
868 'user' => 'REMOTE_USER',
869 'address' => 'REMOTE_ADDR',
870 'connection' => 'HTTP_CONNECTION',
871 'accept-language' => 'HTTP_ACCEPT_LANGUAGE',
872 'accept-encoding' => 'HTTP_ACCEPT_ENCODING',
873 'accept-charset' => 'HTTP_ACCEPT_CHARSET',
874 'host' => 'HTTP_HOST',
875 'cookie' => 'COOKIE_JAR'
878 ###############################################################################
880 # Now we start with the real work. When there is a request, get the required #
881 # values and fork a child to service it. #
883 ###############################################################################
886 my %StartTime = (); # Start time of the children
889 # When someone knocks on the door
894 if(!($paddr = accept(Client
,Server
)) ) # Knock knock
896 exit 1; # This went terrribly wrong
899 # Fork to child and parent
900 if(($child =fork()) == 0)
903 my ($port,$iaddr) = sockaddr_in
($paddr);
904 my $address = inet_ntoa
($iaddr); # The IP address of the Client
905 # The following is EXTREMELY slow and generally unnecessary.
906 # Use -n or set $NONAME = 1; if you don't need it.
907 my $name = $NONAME ?
'' : gethostbyaddr($iaddr,AF_INET
);
911 # Before doing anything else, check whether the client should be
913 # Is IP addr on the list?
914 if(@RemoteAddr && !grep(/^\Q$address\E/, @RemoteAddr))
916 print STDERR
"Reject $address $name\n";
919 # Is name on the list?
920 if(@RemoteHost && !grep(/\Q$name\E$/, @RemoteHost))
922 print STDERR
"Reject $name $address\n";
927 # Grab a line without using buffered input... Important for
928 # Post methods since they have to read the Client input stream.
933 alarm 120 ; # prevent deadly spin if other end goes away
934 while(sysread(Client
, $ch, 1)>0)
938 last if $HTTPlength > $MaxLength; # Protect against overflow
940 next if $ch eq "\r"; # skip <cr>
943 last unless $string =~ /\S/; # stop if empty line
944 push (@Input, split(' ', $string)); # Collect input in list
948 alarm 0; # clear alarm
951 foreach my $varname (keys(%FrozenEnv))
953 $ENV{$varname} = $FrozenEnv{$varname};
956 # Extract input arguments
957 my $method = shift(@Input);
958 my $Request = shift(@Input);
959 my $protocol = shift(@Input);
960 my ($Path, $QueryString) = split('\?', $Request);
965 while($HTTPparameter = lc(shift(@Input)))
967 chop($HTTPparameter);
968 $HTTPtable{$HTTPparameter} = "";
969 while(@Input && $Input[0] !~ /\:$/)
971 $HTTPtable{$HTTPparameter} .= " " if $HTTPtable{$HTTPparameter};
972 $HTTPtable{$HTTPparameter} .= shift(@Input);
975 # Host can get the :SERVER_PORT appended. Set the correct SERVER_PORT
976 # and remove it from the host.
977 if($HTTPtable{'host'})
979 # Store current port number
980 if($HTTPtable{'host'} =~ /\:(\d+)\s*$/)
982 $ENV{'SERVER_PORT'} = $1;
984 # Remove port number from host
985 $HTTPtable{'host'} =~ s/\:(\d+)\s*$//g;
988 # Translate the Aliases
989 $Path = GetAlias
($Path);
991 # HTTP servers should always add the default path
992 $Path = $welcome if !$Path || $Path eq '/'; # The common default path
994 # Set fixed environment variables
995 $ENV{'PATH_INFO'} = "$Path";
996 $ENV{'QUERY_STRING'} = "$QueryString";
997 $ENV{'PATH_TRANSLATED'} = "$textroot$Path";
998 $ENV{'SERVER_PROTOCOL'} = "$protocol";
999 $ENV{'REQUEST_METHOD'} = "$method";
1000 $ENV{'REMOTE_ADDR'} = "$address"; # The IP address of the Client
1001 $ENV{'REMOTE_HOST'} = "$name";
1003 # Load all request information in the %ENV.
1004 # MUST be done with a pre-defined list of parameter names (security).
1005 foreach $HTTPparameter (keys(%HTTPtype))
1007 my $Label = $HTTPtype{$HTTPparameter};
1008 # The following adds environment variables FROM THE REQUEST.
1009 # It is a VERY, VERY bad idea to just use the client supplied
1011 $ENV{$Label} = $HTTPtable{$HTTPparameter} unless exists($ENV{$Label});
1012 # (The last part prevents overwriting existing environment variables)
1015 # SECURITY: Check length of POST request. Stop if request is too long
1016 die if $HTTPlength + $ENV{'CONTENT_LENGTH'} > $MaxLength;
1018 # If POST requests are unwanted, they can be added tot the query string
1019 # NOTE: the method is set to GET if $POSTtoGET equals 'GET', otherwise,
1020 # the method stays POST and only the content length is set to 0
1021 if($POSTtoGET && $ENV{'REQUEST_METHOD'} =~ /^POST$/i)
1023 my $POSTlength = $ENV{'CONTENT_LENGTH'} || 0;
1026 # Add '&' if there is a query string already
1027 if($ENV{'QUERY_STRING'})
1029 # Before we add something to the string, check length again
1030 die if $HTTPlength + $ENV{'CONTENT_LENGTH'} + 1 > $MaxLength;
1032 $ENV{'QUERY_STRING'} .= '&';
1036 while($POSTlength > 0 && $ReadBytes > 0)
1039 $ReadBytes = sysread(Client
, $Read, $POSTlength);
1040 $ENV{'QUERY_STRING'} .= $Read;
1041 $POSTlength -= $ReadBytes;
1044 # All has been read, the content length becomes 0
1045 $ENV{'CONTENT_LENGTH'} = 0;
1047 $ENV{'REQUEST_METHOD'} = 'GET' if $POSTtoGET eq 'GET';
1050 # Reset User defience Env variables
1051 foreach my $varname (keys(%UserEnv))
1053 $ENV{$varname} = $UserEnv{$varname};
1055 # Clean out the User Environment variables in the child
1059 # Connect STDOUT and STDIN to the client
1060 open(STDIN
, "<&Client");
1061 open(STDOUT
, ">&Client");
1062 print STDOUT
"HTTP/1.1 200 OK\n"; # Supply HTTP protocol information
1063 print STDOUT
"Date: ".gmtime()." GMT\n"; # Current date
1064 print STDOUT
"Server: $program $version\n"; # This program
1065 print STDOUT
"Connection: close\n"; # Don't allow persistent connections
1067 # Start processing of request (note that ALL scripts will be executed if
1068 # present, i.e., if -d, -x, and -e are entered, they are alle processed).
1070 # If in memory-only mode, store the requested file in an environment
1071 # variable: CGI_FILE_CONTENTS
1072 undef($ENV{'CGI_FILE_CONTENTS'}); # Make sure the ENV var doesn't exist
1075 my $DecompressedPtr = &$DecompressRAMimage(\
${WWWramImage
{"$textroot$Path"}});
1076 $ENV{'CGI_FILE_CONTENTS'} = $$DecompressedPtr;
1077 # Decompression does not seem to work
1081 @ARGV = ("$textroot$Path", $QueryString);
1082 # This was suggested by Jochen_Hayek@ACM.org
1085 # The perl script should do the printing
1086 my ($return) = do "$doarg";
1088 warn "couldn't parse $doarg: $@" if $@
;
1089 warn "couldn't $doarg: $!" unless defined $return;
1090 warn "couldn't run $doarg" unless $return;
1093 # evaluate perl command
1094 print STDOUT
eval($evalarg) if $evalarg;
1096 # execute shell command
1099 my $shellscript = $execarg;
1101 # Attempts to use Paths or Queries containing '-quotes are rejected.
1102 # Executing these would compromise security.
1103 die "Quotes in path: $textroot$Path\n" if "$textroot$Path" =~ /\'/;
1104 $shellscript .= " '$textroot$Path'" if $Path;
1106 die "Quotes in query: $QueryString\n" if $QueryString =~ /\'/;
1107 $shellscript .= " '$QueryString'" if $QueryString;
1108 $shellscript = qx{$shellscript};
1109 print STDOUT
$shellscript;
1112 # Output files if no scripts are given (actually, this should be
1113 # handled by a script). Unknown mimetypes are killed.
1114 # This is more or less a functional (dynamic) Web server in itself.
1115 unless($doarg || $execarg || $evalarg) # Request not already handled
1117 die ".. trick: $address $name $Path $QueryString\n"
1118 if $Path =~ m@\
.\
./@
; # No tricks!
1120 # Handle mime-types and directory browsing
1121 $Path =~ /\.([\w]+)$/; # Get extension
1122 my $extension = uc($1);
1123 my $browse = ($Path =~ m@
/\s
*$@
|| -d
"$textroot$Path") ?
1 : 0;
1124 my $mime = $browse ?
"" : $mimeType{$extension};
1126 # Serve up text and binary files unless they the $Secure option is given
1127 $mime = "text/plain" if !$mime && !$browse && (-T
"$textroot$Path") && !$Secure;
1128 $mime = "application/octet-stream" if !$mime && !$browse && (-B
"$textroot$Path") && !$Secure;
1130 # Remove final / in directory paths
1131 $Path =~ s@
/\s
*$@
@g;
1133 # Block illegal mime-types
1134 die "Illegal mime type:$extension\n" unless $mime || $browse; # illegal mime's are killed
1136 # Print out the document
1137 if(($mime eq 'text/osshell') && $ExecuteOSshell) # Don't use this unless you know what you're doing
1139 # Note that CGI scripts must supply their own content type
1140 # Some rudimentary security tests
1141 # Kill child if the path contains any non-URL characters
1142 die "ATTACK: ADDR:$ENV{'REMOTE_ADDR'} HOST:$ENV{'REMOTE_HOST'} URL=$Path '$QueryString'\n"
1143 if $Path =~ m@
[^\w\
-\
.\
/]@
; # Exclusive list of allowed characters
1144 # If you want to execute server side shell scripts, use the 'text/osshell'
1145 # mime-type (see above) but remember that there is NO SECURITY implemented
1147 # IF YOU DIDN'T GET THE MESSAGE YET, YOU COULD NOW OPEN YOUR COMPUTER TO THE WHOLE
1148 # INTERNET TO PLAY WITH!
1149 # Plain Web site from DISK
1150 unless($UseRAMimage)
1152 print STDOUT
`$textroot$Path`; # This is Russian Roulette
1154 else # Use a RAM image of the web site
1156 my $ShellInterpreter = '/usr/bin/sh';
1157 if($ENV{'CGI_FILE_CONTENTS'} =~ /^\#\!\s*([^\r\n]+)/isg)
1159 $ShellInterpreter = $1;
1161 # Execute shell script
1162 open(RAMOUT
, "| $ShellInterpreter") || die "ERROR open RAMOUT $ShellInterpreter $textroot$Path $! $@\n";
1163 (print RAMOUT
$ENV{'CGI_FILE_CONTENTS'}) || die "ERROR print RAMOUT $ShellInterpreter $textroot$Path $! $@\n";
1169 # Content-type and document
1170 print STDOUT
"Content-type: $mime\n\n";
1171 # Plain Web site from DISK
1172 unless($UseRAMimage)
1175 my $number_of_bytes = 0;
1176 open(BINARY
, "<$textroot$Path") || die "$textroot$Path: $!";
1178 # read and write block of 1024 bytes
1179 while($number_of_bytes = sysread(BINARY
, $String, 1024))
1181 syswrite(STDOUT
, $String, $number_of_bytes); # Actually print the file content
1185 # Alternative output using the UNIX shell
1186 # print STDOUT `cat '$textroot$Path'`; # lazy, let the OS do the work
1188 else # Use a RAM image of the web site
1190 print STDOUT
$ENV{'CGI_FILE_CONTENTS'};
1194 elsif($browse && !$Secure) # Block directory browsing in the Secure setup
1196 # Content-type and document
1197 print STDOUT
"Content-type: text/html\n\n";
1198 opendir(BROWSE
, "$textroot$Path") || die "<$textroot$Path: $!\n";
1200 print "<HTML>\n<HEAD>\n<TITLE>$Path</TITLE></HEAD>\n<BODY>\n<H1>$Path</H1>\n<pre>\n<dl>";
1203 foreach $DirEntry (sort {lc($a) cmp lc($b)} readdir(BROWSE
))
1205 my $CurrentPath = $Path;
1207 if($DirEntry eq '..')
1209 my $ParentDir = $CurrentPath;
1210 $ParentDir =~ s@
/[^/]+$@
@g;
1211 $ParentDir = '/' unless $ParentDir;
1212 print "<dt> <a href='$ParentDir'><h3>Parent directory</h3></a></dt>\n";
1214 next if $DirEntry !~ /[^\.\/\\\
:]/;
1217 my $Alias = GetAlias
("$CurrentPath/$DirEntry");
1218 if($Alias ne "$CurrentPath/$DirEntry")
1220 $Alias =~ m@
/([^/]+)$@
;
1225 my $Date = localtime($^T - (-M "$textroot$CurrentPath/$DirEntry")*3600*24);
1226 my $Size = -s "$textroot$CurrentPath/$DirEntry";
1227 $Size = sprintf("%6.0F kB", $Size/1024);
1228 my $Type = `file
$textroot$CurrentPath/$DirEntry`;
1229 $Type =~ s@\s*$textroot$CurrentPath/$DirEntry\s*\:\s*@@ig;
1231 print "<dt> <a href='$CurrentPath/$DirEntry'>";
1232 printf("%-40s", $DirEntry."</a>");
1233 print "\t$Size\t$Date\t$Type</dt>\n";
1236 print "</dl></pre></BODY>\n</HTML>\n";
1241 close(STDOUT) || die "STDOUT: $!\n";
1242 close(STDIN) || die "STDIN: $!\n";
1243 close(Client) || die "Client: $!\n";
1245 exit 0; # Kill Child
1250 # parent code...some systems will have to worry about waiting
1251 # before they can actually close the link to the Client
1252 my $current_time = time();
1254 # Determine which of the children are actually still alive
1255 # and kill those that have run for too long (probably not connected anymore)
1256 my @old_brood = @brood;
1257 @brood = (); # empty brood
1258 foreach (@old_brood)
1260 # Kill the child if it runs for longer than MaxTime
1261 if(($StartTime{$_} - $current_time) > $MaxTime)
1266 # Store children that are alive
1267 if(kill (0, $_)) # Alive?
1273 delete($StartTime{$_});
1277 # Weed out overflow of children (zombies etc.), keep pid for
1278 # removing the StartTime later on
1280 for($oldest=0; $oldest < scalar(@brood)-$MaxBrood; ++$oldest)
1282 kill "KILL", $brood[$oldest] if $brood[$oldest]; # Remove
1285 # Child pid could be recycled, i.e., $child could be stored
1286 # in @brood already. Remove it
1287 @brood = grep($_ != $child, @brood);
1289 # Push new child on the list
1290 push (@brood, $child);
1291 $StartTime{$child} = $current_time;
1293 close Client; # This is it, ready!
1297 # Interupt handler for shutting down
1304 # Subroutine for Aliases
1305 # Uses Global variables: %AliasTranslation, @RegAliasTranslation, and @RegURLTranslation
1306 sub GetAlias # ($Path)->AliasURL
1310 # Translate the Aliases
1311 if($AliasTranslation{$Path})
1313 $Path = $AliasTranslation{$Path};
1315 elsif(@RegAliasTranslation)
1318 for($i=0; $i<scalar(@RegAliasTranslation); ++$i)
1320 my $Alias = $RegAliasTranslation[$i];
1321 my $RealURL = $RegURLTranslation[$i];
1322 last if ($Path =~ s#$Alias#$RealURL#g);
1330 CGIservlet - a HTTPd "connector" for running CGI scripts on unix systems as WWW
1331 accessible Web sites.
1335 The servlet starts a true HTTP daemon that channels
1336 HTTP requests to forked daughter processes. Can run
1337 a (small) WWW-site from memory.
1341 Whenever an HTTP request is received, the specified CGI script is
1342 started inside a child process as if it was inside a real server (e.g.,
1343 Apache). The evironment variables are set more or less as in Apache.
1344 Note that CGIservlet only uses a SINGLE script for ALL requests.
1345 No attemps for security are made, it is the script's responsibility to
1346 check access rights and the validity of the request.
1347 Can store the files of Web site in memory and serve them
1350 =head1 PREREQUISITES
1352 This script requires the C<strict>, Socket and Carp modules.
1360 =pod SCRIPT CATEGORIES