1 <!DOCTYPE HTML PUBLIC
"-//W3C//DTD HTML 4.0 Transitional//EN"
2 "http://www.w3.org/TR/REC-html40/loose.dtd">
5 <TITLE>Common Gateway Interface -
1.1 *Draft
03* [http://cgi-spec.golux.com/draft-coar-cgi-v11-
03-clean.html]
7 <!--#if expr="$HTTP_USER_AGENT != /Lynx/" -->
8 <!--#set var="GUI" value="1" -->
10 <LINK HREF=
"mailto:Ken.Coar@Golux.Com" rev=
"revised">
11 <LINK REL=
"STYLESHEET" HREF=
"cgip-style-rfc.css" TYPE=
"text/css">
12 <META name=
"latexstyle" content=
"rfc">
13 <META name=
"author" content=
"Ken A L Coar">
14 <META name=
"institute" content=
"IBM Corporation">
15 <META name=
"date" content=
"25 June 1999">
16 <META name=
"expires" content=
"Expires 31 December 1999">
17 <META name=
"document" content=
"INTERNET-DRAFT">
18 <META name=
"file" content=
"<draft-coar-cgi-v11-03.txt>">
19 <META name=
"group" content=
"INTERNET-DRAFT">
21 There are a lot of BNF fragments in this document. To make it work
22 in all possible browsers (including Lynx, which is used to turn it
23 into text/plain), we handle these by using PREformatted blocks with
24 a universal internal margin of 2, inside one-level DL blocks.
29 HTML doesn't do paper pagination, so we need to fake it out. Basing
30 our formatting upon RFC2068, there are four (4) lines of header and
31 four (4) lines of footer for each page.
39 Coar, et al. CGI/1.1 Specification May, 1998
40 INTERNET-DRAFT Expires 1 December 1998 [Page 2]
47 The following weirdness wrt non-breaking spaces is to get Lynx
48 (which is barely TABLE-aware) to line the left/right justified
52 <TABLE WIDTH=
"100%" CELLPADDING=
0 CELLSPACING=
0>
55 INTERNET-DRAFT
58 Ken A L Coar
63 draft-coar-cgi-v11-
03.{html,txt}
66 IBM Corporation
71
74 D.R.T. Robinson
79
82 E*TRADE
UK
Ltd.
87
90 25 June
1999
97 The WWW Common Gateway Interface
102 <!--#include virtual="I-D-statement" -->
110 The Common Gateway Interface (CGI) is a simple interface for running
111 external programs, software or gateways under an information server
112 in a platform-independent manner. Currently, the supported information
113 servers are HTTP servers.
116 The interface has been in use by the World-Wide Web since
1993. This
117 specification defines the
118 "current practice" parameters of the
119 'CGI/
1.1' interface developed and documented at the U.S. National
120 Centre for Supercomputing Applications [NCSA-CGI].
121 This document also defines the use of the CGI/
1.1 interface
122 on the Unix and AmigaDOS(tm) systems.
125 Discussion of this draft occurs on the CGI-WG mailing list; see the
127 <SAMP><URL:
<A HREF=
"http://CGI-Spec.Golux.Com/"
128 >http://CGI-Spec.Golux.Com/
</A>></SAMP>
129 for details on the mailing list and the status of the project.
132 <!--#if expr="$GUI" -->
137 The revision history of this draft is being maintained using Web-based
138 GUI notation, such as struck-through characters and colour-coded
139 sections. The following legend describes how to determine the origin
140 of a particular revision according to the colour of the text:
145 <DD>Revision
00, released
28 May
1998
149 <DD>Revision
01, released
28 December
1998
151 Major structure change: Section
4,
"Request Metadata (Meta-Variables)"
152 was moved entirely under
<A HREF=
"#7.0">Section
7</A>,
"Data Input to the
154 Due to the size of this change, it is noted here and the text in its
155 former location does
<EM>not
</EM> appear as struckthrough. This has
156 caused major
<A HREF=
"#6.0">sections
5</A> and following to decrement
158 large text movements are likewise not marked up. References to RFC
159 1738 were changed to
2396 (
1738's replacement).
163 <DD>Revision
02, released
2 April,
1999
165 Added text to
<A HREF=
"#8.3">section
8.3</A> defining correct handling
167 requests using
"chunked" Transfer-Encoding. Labelled metavariable
168 names in
<A HREF=
"#8.0">section
8</A> with the appropriate detail section
170 Clarified allowed usage of
<SAMP>Status
</SAMP> and
171 <SAMP>Location
</SAMP> response header fields. Included new
172 Internet-Draft language.
176 <DD>Revision
03, released
25 June
1999
178 Changed references from
"HTTP" to
"Protocol-Specific" for the listing of
179 things like HTTP_ACCEPT. Changed 'entity-body' and 'content-body' to
180 'message-body.' Added a note that response headers must comply with
181 requirements of the protocol level in use. Added a lot of stuff about
182 security (section
11). Clarified a bunch of productions. Pointed out
183 that zero-length and omitted values are indistinguishable in this
184 specification. Clarified production describing order of fields in
185 script response header. Clarified issues surrounding encoding of
186 data. Acknowledged additional contributors, and changed one of
187 the authors' addresses.
199 1 Introduction..............................................
<A
202 1.1 Purpose................................................
<A
205 1.2 Requirements...........................................
<A
208 1.3 Specifications.........................................
<A
211 1.4 Terminology............................................
<A
214 2 Notational Conventions and Generic Grammar................
<A
217 2.1 Augmented BNF..........................................
<A
220 2.2 Basic Rules............................................
<A
223 3 Protocol Parameters.......................................
<A
226 3.1 URL Encoding...........................................
<A
229 3.2 The Script-URI.........................................
<A
232 4 Invoking the Script.......................................
<A
235 5 The CGI Script Command Line...............................
<A
238 6 Data Input to the CGI Script..............................
<A
241 6.1 Request Metadata (Metavariables).......................
<A
244 6.1.1 AUTH_TYPE...........................................
<A
247 6.1.2 CONTENT_LENGTH......................................
<A
250 6.1.3 CONTENT_TYPE........................................
<A
253 6.1.4 GATEWAY_INTERFACE...................................
<A
256 6.1.5 Protocol-Specific Metavariables.....................
<A
259 6.1.6 PATH_INFO...........................................
<A
262 6.1.7 PATH_TRANSLATED.....................................
<A
265 6.1.8 QUERY_STRING........................................
<A
268 6.1.9 REMOTE_ADDR.........................................
<A
271 6.1.10 REMOTE_HOST........................................
<A
274 6.1.11 REMOTE_IDENT.......................................
<A
277 6.1.12 REMOTE_USER........................................
<A
280 6.1.13 REQUEST_METHOD.....................................
<A
283 6.1.14 SCRIPT_NAME........................................
<A
286 6.1.15 SERVER_NAME........................................
<A
289 6.1.16 SERVER_PORT........................................
<A
292 6.1.17 SERVER_PROTOCOL....................................
<A
295 6.1.18 SERVER_SOFTWARE....................................
<A
298 6.2 Request Message-Bodies................................
<A
301 7 Data Output from the CGI Script...........................
<A
304 7.1 Non-Parsed Header Output...............................
<A
307 7.2 Parsed Header Output...................................
<A
310 7.2.1 CGI header fields...................................
<A
313 7.2.1.1 Content-Type.....................................
<A
316 7.2.1.2 Location.........................................
<A
319 7.2.1.3 Status...........................................
<A
322 7.2.1.4 Extension header fields..........................
<A
325 7.2.2 HTTP header fields..................................
<A
328 8 Server Implementation.....................................
<A
331 8.1 Requirements for Servers...............................
<A
334 8.1.1 Script-URI..........................................
<A
337 8.1.2 Request Message-body Handling.......................
<A
340 8.1.3 Required Metavariables..............................
<A
343 8.1.4 Response Compliance.................................
<A
346 8.2 Recommendations for Servers............................
<A
349 8.3 Summary of Metavariables...............................
<A
352 9 Script Implementation.....................................
<A
355 9.1 Requirements for Scripts...............................
<A
358 9.2 Recommendations for Scripts............................
<A
361 10 System Specifications....................................
<A
364 10.1 AmigaDOS..............................................
<A
367 10.2 Unix..................................................
<A
370 11 Security Considerations..................................
<A
373 11.1 Safe Methods..........................................
<A
376 11.2 HTTP Header Fields Containing Sensitive Information...
<A
379 11.3 Script Interference with the Server...................
<A
382 11.4 Data Length and Buffering Considerations..............
<A
385 11.5 Stateless Processing..................................
<A
388 12 Acknowledgments..........................................
<A
391 13 References...............................................
<A
394 14 Authors' Addresses.......................................
<A
412 Together the HTTP [
<A HREF=
"#[3]">3</A>,
<A HREF=
"#[8]">8</A>] server
413 and the CGI script are responsible
414 for servicing a client
415 request by sending back responses. The client
416 request comprises a Universal Resource Identifier (URI)
417 [
<A HREF=
"#[1]">1</A>], a
418 request method, and various ancillary
419 information about the request
420 provided by the transport mechanism.
423 The CGI defines the abstract parameters, known as
425 which describe the client's
426 request. Together with a
427 concrete programmer interface this specifies a platform-independent
428 interface between the script and the HTTP server.
437 This specification uses the same words as RFC
1123
438 [
<A HREF=
"#[5]">5</A>] to define the
439 significance of each particular requirement. These are:
440 </P><!--#if expr="! $GUI" -->
441 <P></P><!--#endif -->
447 This word or the adjective 'required' means that the item is an
448 absolute requirement of the specification.
455 This word or the adjective 'recommended' means that there may
456 exist valid reasons in particular circumstances to ignore this
457 item, but the full implications should be understood and the case
458 carefully weighed before choosing a different course.
465 This word or the adjective 'optional' means that this item is
466 truly optional. One vendor may choose to include the item because
467 a particular marketplace requires it or because it enhances the
468 product, for example; another vendor may omit the same item.
473 An implementation is not compliant if it fails to satisfy one or more
474 of the 'must' requirements for the protocols it implements. An
475 implementation that satisfies all of the 'must' and all of the
476 'should' requirements for its features is said to be 'unconditionally
477 compliant'; one that satisfies all of the 'must' requirements but not
478 all of the 'should' requirements for its features is said to be
479 'conditionally compliant.'
488 Not all of the functions and features of the CGI are defined in the
489 main part of this specification. The following phrases are used to
490 describe the features which are not specified:
493 <DT><EM>system defined
</EM>
497 The feature may differ between systems, but must be the same for
498 different implementations using the same system. A system will
499 usually identify a class of operating-systems. Some systems are
502 >section
10</A> of this document.
503 New systems may be defined
504 by new specifications without revision of this document.
507 <DT><EM>implementation defined
</EM>
511 The behaviour of the feature may vary from implementation to
512 implementation, but a particular implementation must document its
524 This specification uses many terms defined in the HTTP/
1.1
525 specification [
<A HREF=
"#[8]">8</A>]; however, the following terms are
527 sense which may not accord with their definitions in that document,
528 or with their common meaning.
532 <DT><EM>metavariable
</EM>
536 A named parameter that carries information from the server to the
537 script. It is not necessarily a variable in the operating-system's
538 environment, although that is the most common implementation.
546 The software which is invoked by the server
<EM>via
</EM> this
548 need not be a standalone program, but could be a
549 dynamically-loaded or shared library, or even a subroutine in the
550 server. It
<EM>may
</EM> be a set of statements
551 interpreted at run-time, as the term 'script' is frequently
552 understood, but that is not a requirement and within the context
553 of this specification the term has the broader definition stated.
560 The application program which invokes the script in order to service
568 2. Notational Conventions and Generic Grammar
578 All of the mechanisms specified in this document are described in
579 both prose and an augmented Backus-Naur Form (BNF) similar to that
580 used by RFC
822 [
<A HREF=
"#[6]">6</A>]. This augmented BNF contains
581 the following constructs:
584 <DT>name = definition
589 definition by the equal character (
"="). Whitespace is only
590 significant in that continuation lines of a definition are
598 Quotation marks (
") surround literal text, except for a literal
599 quotation mark, which is surrounded by angle-brackets ("<" and ">").
600 Unless stated otherwise, the text is case-sensitive.
607 Alternative rules are separated by a vertical bar ("|
").
610 <DT>(rule1 rule2 rule3)
614 Elements enclosed in parentheses are treated as a single element.
621 A rule preceded by an asterisk ("*
") may have zero or more
622 occurrences. A rule preceded by an integer followed by an asterisk
623 must occur at least the specified number of times.
630 An element enclosed in square
631 brackets ("[
" and "]
") is optional.
642 The following rules are used throughout this specification to
643 describe basic parsing constructs.
644 </P><!--#if expr="! $GUI
" -->
645 <P></P><!--#endif -->
647 alpha = lowalpha | hialpha
648 alphanum = alpha | digit
649 lowalpha = "a
" | "b
" | "c
" | "d
" | "e
" | "f
" | "g
" | "h
"
650 | "i
" | "j
" | "k
" | "l
" | "m
" | "n
" | "o
" | "p
"
651 | "q
" | "r
" | "s
" | "t
" | "u
" | "v
" | "w
" | "x
"
653 hialpha = "A
" | "B
" | "C
" | "D
" | "E
" | "F
" | "G
" | "H
"
654 | "I
" | "J
" | "K
" | "L
" | "M
" | "N
" | "O
" | "P
"
655 | "Q
" | "R
" | "S
" | "T
" | "U
" | "V
" | "W
" | "X
"
657 digit = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7"
659 hex = digit | "A
" | "B
" | "C
" | "D
" | "E
" | "F
" | "a
"
660 | "b
" | "c
" | "d
" | "e
" | "f
"
661 escaped = "%
" hex hex
662 OCTET = <any 8-bit sequence of data>
663 CHAR = <any US-ASCII character (octets 0 - 127)>
664 CTL = <any US-ASCII control character
665 (octets 0 - 31) and DEL (127)>
666 CR = <US-ASCII CR, carriage return (13)>
667 LF = <US-ASCII LF, linefeed (10)>
668 SP = <US-ASCII SP, space (32)>
669 HT = <US-ASCII HT, horizontal tab (9)>
672 tspecial = "(
" | ")
" | "@
" | ",
" | ";
" | ":
" | "\
" | <">
673 |
"/" |
"[" |
"]" |
"?" |
"<" |
">" |
"{" |
"}"
675 token =
1*
<any CHAR except CTLs or tspecials
>
676 quoted-string = (
<"> *qdtext <"> ) | (
"<" *qatext
">")
677 qdtext =
<any CHAR except
<"> and CTLs but including LWSP>
678 qatext = <any CHAR except "<", ">" and CTLs but
680 mark = "-
" | "_
" | ".
" | "!
" | "~
" | "*
" | "'
" | "(
" | ")
"
681 unreserved = alphanum | mark
682 reserved = ";
" | "/
" | "?
" | ":
" | "@
" | "&" | "=
" |
684 uric = reserved | unreserved | escaped
687 Note that newline (NL) need not be a single character, but can be a
693 3. Protocol Parameters
703 Some variables and constructs used here are described as being
704 'URL-encoded'. This encoding is described in section
707 [<A HREF="#[
4]
">4</A>].
710 An alternate "shortcut
" encoding for representing the space
711 character exists and is in common use. Scripts MUST be prepared to
712 recognise both '+' and '%20' as an encoded space in a
716 Note that some unsafe characters may have different semantics if
717 they are encoded. The definition of which characters are unsafe
718 depends on the context.
719 For example, the following two URLs do not
720 necessarily refer to the same resource:
721 </P><!--#if expr="! $GUI
" -->
722 <P></P><!--#endif -->
724 http://somehost.com/somedir%2Fvalue
725 http://somehost.com/somedir/value
730 2396 [<A HREF="#[
4]
">4</A>]
731 for authoritative treatment of this issue.
740 The 'Script-URI' is defined as the URI of the resource identified
741 by the metavariables. Often,
742 this URI will be the same as
743 the URI requested by the client (the 'Client-URI'); however, it need
744 not be. Instead, it could be a URI invented by the server, and so it
745 can only be used in the context of the server and its CGI interface.
748 The Script-URI has the syntax of generic-RL as defined in section 2.1
749 of RFC 1808 [<A HREF="#[
7]
">7</A>], with the exception that object
751 fragment identifiers are not permitted:
752 </P><!--#if expr="! $GUI
" -->
753 <P></P><!--#endif -->
755 <scheme>://<host><port>/<path>?<query>
758 The various components of the
760 are defined by some of the
762 <A HREF="#
4.0">section 4</A>
764 </P><!--#if expr="! $GUI
" -->
765 <P></P><!--#endif -->
767 script-uri = protocol "://
" SERVER_NAME ":
" SERVER_PORT enc-script
768 enc-path-info "?
" QUERY_STRING
771 where 'protocol' is obtained
772 from SERVER_PROTOCOL, 'enc-script' is a
773 URL-encoded version of SCRIPT_NAME and 'enc-path-info' is a
774 URL-encoded version of PATH_INFO. See
775 <A HREF="#
4.6">section 4.6</A> for more information about the PATH_INFO
779 Note that the scheme and the protocol are <EM>not</EM> identical;
780 for instance, a resource accessed <EM>via</EM> an SSL mechanism
781 may have a Client-URI with a scheme of "<SAMP>https
</SAMP>"
782 rather than "<SAMP>http
</SAMP>". CGI/1.1 provides no means
783 for the script to reconstruct this, and therefore
784 the Script-URI includes the base protocol used.
789 4. Invoking the Script
794 script is invoked in a system defined manner. Unless specified
795 otherwise, the file containing the script will be invoked as an
801 5. The CGI Script Command Line
805 Some systems support a method for supplying an array of strings to
806 the CGI script. This is only used in the case of an 'indexed' query.
807 This is identified by a "GET
" or "HEAD
" HTTP request with a URL
809 string not containing any unencoded "=
" characters. For such a
811 servers SHOULD parse the search string
812 into words, using the following rules:
813 </P><!--#if expr="! $GUI
" -->
814 <P></P><!--#endif -->
816 search-string = search-word *( "+
" search-word )
817 search-word = 1*schar
818 schar = xunreserved | escaped | xreserved
819 xunreserved = alpha | digit | xsafe | extra
820 xsafe = "$
" | "-
" | "_
" | ".
"
821 xreserved = ";
" | "/
" | "?
" | ":
" | "@
" | "&
"
824 After parsing, each word is URL-decoded, optionally encoded in a
825 system defined manner,
826 and then the argument list is set to the list
830 If the server cannot create any part of the argument list, then the
831 server SHOULD NOT generate any command line information. For example, the
832 number of arguments may be greater than operating system or server
833 limitations permit, or one of the words may not be representable as an
837 Scripts SHOULD check to see if the QUERY_STRING value contains an
838 unencoded "=
" character, and SHOULD NOT use the command line arguments
844 6. Data Input to the CGI Script
848 Information about a request comes from two different sources: the
849 request header, and any associated
852 make portions of this information available to
858 6.1. Request Metadata
864 implementation MUST define a mechanism
865 to pass data about the request from
866 the server to the script.
867 The metavariables containing these
869 are accessed by the script in a system
872 representation of the characters in the
877 This specification does not distinguish between the representation of
878 null values and missing ones. Whether null or missing values
879 (such as a query component of "?
" or "", respectively) are represented
880 by undefined metavariables or by metavariables with values of "" is
881 implementation-defined.
884 Case is not significant in the
886 names, in that there cannot be two
888 whose names differ in case only. Here they are
889 shown using a canonical representation of capitals plus underscore
890 ("_
"). The actual representation of the names is system defined; for
891 a particular system the representation MAY be defined differently
897 considered case-sensitive except as noted
903 defined by this specification are:
904 </P><!--#if expr="! $GUI
" -->
905 <P></P><!--#endif -->
926 Metavariables with names beginning with the protocol name (<EM>e.g.</EM>,
927 "HTTP_ACCEPT
") are also canonical in their description of request header
928 fields. The number and meaning of these fields may change independently
929 of this specification. (See also <A HREF="#
6.1.5">section 6.1.5</A>.)
938 This variable is specific to requests made
945 required access authentication for external
946 access, then the server
950 from the '<SAMP>auth-scheme</SAMP>' token in
951 the request's "<SAMP>Authorization
</SAMP>" header
956 </P><!--#if expr="! $GUI
" -->
957 <P></P><!--#endif -->
959 AUTH_TYPE = "" | auth-scheme
960 auth-scheme = "Basic
" | "Digest
" | token
963 HTTP access authentication schemes are described in section 11 of the
964 HTTP/1.1 specification [<A HREF="#[
8]
">8</A>]. The auth-scheme is
970 provide this metavariable
971 to scripts if the request
972 header included an "<SAMP>Authorization
</SAMP>" field
973 that was authenticated.
978 6.1.2. CONTENT_LENGTH
985 size of the message-body
986 entity attached to the request, if any, in decimal
987 number of octets. If no data are attached, then this
989 is either NULL or not
990 defined. The syntax is
992 the HTTP "<SAMP>Content-Length
</SAMP>" header field (section 14.14, HTTP/1.1
993 specification [<A HREF="#[
8]
">8</A>]).
994 </P><!--#if expr="! $GUI
" -->
995 <P></P><!--#endif -->
997 CONTENT_LENGTH = "" | 1*digit
1000 Servers MUST provide this metavariable
1001 to scripts if the request
1002 was accompanied by a
1003 message-body entity.
1012 If the request includes a
1016 the Internet Media Type
1017 [<A HREF="#[
9]
">9</A>] of the attached
1018 entity if the type was provided <EM>via</EM>
1019 a "<SAMP>Content-type
</SAMP>" field in the
1020 request header, or if the server can determine it in the absence
1021 of a supplied "<SAMP>Content-type
</SAMP>" field. The syntax is the
1022 same as for the HTTP
1023 "<SAMP>Content-Type
</SAMP>" header field.
1024 </P><!--#if expr="! $GUI
" -->
1025 <P></P><!--#endif -->
1027 CONTENT_TYPE = "" | media-type
1028 media-type = type "/
" subtype *( ";
" parameter)
1031 parameter = attribute "=
" value
1033 value = token | quoted-string
1037 and parameter attribute names are not
1038 case-sensitive. Parameter values MAY be case sensitive.
1039 Media types and their use in HTTP are described
1040 in section 3.7 of the
1041 HTTP/1.1 specification [<A HREF="#[
8]
">8</A>].
1045 </P><!--#if expr="! $GUI
" -->
1046 <P></P><!--#endif -->
1048 application/x-www-form-urlencoded
1051 There is no default value for this variable. If and only if it is
1052 unset, then the script MAY attempt to determine the media type from
1053 the data received. If the type remains unknown, then
1054 the script MAY choose to either assume a
1056 <SAMP>application/octet-stream</SAMP>
1057 or reject the request with a 415 ("Unsupported Media Type
")
1058 error. See <A HREF="#
7.2.1.3">section 7.2.1.3</A>
1059 for more information about returning error status values.
1062 Servers MUST provide this metavariable
1064 a "<SAMP>Content-Type
</SAMP>" field was present
1065 in the original request header. If the server receives a request
1066 with an attached entity but no "<SAMP>Content-Type
</SAMP>"
1067 header field, it MAY attempt to
1068 determine the correct datatype, or it MAY omit this
1070 communicating the request information to the script.
1075 6.1.4. GATEWAY_INTERFACE
1082 the dialect of CGI being used
1083 by the server to communicate with the script.
1085 </P><!--#if expr="! $GUI
" -->
1086 <P></P><!--#endif -->
1088 GATEWAY_INTERFACE = "CGI
" "/
" major ".
" minor
1093 Note that the major and minor numbers are treated as separate
1094 integers and hence each may be
1096 digit. Thus CGI/2.4 is a lower version than CGI/2.13 which in turn
1097 is lower than CGI/12.3. Leading zeros in either
1098 the major or the minor number MUST be ignored by scripts and
1099 SHOULD NOT be generated by servers.
1102 This document defines the 1.1 version of the CGI interface
1106 Servers MUST provide this metavariable
1112 6.1.5. Protocol-Specific Metavariables
1116 These metavariables are specific to
1118 <EM>via</EM> which the request is made.
1119 Interpretation of these variables depends on the value of
1124 <A HREF="#
6.1.17">section 6.1.17</A>).
1128 with names beginning with "HTTP_
" contain
1129 values from the request header, if the
1130 scheme used was HTTP.
1132 HTTP header field name is converted to upper case, has all occurrences of
1133 "-
" replaced with "_
",
1134 and has "HTTP_
" prepended to form
1135 the metavariable name.
1136 Similar transformations are applied for other
1138 The header data MAY be presented as sent
1139 by the client, or MAY be rewritten in ways which do not change its
1140 semantics. If multiple header fields with the same field-name are received
1142 MUST rewrite them as though they
1143 had been received as a single header field having the same
1144 semantics before being represented in a
1146 Similarly, a header field that is received on more than one line
1147 MUST be merged into a single line. The server MUST, if necessary,
1148 change the representation of the data (for example, the character
1149 set) to be appropriate for a CGI
1151 <!-- ###NOTE: See if 2068 describes this thoroughly, and
1152 point there if so. -->
1156 not required to create
1157 metavariables for all
1159 header fields that they
1160 receive. In particular,
1162 decline to make available any
1163 header fields carrying authentication information, such as
1164 "<SAMP>Authorization
</SAMP>", or
1165 which are available to the script
1166 <EM>via</EM> other metavariables,
1167 such as "<SAMP>Content-Length
</SAMP>" and "<SAMP>Content-Type
</SAMP>".
1179 a path to be interpreted by the CGI script. It identifies the
1180 resource or sub-resource to be returned
1182 script, and it is derived from the portion
1183 of the URI path following the script name but preceding
1186 and semantics are similar to a decoded HTTP URL
1190 [<A HREF="#[
4]
">4</A>]), with the exception
1191 that a PATH_INFO of "/
"
1192 represents a single void path segment.
1193 </P><!--#if expr="! $GUI
" -->
1194 <P></P><!--#endif -->
1196 PATH_INFO = "" | ( "/
" path )
1197 path = segment *( "/
" segment )
1199 pchar = <any CHAR except "/
">
1202 The PATH_INFO string is the trailing part of the <path> component of
1204 (see <A HREF="#
3.2">section 3.2</A>)
1205 that follows the SCRIPT_NAME
1206 portion of the path.
1209 Servers MAY impose their own restrictions and
1210 limitations on what values they will accept for PATH_INFO, and MAY
1211 reject or edit any values they
1212 consider objectionable before passing
1216 Servers MUST make this URI component available
1217 to CGI scripts. The PATH_INFO
1218 value is case-sensitive, and the
1219 server MUST preserve the case of the PATH_INFO element of the URI
1220 when making it available to scripts.
1225 6.1.7. PATH_TRANSLATED
1229 PATH_TRANSLATED is derived by taking any path-info component of the
1231 <A HREF="#
6.1.6">section 6.1.6</A>), decoding it
1232 (see <A HREF="#
3.1">section 3.1</A>), parsing it as a URI in its own
1233 right, and performing any virtual-to-physical
1234 translation appropriate to map it onto the
1235 server's document repository structure.
1236 If the request URI includes no path-info
1237 component, the PATH_TRANSLATED metavariable SHOULD NOT be defined.
1238 </P><!--#if expr="! $GUI
" -->
1239 <P></P><!--#endif -->
1241 PATH_TRANSLATED = *CHAR
1244 For a request such as the following:
1245 </P><!--#if expr="! $GUI
" -->
1246 <P></P><!--#endif -->
1248 http://somehost.com/cgi-bin/somescript/this%2eis%2epath%2einfo
1251 the PATH_INFO component would be decoded, and the result
1252 parsed as though it were a request for the following:
1253 </P><!--#if expr="! $GUI
" -->
1254 <P></P><!--#endif -->
1256 http://somehost.com/this.is.the.path.info
1259 This would then be translated to a
1260 location in the server's document repository,
1261 perhaps a filesystem path something
1263 </P><!--#if expr="! $GUI
" -->
1264 <P></P><!--#endif -->
1266 /usr/local/www/htdocs/this.is.the.path.info
1269 The result of the translation is the value of PATH_TRANSLATED.
1272 The value of PATH_TRANSLATED may or may not map to a valid
1275 Servers MUST preserve the case of the path-info
1276 segment if and only if the underlying
1278 supports case-sensitive
1281 is only case-aware, case-preserving, or case-blind
1284 servers are not required to preserve the
1285 case of the original segment through the translation.
1290 algorithm the server uses to derive PATH_TRANSLATED is
1291 implementation defined; CGI scripts which use this variable may
1292 suffer limited portability.
1295 Servers SHOULD provide this metavariable
1296 to scripts if and only if the request URI includes a
1297 path-info component.
1307 string; the <query> part of the
1310 <A HREF="#
3.2">section 3.2</A>.)
1311 </P><!--#if expr="! $GUI
" -->
1312 <P></P><!--#endif -->
1314 QUERY_STRING = query-string
1315 query-string = *uric
1318 The URL syntax for a query
1319 string is described in
1322 [<A HREF="#[
4]
">4</A>].
1325 Servers MUST supply this value to scripts.
1326 The QUERY_STRING value is case-sensitive.
1327 If the Script-URI does not include a query component,
1328 the QUERY_STRING metavariable MUST be defined as an empty string ("").
1337 The IP address of the client
1338 sending the request to the server. This
1339 is not necessarily that of the user
1341 (such as if the request came through a proxy).
1342 </P><!--#if expr="! $GUI
" -->
1343 <P></P><!--#endif -->
1345 REMOTE_ADDR = hostnumber
1346 hostnumber = ipv4-address | ipv6-address
1349 The definitions of <SAMP>ipv4-address</SAMP> and <SAMP>ipv6-address</SAMP>
1350 are provided in Appendix B of RFC 2373 [<A HREF="#[
13]
">13</A>].
1353 Servers MUST supply this value to scripts.
1362 The fully qualified domain name of the
1363 client sending the request to
1364 the server, if available, otherwise NULL.
1365 (See <A HREF="#
6.1.9">section 6.1.9</A>.)
1366 Fully qualified domain names take the form as described in
1367 section 3.5 of RFC 1034 [<A HREF="#[
10]
">10</A>] and section 2.1 of
1368 RFC 1123 [<A HREF="#[
5]
">5</A>]. Domain names are not case sensitive.
1371 Servers SHOULD provide this information to
1377 6.1.11. REMOTE_IDENT
1381 The identity information reported about the connection by a
1382 RFC 1413 [<A HREF="#[
11]
">11</A>] request to the remote agent, if
1385 to support this feature, or not to request the data
1386 for efficiency reasons.
1387 </P><!--#if expr="! $GUI
" -->
1388 <P></P><!--#endif -->
1390 REMOTE_IDENT = *CHAR
1394 may be used for authentication purposes, but the level
1395 of trust reposed in them should be minimal.
1398 Servers MAY supply this information to scripts if the
1399 RFC1413 [<A HREF="#[
11]
">11</A>] lookup is performed.
1408 If the request required authentication using the "Basic
"
1409 mechanism (<EM>i.e.</EM>, the AUTH_TYPE
1411 to "Basic
"), then the value of the REMOTE_USER
1412 metavariable is set to the
1413 user-ID supplied. In all other cases
1414 the value of this metavariable
1416 </P><!--#if expr="! $GUI
" -->
1417 <P></P><!--#endif -->
1419 REMOTE_USER = *OCTET
1422 This variable is specific to requests made <EM>via</EM> the
1426 Servers SHOULD provide this metavariable
1432 6.1.13. REQUEST_METHOD
1439 method with which the request was made, as described in section
1440 5.1.1 of the HTTP/1.0 specification [<A HREF="#[
3]
">3</A>] and
1441 section 5.1.1 of the
1442 HTTP/1.1 specification [<A HREF="#[
8]
">8</A>].
1443 </P><!--#if expr="! $GUI
" -->
1444 <P></P><!--#endif -->
1446 REQUEST_METHOD = http-method
1447 http-method = "GET
" | "HEAD
" | "POST
" | "PUT
" | "DELETE
"
1448 | "OPTIONS
" | "TRACE
" | extension-method
1449 extension-method = token
1452 The method is case sensitive.
1453 CGI/1.1 servers MAY choose to process some methods
1454 directly rather than passing them to scripts.
1457 This variable is specific to requests made with HTTP.
1460 Servers MUST provide this metavariable
1473 set to a URL path that could identify the CGI script (rather than the
1475 output). The syntax and semantics are identical to a
1476 decoded HTTP URL 'path' token
1478 [<A HREF="#[
4]
">4</A>]).
1479 </P><!--#if expr="! $GUI
" -->
1480 <P></P><!--#endif -->
1482 SCRIPT_NAME = "" | ( "/
" [ path ] )
1485 The SCRIPT_NAME string is some leading part of the <path> component
1486 of the Script-URI derived in some
1487 implementation defined manner.
1488 No PATH_INFO or QUERY_STRING segments
1489 (see sections <A HREF="#
6.1.6">6.1.6</A> and
1490 <A HREF="#
6.1.8">6.1.8</A>) are included
1491 in the SCRIPT_NAME value.
1494 Servers MUST provide this metavariable
1509 derived from the <host> part of the
1511 (see <A HREF="#
3.2">section 3.2</A>).
1512 </P><!--#if expr="! $GUI
" -->
1513 <P></P><!--#endif -->
1515 SERVER_NAME = hostname | hostnumber
1518 Servers MUST provide this metavariable
1532 request was received, as used in the <port>
1533 part of the Script-URI.
1534 </P><!--#if expr="! $GUI
" -->
1535 <P></P><!--#endif -->
1537 SERVER_PORT = 1*digit
1540 If the <port> portion of the script-URI is blank, the actual
1541 port number upon which the request was received MUST be supplied.
1544 Servers MUST provide this metavariable
1550 6.1.17. SERVER_PROTOCOL
1558 name and revision of the information protocol with which
1561 arrived. This is not necessarily the same as the protocol version used by
1562 the server in its response to the client.
1563 </P><!--#if expr="! $GUI
" -->
1564 <P></P><!--#endif -->
1566 SERVER_PROTOCOL = HTTP-Version | extension-version
1568 HTTP-Version = "HTTP
" "/
" 1*digit ".
" 1*digit
1569 extension-version = protocol "/
" 1*digit ".
" 1*digit
1570 protocol = 1*( alpha | digit | "+
" | "-
" | ".
" )
1571 extension-token = token
1574 'protocol' is a version of the <scheme> part of the
1576 not identical to it. For example, the scheme of a request may be
1577 "<SAMP>https
</SAMP>" while the protocol remains "<SAMP>http
</SAMP>".
1578 The protocol is not case sensitive, but
1579 by convention, 'protocol' is in
1583 A well-known extension token value is "INCLUDED
",
1584 which signals that the current document is being included as part of
1585 a composite document, rather than being the direct target of the
1589 Servers MUST provide this metavariable
1595 6.1.18. SERVER_SOFTWARE
1602 name and version of the information server software answering the
1603 request (and running the gateway).
1604 </P><!--#if expr="! $GUI
" -->
1605 <P></P><!--#endif -->
1607 SERVER_SOFTWARE = 1*product
1608 product = token [ "/
" product-version ]
1609 product-version = token
1612 Servers MUST provide this metavariable
1618 6.2. Request Message-Bodies
1622 As there may be a data entity attached to the request, there MUST be
1623 a system defined method for the script to read
1625 defined otherwise, this will be <EM>via</EM> the 'standard input' file
1629 If the CONTENT_LENGTH value (see <A HREF="#
6.1.2">section 6.1.2</A>)
1630 is non-NULL, the server MUST supply at least that many bytes to
1631 scripts on the standard input stream.
1633 not obliged to read the data.
1634 Servers MAY signal an EOF condition after CONTENT_LENGTH bytes have been
1636 not obligated to do so. Therefore, scripts
1638 attempt to read more than CONTENT_LENGTH bytes, even if more data
1642 For non-parsed header (NPH) scripts (see
1643 <A HREF="#
7.1">section 7.1</A>
1646 attempt to ensure that the data
1647 supplied to the script are precisely
1648 as supplied by the client and unaltered by
1652 <A HREF="#
8.1.2">Section 8.1.2</A> describes the requirements of
1653 servers with regard to requests that include
1659 7. Data Output from the CGI Script
1663 There MUST be a system defined method for the script to send data
1664 back to the server or client; a script MUST always return some data.
1665 Unless defined otherwise, this will be <EM>via</EM> the 'standard
1666 output' file descriptor.
1669 There are two forms of output that scripts can supply to servers: non-parsed
1670 header (NPH) output, and parsed header output.
1671 Servers MUST support parsed header
1672 output and MAY support NPH output. The method of
1673 distinguishing between the two
1674 types of output (or scripts) is implementation defined.
1677 Servers MAY implement a timeout period within which data must be
1678 received from scripts. If a server implementation defines such
1679 a timeout and receives no data from a script within the timeout
1680 period, the server MAY terminate the script process and SHOULD
1681 abort the client request with
1683 '504 Gateway Timed Out' or a
1684 '500 Internal Server Error' response.
1689 7.1. Non-Parsed Header Output
1693 Scripts using the NPH output form
1694 MUST return a complete HTTP response message, as described
1695 in Section 6 of the HTTP specifications
1696 [<A HREF="#[
3]
">3</A>,<A HREF="#[
8]
">8</A>].
1698 MUST use the SERVER_PROTOCOL variable to determine the appropriate format
1703 SHOULD attempt to ensure that the script output is sent
1704 directly to the client, with minimal
1705 internal and no transport-visible
1711 7.2. Parsed Header Output
1715 Scripts using the parsed header output form MUST supply
1716 a CGI response message to the server
1718 </P><!--#if expr="! $GUI
" -->
1719 <P></P><!--#endif -->
1721 CGI-Response = *optional-field CGI-Field *optional-field NL [ Message-Body ]
1722 optional-field = ( CGI-Field | HTTP-Field )
1723 CGI-Field = Content-type
1728 <P><!-- ##### If HTTP defines x-headers, remove ours except x-cgi- -->
1729 The response comprises a header and a body, separated by a blank line.
1730 The body may be NULL.
1731 The header fields are either CGI header fields to be interpreted by
1732 the server, or HTTP header fields
1733 to be included in the response returned
1735 if the request method is HTTP. At least one
1737 supplied, but no CGI field name may be used more than once
1739 If a body is supplied, then a "<SAMP>Content-type
</SAMP>"
1740 header field MUST be
1741 supplied by the script,
1742 otherwise the script MUST send a "<SAMP>Location
</SAMP>"
1743 or "<SAMP>Status
</SAMP>" header field. If a
1744 <SAMP>Location</SAMP> CGI-Field
1745 is returned, then the script MUST NOT supply
1749 Each header field in a CGI-Response MUST be specified on a single line;
1750 CGI/1.1 does not support continuation lines.
1755 7.2.1. CGI header fields
1759 The CGI header fields have the generic syntax:
1760 </P><!--#if expr="! $GUI
" -->
1761 <P></P><!--#endif -->
1763 generic-field = field-name ":
" [ field-value ] NL
1765 field-value = *( field-content | LWSP )
1766 field-content = *( token | tspecial | quoted-string )
1769 The field-name is not case sensitive; a NULL field value is
1770 equivalent to the header field not being sent.
1775 7.2.1.1. Content-Type
1779 The Internet Media Type [<A HREF="#[
9]
">9</A>] of the entity
1780 body, which is to be sent unmodified to the client.
1781 </P><!--#if expr="! $GUI
" -->
1782 <P></P><!--#endif -->
1784 Content-Type = "Content-Type
" ":
" media-type NL
1787 This is actually an HTTP-Field
1788 rather than a CGI-Field, but
1789 it is listed here because of its importance in the CGI dialogue as
1790 a member of the "one of these is required
" set of header
1800 This is used to specify to the server that the script is returning a
1801 reference to a document rather than an actual document.
1802 </P><!--#if expr="! $GUI
" -->
1803 <P></P><!--#endif -->
1805 Location = "Location
" ":
"
1806 ( fragment-URI | rel-URL-abs-path ) NL
1807 fragment-URI = URI [ # fragmentid ]
1808 URI = scheme ":
" *qchar
1810 rel-URL-abs-path = "/
" [ hpath ] [ "?
" query-string ]
1811 hpath = fpsegment *( "/
" psegment )
1814 hchar = alpha | digit | safe | extra
1815 | ":
" | "@
" | "& |
"="
1819 value is either an absolute URI with optional fragment,
1820 as defined in RFC
1630 [
<A HREF=
"#[1]">1</A>], or an absolute path
1821 within the server's URI space (
<EM>i.e.
</EM>,
1822 omitting the scheme and network-related fields) and optional
1823 query-string. If an absolute URI is returned by the script,
1825 server MUST generate a
1826 '
302 redirect' HTTP response
1827 message unless the script has supplied an
1828 explicit Status response header field.
1829 Scripts returning an absolute URI MAY choose to
1830 provide a message-body. Servers MUST make any appropriate modifications
1831 to the script's output to ensure the response to the user-agent complies
1832 with the response protocol version.
1833 If the Location value is a path, then the server
1835 the response that it would have produced in response to a request
1837 </P><!--#if expr="! $GUI" -->
1838 <P></P><!--#endif -->
1840 scheme
"://" SERVER_NAME
":" SERVER_PORT rel-URL-abs-path
1843 Note: If the request was accompanied by a
1845 (such as for a POST request), and the script
1846 redirects the request with a Location field, the
1849 available to the resource that is the target of the redirect.
1858 The
"<SAMP>Status</SAMP>" header field is used to indicate to the server what
1859 status code the server MUST use in the response message.
1860 </P><!--#if expr="! $GUI" -->
1861 <P></P><!--#endif -->
1863 Status =
"Status" ":" digit digit digit SP reason-phrase NL
1864 reason-phrase = *
<CHAR, excluding CTLs, NL
>
1867 The valid status codes are listed in section
6.1.1 of the HTTP/
1.0
1868 specifications [
<A HREF=
"#[3]">3</A>]. If the SERVER_PROTOCOL is
1869 "HTTP/1.1", then the status codes defined in the HTTP/
1.1
1870 specification [
<A HREF=
"#[8]">8</A>] may
1871 be used. If the script does not return a
"<SAMP>Status</SAMP>" header
1872 field, then
"200 OK" SHOULD be assumed by the server.
1875 If a script is being used to handle a particular error or condition
1876 encountered by the server, such as a '
404 Not Found' error, the script
1877 SHOULD use the
"<SAMP>Status</SAMP>" CGI header field to propagate the error
1878 condition back to the client.
<EM>E.g.
</EM>, in the example mentioned it
1879 SHOULD include a
"Status: 404 Not Found" in the
1880 header data returned to the server.
1885 7.2.1.4. Extension header fields
1889 Scripts MAY include in their CGI response header additional fields
1890 not defined in this or the HTTP specification.
1891 These are called
"extension" fields,
1892 and have the syntax of a
<SAMP>generic-field
</SAMP> as defined in
1893 <A HREF=
"#7.2.1">section
7.2.1</A>. The name of an extension field
1894 MUST NOT conflict with a field name defined in this or any other
1895 specification; extension field names SHOULD begin with
"X-CGI-"
1896 to ensure uniqueness.
1901 7.2.2. HTTP header fields
1905 The script MAY return any other header fields defined by the
1907 for the SERVER_PROTOCOL (HTTP/
1.0 [
<A HREF=
"#[3]">3</A>] or HTTP/
1.1
1908 [
<A HREF=
"#[8]">8</A>]).
1909 Servers MUST resolve conflicts beteen CGI header
1910 and HTTP header formats or names (see
<A HREF=
"#8.0">section
8</A>).
1915 8. Server Implementation
1919 This section defines the requirements that must be met by HTTP
1920 servers in order to provide a coherent and correct CGI/
1.1
1921 environment in which scripts may function. It is intended
1922 primarily for server implementors, but it is useful for
1923 script authors to be familiar with the information as well.
1928 8.1. Requirements for Servers
1932 In order to be considered CGI/
1.1-compliant, a server must meet
1933 certain basic criteria and provide certain minimal functionality.
1934 The details of these requirements are described in the following sections.
1943 Servers MUST support the standard mechanism (described below) which
1945 script authors to determine
1946 what URL to use in documents
1947 which reference the script;
1948 specifically, what URL to use in order to
1949 achieve particular settings of the
1951 mechanism is as follows:
1955 MUST translate the header data from the CGI header field syntax to
1957 header field syntax if these differ. For example, the character
1959 newline (such as Unix's ASCII NL) used by CGI scripts may not be the
1960 same as that used by HTTP (ASCII CR followed by LF). The server MUST
1961 also resolve any conflicts between header fields returned by the script
1962 and header fields that it would otherwise send itself.
1967 8.1.2. Request Message-body Handling
1971 These are the requirements for server handling of message-bodies directed
1972 to CGI/
1.1 resources:
1975 <LI>The message-body the server provides to the CGI script MUST
1976 have any transfer encodings removed.
1978 <LI>The server MUST derive and provide a value for the CONTENT_LENGTH
1979 metavariable that reflects the length of the message-body after any
1982 <LI>The server MUST leave intact any content-encodings of the message-body.
1988 8.1.3. Required Metavariables
1992 Servers MUST provide scripts with certain information and
1994 as described in
<A HREF=
"#8.3">section
8.3</A>.
1999 8.1.4. Response Compliance
2003 Servers MUST ensure that responses sent to the user-agent meet all
2004 requirements of the protocol level in effect. This may involve
2005 modifying, deleting, or augmenting any header
2006 fields and/or message-body supplied by the script.
2011 8.2. Recommendations for Servers
2015 Servers SHOULD provide the
"<SAMP>query</SAMP>" component of the script-URI
2016 as command-line arguments to scripts if it does not
2017 contain any unencoded '=' characters and the command-line arguments can
2018 be generated in an unambiguous manner.
2019 (See
<A HREF=
"#5.0">section
5</A>.)
2022 Servers SHOULD set the AUTH_TYPE
2023 metavariable to the value of the
2024 '
<SAMP>auth-scheme
</SAMP>' token of the
"<SAMP>Authorization</SAMP>"
2025 field if it was supplied as part of the request header.
2026 (See
<A HREF=
"#6.1.1">section
6.1.1</A>.)
2029 Where applicable, servers SHOULD set the current working directory
2030 to the directory in which the script is located before invoking
2034 Servers MAY reject with error '
404 Not Found'
2035 any requests that would result in
2036 an encoded
"/" being decoded into PATH_INFO or SCRIPT_NAME, as this
2037 might represent a loss of information to the script.
2040 Although the server and the CGI script need not be consistent in
2041 their handling of URL paths (client URLs and the PATH_INFO data,
2042 respectively), server authors may wish to impose consistency.
2043 So the server implementation SHOULD define its behaviour for the
2047 <LI>define any restrictions on allowed characters, in particular
2048 whether ASCII NUL is permitted;
2050 <LI>define any restrictions on allowed path segments, in particular
2051 whether non-terminal NULL segments are permitted;
2053 <LI>define the behaviour for
<SAMP>"."</SAMP> or
<SAMP>".."</SAMP> path
2054 segments;
<EM>i.e.
</EM>, whether they are prohibited, treated as
2056 segments or interpreted in accordance with the relative URL
2057 specification [
<A HREF=
"#[7]">7</A>];
2059 <LI>define any limits of the implementation, including limits on path or
2060 search string lengths, and limits on the volume of header data the server
2062 </LI><!-- ##### Move the field resolution/translation para below here -->
2065 Servers MAY generate the
2067 any way from the client URI,
2068 or from any other data (but the behaviour SHOULD be documented).
2071 For non-parsed header (NPH) scripts (see
2072 <A HREF=
"#7.1">section
7.1</A>), servers SHOULD
2073 attempt to ensure that the script input comes directly from the
2074 client, with minimal buffering. For all scripts the data will be
2075 as supplied by the client.
2085 Servers MUST provide the following
2087 scripts. See the individual descriptions for exceptions and semantics.
2088 </P><!--#if expr="! $GUI" -->
2089 <P></P><!--#endif -->
2091 CONTENT_LENGTH (section
<A HREF=
"#6.1.2">6.1.2</A>)
2092 CONTENT_TYPE (section
<A HREF=
"#6.1.3">6.1.3</A>)
2093 GATEWAY_INTERFACE (section
<A HREF=
"#6.1.4">6.1.4</A>)
2094 PATH_INFO (section
<A HREF=
"#6.1.6">6.1.6</A>)
2095 QUERY_STRING (section
<A HREF=
"#6.1.8">6.1.8</A>)
2096 REMOTE_ADDR (section
<A HREF=
"#6.1.9">6.1.9</A>)
2097 REQUEST_METHOD (section
<A HREF=
"#6.1.13">6.1.13</A>)
2098 SCRIPT_NAME (section
<A HREF=
"#6.1.14">6.1.14</A>)
2099 SERVER_NAME (section
<A HREF=
"#6.1.15">6.1.15</A>)
2100 SERVER_PORT (section
<A HREF=
"#6.1.16">6.1.16</A>)
2101 SERVER_PROTOCOL (section
<A HREF=
"#6.1.17">6.1.17</A>)
2102 SERVER_SOFTWARE (section
<A HREF=
"#6.1.18">6.1.18</A>)
2105 Servers SHOULD define the following
2106 metavariables for scripts.
2107 See the individual descriptions for exceptions and semantics.
2108 </P><!--#if expr="! $GUI" -->
2109 <P></P><!--#endif -->
2111 AUTH_TYPE (section
<A HREF=
"#6.1.1">6.1.1</A>)
2112 REMOTE_HOST (section
<A HREF=
"#6.1.10">6.1.10</A>)
2115 In addition, servers SHOULD provide
2116 metavariables for all fields present
2117 in the HTTP request header, with the exception of those involved with
2118 access control. Servers MAY at their discretion provide
2120 for access control fields.
2123 Servers MAY define the following
2124 metavariables. See the individual
2125 descriptions for exceptions and semantics.
2126 </P><!--#if expr="! $GUI" -->
2127 <P></P><!--#endif -->
2129 PATH_TRANSLATED (section
<A HREF=
"#6.1.7">6.1.7</A>)
2130 REMOTE_IDENT (section
<A HREF=
"#6.1.11">6.1.11</A>)
2131 REMOTE_USER (section
<A HREF=
"#6.1.12">6.1.12</A>)
2135 at their discretion define additional implementation-specific
2136 extension metavariables
2137 provided their names do not
2138 conflict with defined header field names. Implementation-specific
2139 metavariable names SHOULD
2140 be prefixed with
"X_" (
<EM>e.g.
</EM>,
2141 "X_DBA") to avoid the potential for such conflicts.
2147 Script Implementation
2151 This section defines the requirements and recommendations for scripts
2152 that are intended to function in a CGI/
1.1 environment. It is intended
2153 primarily as a reference for script authors, but server implementors
2154 should be familiar with these issues as well.
2159 9.1. Requirements for Scripts
2163 Scripts using the parsed-header method to communicate with servers
2164 MUST supply a response header to the server.
2165 (See
<A HREF=
"#7.0">section
7</A>.)
2168 Scripts using the NPH method to communicate with servers MUST
2169 provide complete HTTP responses, and MUST use the value of the
2170 SERVER_PROTOCOL metavariable
2171 to determine the appropriate format.
2172 (See
<A HREF=
"#7.1">section
7.1</A>.)
2175 Scripts MUST check the value of the REQUEST_METHOD
2176 metavariable in order
2177 to provide an appropriate response.
2178 (See
<A HREF=
"#6.1.13">section
6.1.13</A>.)
2181 Scripts MUST be prepared to handled URL-encoded values in
2183 In addition, they MUST recognise both
"+" and
"%20" in URL-encoded
2184 quantities as representing the space character.
2185 (See
<A HREF=
"#3.1">section
3.1</A>.)
2188 Scripts MUST ignore leading zeros in the major and minor version numbers
2189 in the GATEWAY_INTERFACE
2190 metavariable value. (See
2191 <A HREF=
"#6.1.4">section
6.1.4</A>.)
2194 When processing requests that include a
2195 message-body, scripts
2196 MUST NOT read more than CONTENT_LENGTH bytes from the input stream.
2197 (See sections
<A HREF=
"#6.1.2">6.1.2</A> and
<A HREF=
"#6.2">6.2</A>.)
2202 9.2. Recommendations for Scripts
2206 Servers may interrupt or terminate script execution at any time
2207 and without warning, so scripts SHOULD be prepared to deal with
2208 abnormal termination.
2213 error '
405 Method Not
2215 made using methods that they do not support. If the script does
2217 processing the PATH_INFO data, then it SHOULD reject the request with
2219 Found' if PATH_INFO is not NULL.
2222 If a script is processing the output of a form, it SHOULD
2223 verify that the CONTENT_TYPE
2224 is
"<SAMP>application/x-www-form-urlencoded</SAMP>" [
<A HREF=
"#[2]">2</A>]
2225 or whatever other media type is expected.
2228 Scripts parsing PATH_INFO,
2229 PATH_TRANSLATED, or SCRIPT_NAME
2231 of void path segments (
"<SAMP>//</SAMP>") and special path segments
2232 (
<SAMP>"."</SAMP> and
2233 <SAMP>".."</SAMP>). They SHOULD either be removed from the path before
2235 system calls, or the request SHOULD be rejected with
2239 As it is impossible for
2240 scripts to determine the client URI that
2242 request without knowledge of the specific server in
2243 use, the script SHOULD NOT return
"<SAMP>text/html</SAMP>"
2244 documents containing
2245 relative URL links without including a
"<SAMP><BASE></SAMP>"
2246 tag in the document.
2249 When returning header fields,
2250 scripts SHOULD try to send the CGI
2251 header fields (see section
2252 <A HREF=
"#7.2">7.2</A>) as soon as possible, and
2254 before any HTTP header fields. This may
2255 help reduce the server's memory requirements.
2260 10. System Specifications
2270 The implementation of the CGI on an AmigaDOS operating system platform
2271 SHOULD use environment variables as the mechanism of providing
2272 request metadata to CGI scripts.
2275 <DT><STRONG>Environment variables
</STRONG>
2279 These are accessed by the DOS library routine
<SAMP>GetVar
</SAMP>. The
2280 flags argument SHOULD be
0. Case is ignored, but upper case is
2281 recommended for compatibility with case-sensitive systems.
2284 <DT><STRONG>The current working directory
</STRONG>
2288 The current working directory for the script is set to the directory
2289 containing the script.
2292 <DT><STRONG>Character set
</STRONG>
2296 The US-ASCII character set is used for the definition of environment
2297 variable names and header
2298 field names; the newline (NL) sequence is LF;
2299 servers SHOULD also accept CR LF as a newline.
2310 The implementation of the CGI on a UNIX operating system platform
2311 SHOULD use environment variables as the mechanism of providing
2312 request metadata to CGI scripts.
2315 For Unix compatible operating systems, the following are defined:
2318 <DT><STRONG>Environment variables
</STRONG>
2322 These are accessed by the C library routine
<SAMP>getenv
</SAMP>.
2325 <DT><STRONG>The command line
</STRONG>
2329 This is accessed using the
2330 <SAMP>argc
</SAMP> and
<SAMP>argv
</SAMP>
2331 arguments to
<SAMP>main()
</SAMP>. The words have any characters
2333 are 'active' in the Bourne shell escaped with a backslash.
2334 If the value of the QUERY_STRING
2336 contains an unencoded equals-sign '=', then the command line
2337 SHOULD NOT be used by the script.
2340 <DT><STRONG>The current working directory
</STRONG>
2344 The current working directory for the script
2345 SHOULD be set to the directory
2346 containing the script.
2349 <DT><STRONG>Character set
</STRONG>
2353 The US-ASCII character set is used for the definition of environment
2354 variable names and header field names; the newline (NL) sequence is LF;
2355 servers SHOULD also accept CR LF as a newline.
2362 11. Security Considerations
2372 As discussed in the security considerations of the HTTP
2373 specifications [
<A HREF=
"#[3]">3</A>,
<A HREF=
"#[8]">8</A>], the
2374 convention has been established that the
2375 GET and HEAD methods should be 'safe'; they should cause no
2376 side-effects and only have the significance of resource retrieval.
2379 CGI scripts are responsible for enforcing any HTTP security considerations
2380 [
<A HREF=
"#[3]">3</A>,
<A HREF=
"#[8]">8</A>]
2381 with respect to the protocol version level of the request and
2382 any side effects generated by the scripts on behalf of
2385 are the considerations of safe and idempotent methods. Idempotent
2386 requests are those that may be repeated an arbitrary number of times
2387 and produce side effects identical to a single request.
2393 Fields Containing Sensitive Information
2397 Some HTTP header fields may carry sensitive information which the server
2398 SHOULD NOT pass on to the script unless explicitly configured to do
2399 so. For example, if the server protects the script using the
2400 "<SAMP>Basic</SAMP>"
2401 authentication scheme, then the client will send an
2402 "<SAMP>Authorization</SAMP>"
2403 header field containing a username and password. If the server, rather
2404 than the script, validates this information then the password SHOULD
2405 NOT be passed on to the script
<EM>via
</EM> the HTTP_AUTHORIZATION
2407 without careful consideration.
2408 This also applies to the
2409 Proxy-Authorization header field and the corresponding
2410 HTTP_PROXY_AUTHORIZATION
2417 Interference with the Server
2421 The most common implementation of CGI invokes the script as a child
2422 process using the same user and group as the server process. It
2423 SHOULD therefore be ensured that the script cannot interfere with the
2424 server process, its configuration, or documents.
2427 If the script is executed by calling a function linked in to the
2428 server software (either at compile-time or run-time) then precautions
2429 SHOULD be taken to protect the core memory of the server, or to
2430 ensure that untrusted code cannot be executed.
2435 11.4. Data Length and Buffering Considerations
2439 This specification places no limits on the length of message-bodies
2440 presented to the script. Scripts should not assume that statically
2441 allocated buffers of any size are sufficient to contain the entire
2442 submission at one time. Use of a fixed length buffer without careful
2443 overflow checking may result in an attacker exploiting 'stack-smashing'
2444 or 'stack-overflow' vulnerabilities of the operating system.
2445 Scripts may spool large submissions to disk or other buffering media,
2446 but a rapid succession of large submissions may result in denial of
2447 service conditions. If the CONTENT_LENGTH of a message-body is larger
2448 than resource considerations allow, scripts should respond with an
2449 error status appropriate for the protocol version; potentially applicable
2450 status codes include '
503 Service Unavailable' (HTTP/
1.0 and HTTP/
1.1),
2451 '
413 Request Entity Too Large' (HTTP/
1.1), and
2452 '
414 Request-URI Too Long' (HTTP/
1.1).
2457 11.5. Stateless Processing
2461 The stateless nature of the Web makes each script execution and resource
2462 retrieval independent of all others even when multiple requests constitute a
2463 single conceptual Web transaction. Because of this, a script should not
2464 make any assumptions about the context of the user-agent submitting a
2465 request. In particular, scripts should examine data obtained from the client
2466 and verify that they are valid, both in form and content, before allowing
2467 them to be used for sensitive purposes such as input to other
2468 applications, commands, or operating system services. These uses
2469 include, but are not
2470 limited to: system call arguments, database writes, dynamically evaluated
2471 source code, and input to billing or other secure processes. It is important
2472 that applications be protected from invalid input regardless of whether
2473 the invalidity is the result of user error, logic error, or malicious action.
2476 Authors of scripts involved in multi-request transactions should be
2477 particularly cautios about validating the state information;
2478 undesirable effects may result from the substitution of dangerous
2479 values for portions of the submission which might otherwise be
2480 presumed safe. Subversion of this type occurs when alterations
2481 are made to data from a prior stage of the transaction that were
2482 not meant to be controlled by the client (
<EM>e.g.
</EM>, hidden
2483 HTML form elements, cookies, embedded URLs,
<EM>etc.
</EM>).
2488 12. Acknowledgements
2492 This work is based on a draft published in
1997 by David R. Robinson,
2493 which in turn was based on the original CGI interface that arose out of
2494 discussions on the
<EM>www-talk
</EM> mailing list. In particular,
2495 Rob McCool, John Franks, Ari Luotonen,
2497 Tony Sanders deserve special recognition for their efforts in
2498 defining and implementing the early versions of this interface.
2501 This document has also greatly benefited from the comments and
2502 suggestions made by Chris Adie, Dave Kristol,
2503 Mike Meyer, David Morris, Jeremy Madea,
2504 Patrick M
<SUP>c
</SUP>Manus, Adam Donahue,
2505 Ross Patterson, and Harald Alvestrand.
2514 <DT><A NAME=
"[1]">[
1]
</A>
2516 <DD>Berners-Lee, T., 'Universal Resource Identifiers in WWW: A
2517 Unifying Syntax for the Expression of Names and Addresses of
2518 Objects on the Network as used in the World-Wide Web', RFC
1630,
2523 <DT><A NAME=
"[2]">[
2]
</A>
2525 <DD>Berners-Lee, T. and Connolly, D., 'Hypertext Markup Language -
2526 2.0', RFC
1866, MIT/W3C, November
1995.
2530 <DT><A NAME=
"[3]">[
3]
</A>
2532 <DD>Berners-Lee, T., Fielding, R. T. and Frystyk, H.,
2533 'Hypertext Transfer Protocol -- HTTP/
1.0', RFC
1945, MIT/LCS,
2534 UC Irvine, May
1996.
2539 <DT><A NAME=
"[4]">[
4]
</A>
2541 <DD>Berners-Lee, T., Fielding, R., and Masinter, L., Editors,
2542 'Uniform Resource Identifiers (URI): Generic Syntax', RFC
2396,
2543 MIT, U.C. Irvine, Xerox Corporation, August
1996.
2548 <DT><A NAME=
"[5]">[
5]
</A>
2550 <DD>Braden, R., Editor, 'Requirements for Internet Hosts --
2551 Application and Support', STD
3, RFC
1123, IETF, October
1989.
2555 <DT><A NAME=
"[6]">[
6]
</A>
2557 <DD>Crocker, D.H., 'Standard for the Format of ARPA Internet Text
2558 Messages', STD
11, RFC
822, University of Delaware, August
1982.
2562 <DT><A NAME=
"[7]">[
7]
</A>
2564 <DD>Fielding, R., 'Relative Uniform Resource Locators', RFC
1808,
2565 UC Irvine, June
1995.
2569 <DT><A NAME=
"[8]">[
8]
</A>
2571 <DD>Fielding, R., Gettys, J., Mogul, J., Frystyk, H. and
2572 Berners-Lee, T., 'Hypertext Transfer Protocol -- HTTP/
1.1',
2573 RFC
2068, UC Irvine, DEC,
2574 MIT/LCS, January
1997.
2578 <DT><A NAME=
"[9]">[
9]
</A>
2580 <DD>Freed, N. and Borenstein N., 'Multipurpose Internet Mail
2581 Extensions (MIME) Part Two: Media Types', RFC
2046, Innosoft,
2582 First Virtual, November
1996.
2586 <DT><A NAME=
"[10]">[
10]
</A>
2588 <DD>Mockapetris, P., 'Domain Names - Concepts and Facilities',
2589 STD
13, RFC
1034, ISI, November
1987.
2593 <DT><A NAME=
"[11]">[
11]
</A>
2595 <DD>St. Johns, M., 'Identification Protocol', RFC
1431, US
2596 Department of Defense, February
1993.
2600 <DT><A NAME=
"[12]">[
12]
</A>
2602 <DD>'Coded Character Set --
7-bit American Standard Code for
2603 Information Interchange', ANSI X3.4-
1986.
2607 <DT><A NAME=
"[13]">[
13]
</A>
2609 <DD>Hinden, R. and Deering, S.,
2610 'IP Version
6 Addressing Architecture', RFC
2373,
2611 Nokia, Cisco Systems,
2620 14. Authors' Addresses
2629 7824 Mayfaire Crest Lane, Suite
202
2631 Raleigh, NC
27615-
4875
2636 Tel: +
1 (
919)
254.4237
2638 Fax: +
1 (
919)
254.5250
2642 HREF=
"mailto:Ken.Coar@Golux.Com"
2643 ><SAMP>Ken.Coar@Golux.Com
</SAMP></A>
2652 Mount Pleasant House
2663 Tel: +
44 (
1223)
566926
2665 Fax: +
44 (
1223)
506288
2669 HREF=
"mailto:drtr@etrade.co.uk"
2670 ><SAMP>drtr@etrade.co.uk
</SAMP></A>