Doc/lib/emailmessage.tex

   1 \declaremodule{standard}{email.Message}
   2 \modulesynopsis{The base class representing email messages.}
   3
   4 The central class in the \module{email} package is the
   5 \class{Message} class; it is the base class for the \module{email}
   6 object model.  \class{Message} provides the core functionality for
   7 setting and querying header fields, and for accessing message bodies.
   8
   9 Conceptually, a \class{Message} object consists of \emph{headers} and
  10 \emph{payloads}.  Headers are \rfc{2822} style field names and
  11 values where the field name and value are separated by a colon.  The
  12 colon is not part of either the field name or the field value.
  13
  14 Headers are stored and returned in case-preserving form but are
  15 matched case-insensitively.  There may also be a single
  16 \emph{Unix-From} header, also known as the envelope header or the
  17 \code{From_} header.  The payload is either a string in the case of
  18 simple message objects, a list of \class{Message} objects for
  19 multipart MIME documents, or a single \class{Message} instance for
  20 \mimetype{message/rfc822} type objects.
  21
  22 \class{Message} objects provide a mapping style interface for
  23 accessing the message headers, and an explicit interface for accessing
  24 both the headers and the payload.  It provides convenience methods for
  25 generating a flat text representation of the message object tree, for
  26 accessing commonly used header parameters, and for recursively walking
  27 over the object tree.
  28
  29 Here are the methods of the \class{Message} class:
  30
  31 \begin{classdesc}{Message}{}
  32 The constructor takes no arguments.
  33 \end{classdesc}
  34
  35 \begin{methoddesc}[Message]{as_string}{\optional{unixfrom}}
  36 Return the entire formatted message as a string.  Optional
  37 \var{unixfrom}, when true, specifies to include the \emph{Unix-From}
  38 envelope header; it defaults to 0.
  39 \end{methoddesc}
  40
  41 \begin{methoddesc}[Message]{__str__}{}
  42 Equivalent to \method{aMessage.as_string(unixfrom=1)}.
  43 \end{methoddesc}
  44
  45 \begin{methoddesc}[Message]{is_multipart}{}
  46 Return 1 if the message's payload is a list of sub-\class{Message}
  47 objects, otherwise return 0.  When \method{is_multipart()} returns 0,
  48 the payload should either be a string object, or a single
  49 \class{Message} instance.
  50 \end{methoddesc}
  51
  52 \begin{methoddesc}[Message]{set_unixfrom}{unixfrom}
  53 Set the \emph{Unix-From} (a.k.a envelope header or \code{From_}
  54 header) to \var{unixfrom}, which should be a string.
  55 \end{methoddesc}
  56
  57 \begin{methoddesc}[Message]{get_unixfrom}{}
  58 Return the \emph{Unix-From} header.  Defaults to \code{None} if the
  59 \emph{Unix-From} header was never set.
  60 \end{methoddesc}
  61
  62 \begin{methoddesc}[Message]{add_payload}{payload}
  63 Add \var{payload} to the message object's existing payload.  If, prior
  64 to calling this method, the object's payload was \code{None}
  65 (i.e. never before set), then after this method is called, the payload
  66 will be the argument \var{payload}.
  67
  68 If the object's payload was already a list
  69 (i.e. \method{is_multipart()} returns 1), then \var{payload} is
  70 appended to the end of the existing payload list.
  71
  72 For any other type of existing payload, \method{add_payload()} will
  73 transform the new payload into a list consisting of the old payload
  74 and \var{payload}, but only if the document is already a MIME
  75 multipart document.  This condition is satisfied if the message's
  76 \mailheader{Content-Type} header's main type is either
  77 \mimetype{multipart}, or there is no \mailheader{Content-Type}
  78 header.  In any other situation,
  79 \exception{MultipartConversionError} is raised.
  80 \end{methoddesc}
  81
  82 \begin{methoddesc}[Message]{attach}{payload}
  83 Synonymous with \method{add_payload()}.
  84 \end{methoddesc}
  85
  86 \begin{methoddesc}[Message]{get_payload}{\optional{i\optional{, decode}}}
  87 Return the current payload, which will be a list of \class{Message}
  88 objects when \method{is_multipart()} returns 1, or a scalar (either a
  89 string or a single \class{Message} instance) when
  90 \method{is_multipart()} returns 0.
  91
  92 With optional \var{i}, \method{get_payload()} will return the
  93 \var{i}-th element of the payload, counting from zero, if
  94 \method{is_multipart()} returns 1.  An \exception{IndexError} will be raised
  95 if \var{i} is less than 0 or greater than or equal to the number of
  96 items in the payload.  If the payload is scalar
  97 (i.e. \method{is_multipart()} returns 0) and \var{i} is given, a
  98 \exception{TypeError} is raised.
  99
 100 Optional \var{decode} is a flag indicating whether the payload should be
 101 decoded or not, according to the \mailheader{Content-Transfer-Encoding} header.
 102 When true and the message is not a multipart, the payload will be
 103 decoded if this header's value is \samp{quoted-printable} or
 104 \samp{base64}.  If some other encoding is used, or
 105 \mailheader{Content-Transfer-Encoding} header is
 106 missing, the payload is returned as-is (undecoded).  If the message is
 107 a multipart and the \var{decode} flag is true, then \code{None} is
 108 returned.
 109 \end{methoddesc}
 110
 111 \begin{methoddesc}[Message]{set_payload}{payload}
 112 Set the entire message object's payload to \var{payload}.  It is the
 113 client's responsibility to ensure the payload invariants.
 114 \end{methoddesc}
 115
 116 The following methods implement a mapping-like interface for accessing
 117 the message object's \rfc{2822} headers.  Note that there are some
 118 semantic differences between these methods and a normal mapping
 119 (i.e. dictionary) interface.  For example, in a dictionary there are
 120 no duplicate keys, but here there may be duplicate message headers.  Also,
 121 in dictionaries there is no guaranteed order to the keys returned by
 122 \method{keys()}, but in a \class{Message} object, there is an explicit
 123 order.  These semantic differences are intentional and are biased
 124 toward maximal convenience.
 125
 126 Note that in all cases, any optional \emph{Unix-From} header the message
 127 may have is not included in the mapping interface.
 128
 129 \begin{methoddesc}[Message]{__len__}{}
 130 Return the total number of headers, including duplicates.
 131 \end{methoddesc}
 132
 133 \begin{methoddesc}[Message]{__contains__}{name}
 134 Return true if the message object has a field named \var{name}.
 135 Matching is done case-insensitively and \var{name} should not include the
 136 trailing colon.  Used for the \code{in} operator,
 137 e.g.:
 138
 139 \begin{verbatim}
 140 if 'message-id' in myMessage:
 141     print 'Message-ID:', myMessage['message-id']
 142 \end{verbatim}
 143 \end{methoddesc}
 144
 145 \begin{methoddesc}[Message]{__getitem__}{name}
 146 Return the value of the named header field.  \var{name} should not
 147 include the colon field separator.  If the header is missing,
 148 \code{None} is returned; a \exception{KeyError} is never raised.
 149
 150 Note that if the named field appears more than once in the message's
 151 headers, exactly which of those field values will be returned is
 152 undefined.  Use the \method{get_all()} method to get the values of all
 153 the extant named headers.
 154 \end{methoddesc}
 155
 156 \begin{methoddesc}[Message]{__setitem__}{name, val}
 157 Add a header to the message with field name \var{name} and value
 158 \var{val}.  The field is appended to the end of the message's existing
 159 fields.
 160
 161 Note that this does \emph{not} overwrite or delete any existing header
 162 with the same name.  If you want to ensure that the new header is the
 163 only one present in the message with field name
 164 \var{name}, first use \method{__delitem__()} to delete all named
 165 fields, e.g.:
 166
 167 \begin{verbatim}
 168 del msg['subject']
 169 msg['subject'] = 'Python roolz!'
 170 \end{verbatim}
 171 \end{methoddesc}
 172
 173 \begin{methoddesc}[Message]{__delitem__}{name}
 174 Delete all occurrences of the field with name \var{name} from the
 175 message's headers.  No exception is raised if the named field isn't
 176 present in the headers.
 177 \end{methoddesc}
 178
 179 \begin{methoddesc}[Message]{has_key}{name}
 180 Return 1 if the message contains a header field named \var{name},
 181 otherwise return 0.
 182 \end{methoddesc}
 183
 184 \begin{methoddesc}[Message]{keys}{}
 185 Return a list of all the message's header field names.  These keys
 186 will be sorted in the order in which they were added to the message
 187 via \method{__setitem__()}, and may contain duplicates.  Any fields
 188 deleted and then subsequently re-added are always appended to the end
 189 of the header list.
 190 \end{methoddesc}
 191
 192 \begin{methoddesc}[Message]{values}{}
 193 Return a list of all the message's field values.  These will be sorted
 194 in the order in which they were added to the message via
 195 \method{__setitem__()}, and may contain duplicates.  Any fields
 196 deleted and then subsequently re-added are always appended to the end
 197 of the header list.
 198 \end{methoddesc}
 199
 200 \begin{methoddesc}[Message]{items}{}
 201 Return a list of 2-tuples containing all the message's field headers and
 202 values.  These will be sorted in the order in which they were added to
 203 the message via \method{__setitem__()}, and may contain duplicates.
 204 Any fields deleted and then subsequently re-added are always appended
 205 to the end of the header list.
 206 \end{methoddesc}
 207
 208 \begin{methoddesc}[Message]{get}{name\optional{, failobj}}
 209 Return the value of the named header field.  This is identical to
 210 \method{__getitem__()} except that optional \var{failobj} is returned
 211 if the named header is missing (defaults to \code{None}).
 212 \end{methoddesc}
 213
 214 Here are some additional useful methods:
 215
 216 \begin{methoddesc}[Message]{get_all}{name\optional{, failobj}}
 217 Return a list of all the values for the field named \var{name}.  These
 218 will be sorted in the order in which they were added to the message
 219 via \method{__setitem__()}.  Any fields
 220 deleted and then subsequently re-added are always appended to the end
 221 of the list.
 222
 223 If there are no such named headers in the message, \var{failobj} is
 224 returned (defaults to \code{None}).
 225 \end{methoddesc}
 226
 227 \begin{methoddesc}[Message]{add_header}{_name, _value, **_params}
 228 Extended header setting.  This method is similar to
 229 \method{__setitem__()} except that additional header parameters can be
 230 provided as keyword arguments.  \var{_name} is the header to set and
 231 \var{_value} is the \emph{primary} value for the header.
 232
 233 For each item in the keyword argument dictionary \var{_params}, the
 234 key is taken as the parameter name, with underscores converted to
 235 dashes (since dashes are illegal in Python identifiers).  Normally,
 236 the parameter will be added as \code{key="value"} unless the value is
 237 \code{None}, in which case only the key will be added.
 238
 239 Here's an example:
 240
 241 \begin{verbatim}
 242 msg.add_header('Content-Disposition', 'attachment', filename='bud.gif')
 243 \end{verbatim}
 244
 245 This will add a header that looks like
 246
 247 \begin{verbatim}
 248 Content-Disposition: attachment; filename="bud.gif"
 249 \end{verbatim}
 250 \end{methoddesc}
 251
 252 \begin{methoddesc}[Message]{get_type}{\optional{failobj}}
 253 Return the message's content type, as a string of the form
 254 \mimetype{maintype/subtype} as taken from the
 255 \mailheader{Content-Type} header.
 256 The returned string is coerced to lowercase.
 257
 258 If there is no \mailheader{Content-Type} header in the message,
 259 \var{failobj} is returned (defaults to \code{None}).
 260 \end{methoddesc}
 261
 262 \begin{methoddesc}[Message]{get_main_type}{\optional{failobj}}
 263 Return the message's \emph{main} content type.  This essentially returns the
 264 \var{maintype} part of the string returned by \method{get_type()}, with the
 265 same semantics for \var{failobj}.
 266 \end{methoddesc}
 267
 268 \begin{methoddesc}[Message]{get_subtype}{\optional{failobj}}
 269 Return the message's sub-content type.  This essentially returns the
 270 \var{subtype} part of the string returned by \method{get_type()}, with the
 271 same semantics for \var{failobj}.
 272 \end{methoddesc}
 273
 274 \begin{methoddesc}[Message]{get_params}{\optional{failobj\optional{, header}}}
 275 Return the message's \mailheader{Content-Type} parameters, as a list.  The
 276 elements of the returned list are 2-tuples of key/value pairs, as
 277 split on the \character{=} sign.  The left hand side of the
 278 \character{=} is the key, while the right hand side is the value.  If
 279 there is no \character{=} sign in the parameter the value is the empty
 280 string.  The value is always unquoted with \method{Utils.unquote()}.
 281
 282 Optional \var{failobj} is the object to return if there is no
 283 \mailheader{Content-Type} header.  Optional \var{header} is the header to
 284 search instead of \mailheader{Content-Type}.
 285 \end{methoddesc}
 286
 287 \begin{methoddesc}[Message]{get_param}{param\optional{,
 288     failobj\optional{, header}}}
 289 Return the value of the \mailheader{Content-Type} header's parameter
 290 \var{param} as a string.  If the message has no \mailheader{Content-Type}
 291 header or if there is no such parameter, then \var{failobj} is
 292 returned (defaults to \code{None}).
 293
 294 Optional \var{header} if given, specifies the message header to use
 295 instead of \mailheader{Content-Type}.
 296 \end{methoddesc}
 297
 298 \begin{methoddesc}[Message]{get_charsets}{\optional{failobj}}
 299 Return a list containing the character set names in the message.  If
 300 the message is a \mimetype{multipart}, then the list will contain one
 301 element for each subpart in the payload, otherwise, it will be a list
 302 of length 1.
 303
 304 Each item in the list will be a string which is the value of the
 305 \code{charset} parameter in the \mailheader{Content-Type} header for the
 306 represented subpart.  However, if the subpart has no
 307 \mailheader{Content-Type} header, no \code{charset} parameter, or is not of
 308 the \mimetype{text} main MIME type, then that item in the returned list
 309 will be \var{failobj}.
 310 \end{methoddesc}
 311
 312 \begin{methoddesc}[Message]{get_filename}{\optional{failobj}}
 313 Return the value of the \code{filename} parameter of the
 314 \mailheader{Content-Disposition} header of the message, or \var{failobj} if
 315 either the header is missing, or has no \code{filename} parameter.
 316 The returned string will always be unquoted as per
 317 \method{Utils.unquote()}.
 318 \end{methoddesc}
 319
 320 \begin{methoddesc}[Message]{get_boundary}{\optional{failobj}}
 321 Return the value of the \code{boundary} parameter of the
 322 \mailheader{Content-Type} header of the message, or \var{failobj} if either
 323 the header is missing, or has no \code{boundary} parameter.  The
 324 returned string will always be unquoted as per
 325 \method{Utils.unquote()}.
 326 \end{methoddesc}
 327
 328 \begin{methoddesc}[Message]{set_boundary}{boundary}
 329 Set the \code{boundary} parameter of the \mailheader{Content-Type} header
 330 to \var{boundary}.  \method{set_boundary()} will always quote
 331 \var{boundary} so you should not quote it yourself.  A
 332 \exception{HeaderParseError} is raised if the message object has no
 333 \mailheader{Content-Type} header.
 334
 335 Note that using this method is subtly different than deleting the old
 336 \mailheader{Content-Type} header and adding a new one with the new boundary
 337 via \method{add_header()}, because \method{set_boundary()} preserves the
 338 order of the \mailheader{Content-Type} header in the list of headers.
 339 However, it does \emph{not} preserve any continuation lines which may
 340 have been present in the original \mailheader{Content-Type} header.
 341 \end{methoddesc}
 342
 343 \begin{methoddesc}[Message]{walk}{}
 344 The \method{walk()} method is an all-purpose generator which can be
 345 used to iterate over all the parts and subparts of a message object
 346 tree, in depth-first traversal order.  You will typically use
 347 \method{walk()} as the iterator in a \code{for ... in} loop; each
 348 iteration returns the next subpart.
 349
 350 Here's an example that prints the MIME type of every part of a message
 351 object tree:
 352
 353 \begin{verbatim}
 354 >>> for part in msg.walk():
 355 >>>     print part.get_type('text/plain')
 356 multipart/report
 357 text/plain
 358 message/delivery-status
 359 text/plain
 360 text/plain
 361 message/rfc822
 362 \end{verbatim}
 363 \end{methoddesc}
 364
 365 \class{Message} objects can also optionally contain two instance
 366 attributes, which can be used when generating the plain text of a MIME
 367 message.
 368
 369 \begin{datadesc}{preamble}
 370 The format of a MIME document allows for some text between the blank
 371 line following the headers, and the first multipart boundary string.
 372 Normally, this text is never visible in a MIME-aware mail reader
 373 because it falls outside the standard MIME armor.  However, when
 374 viewing the raw text of the message, or when viewing the message in a
 375 non-MIME aware reader, this text can become visible.
 376
 377 The \var{preamble} attribute contains this leading extra-armor text
 378 for MIME documents.  When the \class{Parser} discovers some text after
 379 the headers but before the first boundary string, it assigns this text
 380 to the message's \var{preamble} attribute.  When the \class{Generator}
 381 is writing out the plain text representation of a MIME message, and it
 382 finds the message has a \var{preamble} attribute, it will write this
 383 text in the area between the headers and the first boundary.
 384
 385 Note that if the message object has no preamble, the
 386 \var{preamble} attribute will be \code{None}.
 387 \end{datadesc}
 388
 389 \begin{datadesc}{epilogue}
 390 The \var{epilogue} attribute acts the same way as the \var{preamble}
 391 attribute, except that it contains text that appears between the last
 392 boundary and the end of the message.
 393
 394 One note: when generating the flat text for a \mimetype{multipart}
 395 message that has no \var{epilogue} (using the standard
 396 \class{Generator} class), no newline is added after the closing
 397 boundary line.  If the message object has an \var{epilogue} and its
 398 value does not start with a newline, a newline is printed after the
 399 closing boundary.  This seems a little clumsy, but it makes the most
 400 practical sense.  The upshot is that if you want to ensure that a
 401 newline get printed after your closing \mimetype{multipart} boundary,
 402 set the \var{epilogue} to the empty string.
 403 \end{datadesc}