1 \section{Standard Module
\module{rfc822
}}
6 This module defines a class,
\class{Message
}, which represents a
7 collection of ``email headers'' as defined by the Internet standard
8 \rfc{822}. It is used in various contexts, usually to read such
11 Note that there's a separate module to read
\UNIX{}, MH, and MMDF
12 style mailbox files:
\module{mailbox
}\refstmodindex{mailbox
}.
14 \begin{classdesc
}{Message
}{file
\optional{, seekable
}}
15 A
\class{Message
} instance is instantiated with an open file object as
16 parameter. The optional
\var{seekable
} parameter indicates if the
17 file object is seekable; the default value is
\code{1} for true.
18 Instantiation reads headers from the file up to a blank line and
19 stores them in the instance; after instantiation, the file is
20 positioned directly after the blank line that terminates the headers.
22 Input lines as read from the file may either be terminated by CR-LF or
23 by a single linefeed; a terminating CR-LF is replaced by a single
24 linefeed before the line is stored.
26 All header matching is done independent of upper or lower case;
27 e.g.
\code{\var{m
}['From'
]},
\code{\var{m
}['from'
]} and
28 \code{\var{m
}['FROM'
]} all yield the same result.
31 \begin{funcdesc
}{parsedate
}{date
}
32 Attempts to parse a date according to the rules in
\rfc{822}.
33 however, some mailers don't follow that format as specified, so
34 \function{parsedate()
} tries to guess correctly in such cases.
35 \var{date
} is a string containing an
\rfc{822} date, such as
36 \code{'Mon,
20 Nov
1995 19:
12:
08 -
0500'
}. If it succeeds in parsing
37 the date,
\function{parsedate()
} returns a
9-tuple that can be passed
38 directly to
\function{time.mktime()
}; otherwise
\code{None
} will be
42 \begin{funcdesc
}{parsedate_tz
}{date
}
43 Performs the same function as
\function{parsedate()
}, but returns
44 either
\code{None
} or a
10-tuple; the first
9 elements make up a tuple
45 that can be passed directly to
\function{time.mktime()
}, and the tenth
46 is the offset of the date's timezone from UTC (which is the official
47 term for Greenwich Mean Time). (Note that the sign of the timezone
48 offset is the opposite of the sign of the
\code{time.timezone
}
49 variable for the same timezone; the latter variable follows the
50 \POSIX{} standard while this module follows
\rfc{822}.) If the input
51 string has no timezone, the last element of the tuple returned is
55 \begin{funcdesc
}{mktime_tz
}{tuple
}
56 Turn a
10-tuple as returned by
\function{parsedate_tz()
} into a UTC
57 timestamp. It the timezone item in the tuple is
\code{None
}, assume
58 local time. Minor deficiency: this first interprets the first
8
59 elements as a local time and then compensates for the timezone
60 difference; this may yield a slight error around daylight savings time
61 switch dates. Not enough to worry about for common use.
64 \subsection{Message Objects
}
65 \label{message-objects
}
67 A
\class{Message
} instance has the following methods:
69 \begin{methoddesc
}{rewindbody
}{}
70 Seek to the start of the message body. This only works if the file
74 \begin{methoddesc
}{getallmatchingheaders
}{name
}
75 Return a list of lines consisting of all headers matching
76 \var{name
}, if any. Each physical line, whether it is a continuation
77 line or not, is a separate list item. Return the empty list if no
78 header matches
\var{name
}.
81 \begin{methoddesc
}{getfirstmatchingheader
}{name
}
82 Return a list of lines comprising the first header matching
83 \var{name
}, and its continuation line(s), if any. Return
\code{None
}
84 if there is no header matching
\var{name
}.
87 \begin{methoddesc
}{getrawheader
}{name
}
88 Return a single string consisting of the text after the colon in the
89 first header matching
\var{name
}. This includes leading whitespace,
90 the trailing linefeed, and internal linefeeds and whitespace if there
91 any continuation line(s) were present. Return
\code{None
} if there is
92 no header matching
\var{name
}.
95 \begin{methoddesc
}{getheader
}{name
}
96 Like
\code{getrawheader(
\var{name
})
}, but strip leading and trailing
97 whitespace. Internal whitespace is not stripped.
100 \begin{methoddesc
}{getaddr
}{name
}
101 Return a pair
\code{(
\var{full name
},
\var{email address
})
} parsed
102 from the string returned by
\code{getheader(
\var{name
})
}. If no
103 header matching
\var{name
} exists, return
\code{(None, None)
};
104 otherwise both the full name and the address are (possibly empty)
107 Example: If
\var{m
}'s first
\code{From
} header contains the string
108 \code{'jack@cwi.nl (Jack Jansen)'
}, then
109 \code{m.getaddr('From')
} will yield the pair
110 \code{('Jack Jansen', 'jack@cwi.nl')
}.
111 If the header contained
112 \code{'Jack Jansen <jack@cwi.nl>'
} instead, it would yield the
116 \begin{methoddesc
}{getaddrlist
}{name
}
117 This is similar to
\code{getaddr(
\var{list
})
}, but parses a header
118 containing a list of email addresses (e.g. a
\code{To
} header) and
119 returns a list of
\code{(
\var{full name
},
\var{email address
})
} pairs
120 (even if there was only one address in the header). If there is no
121 header matching
\var{name
}, return an empty list.
123 XXX The current version of this function is not really correct. It
124 yields bogus results if a full name contains a comma.
127 \begin{methoddesc
}{getdate
}{name
}
128 Retrieve a header using
\method{getheader()
} and parse it into a
9-tuple
129 compatible with
\function{time.mktime()
}. If there is no header matching
130 \var{name
}, or it is unparsable, return
\code{None
}.
132 Date parsing appears to be a black art, and not all mailers adhere to
133 the standard. While it has been tested and found correct on a large
134 collection of email from many sources, it is still possible that this
135 function may occasionally yield an incorrect result.
138 \begin{methoddesc
}{getdate_tz
}{name
}
139 Retrieve a header using
\method{getheader()
} and parse it into a
140 10-tuple; the first
9 elements will make a tuple compatible with
141 \function{time.mktime()
}, and the
10th is a number giving the offset
142 of the date's timezone from UTC. Similarly to
\method{getdate()
}, if
143 there is no header matching
\var{name
}, or it is unparsable, return
147 \class{Message
} instances also support a read-only mapping interface.
148 In particular:
\code{\var{m
}[name
]} is like
149 \code{\var{m
}.getheader(name)
} but raises
\exception{KeyError
} if
150 there is no matching header; and
\code{len(
\var{m
})
},
151 \code{\var{m
}.has_key(name)
},
\code{\var{m
}.keys()
},
152 \code{\var{m
}.values()
} and
\code{\var{m
}.items()
} act as expected
155 Finally,
\class{Message
} instances have two public instance variables:
157 \begin{memberdesc
}{headers
}
158 A list containing the entire set of header lines, in the order in
159 which they were read. Each line contains a trailing newline. The
160 blank line terminating the headers is not contained in the list.
163 \begin{memberdesc
}{fp
}
164 The file object passed at instantiation time.