external/ibm-public/postfix/dist/implementation-notes/MILTER

   1 Distribution of Milter responsibility
   2 =====================================
   3
   4 Milters look at the SMTP commands as well as the message content.
   5 In Postfix these are handled by different processes:
   6
   7 - smtpd(8) (the SMTP server) focuses on the SMTP commands, strips
   8   the SMTP encapsulation, and passes envelope information and message
   9   content to the cleanup server.
  10
  11 - the cleanup(8) server parses the message content (it understands
  12   headers, body, and MIME structure), and creates a queue file with
  13   envelope and content information. The cleanup server adds additional
  14   envelope records, such as when to send a "delayed mail" notice.
  15
  16 If we want to support message modifications (add/delete recipient,
  17 add/delete/replace header, replace body) then it pretty much has
  18 to be implemented in the cleanup server, if we want to avoid extra
  19 temporary files.
  20
  21 Network versus local submission
  22 ===============================
  23
  24 As of Sendmail 8.12, all mail is received via SMTP, so all mail is
  25 subject to Miltering (local submissions are queued in a submission
  26 queue and then delivered via SMTP to the main MTA, or appended to
  27 $HOME/dead.letter). In Postfix, local submissions are received by
  28 the pickup server, which feeds the mail into the cleanup server
  29 after doing basic sanity checks.
  30
  31 How do we set up the Milters with SMTP mail versus local submissions?
  32
  33 - SMTP mail: smtpd creates Milter contexts, and sends them, including
  34   their sockets, to the cleanup server. The smtpd is responsible
  35   for sending the Milter abort and close messages. Both smtpd and
  36   cleanup are responsible for closing their Milter socket. Since
  37   smtpd and cleanup inspect mail at different times, there is no
  38   conflict with access to the Milter socket.
  39
  40 - Local submission: the cleanup server creates Milter contexts.
  41   The cleanup server provides dummy connect and helo information,
  42   or perhaps none at all, and provides sender and recipient events.
  43   The cleanup server is responsible for sending the Milter abort
  44   and close messages, and for closing the Milter socket.
  45
  46 A special case of local submission is "sendmail -t". This creates
  47 a record stream in which recipients appear after content. However,
  48 Milters expect to receive envelope information before content, not
  49 after.  This is not a problem: just like a queue manager, the
  50 cleanup-side Milter client can jump around through the queue file
  51 and send the information to the Milter in the expected order.
  52
  53 Interaction with XCLIENT, "postsuper -r", and external content filters
  54 ======================================================================
  55
  56 Milter applications expect that the MTA supplies context information
  57 in the form of Sendmail-like macros (j=hostname, {client_name}=the
  58 SMTP client hostname, etc.). Not all these macros have a Postfix
  59 equivalent. Postfix 2.3 makes a subset available.
  60
  61 If Postfix does not implement a specific macro, people can usually
  62 work around it. But we should avoid inconsistency. If Postfix can
  63 make macro X available at Milter protocol stage Y, then it must
  64 also be able to make that macro available at all later Milter
  65 protocol stages, even when some of those stages are handled by a
  66 different Postfix process.
  67
  68 Thus, when adding Milter support for a specific Sendmail-like macro
  69 to the SMTP server:
  70
  71 - We may have to update the XCLIENT protocol, so that Milter
  72   applications can be tested with XCLIENT. If not, then we must
  73   prominently document everywhere that XCLIENT does not provide
  74   100% accurate simulation for Milters. An additional complication
  75   is that the SMTP command length is limited, and that each XCLIENT
  76   command resets the SMTP server to the 220 stage and generates
  77   "connect" events for anvil(8) and for Milters.
  78
  79 - The SMTP server has to send the corresponding attribute to the
  80   cleanup server.  The cleanup server then stores the attribute in
  81   the queue file, so that Milters produce consistent results when
  82   mail is re-queued with "postsuper -r".
  83
  84 But wait, there is more. If mail is filtered by an external content
  85 filter, then it needs to preserve all the Milter attributes so that
  86 after "postsuper -r", Milters produce the exact same result as when
  87 mail was received originally by Postfix. Specifically, after
  88 "postsuper -r" a signing Milter must not sign mail that it did not
  89 sign on the first pass through Postfix, and it must not reject mail
  90 that it accepted on the first pass through Postfix.
  91
  92 Instead of trying to re-create the Milter execution environment
  93 after "postsuper -r" we simply disable Milter processing. The
  94 rationale for this is: if mail was Miltered before it was written
  95 to queue file, then there is no need to Milter it again.
  96
  97 We might want to take a similar approach with external (signing or
  98 blocking) content filters: don't filter mail that has already been
  99 filtered, and don't filter mail that didn't need to be filtered.
 100 Such mail can be recognized by the absence of a "content_filter"
 101 record. To make the implementation efficient, the cleanup server
 102 would have to record the presence of a "content_filter" record in
 103 the queue file header.
 104
 105 Message envelope or content modifications
 106 =========================================
 107
 108 Milters can send modification requests after receiving the end of
 109 the message body.  If we can implement all the header/body-related
 110 Milter operations in the cleanup server, then we can try to edit
 111 the queue file in place, without ever having to make a temporary
 112 copy. Once a Milter is done editing, the queue file can be used as
 113 input for the next Milter, and so on. Finally, the cleanup server
 114 calls fsync() and waits for successful return.
 115
 116 To implement in-place queue file edits, we need to introduce
 117 surprisingly little change to the existing Postfix queue file
 118 structure.  All we need is a way to specify a jump from one place
 119 in the file to another.
 120
 121 Postfix does not store queue files as plain text files. Instead all
 122 information is stored in records with an explicit type and length
 123 for sender, recipient, arrival time, and so on.  Even the content
 124 that makes up the message header and body is stored as records with
 125 an explicit type and length.  This organization makes it very easy
 126 to introduce pointer records, which is what we will use to jump
 127 from one place in a queue file to another place.
 128
 129 - Deleting a recipient or header record is easy - just mark the
 130   record as killed.  When deleting a recipient, we must kill all
 131   recipient records that result from virtual alias expansion of the
 132   original recipient address. When deleting a very long header or
 133   body line, multiple queue file records may need to be killed. We
 134   won't try to reuse the deleted space for other purposes.
 135
 136 - Replacing header or body records involves pointer records.
 137   Basically, a record is replaced by overwriting it with a forward
 138   pointer to space after the end of the queue file, putting the new
 139   record there, followed by a reverse pointer to the record that
 140   follows the replaced information. If the replaced record is shorter
 141   than a pointer record, we relocate the records that follow it to
 142   the new area, until we have enough space for the forward pointer
 143   record. See below for a discussion on what it takes to make this
 144   safe.
 145
 146   Postfix queue files are segmented. The first segment is for
 147   envelope records, the second for message header and body content,
 148   and the third segment is for information that was extracted or
 149   generated from the message header and body content.  Each segment
 150   is terminated by a marker record. For now we don't want to change
 151   their location. In particular, we want to avoid moving the start
 152   of a segment.
 153
 154   To ensure that we can always replace a header or body record by
 155   a pointer record, without having to relocate a marker record, the
 156   cleanup server always places a dummy pointer record at the end
 157   of the headers and at the end of the body.
 158
 159   When a Milter wants to replace an entire body, we have the option
 160   to overwrite existing body records until we run out of space, and
 161   then writing a pointer to space at the end of the queue file,
 162   followed by the remainder of the body, and a pointer to the marker
 163   that ends the message content segment.
 164
 165 - Appending a recipient or header record involves pointer records
 166   as well. This requires that the queue file already contains a
 167   dummy pointer record at the place where we want to append recipient
 168   or header content (Milters currently do not replace individual
 169   body records, but we could add this if need be).  To append,
 170   change the dummy pointer into a forward pointer to space after
 171   the end of a message, put the new record there, followed by a
 172   reverse pointer to the record that follows the forward pointer.
 173
 174   To append another record, replace the reverse pointer by a forward
 175   pointer to space after the end of a message, put the new record
 176   there, followed by the value of the reverse pointer that we
 177   replace. Thus, there is no one-to-one correspondence between
 178   forward and backward pointers! In fact, there can be multiple
 179   forward pointers for one reverse pointer.
 180
 181 When relocating a record we must not relocate the target of a jump
 182 ==================================================================
 183
 184 As discussed above, when replacing an existing record, we overwrite
 185 it with a forward pointer to the new information. If the old record
 186 is too small we relocate one or more records that follow the record
 187 that's being replaced, until we have enough space for the forward
 188 pointer record.
 189
 190 Now we have to become really careful. Could we end up relocating a
 191 record that is the target of a forward or reverse pointer, and thus
 192 corrupt the queue file? The answer is NO.
 193
 194 - We never relocate end-of-segment marker records. Instead, the
 195   cleanup server writes dummy pointer records to guarantee that
 196   there is always space for a pointer.
 197
 198 - When a record is the target of a forward pointer, it is "edited"
 199   information that is preceded either by the end-of-queue-file
 200   marker record, or it is preceded by the reverse pointer at the
 201   end of earlier written "edited" information. Thus, the target of
 202   a forward pointer will not be relocated to make space for a pointer
 203   record.
 204
 205 - When a record is the target of a reverse pointer, it is always
 206   preceded by a forward pointer record (or by a forward pointer
 207   record followed by some unused space). Thus, the target of a
 208   reverse pointer will not be relocated to make space for a pointer
 209   record.
 210
 211 Could we end up relocating a pointer record?  Yes, but that is OK,
 212 as long as pointers contain absolute offsets.
 213
 214 Pointer records introduce the possibility of loops
 215 ==================================================
 216
 217 When a queue file is damaged, a bogus pointer value may send Postfix
 218 into a loop. This must not happen.
 219
 220 Detecting loops is not trivial:
 221
 222 - A sequence of multiple forward pointers may be followed by one
 223   legitimate reverse pointer to the location after the first forward
 224   pointer. See above for a discussion of how to append a record to
 225   an appended record.
 226
 227 - We do know, however, that there will not be more reverse pointers
 228   than forward pointers. But this does not help much.
 229
 230 Perhaps we can include a record count at the start of the queue
 231 file, so that the record walking code knows that it's looking at
 232 some records more than once, and return an error indication.
 233
 234 How many bytes do we need for a pointer record?
 235 ===============================================
 236
 237 A pointer record would look like this:
 238
 239     type (1 byte)
 240     offset (see below)
 241
 242 Postfix uses long for queue file size/offset information, and stores
 243 them as %15ld in the SIZE record at the start of the queue file.
 244 This is somewhat less than a 64-bit long, but it is enough for a
 245 some time to come, and it is easily changed without breaking forward
 246 or backward compatibility.
 247
 248 It does mean, however, that a pointer record can easily exceed the
 249 length of a header record. This is why we go through the trouble
 250 of record relocation and dummy records.
 251
 252 In Postfix 2.4 we fixed this by adding padding to short message
 253 header records so that we can always write a pointer record over a
 254 message header.  This immensly simplifies the code.