1 P
\bPo
\bos
\bst
\btf
\bfi
\bix
\bx S
\bSt
\btr
\bre
\bes
\bss
\bs-
\b-D
\bDe
\bep
\bpe
\ben
\bnd
\bde
\ben
\bnt
\bt C
\bCo
\bon
\bnf
\bfi
\big
\bgu
\bur
\bra
\bat
\bti
\bio
\bon
\bn
3 -------------------------------------------------------------------------------
5 O
\bOv
\bve
\ber
\brv
\bvi
\bie
\bew
\bw
7 This document describes the symptoms of Postfix SMTP server overload. It
8 presents permanent main.cf changes to avoid overload during normal operation,
9 and temporary main.cf changes to cope with an unexpected burst of mail. This
10 document makes specific suggestions for Postfix 2.5 and later which support
11 stress-adaptive behavior, and for earlier Postfix versions that don't.
13 Topics covered in this document:
15 * Symptoms of Postfix SMTP server overload
16 * Service more SMTP clients at the same time
17 * Spend less time per SMTP client
18 * Disconnect suspicious SMTP clients
19 * Temporary measures for older Postfix releases
20 * Automatic stress-adaptive behavior
21 * Detecting support for stress-adaptive behavior
22 * Forcing stress-adaptive behavior on or off
23 * Other measures to off-load zombies
26 S
\bSy
\bym
\bmp
\bpt
\bto
\bom
\bms
\bs o
\bof
\bf P
\bPo
\bos
\bst
\btf
\bfi
\bix
\bx S
\bSM
\bMT
\bTP
\bP s
\bse
\ber
\brv
\bve
\ber
\br o
\bov
\bve
\ber
\brl
\blo
\boa
\bad
\bd
28 Under normal conditions, the Postfix SMTP server responds immediately when an
29 SMTP client connects to it; the time to deliver mail is noticeable only with
30 large messages. Performance degrades dramatically when the number of SMTP
31 clients exceeds the number of Postfix SMTP server processes. When an SMTP
32 client connects while all Postfix SMTP server processes are busy, the client
33 must wait until a server process becomes available.
35 SMTP server overload may be caused by a surge of legitimate mail (example: a
36 DNS registrar opens a new zone for registrations), by mistake (mail explosion
37 caused by a forwarding loop) or by malice (worm outbreak, botnet, or other
38 illegitimate activity).
40 Symptoms of Postfix SMTP server overload are:
42 * Remote SMTP clients experience a long delay before Postfix sends the "220
43 hostname.example.com ESMTP Postfix" greeting.
45 o NOTE: Broken DNS configurations can also cause lengthy delays before
46 Postfix sends "220 hostname.example.com ...". These delays also exist
47 when Postfix is NOT overloaded.
49 o NOTE: To avoid "overload" delays for end-user mail clients, enable the
50 "submission" service entry in master.cf (present since Postfix 2.1),
51 and tell users to connect to this instead of the public SMTP service.
53 * The Postfix SMTP server logs an increased number of "lost connection after
54 CONNECT" events. This happens because remote SMTP clients disconnect before
55 Postfix answers the connection.
57 o NOTE: A portscan for open SMTP ports can also result in "lost
58 connection ..." logfile messages.
60 * Postfix 2.3 and later logs a warning that all server ports are busy:
62 Oct 3 20:39:27 spike postfix/master[28905]: warning: service "smtp"
63 (25) has reached its process limit "30": new clients may experience
65 Oct 3 20:39:27 spike postfix/master[28905]: warning: to avoid this
66 condition, increase the process count in master.cf or reduce the
67 service time per client
69 Legitimate mail that doesn't get through during an episode of Postfix SMTP
70 server overload is not necessarily lost. It should still arrive once the
71 situation returns to normal, as long as the overload condition is temporary.
73 S
\bSe
\ber
\brv
\bvi
\bic
\bce
\be m
\bmo
\bor
\bre
\be S
\bSM
\bMT
\bTP
\bP c
\bcl
\bli
\bie
\ben
\bnt
\bts
\bs a
\bat
\bt t
\bth
\bhe
\be s
\bsa
\bam
\bme
\be t
\bti
\bim
\bme
\be
75 One measure to avoid the "all server processes busy" condition is to service
76 more SMTP clients simultaneously. For this you need to increase the number of
77 Postfix SMTP server processes. This will improve the responsiveness for remote
78 SMTP clients, as long as the server machine has enough hardware and software
79 resources to run the additional processes, and as long as the file system can
80 keep up with the additional load.
82 * You increase the number of SMTP server processes either by increasing the
83 default_process_limit in main.cf (line 3 below), or by increasing the SMTP
84 server's "maxproc" field in master.cf (line 10 below). Either way, you need
85 to issue a "postfix reload" command to make the change effective.
87 * Process limits above 1000 require Postfix version 2.4 or later, and an
88 operating system that supports kernel-based event filters (BSD kqueue(2),
89 Linux epoll(4), or Solaris /dev/poll).
91 * More processes use more memory. You can reduce the Postfix memory footprint
92 by using cdb: lookup tables instead of Berkeley DB's hash: or btree:
95 1 /etc/postfix/main.cf:
96 2 # Raise the global process limit, 100 since Postfix 2.0.
97 3 default_process_limit = 200
99 5 /etc/postfix/master.cf:
100 6 # =============================================================
101 7 # service type private unpriv chroot wakeup maxproc command
102 8 # =============================================================
103 9 # Raise the SMTP service process limit only.
104 10 smtp inet n - n - 200 smtpd
106 * NOTE: older versions of the SMTPD_POLICY_README document contain a mistake:
107 they configure a fixed number of policy daemon processes. When you raise
108 the SMTP server's "maxproc" field in master.cf, SMTP server processes will
109 report problems when connecting to policy server processes, because there
110 aren't enough of them. Examples of errors are "connection refused" or
111 "operation timed out".
113 To fix, edit master.cf and specify a zero "maxproc" field in all policy
114 server entries; see line 6 in the example below. Issue a "postfix reload"
115 command to make the change effective.
117 1 /etc/postfix/master.cf:
118 2 # =============================================================
119 3 # service type private unpriv chroot wakeup maxproc command
120 4 # =============================================================
121 5 # Disable the policy service process limit.
122 6 policy unix - n n - 0 spawn
123 7 user=nobody argv=/some/where/policy-server
125 S
\bSp
\bpe
\ben
\bnd
\bd l
\ble
\bes
\bss
\bs t
\bti
\bim
\bme
\be p
\bpe
\ber
\br S
\bSM
\bMT
\bTP
\bP c
\bcl
\bli
\bie
\ben
\bnt
\bt
127 When increasing the number of SMTP server processes is not practical, you can
128 improve Postfix server responsiveness by eliminating delays. When Postfix
129 spends less time per SMTP session, the same number of SMTP server processes can
130 service more clients in a given amount of time.
132 * Eliminate non-functional RBL lookups (blocklists that are no longer in
133 operation). These lookups can degrade performance. Postfix logs a warning
134 when an RBL server does not respond.
136 * Eliminate redundant RBL lookups (people often use multiple Spamhaus RBLs
137 that include each other). To find out whether RBLs include other RBLs, look
138 up the websites that document the RBL's policies.
140 * Eliminate header_checks and body_checks, and keep just a few emergency
141 patterns to block the latest worm explosion or backscatter mail. See
142 BACKSCATTER_README for examples of the latter.
144 * Group your header_checks and body_checks patterns to avoid unnecessary
145 pattern matching operations:
147 1 /etc/postfix/header_checks:
149 3 /^Subject: virus found in mail from you/ reject
150 4 /^Subject: ..other../ reject
154 8 /^Received: from (postfix\.org) / reject forged client name in
156 9 /^Received: from ..other../ reject ....
159 D
\bDi
\bis
\bsc
\bco
\bon
\bnn
\bne
\bec
\bct
\bt s
\bsu
\bus
\bsp
\bpi
\bic
\bci
\bio
\bou
\bus
\bs S
\bSM
\bMT
\bTP
\bP c
\bcl
\bli
\bie
\ben
\bnt
\bts
\bs
161 Under conditions of overload you can improve Postfix SMTP server responsiveness
162 by hanging up on suspicious clients, so that other clients get a chance to talk
165 * Use "521" SMTP reply codes (Postfix 2.6 and later) or "421" (Postfix 2.3-
166 2.5) to hang up on clients that that match botnet-related RBLs (see next
167 bullet) or that match selected non-RBL restrictions such as SMTP access
168 maps. The Postfix SMTP server will reject mail and disconnect without
169 waiting for the remote SMTP client to send a QUIT command.
171 * To hang up connections from blacklisted zombies, you can set specific
172 Postfix SMTP server reject codes for specific RBLs, and for individual
173 responses from specific RBLs. We'll use zen.spamhaus.org as an example; by
174 the time you read this document, details may have changed. Right now, their
175 documents say that a response of 127.0.0.10 or 127.0.0.11 indicates a
176 dynamic client IP address, which means that the machine is probably running
177 a bot of some kind. To give a 521 response instead of the default 554
178 response, use something like:
180 1 /etc/postfix/main.cf:
181 2 smtpd_client_restrictions =
183 4 reject_rbl_client zen.spamhaus.org=127.0.0.10
184 5 reject_rbl_client zen.spamhaus.org=127.0.0.11
185 6 reject_rbl_client zen.spamhaus.org
187 8 rbl_reply_maps = hash:/etc/postfix/rbl_reply_maps
189 10 /etc/postfix/rbl_reply_maps:
190 11 # With Postfix 2.3-2.5 use "421" to hang up connections.
191 12 zen.spamhaus.org=127.0.0.10 521 4.7.1 Service unavailable;
192 13 $rbl_class [$rbl_what] blocked using
193 14 $rbl_domain${rbl_reason?; $rbl_reason}
195 16 zen.spamhaus.org=127.0.0.11 521 4.7.1 Service unavailable;
196 17 $rbl_class [$rbl_what] blocked using
197 18 $rbl_domain${rbl_reason?; $rbl_reason}
199 Although the above example shows three RBL lookups (lines 4-6), Postfix
200 will only do a single DNS query, so it does not affect the performance.
202 * With Postfix 2.3-2.5, use reply code 421 (521 will not cause Postfix to
203 disconnect). The down-side of replying with 421 is that it works only for
204 zombies and other malware. If the client is running a real MTA, then it may
205 connect again several times until the mail expires in its queue. When this
206 is a problem, stick with the default 554 reply, and use
207 "smtpd_hard_error_limit = 1" as described below.
209 * You can automatically turn on the above overload measure with Postfix 2.5
210 and later, or with earlier releases that contain the stress-adaptive
211 behavior source code patch from the mirrors listed at http://
212 www.postfix.org/download.html. Simply replace line above 8 with:
214 8 rbl_reply_maps = ${stress?hash:/etc/postfix/rbl_reply_maps}
216 More information about automatic stress-adaptive behavior is in section
217 "Automatic stress-adaptive behavior".
219 T
\bTe
\bem
\bmp
\bpo
\bor
\bra
\bar
\bry
\by m
\bme
\bea
\bas
\bsu
\bur
\bre
\bes
\bs f
\bfo
\bor
\br o
\bol
\bld
\bde
\ber
\br P
\bPo
\bos
\bst
\btf
\bfi
\bix
\bx r
\bre
\bel
\ble
\bea
\bas
\bse
\bes
\bs
221 See the next section, "Automatic stress-adaptive behavior", if you are running
222 Postfix version 2.5 or later, or if you have applied the source code patch for
223 stress-adaptive behavior from the mirrors listed at http://www.postfix.org/
226 The following measures can be applied temporarily during overload. They still
227 allow m
\bmo
\bos
\bst
\bt legitimate clients to connect and send mail, but may affect some
230 * Reduce smtpd_timeout (default: 300s). Experience on the postfix-users list
231 from a variety of sysadmins shows that reducing the "normal" smtpd_timeout
232 to 60s is unlikely to affect legitimate clients. However, it is unlikely to
233 become the Postfix default because it's not RFC compliant. Setting
234 smtpd_timeout to 10s (line 2 below) or even 5s under stress will still
235 allow m
\bmo
\bos
\bst
\bt legitimate clients to connect and send mail, but may delay mail
236 from some clients. No mail should be lost, as long as this measure is used
239 * Reduce smtpd_hard_error_limit (default: 20). Setting this to 1 under stress
240 (line 3 below) helps by disconnecting clients after a single error, giving
241 other clients a chance to connect. However, this may cause significant
242 delays with legitimate mail, such as a mailing list that contains a few no-
243 longer-active user names that didn't bother to unsubscribe. No mail should
244 be lost, as long as this measure is used only temporarily.
246 * Use an smtpd_junk_command_limit of 1 instead of the default 100. This
247 prevents clients from keeping idle connections open by repeatedly sending
248 NOOP or RSET commands.
250 1 /etc/postfix/main.cf:
252 3 smtpd_hard_error_limit = 1
253 4 smtpd_junk_command_limit = 1
255 With these measures, no mail should be lost, as long as these measures are used
256 only temporarily. The next section of this document introduces a way to
257 automate this process.
259 A
\bAu
\but
\bto
\bom
\bma
\bat
\bti
\bic
\bc s
\bst
\btr
\bre
\bes
\bss
\bs-
\b-a
\bad
\bda
\bap
\bpt
\bti
\biv
\bve
\be b
\bbe
\beh
\bha
\bav
\bvi
\bio
\bor
\br
261 Postfix version 2.5 introduces automatic stress-adaptive behavior. This is also
262 available as a source code patch for Postfix versions 2.4 and 2.3 from the
263 mirrors listed at http://www.postfix.org/download.html.
265 It works as follows. When a "public" network service such as the SMTP server
266 runs into an "all server ports are busy" condition, the Postfix master(8)
267 daemon logs a warning, restarts the service (without interrupting existing
268 network sessions), and runs the service with "-o stress=yes" on the server
269 process command line:
271 80821 ?? S 0:00.24 smtpd -n smtp -t inet -u -c -o stress=yes
273 Normally, the Postfix master(8) daemon runs such a service with "-o stress=" on
274 the command line (i.e. with an empty parameter value):
276 83326 ?? S 0:00.28 smtpd -n smtp -t inet -u -c -o stress=
278 Services that have local access only never have "-o stress" parameters on the
279 command line. This includes services internal to Postfix such as the queue
280 manager, and services that listen on a loopback interface only, such as after-
281 filter SMTP services.
283 The "stress" parameter value is the key to making main.cf parameter settings
284 stress adaptive. The following settings are the default with Postfix 2.6 and
285 later. With earlier Postfix versions that have stress-adaptive support, append
286 the lines below to the main.cf file and issue a "postfix reload" command:
288 1 smtpd_timeout = ${stress?10}${stress:300}s
289 2 smtpd_hard_error_limit = ${stress?1}${stress:20}
290 3 smtpd_junk_command_limit = ${stress?1}${stress:100}
294 * Line 1: under conditions of stress, use an smtpd_timeout value of 10
295 seconds instead of the default 300 seconds. Experience on the postfix-users
296 list from a variety of sysadmins shows that reducing the "normal"
297 smtpd_timeout to 60s is unlikely to affect legitimate clients. However, it
298 is unlikely to become the Postfix default because it's not RFC compliant.
299 Setting smtpd_timeout to 10s (line 2 below) or even 5s under stress will
300 still allow most legitimate clients to connect and send mail, but may delay
301 mail from some clients. No mail should be lost, as long as this measure is
302 used only temporarily.
304 * Line 2: under conditions of stress, use an smtpd_hard_error_limit of 1
305 instead of the default 20. This helps by disconnecting clients after a
306 single error, giving other clients a chance to connect. However, this may
307 cause significant delays with legitimate mail, such as a mailing list that
308 contains a few no-longer-active user names that didn't bother to
309 unsubscribe. No mail should be lost, as long as this measure is used only
312 * Line 3: under conditions of stress, use an smtpd_junk_command_limit of 1
313 instead of the default 100. This prevents clients from keeping idle
314 connections open by repeatedly sending NOOP or RSET commands.
316 The syntax of ${name?value} and ${name:value} is explained at the beginning of
317 the postconf(5) manual page.
319 NOTE: Please keep in mind that the stress-adaptive feature is a fairly
320 desperate measure to keep s
\bso
\bom
\bme
\be legitimate mail flowing under overload
321 conditions. If a site is reaching the SMTP server process limit when there
322 isn't an attack or bot flood occurring, then either the process limit needs to
323 be raised or more hardware needs to be added.
325 D
\bDe
\bet
\bte
\bec
\bct
\bti
\bin
\bng
\bg s
\bsu
\bup
\bpp
\bpo
\bor
\brt
\bt f
\bfo
\bor
\br s
\bst
\btr
\bre
\bes
\bss
\bs-
\b-a
\bad
\bda
\bap
\bpt
\bti
\biv
\bve
\be b
\bbe
\beh
\bha
\bav
\bvi
\bio
\bor
\br
327 To find out if your Postfix installation supports stress-adaptive behavior, use
328 the "ps" command, and look for the smtpd processes. Postfix has stress-adaptive
329 support when you see "-o stress=" or "-o stress=yes" command-line options.
330 Remember that Postfix never enables stress-adaptive behavior on servers that
331 listen on local addresses only.
333 The following example is for FreeBSD or Linux. On Solaris, HP-UX and other
334 System-V flavors, use "ps -ef" instead of "ps ax".
337 83326 ?? S 0:00.28 smtpd -n smtp -t inet -u -c -o stress=
338 84345 ?? Ss 0:00.11 /usr/bin/perl /usr/libexec/postfix/smtpd-
341 You can't use postconf(1) to detect stress-adaptive support. The postconf(1)
342 command ignores the existence of the stress parameter in main.cf, because the
343 parameter has no effect there. Command-line "-o parameter" settings always take
344 precedence over main.cf parameter settings.
346 If you configure stress-adaptive behavior in main.cf when it isn't supported,
347 nothing bad will happen. The processes will run as if the stress parameter
348 always has an empty value.
350 F
\bFo
\bor
\brc
\bci
\bin
\bng
\bg s
\bst
\btr
\bre
\bes
\bss
\bs-
\b-a
\bad
\bda
\bap
\bpt
\bti
\biv
\bve
\be b
\bbe
\beh
\bha
\bav
\bvi
\bio
\bor
\br o
\bon
\bn o
\bor
\br o
\bof
\bff
\bf
352 You can manually force stress-adaptive behavior on, by adding a "-o stress=yes"
353 command-line option in master.cf. This can be useful for testing overrides on
354 the SMTP service. Issue "postfix reload" to make the change effective.
356 Note: setting the stress parameter in main.cf has no effect for services that
357 accept remote connections.
359 1 /etc/postfix/master.cf:
360 2 # =============================================================
361 3 # service type private unpriv chroot wakeup maxproc command
362 4 # =============================================================
364 6 smtp inet n - n - - smtpd
368 To permanently force stress-adaptive behavior off with a specific service,
369 specify "-o stress=" on its master.cf command line. This may be desirable for
370 the "submission" service. Issue "postfix reload" to make the change effective.
372 Note: setting the stress parameter in main.cf has no effect for services that
373 accept remote connections.
375 1 /etc/postfix/master.cf:
376 2 # =============================================================
377 3 # service type private unpriv chroot wakeup maxproc command
378 4 # =============================================================
380 6 submission inet n - n - - smtpd
384 O
\bOt
\bth
\bhe
\ber
\br m
\bme
\bea
\bas
\bsu
\bur
\bre
\bes
\bs t
\bto
\bo o
\bof
\bff
\bf-
\b-l
\blo
\boa
\bad
\bd z
\bzo
\bom
\bmb
\bbi
\bie
\bes
\bs
386 OpenBSD spamd implements a daemon that handles all connections from "new"
387 clients. Only well-behaved mail clients are allowed to talk to the mail server.
388 Other clients are tarpitted, and will never get a chance to affect mail server
391 At some point in the future, Postfix may come with a simple front-end daemon
392 that does basic greylisting and pipelining detection to keep zombies and other
393 ratware away from Postfix itself. This would use the "pass" service type which
394 has been available in stable Postfix releases since Postfix 2.5.
396 C
\bCr
\bre
\bed
\bdi
\bit
\bts
\bs
398 * Thanks to the postfix-users mailing list members for sharing early
399 experiences with the stress-adaptive feature.
400 * The RBL example and several other paragraphs of text were adapted from
401 postfix-users postings by Noel Jones.
402 * Wietse implemented stress-adaptive behavior as the smallest possible patch
403 while he should be working on other things.