Merge "docs: Fix typo"
[mediawiki.git] / includes / deferred / DeferredUpdates.php
blob6cbcef0e25e803fcac4408d29b2380fc2ebc9a3f
1 <?php
2 /**
3 * This program is free software; you can redistribute it and/or modify
4 * it under the terms of the GNU General Public License as published by
5 * the Free Software Foundation; either version 2 of the License, or
6 * (at your option) any later version.
8 * This program is distributed in the hope that it will be useful,
9 * but WITHOUT ANY WARRANTY; without even the implied warranty of
10 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
11 * GNU General Public License for more details.
13 * You should have received a copy of the GNU General Public License along
14 * with this program; if not, write to the Free Software Foundation, Inc.,
15 * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
16 * http://www.gnu.org/copyleft/gpl.html
18 * @file
21 namespace MediaWiki\Deferred;
23 use ErrorPageError;
24 use LogicException;
25 use MediaWiki\Logger\LoggerFactory;
26 use MWExceptionHandler;
27 use Throwable;
28 use Wikimedia\Rdbms\IDatabase;
29 use Wikimedia\ScopedCallback;
31 /**
32 * Defer callable updates to run later in the PHP process
34 * This is a performance feature that enables MediaWiki to produce faster web responses.
35 * It allows you to postpone non-blocking work (e.g. work that does not change the web
36 * response) to after the HTTP response has been sent to the client (i.e. web browser).
38 * Once the response is finalized and sent to the browser, the webserver process stays
39 * for a little while longer (detached from the web request) to run your POSTSEND tasks.
41 * There is also a PRESEND option, which runs your task right before the finalized response
42 * is sent to the browser. This is for critical tasks that does need to block the response,
43 * but where you'd like to benefit from other DeferredUpdates features. Such as:
45 * - MergeableUpdate: batch updates from different components without coupling
46 * or awareness of each other.
47 * - Automatic cancellation: pass a IDatabase object (for any wiki or database) to
48 * DeferredUpdates::addCallableUpdate or AtomicSectionUpdate.
49 * - Reducing lock contention: if the response is likely to take several seconds
50 * (e.g. uploading a large file to FileBackend, or saving an edit to a large article)
51 * much of that work may overlap with a database transaction that is staying open for
52 * the entire duration. By moving contentious writes out to a PRESEND update, these
53 * get their own transaction (after the main one is committed), which give up some
54 * atomicity for improved throughput.
56 * ## Expectation and comparison to job queue
58 * When scheduling a POSTSEND via the DeferredUpdates system you can generally expect
59 * it to complete well before the client makes their next request. Updates runs directly after
60 * the web response is sent, from the same process on the same server. This unlike the JobQueue,
61 * where jobs may need to wait in line for some minutes or hours.
63 * If your update fails, this failure is not known to the client and gets no retry. For updates
64 * that need re-tries for system consistency or data integrity, it is recommended to implement
65 * it as a job instead and use JobQueueGroup::lazyPush. This has the caveat of being delayed
66 * by default, the same as any other job.
68 * A hybrid solution is available via the EnqueueableDataUpdate interface. By implementing
69 * this interface, you can queue your update via the DeferredUpdates first, and if it fails,
70 * the system will automatically catch this and queue it as a job instead.
72 * ## How it works during web requests
74 * 1. Your request route is executed (e.g. Action or SpecialPage class, or API).
75 * 2. Output is finalized and main database transaction is committed.
76 * 3. PRESEND updates run via DeferredUpdates::doUpdates.
77 * 5. The web response is sent to the browser.
78 * 6. POSTSEND updates run via DeferredUpdates::doUpdates.
80 * @see MediaWiki::preOutputCommit
81 * @see MediaWiki::restInPeace
83 * ## How it works for Maintenance scripts
85 * In CLI mode, no distinction is made between PRESEND and POSTSEND deferred updates,
86 * and the queue is periodically executed throughout the process.
88 * @see DeferredUpdates::tryOpportunisticExecute
90 * ## How it works internally
92 * Each update is added via DeferredUpdates::addUpdate and stored in either the PRESEND or
93 * POSTSEND queue. If an update gets queued while another update is already running, then
94 * we store in a "sub"-queue associated with the current update. This allows nested updates
95 * to be completed before other updates, which improves ordering for process caching.
97 * @since 1.19
99 class DeferredUpdates {
100 /** @var int Process all updates; in web requests, use only after flushing output buffer */
101 public const ALL = 0;
102 /** @var int Specify/process updates that should run before flushing output buffer */
103 public const PRESEND = 1;
104 /** @var int Specify/process updates that should run after flushing output buffer */
105 public const POSTSEND = 2;
107 /** @var int[] List of "defer until" queue stages that can be reached */
108 public const STAGES = [ self::PRESEND, self::POSTSEND ];
110 /** @var DeferredUpdatesScopeStack|null Queue states based on recursion level */
111 private static $scopeStack;
114 * @var int Nesting level for preventOpportunisticUpdates()
116 private static $preventOpportunisticUpdates = 0;
118 private static function getScopeStack(): DeferredUpdatesScopeStack {
119 self::$scopeStack ??= new DeferredUpdatesScopeMediaWikiStack();
120 return self::$scopeStack;
124 * @param DeferredUpdatesScopeStack $scopeStack
125 * @internal Only for use in tests.
127 public static function setScopeStack( DeferredUpdatesScopeStack $scopeStack ): void {
128 if ( !defined( 'MW_PHPUNIT_TEST' ) ) {
129 throw new LogicException( 'Cannot reconfigure DeferredUpdates outside tests' );
131 self::$scopeStack = $scopeStack;
135 * Add an update to the pending update queue for execution at the appropriate time
137 * In CLI mode, callback magic will also be used to run updates when safe
139 * If an update is already in progress, then what happens to this update is as follows:
140 * - If it has a "defer until" stage at/before the actual run stage of the innermost
141 * in-progress update, then it will go into the sub-queue of that in-progress update.
142 * As soon as that update completes, MergeableUpdate instances in its sub-queue will be
143 * merged into the top-queues and the non-MergeableUpdate instances will be executed.
144 * This is done to better isolate updates from the failures of other updates and reduce
145 * the chance of race conditions caused by updates not fully seeing the intended changes
146 * of previously enqueued and executed updates.
147 * - If it has a "defer until" stage later than the actual run stage of the innermost
148 * in-progress update, then it will go into the normal top-queue for that stage.
150 * @param DeferrableUpdate $update Some object that implements doUpdate()
151 * @param int $stage One of (DeferredUpdates::PRESEND, DeferredUpdates::POSTSEND)
152 * @since 1.28 Added the $stage parameter
154 public static function addUpdate( DeferrableUpdate $update, $stage = self::POSTSEND ) {
155 self::getScopeStack()->current()->addUpdate( $update, $stage );
156 self::tryOpportunisticExecute();
160 * Add an update to the pending update queue that invokes the specified callback when run
162 * @param callable $callable One of the following:
163 * - A Closure callback that takes the caller name as its argument
164 * - A non-Closure callback that takes no arguments
165 * @param int $stage One of (DeferredUpdates::PRESEND, DeferredUpdates::POSTSEND)
166 * @param IDatabase|IDatabase[] $dependeeDbws DB handles which might have pending writes
167 * upon which this update depends. If any of the handles already has an open transaction,
168 * a rollback thereof will cause this update to be cancelled (if it has not already run).
169 * [optional] (since 1.28)
170 * @since 1.27 Added $stage parameter
171 * @since 1.28 Added the $dbw parameter
172 * @since 1.43 Closures are now given the caller name parameter
174 public static function addCallableUpdate(
175 $callable,
176 $stage = self::POSTSEND,
177 $dependeeDbws = []
179 self::addUpdate( new MWCallableUpdate( $callable, wfGetCaller(), $dependeeDbws ), $stage );
183 * Run an update, and, if an error was thrown, catch/log it and enqueue the update as
184 * a job in the job queue system if possible (e.g. implements EnqueueableDataUpdate)
186 * @param DeferrableUpdate $update
187 * @return Throwable|null
189 private static function run( DeferrableUpdate $update ): ?Throwable {
190 $logger = LoggerFactory::getInstance( 'DeferredUpdates' );
192 $type = get_class( $update )
193 . ( $update instanceof DeferrableCallback ? '_' . $update->getOrigin() : '' );
194 $updateId = spl_object_id( $update );
195 $logger->debug( "DeferredUpdates::run: started $type #{updateId}", [ 'updateId' => $updateId ] );
197 $updateException = null;
199 $startTime = microtime( true );
200 try {
201 self::attemptUpdate( $update );
202 } catch ( Throwable $updateException ) {
203 MWExceptionHandler::logException( $updateException );
204 $logger->error(
205 "Deferred update '{deferred_type}' failed to run.",
207 'deferred_type' => $type,
208 'exception' => $updateException,
211 self::getScopeStack()->onRunUpdateFailed( $update );
212 } finally {
213 $walltime = microtime( true ) - $startTime;
214 $logger->debug( "DeferredUpdates::run: ended $type #{updateId}, processing time: {walltime}", [
215 'updateId' => $updateId,
216 'walltime' => $walltime,
217 ] );
220 // Try to push the update as a job so it can run later if possible
221 if ( $updateException && $update instanceof EnqueueableDataUpdate ) {
222 try {
223 self::getScopeStack()->queueDataUpdate( $update );
224 } catch ( Throwable $jobException ) {
225 MWExceptionHandler::logException( $jobException );
226 $logger->error(
227 "Deferred update '{deferred_type}' failed to enqueue as a job.",
229 'deferred_type' => $type,
230 'exception' => $jobException,
233 self::getScopeStack()->onRunUpdateFailed( $update );
237 return $updateException;
241 * Consume and execute all pending updates
243 * Note that it is rarely the case that this method should be called outside of a few
244 * select entry points. For simplicity, that kind of recursion is discouraged. Recursion
245 * cannot happen if an explicit transaction round is active, which limits usage to updates
246 * with TRX_ROUND_ABSENT that do not leave open any transactions round of their own during
247 * the call to this method.
249 * In the less-common case of this being called within an in-progress DeferrableUpdate,
250 * this will not see any top-queue updates (since they were consumed and are being run
251 * inside an outer execution loop). In that case, it will instead operate on the sub-queue
252 * of the innermost in-progress update on the stack.
254 * @internal For use by MediaWiki, Maintenance, JobRunner, JobExecutor
255 * @param int $stage Which updates to process. One of
256 * (DeferredUpdates::PRESEND, DeferredUpdates::POSTSEND, DeferredUpdates::ALL)
258 public static function doUpdates( $stage = self::ALL ) {
259 /** @var ErrorPageError $guiError First presentable client-level error thrown */
260 $guiError = null;
261 /** @var Throwable $exception First of any error thrown */
262 $exception = null;
264 $scope = self::getScopeStack()->current();
266 // T249069: recursion is not possible once explicit transaction rounds are involved
267 $activeUpdate = $scope->getActiveUpdate();
268 if ( $activeUpdate ) {
269 $class = get_class( $activeUpdate );
270 if ( !( $activeUpdate instanceof TransactionRoundAwareUpdate ) ) {
271 throw new LogicException(
272 __METHOD__ . ": reached from $class, which is not TransactionRoundAwareUpdate"
275 if ( $activeUpdate->getTransactionRoundRequirement() !== $activeUpdate::TRX_ROUND_ABSENT ) {
276 throw new LogicException(
277 __METHOD__ . ": reached from $class, which does not specify TRX_ROUND_ABSENT"
282 $scope->processUpdates(
283 $stage,
284 static function ( DeferrableUpdate $update, $activeStage ) use ( &$guiError, &$exception ) {
285 $scopeStack = self::getScopeStack();
286 $childScope = $scopeStack->descend( $activeStage, $update );
287 try {
288 $e = self::run( $update );
289 $guiError = $guiError ?: ( $e instanceof ErrorPageError ? $e : null );
290 $exception = $exception ?: $e;
291 // Any addUpdate() calls between descend() and ascend() used the sub-queue.
292 // In rare cases, DeferrableUpdate::doUpdates() will process them by calling
293 // doUpdates() itself. In any case, process remaining updates in the subqueue.
294 // them, enqueueing them, or transferring them to the parent scope
295 // queues as appropriate...
296 $childScope->processUpdates(
297 $activeStage,
298 static function ( DeferrableUpdate $sub ) use ( &$guiError, &$exception ) {
299 $e = self::run( $sub );
300 $guiError = $guiError ?: ( $e instanceof ErrorPageError ? $e : null );
301 $exception = $exception ?: $e;
304 } finally {
305 $scopeStack->ascend();
310 // VW-style hack to work around T190178, so we can make sure
311 // PageMetaDataUpdater doesn't throw exceptions.
312 if ( $exception && defined( 'MW_PHPUNIT_TEST' ) ) {
313 throw $exception;
316 // Throw the first of any GUI errors as long as the context is HTTP pre-send. However,
317 // callers should check permissions *before* enqueueing updates. If the main transaction
318 // round actions succeed but some deferred updates fail due to permissions errors then
319 // there is a risk that some secondary data was not properly updated.
320 if ( $guiError && $stage === self::PRESEND && !headers_sent() ) {
321 throw $guiError;
326 * Consume and execute pending updates now if possible, instead of waiting.
328 * In web requests, updates are always deferred until the end of the request.
330 * In CLI mode, updates run earlier and more often. This is important for long-running
331 * Maintenance scripts that would otherwise grow an excessively large queue, which increases
332 * memory use, and risks losing all updates if the script ends early or crashes.
334 * The folllowing conditions are required for updates to run early in CLI mode:
336 * - No update is already in progress (ensure linear flow, recursion guard).
337 * - LBFactory indicates that we don't have any "busy" database connections, i.e.
338 * there are no pending writes or otherwise active and uncommitted transactions,
339 * except if the transaction is empty and merely used for primary DB read queries,
340 * in which case the transaction (and its repeatable-read snapshot) can be safely flushed.
342 * How this works:
344 * - When a maintenance script calls {@link Maintenance::commitTransaction()},
345 * tryOpportunisticExecute() will be called after commit.
347 * - When a maintenance script calls {@link Maintenance::commitTransactionRound()},
348 * tryOpportunisticExecute() will be called after all the commits.
350 * - For maintenance scripts that don't do much with the database, we also call
351 * tryOpportunisticExecute() after every addUpdate() call.
353 * - Upon the completion of Maintenance::execute() via Maintenance::shutdown(),
354 * any remaining updates are run.
356 * Note that this method runs both PRESEND and POSTSEND updates and thus should not be called
357 * during web requests. It is only intended for long-running Maintenance scripts.
359 * @internal For use by Maintenance
360 * @since 1.28
361 * @return bool Whether updates were allowed to run
363 public static function tryOpportunisticExecute(): bool {
364 // Leave execution up to the current loop if an update is already in progress
365 // or if updates are explicitly disabled
366 if ( self::getRecursiveExecutionStackDepth()
367 || self::$preventOpportunisticUpdates
369 return false;
372 if ( self::getScopeStack()->allowOpportunisticUpdates() ) {
373 self::doUpdates( self::ALL );
374 return true;
377 return false;
381 * Prevent opportunistic updates until the returned ScopedCallback is
382 * consumed.
384 * @return ScopedCallback
386 public static function preventOpportunisticUpdates() {
387 self::$preventOpportunisticUpdates++;
388 return new ScopedCallback( static function () {
389 self::$preventOpportunisticUpdates--;
390 } );
394 * Get the number of pending updates for the current execution context
396 * If an update is in progress, then this operates on the sub-queues of the
397 * innermost in-progress update. Otherwise, it acts on the top-queues.
399 * @return int
400 * @since 1.28
402 public static function pendingUpdatesCount() {
403 return self::getScopeStack()->current()->pendingUpdatesCount();
407 * Get a list of the pending updates for the current execution context
409 * If an update is in progress, then this operates on the sub-queues of the
410 * innermost in-progress update. Otherwise, it acts on the top-queues.
412 * @param int $stage Look for updates with this "defer until" stage. One of
413 * (DeferredUpdates::PRESEND, DeferredUpdates::POSTSEND, DeferredUpdates::ALL)
414 * @return DeferrableUpdate[]
415 * @internal This method should only be used for unit tests
416 * @since 1.29
418 public static function getPendingUpdates( $stage = self::ALL ) {
419 return self::getScopeStack()->current()->getPendingUpdates( $stage );
423 * Cancel all pending updates for the current execution context
425 * If an update is in progress, then this operates on the sub-queues of the
426 * innermost in-progress update. Otherwise, it acts on the top-queues.
428 * @internal This method should only be used for unit tests
430 public static function clearPendingUpdates() {
431 self::getScopeStack()->current()->clearPendingUpdates();
435 * Get the number of in-progress calls to DeferredUpdates::doUpdates()
437 * @return int
438 * @internal This method should only be used for unit tests
440 public static function getRecursiveExecutionStackDepth() {
441 return self::getScopeStack()->getRecursiveDepth();
445 * Attempt to run an update with the appropriate transaction round state if needed
447 * It is allowed for a DeferredUpdate to directly execute one or more other DeferredUpdate
448 * instances without queueing them by calling this method. In that case, the outer update
449 * must use TransactionRoundAwareUpdate::TRX_ROUND_ABSENT, e.g. by extending
450 * TransactionRoundDefiningUpdate, so that this method can give each update its own
451 * transaction round.
453 * @param DeferrableUpdate $update
454 * @since 1.34
456 public static function attemptUpdate( DeferrableUpdate $update ) {
457 self::getScopeStack()->onRunUpdateStart( $update );
459 $update->doUpdate();
461 self::getScopeStack()->onRunUpdateEnd( $update );
465 /** @deprecated class alias since 1.42 */
466 class_alias( DeferredUpdates::class, 'DeferredUpdates' );