Bug 1941128 - Turn off network.dns.native_https_query on Mac again
[gecko.git] / memory / build / PHC.cpp
blob6cde04bf968ec5ed4bc2ad017221823609e9ba2a
1 /* -*- Mode: C++; tab-width: 8; indent-tabs-mode: nil; c-basic-offset: 2 -*- */
2 /* vim: set ts=8 sts=2 et sw=2 tw=80: */
3 /* This Source Code Form is subject to the terms of the Mozilla Public
4 * License, v. 2.0. If a copy of the MPL was not distributed with this
5 * file, You can obtain one at http://mozilla.org/MPL/2.0/. */
7 // PHC is a probabilistic heap checker. A tiny fraction of randomly chosen heap
8 // allocations are subject to some expensive checking via the use of OS page
9 // access protection. A failed check triggers a crash, whereupon useful
10 // information about the failure is put into the crash report. The cost and
11 // coverage for each user is minimal, but spread over the entire user base the
12 // coverage becomes significant.
14 // The idea comes from Chromium, where it is called GWP-ASAN. (Firefox uses PHC
15 // as the name because GWP-ASAN is long, awkward, and doesn't have any
16 // particular meaning.)
18 // In the current implementation up to 64 allocations per process can become
19 // PHC allocations. These allocations must be page-sized or smaller. Each PHC
20 // allocation gets its own page, and when the allocation is freed its page is
21 // marked inaccessible until the page is reused for another allocation. This
22 // means that a use-after-free defect (which includes double-frees) will be
23 // caught if the use occurs before the page is reused for another allocation.
24 // The crash report will contain stack traces for the allocation site, the free
25 // site, and the use-after-free site, which is often enough to diagnose the
26 // defect.
28 // Also, each PHC allocation is followed by a guard page. The PHC allocation is
29 // positioned so that its end abuts the guard page (or as close as possible,
30 // given alignment constraints). This means that a bounds violation at the end
31 // of the allocation (overflow) will be caught. The crash report will contain
32 // stack traces for the allocation site and the bounds violation use site,
33 // which is often enough to diagnose the defect.
35 // (A bounds violation at the start of the allocation (underflow) will not be
36 // caught, unless it is sufficiently large to hit the preceding allocation's
37 // guard page, which is not that likely. It would be possible to look more
38 // assiduously for underflow by randomly placing some allocations at the end of
39 // the page and some at the start of the page, and GWP-ASAN does this. PHC does
40 // not, however, because overflow is likely to be much more common than
41 // underflow in practice.)
43 // We use a simple heuristic to categorize a guard page access as overflow or
44 // underflow: if the address falls in the lower half of the guard page, we
45 // assume it is overflow, otherwise we assume it is underflow. More
46 // sophisticated heuristics are possible, but this one is very simple, and it is
47 // likely that most overflows/underflows in practice are very close to the page
48 // boundary.
50 // The design space for the randomization strategy is large. The current
51 // implementation has a large random delay before it starts operating, and a
52 // small random delay between each PHC allocation attempt. Each freed PHC
53 // allocation is quarantined for a medium random delay before being reused, in
54 // order to increase the chance of catching UAFs.
56 // The basic cost of PHC's operation is as follows.
58 // - The physical memory cost is 64 pages plus some metadata (including stack
59 // traces) for each page. This amounts to 256 KiB per process on
60 // architectures with 4 KiB pages and 1024 KiB on macOS/AArch64 which uses
61 // 16 KiB pages.
63 // - The virtual memory cost is the physical memory cost plus the guard pages:
64 // another 64 pages. This amounts to another 256 KiB per process on
65 // architectures with 4 KiB pages and 1024 KiB on macOS/AArch64 which uses
66 // 16 KiB pages. PHC is currently only enabled on 64-bit platforms so the
67 // impact of the virtual memory usage is negligible.
69 // - Every allocation requires a size check and a decrement-and-check of an
70 // atomic counter. When the counter reaches zero a PHC allocation can occur,
71 // which involves marking a page as accessible and getting a stack trace for
72 // the allocation site. Otherwise, mozjemalloc performs the allocation.
74 // - Every deallocation requires a range check on the pointer to see if it
75 // involves a PHC allocation. (The choice to only do PHC allocations that are
76 // a page or smaller enables this range check, because the 64 pages are
77 // contiguous. Allowing larger allocations would make this more complicated,
78 // and we definitely don't want something as slow as a hash table lookup on
79 // every deallocation.) PHC deallocations involve marking a page as
80 // inaccessible and getting a stack trace for the deallocation site.
82 // Note that calls to realloc(), free(), and malloc_usable_size() will
83 // immediately crash if the given pointer falls within a page allocation's
84 // page, but does not point to the start of the allocation itself.
86 // void* p = malloc(64);
87 // free(p + 1); // p+1 doesn't point to the allocation start; crash
89 // Such crashes will not have the PHC fields in the crash report.
91 // PHC-specific tests can be run with the following commands:
92 // - gtests: `./mach gtest '*PHC*'`
93 // - xpcshell-tests: `./mach test toolkit/crashreporter/test/unit`
94 // - This runs some non-PHC tests as well.
96 #include "PHC.h"
98 #include <stdlib.h>
99 #include <time.h>
101 #include <algorithm>
103 #ifdef XP_WIN
104 # include <process.h>
105 #else
106 # include <sys/mman.h>
107 # include <sys/types.h>
108 # include <pthread.h>
109 # include <unistd.h>
110 #endif
112 #include "mozjemalloc.h"
114 #include "mozjemalloc.h"
115 #include "FdPrintf.h"
116 #include "Mutex.h"
117 #include "mozilla/Assertions.h"
118 #include "mozilla/Atomics.h"
119 #include "mozilla/Attributes.h"
120 #include "mozilla/CheckedInt.h"
121 #include "mozilla/Maybe.h"
122 #include "mozilla/StackWalk.h"
123 #include "mozilla/ThreadLocal.h"
124 #include "mozilla/XorShift128PlusRNG.h"
126 using namespace mozilla;
128 //---------------------------------------------------------------------------
129 // Utilities
130 //---------------------------------------------------------------------------
132 #ifdef ANDROID
133 // Android doesn't have pthread_atfork defined in pthread.h.
134 extern "C" MOZ_EXPORT int pthread_atfork(void (*)(void), void (*)(void),
135 void (*)(void));
136 #endif
138 #ifndef DISALLOW_COPY_AND_ASSIGN
139 # define DISALLOW_COPY_AND_ASSIGN(T) \
140 T(const T&); \
141 void operator=(const T&)
142 #endif
144 // This class provides infallible operations for the small number of heap
145 // allocations that PHC does for itself. It would be nice if we could use the
146 // InfallibleAllocPolicy from mozalloc, but PHC cannot use mozalloc.
147 class InfallibleAllocPolicy {
148 public:
149 static void AbortOnFailure(const void* aP) {
150 if (!aP) {
151 MOZ_CRASH("PHC failed to allocate");
155 template <class T>
156 static T* new_() {
157 void* p = MozJemalloc::malloc(sizeof(T));
158 AbortOnFailure(p);
159 return new (p) T;
163 //---------------------------------------------------------------------------
164 // Stack traces
165 //---------------------------------------------------------------------------
167 // This code is similar to the equivalent code within DMD.
169 class StackTrace : public phc::StackTrace {
170 public:
171 StackTrace() = default;
173 void Clear() { mLength = 0; }
175 void Fill();
177 private:
178 static void StackWalkCallback(uint32_t aFrameNumber, void* aPc, void* aSp,
179 void* aClosure) {
180 StackTrace* st = (StackTrace*)aClosure;
181 MOZ_ASSERT(st->mLength < kMaxFrames);
182 st->mPcs[st->mLength] = aPc;
183 st->mLength++;
184 MOZ_ASSERT(st->mLength == aFrameNumber);
188 // WARNING WARNING WARNING: this function must only be called when PHC::mMutex
189 // is *not* locked, otherwise we might get deadlocks.
191 // How? On Windows, MozStackWalk() can lock a mutex, M, from the shared library
192 // loader. Another thread might call malloc() while holding M locked (when
193 // loading a shared library) and try to lock PHC::mMutex, causing a deadlock.
194 // So PHC::mMutex can't be locked during the call to MozStackWalk(). (For
195 // details, see https://bugzilla.mozilla.org/show_bug.cgi?id=374829#c8. On
196 // Linux, something similar can happen; see bug 824340. So we just disallow it
197 // on all platforms.)
199 // In DMD, to avoid this problem we temporarily unlock the equivalent mutex for
200 // the MozStackWalk() call. But that's grotty, and things are a bit different
201 // here, so we just require that stack traces be obtained before locking
202 // PHC::mMutex.
204 // Unfortunately, there is no reliable way at compile-time or run-time to ensure
205 // this pre-condition. Hence this large comment.
207 void StackTrace::Fill() {
208 mLength = 0;
210 // These ifdefs should be kept in sync with the conditions in
211 // phc_implies_frame_pointers in build/moz.configure/memory.configure
212 #if defined(XP_WIN) && defined(_M_IX86)
213 // This avoids MozStackWalk(), which causes unusably slow startup on Win32
214 // when it is called during static initialization (see bug 1241684).
216 // This code is cribbed from the Gecko Profiler, which also uses
217 // FramePointerStackWalk() on Win32: Registers::SyncPopulate() for the
218 // frame pointer, and GetStackTop() for the stack end.
219 CONTEXT context;
220 RtlCaptureContext(&context);
221 void** fp = reinterpret_cast<void**>(context.Ebp);
223 PNT_TIB pTib = reinterpret_cast<PNT_TIB>(NtCurrentTeb());
224 void* stackEnd = static_cast<void*>(pTib->StackBase);
225 FramePointerStackWalk(StackWalkCallback, kMaxFrames, this, fp, stackEnd);
226 #elif defined(XP_DARWIN)
227 // This avoids MozStackWalk(), which has become unusably slow on Mac due to
228 // changes in libunwind.
230 // This code is cribbed from the Gecko Profiler, which also uses
231 // FramePointerStackWalk() on Mac: Registers::SyncPopulate() for the frame
232 // pointer, and GetStackTop() for the stack end.
233 # pragma GCC diagnostic push
234 # pragma GCC diagnostic ignored "-Wframe-address"
235 void** fp = reinterpret_cast<void**>(__builtin_frame_address(1));
236 # pragma GCC diagnostic pop
237 void* stackEnd = pthread_get_stackaddr_np(pthread_self());
238 FramePointerStackWalk(StackWalkCallback, kMaxFrames, this, fp, stackEnd);
239 #else
240 MozStackWalk(StackWalkCallback, nullptr, kMaxFrames, this);
241 #endif
244 //---------------------------------------------------------------------------
245 // Logging
246 //---------------------------------------------------------------------------
248 // Change this to 1 to enable some PHC logging. Useful for debugging.
249 #define PHC_LOGGING 0
251 #if PHC_LOGGING
253 static size_t GetPid() { return size_t(getpid()); }
255 static size_t GetTid() {
256 # if defined(XP_WIN)
257 return size_t(GetCurrentThreadId());
258 # else
259 return size_t(pthread_self());
260 # endif
263 # if defined(XP_WIN)
264 # define LOG_STDERR \
265 reinterpret_cast<intptr_t>(GetStdHandle(STD_ERROR_HANDLE))
266 # else
267 # define LOG_STDERR 2
268 # endif
269 # define LOG(fmt, ...) \
270 FdPrintf(LOG_STDERR, "PHC[%zu,%zu,~%zu] " fmt, GetPid(), GetTid(), \
271 size_t(PHC::Now()), ##__VA_ARGS__)
273 #else
275 # define LOG(fmt, ...)
277 #endif // PHC_LOGGING
279 //---------------------------------------------------------------------------
280 // Global state
281 //---------------------------------------------------------------------------
283 // Throughout this entire file time is measured as the number of sub-page
284 // allocations performed (by PHC and mozjemalloc combined). `Time` is 64-bit
285 // because we could have more than 2**32 allocations in a long-running session.
286 // `Delay` is 32-bit because the delays used within PHC are always much smaller
287 // than 2**32. Delay must be unsigned so that IsPowerOfTwo() can work on some
288 // Delay values.
289 using Time = uint64_t; // A moment in time.
290 using Delay = uint32_t; // A time duration.
291 static constexpr Delay DELAY_MAX = UINT32_MAX / 2;
293 // PHC only runs if the page size is 4 KiB; anything more is uncommon and would
294 // use too much memory. So we hardwire this size for all platforms but macOS
295 // on ARM processors. For the latter we make an exception because the minimum
296 // page size supported is 16KiB so there's no way to go below that.
297 static const size_t kPageSize =
298 #if defined(XP_DARWIN) && defined(__aarch64__)
299 16384
300 #else
301 4096
302 #endif
305 // We align the PHC area to a multiple of the jemalloc and JS GC chunk size
306 // (both use 1MB aligned chunks) so that their address computations don't lead
307 // from non-PHC memory into PHC memory causing misleading PHC stacks to be
308 // attached to a crash report.
309 static const size_t kPhcAlign = 1024 * 1024;
311 static_assert(IsPowerOfTwo(kPhcAlign));
312 static_assert((kPhcAlign % kPageSize) == 0);
314 // There are two kinds of page.
315 // - Allocation pages, from which allocations are made.
316 // - Guard pages, which are never touched by PHC.
318 // These page kinds are interleaved; each allocation page has a guard page on
319 // either side.
320 #ifdef EARLY_BETA_OR_EARLIER
321 static const size_t kNumAllocPages = kPageSize == 4096 ? 4096 : 1024;
322 #else
323 // This will use between 82KiB and 1.1MiB per process (depending on how many
324 // objects are currently allocated). We will tune this in the future.
325 static const size_t kNumAllocPages = kPageSize == 4096 ? 256 : 64;
326 #endif
327 static const size_t kNumAllPages = kNumAllocPages * 2 + 1;
329 // The total size of the allocation pages and guard pages.
330 static const size_t kAllPagesSize = kNumAllPages * kPageSize;
332 // jemalloc adds a guard page to the end of our allocation, see the comment in
333 // AllocAllPages() for more information.
334 static const size_t kAllPagesJemallocSize = kAllPagesSize - kPageSize;
336 // The amount to decrement from the shared allocation delay each time a thread's
337 // local allocation delay reaches zero.
338 static const Delay kDelayDecrementAmount = 256;
340 // When PHC is disabled on the current thread wait this many allocations before
341 // accessing sAllocDelay once more.
342 static const Delay kDelayBackoffAmount = 64;
344 // When PHC is disabled globally reset the shared delay by this many allocations
345 // to keep code running on the fast path.
346 static const Delay kDelayResetWhenDisabled = 64 * 1024;
348 // The default state for PHC. Either Enabled or OnlyFree.
349 #define DEFAULT_STATE mozilla::phc::OnlyFree
351 // The maximum time.
352 static const Time kMaxTime = ~(Time(0));
354 // Truncate aRnd to the range (1 .. aAvgDelay*2). If aRnd is random, this
355 // results in an average value of aAvgDelay + 0.5, which is close enough to
356 // aAvgDelay. aAvgDelay must be a power-of-two for speed.
357 constexpr Delay Rnd64ToDelay(Delay aAvgDelay, uint64_t aRnd) {
358 MOZ_ASSERT(IsPowerOfTwo(aAvgDelay), "must be a power of two");
360 return (aRnd & (uint64_t(aAvgDelay) * 2 - 1)) + 1;
363 static Delay CheckProbability(int64_t aProb) {
364 // Limit delays calculated from prefs to 0x80000000, this is the largest
365 // power-of-two that fits in a Delay since it is a uint32_t.
366 // The minimum is 2 that way not every allocation goes straight to PHC.
367 return RoundUpPow2(std::clamp(aProb, int64_t(2), int64_t(0x80000000)));
370 // Maps a pointer to a PHC-specific structure:
371 // - Nothing
372 // - A guard page (it is unspecified which one)
373 // - An allocation page (with an index < kNumAllocPages)
375 // The standard way of handling a PtrKind is to check IsNothing(), and if that
376 // fails, to check IsGuardPage(), and if that fails, to call AllocPage().
377 class PtrKind {
378 private:
379 enum class Tag : uint8_t {
380 Nothing,
381 GuardPage,
382 AllocPage,
385 Tag mTag;
386 uintptr_t mIndex; // Only used if mTag == Tag::AllocPage.
388 public:
389 // Detect what a pointer points to. This constructor must be fast because it
390 // is called for every call to free(), realloc(), malloc_usable_size(), and
391 // jemalloc_ptr_info().
392 PtrKind(const void* aPtr, const uint8_t* aPagesStart,
393 const uint8_t* aPagesLimit) {
394 if (!(aPagesStart <= aPtr && aPtr < aPagesLimit)) {
395 mTag = Tag::Nothing;
396 } else {
397 uintptr_t offset = static_cast<const uint8_t*>(aPtr) - aPagesStart;
398 uintptr_t allPageIndex = offset / kPageSize;
399 MOZ_ASSERT(allPageIndex < kNumAllPages);
400 if (allPageIndex & 1) {
401 // Odd-indexed pages are allocation pages.
402 uintptr_t allocPageIndex = allPageIndex / 2;
403 MOZ_ASSERT(allocPageIndex < kNumAllocPages);
404 mTag = Tag::AllocPage;
405 mIndex = allocPageIndex;
406 } else {
407 // Even-numbered pages are guard pages.
408 mTag = Tag::GuardPage;
413 bool IsNothing() const { return mTag == Tag::Nothing; }
414 bool IsGuardPage() const { return mTag == Tag::GuardPage; }
416 // This should only be called after IsNothing() and IsGuardPage() have been
417 // checked and failed.
418 uintptr_t AllocPageIndex() const {
419 MOZ_RELEASE_ASSERT(mTag == Tag::AllocPage);
420 return mIndex;
424 // On MacOS, the first __thread/thread_local access calls malloc, which leads
425 // to an infinite loop. So we use pthread-based TLS instead, which somehow
426 // doesn't have this problem.
427 #if !defined(XP_DARWIN)
428 # define PHC_THREAD_LOCAL(T) MOZ_THREAD_LOCAL(T)
429 #else
430 # define PHC_THREAD_LOCAL(T) \
431 detail::ThreadLocal<T, detail::ThreadLocalKeyStorage>
432 #endif
434 // The virtual address space reserved by PHC. It is shared, immutable global
435 // state. Initialized by phc_init() and never changed after that. phc_init()
436 // runs early enough that no synchronization is needed.
437 class PHCRegion {
438 private:
439 // The bounds of the allocated pages.
440 uint8_t* const mPagesStart;
441 uint8_t* const mPagesLimit;
443 // Allocates the allocation pages and the guard pages, contiguously.
444 uint8_t* AllocAllPages() {
445 // The memory allocated here is never freed, because it would happen at
446 // process termination when it would be of little use.
448 // We can rely on jemalloc's behaviour that when it allocates memory aligned
449 // with its own chunk size it will over-allocate and guarantee that the
450 // memory after the end of our allocation, but before the next chunk, is
451 // decommitted and inaccessible. Elsewhere in PHC we assume that we own
452 // that page (so that memory errors in it get caught by PHC) but here we
453 // use kAllPagesJemallocSize which subtracts jemalloc's guard page.
454 void* pages = MozJemalloc::memalign(kPhcAlign, kAllPagesJemallocSize);
455 if (!pages) {
456 MOZ_CRASH();
459 // Make the pages inaccessible.
460 #ifdef XP_WIN
461 if (!VirtualFree(pages, kAllPagesJemallocSize, MEM_DECOMMIT)) {
462 MOZ_CRASH("VirtualFree failed");
464 #else
465 if (mmap(pages, kAllPagesJemallocSize, PROT_NONE,
466 MAP_FIXED | MAP_PRIVATE | MAP_ANON, -1, 0) == MAP_FAILED) {
467 MOZ_CRASH("mmap failed");
469 #endif
471 return static_cast<uint8_t*>(pages);
474 public:
475 PHCRegion();
477 class PtrKind PtrKind(const void* aPtr) {
478 class PtrKind pk(aPtr, mPagesStart, mPagesLimit);
479 return pk;
482 bool IsInFirstGuardPage(const void* aPtr) {
483 return mPagesStart <= aPtr && aPtr < mPagesStart + kPageSize;
486 // Get the address of the allocation page referred to via an index. Used when
487 // marking the page as accessible/inaccessible.
488 uint8_t* AllocPagePtr(uintptr_t aIndex) {
489 MOZ_ASSERT(aIndex < kNumAllocPages);
490 // Multiply by two and add one to account for allocation pages *and* guard
491 // pages.
492 return mPagesStart + (2 * aIndex + 1) * kPageSize;
496 // This type is used as a proof-of-lock token, to make it clear which functions
497 // require mMutex to be locked.
498 using PHCLock = const MutexAutoLock&;
500 // Shared, mutable global state. Many fields are protected by sMutex; functions
501 // that access those feilds should take a PHCLock as proof that mMutex is held.
502 // Other fields are TLS or Atomic and don't need the lock.
503 class PHC {
504 enum class AllocPageState {
505 NeverAllocated = 0,
506 InUse = 1,
507 Freed = 2,
510 // Metadata for each allocation page.
511 class AllocPageInfo {
512 public:
513 AllocPageInfo()
514 : mState(AllocPageState::NeverAllocated),
515 mBaseAddr(nullptr),
516 mReuseTime(0) {}
518 // The current allocation page state.
519 AllocPageState mState;
521 // The arena that the allocation is nominally from. This isn't meaningful
522 // within PHC, which has no arenas. But it is necessary for reallocation of
523 // page allocations as normal allocations, such as in this code:
525 // p = moz_arena_malloc(arenaId, 4096);
526 // realloc(p, 8192);
528 // The realloc is more than one page, and thus too large for PHC to handle.
529 // Therefore, if PHC handles the first allocation, it must ask mozjemalloc
530 // to allocate the 8192 bytes in the correct arena, and to do that, it must
531 // call MozJemalloc::moz_arena_malloc with the correct arenaId under the
532 // covers. Therefore it must record that arenaId.
534 // This field is also needed for jemalloc_ptr_info() to work, because it
535 // also returns the arena ID (but only in debug builds).
537 // - NeverAllocated: must be 0.
538 // - InUse | Freed: can be any valid arena ID value.
539 Maybe<arena_id_t> mArenaId;
541 // The starting address of the allocation. Will not be the same as the page
542 // address unless the allocation is a full page.
543 // - NeverAllocated: must be 0.
544 // - InUse | Freed: must be within the allocation page.
545 uint8_t* mBaseAddr;
547 // Usable size is computed as the number of bytes between the pointer and
548 // the end of the allocation page. This might be bigger than the requested
549 // size, especially if an outsized alignment is requested.
550 size_t UsableSize() const {
551 return mState == AllocPageState::NeverAllocated
553 : kPageSize - (reinterpret_cast<uintptr_t>(mBaseAddr) &
554 (kPageSize - 1));
557 // The internal fragmentation for this allocation.
558 size_t FragmentationBytes() const {
559 MOZ_ASSERT(kPageSize >= UsableSize());
560 return mState == AllocPageState::InUse ? kPageSize - UsableSize() : 0;
563 // The allocation stack.
564 // - NeverAllocated: Nothing.
565 // - InUse | Freed: Some.
566 Maybe<StackTrace> mAllocStack;
568 // The free stack.
569 // - NeverAllocated | InUse: Nothing.
570 // - Freed: Some.
571 Maybe<StackTrace> mFreeStack;
573 // The time at which the page is available for reuse, as measured against
574 // mNow. When the page is in use this value will be kMaxTime.
575 // - NeverAllocated: must be 0.
576 // - InUse: must be kMaxTime.
577 // - Freed: must be > 0 and < kMaxTime.
578 Time mReuseTime;
580 // The next index for a free list of pages.`
581 Maybe<uintptr_t> mNextPage;
584 public:
585 // The RNG seeds here are poor, but non-reentrant since this can be called
586 // from malloc(). SetState() will reset the RNG later.
587 PHC() : mRNG(RandomSeed<1>(), RandomSeed<2>()) {
588 mMutex.Init();
589 if (!tlsIsDisabled.init()) {
590 MOZ_CRASH();
592 if (!tlsAllocDelay.init()) {
593 MOZ_CRASH();
595 if (!tlsLastDelay.init()) {
596 MOZ_CRASH();
599 // This constructor is part of PHC's very early initialisation,
600 // see phc_init(), and if PHC is default-on it'll start marking allocations
601 // and we must setup the delay. However once XPCOM starts it'll call
602 // SetState() which will re-initialise the RNG and allocation delay.
603 MutexAutoLock lock(mMutex);
605 ForceSetNewAllocDelay(Rnd64ToDelay(mAvgFirstAllocDelay, Random64(lock)));
607 for (uintptr_t i = 0; i < kNumAllocPages; i++) {
608 AppendPageToFreeList(lock, i);
612 uint64_t Random64(PHCLock) { return mRNG.next(); }
614 bool IsPageInUse(PHCLock, uintptr_t aIndex) {
615 return mAllocPages[aIndex].mState == AllocPageState::InUse;
618 // Is the page free? And if so, has enough time passed that we can use it?
619 bool IsPageAllocatable(PHCLock, uintptr_t aIndex, Time aNow) {
620 const AllocPageInfo& page = mAllocPages[aIndex];
621 return page.mState != AllocPageState::InUse && aNow >= page.mReuseTime;
624 // Get the address of the allocation page referred to via an index. Used
625 // when checking pointers against page boundaries.
626 uint8_t* AllocPageBaseAddr(PHCLock, uintptr_t aIndex) {
627 return mAllocPages[aIndex].mBaseAddr;
630 Maybe<arena_id_t> PageArena(PHCLock aLock, uintptr_t aIndex) {
631 const AllocPageInfo& page = mAllocPages[aIndex];
632 AssertAllocPageInUse(aLock, page);
634 return page.mArenaId;
637 size_t PageUsableSize(PHCLock aLock, uintptr_t aIndex) {
638 const AllocPageInfo& page = mAllocPages[aIndex];
639 AssertAllocPageInUse(aLock, page);
641 return page.UsableSize();
644 // The total fragmentation in PHC
645 size_t FragmentationBytes() const {
646 size_t sum = 0;
647 for (const auto& page : mAllocPages) {
648 sum += page.FragmentationBytes();
650 return sum;
653 void SetPageInUse(PHCLock aLock, uintptr_t aIndex,
654 const Maybe<arena_id_t>& aArenaId, uint8_t* aBaseAddr,
655 const StackTrace& aAllocStack) {
656 AllocPageInfo& page = mAllocPages[aIndex];
657 AssertAllocPageNotInUse(aLock, page);
659 page.mState = AllocPageState::InUse;
660 page.mArenaId = aArenaId;
661 page.mBaseAddr = aBaseAddr;
662 page.mAllocStack = Some(aAllocStack);
663 page.mFreeStack = Nothing();
664 page.mReuseTime = kMaxTime;
665 MOZ_ASSERT(!page.mNextPage);
668 #if PHC_LOGGING
669 Time GetFreeTime(uintptr_t aIndex) const { return mFreeTime[aIndex]; }
670 #endif
672 void ResizePageInUse(PHCLock aLock, uintptr_t aIndex,
673 const Maybe<arena_id_t>& aArenaId, uint8_t* aNewBaseAddr,
674 const StackTrace& aAllocStack) {
675 AllocPageInfo& page = mAllocPages[aIndex];
676 AssertAllocPageInUse(aLock, page);
678 // page.mState is not changed.
679 if (aArenaId.isSome()) {
680 // Crash if the arenas don't match.
681 MOZ_RELEASE_ASSERT(page.mArenaId == aArenaId);
683 page.mBaseAddr = aNewBaseAddr;
684 // We could just keep the original alloc stack, but the realloc stack is
685 // more recent and therefore seems more useful.
686 page.mAllocStack = Some(aAllocStack);
687 // page.mFreeStack is not changed.
688 // page.mReuseTime is not changed.
689 // page.mNextPage is not changed.
692 void SetPageFreed(PHCLock aLock, uintptr_t aIndex,
693 const Maybe<arena_id_t>& aArenaId,
694 const StackTrace& aFreeStack, Delay aReuseDelay) {
695 AllocPageInfo& page = mAllocPages[aIndex];
696 AssertAllocPageInUse(aLock, page);
698 page.mState = AllocPageState::Freed;
700 // page.mArenaId is left unchanged, for jemalloc_ptr_info() calls that
701 // occur after freeing (e.g. in the PtrInfo test in TestJemalloc.cpp).
702 if (aArenaId.isSome()) {
703 // Crash if the arenas don't match.
704 MOZ_RELEASE_ASSERT(page.mArenaId == aArenaId);
707 // page.musableSize is left unchanged, for reporting on UAF, and for
708 // jemalloc_ptr_info() calls that occur after freeing (e.g. in the PtrInfo
709 // test in TestJemalloc.cpp).
711 // page.mAllocStack is left unchanged, for reporting on UAF.
713 page.mFreeStack = Some(aFreeStack);
714 Time now = Now();
715 #if PHC_LOGGING
716 mFreeTime[aIndex] = now;
717 #endif
718 page.mReuseTime = now + aReuseDelay;
720 MOZ_ASSERT(!page.mNextPage);
721 AppendPageToFreeList(aLock, aIndex);
724 static void CrashOnGuardPage(void* aPtr) {
725 // An operation on a guard page? This is a bounds violation. Deliberately
726 // touch the page in question to cause a crash that triggers the usual PHC
727 // machinery.
728 LOG("CrashOnGuardPage(%p), bounds violation\n", aPtr);
729 *static_cast<uint8_t*>(aPtr) = 0;
730 MOZ_CRASH("unreachable");
733 void EnsureValidAndInUse(PHCLock, void* aPtr, uintptr_t aIndex)
734 MOZ_REQUIRES(mMutex) {
735 const AllocPageInfo& page = mAllocPages[aIndex];
737 // The pointer must point to the start of the allocation.
738 MOZ_RELEASE_ASSERT(page.mBaseAddr == aPtr);
740 if (page.mState == AllocPageState::Freed) {
741 LOG("EnsureValidAndInUse(%p), use-after-free\n", aPtr);
742 // An operation on a freed page? This is a particular kind of
743 // use-after-free. Deliberately touch the page in question, in order to
744 // cause a crash that triggers the usual PHC machinery. But unlock mMutex
745 // first, because that self-same PHC machinery needs to re-lock it, and
746 // the crash causes non-local control flow so mMutex won't be unlocked
747 // the normal way in the caller.
748 mMutex.Unlock();
749 *static_cast<uint8_t*>(aPtr) = 0;
750 MOZ_CRASH("unreachable");
754 // This expects GMUt::mMutex to be locked but can't check it with a parameter
755 // since we try-lock it.
756 void FillAddrInfo(uintptr_t aIndex, const void* aBaseAddr, bool isGuardPage,
757 phc::AddrInfo& aOut) {
758 const AllocPageInfo& page = mAllocPages[aIndex];
759 if (isGuardPage) {
760 aOut.mKind = phc::AddrInfo::Kind::GuardPage;
761 } else {
762 switch (page.mState) {
763 case AllocPageState::NeverAllocated:
764 aOut.mKind = phc::AddrInfo::Kind::NeverAllocatedPage;
765 break;
767 case AllocPageState::InUse:
768 aOut.mKind = phc::AddrInfo::Kind::InUsePage;
769 break;
771 case AllocPageState::Freed:
772 aOut.mKind = phc::AddrInfo::Kind::FreedPage;
773 break;
775 default:
776 MOZ_CRASH();
779 aOut.mBaseAddr = page.mBaseAddr;
780 aOut.mUsableSize = page.UsableSize();
781 aOut.mAllocStack = page.mAllocStack;
782 aOut.mFreeStack = page.mFreeStack;
785 void FillJemallocPtrInfo(PHCLock, const void* aPtr, uintptr_t aIndex,
786 jemalloc_ptr_info_t* aInfo) {
787 const AllocPageInfo& page = mAllocPages[aIndex];
788 switch (page.mState) {
789 case AllocPageState::NeverAllocated:
790 break;
792 case AllocPageState::InUse: {
793 // Only return TagLiveAlloc if the pointer is within the bounds of the
794 // allocation's usable size.
795 uint8_t* base = page.mBaseAddr;
796 uint8_t* limit = base + page.UsableSize();
797 if (base <= aPtr && aPtr < limit) {
798 *aInfo = {TagLiveAlloc, page.mBaseAddr, page.UsableSize(),
799 page.mArenaId.valueOr(0)};
800 return;
802 break;
805 case AllocPageState::Freed: {
806 // Only return TagFreedAlloc if the pointer is within the bounds of the
807 // former allocation's usable size.
808 uint8_t* base = page.mBaseAddr;
809 uint8_t* limit = base + page.UsableSize();
810 if (base <= aPtr && aPtr < limit) {
811 *aInfo = {TagFreedAlloc, page.mBaseAddr, page.UsableSize(),
812 page.mArenaId.valueOr(0)};
813 return;
815 break;
818 default:
819 MOZ_CRASH();
822 // Pointers into guard pages will end up here, as will pointers into
823 // allocation pages that aren't within the allocation's bounds.
824 *aInfo = {TagUnknown, nullptr, 0, 0};
827 #ifndef XP_WIN
828 static void prefork() MOZ_NO_THREAD_SAFETY_ANALYSIS {
829 PHC::sPHC->mMutex.Lock();
831 static void postfork_parent() MOZ_NO_THREAD_SAFETY_ANALYSIS {
832 PHC::sPHC->mMutex.Unlock();
834 static void postfork_child() { PHC::sPHC->mMutex.Init(); }
835 #endif
837 #if PHC_LOGGING
838 void IncPageAllocHits(PHCLock) { mPageAllocHits++; }
839 void IncPageAllocMisses(PHCLock) { mPageAllocMisses++; }
840 #else
841 void IncPageAllocHits(PHCLock) {}
842 void IncPageAllocMisses(PHCLock) {}
843 #endif
845 phc::PHCStats GetPageStats(PHCLock) {
846 phc::PHCStats stats;
848 for (const auto& page : mAllocPages) {
849 stats.mSlotsAllocated += page.mState == AllocPageState::InUse ? 1 : 0;
850 stats.mSlotsFreed += page.mState == AllocPageState::Freed ? 1 : 0;
852 stats.mSlotsUnused =
853 kNumAllocPages - stats.mSlotsAllocated - stats.mSlotsFreed;
855 return stats;
858 #if PHC_LOGGING
859 size_t PageAllocHits(PHCLock) { return mPageAllocHits; }
860 size_t PageAllocAttempts(PHCLock) {
861 return mPageAllocHits + mPageAllocMisses;
864 // This is an integer because FdPrintf only supports integer printing.
865 size_t PageAllocHitRate(PHCLock) {
866 return mPageAllocHits * 100 / (mPageAllocHits + mPageAllocMisses);
868 #endif
870 // Should we make new PHC allocations?
871 bool ShouldMakeNewAllocations() const {
872 return mPhcState == mozilla::phc::Enabled;
875 using PHCState = mozilla::phc::PHCState;
876 void SetState(PHCState aState) {
877 if (mPhcState != PHCState::Enabled && aState == PHCState::Enabled) {
878 MutexAutoLock lock(mMutex);
879 // Reset the RNG at this point with a better seed.
880 ResetRNG(lock);
882 ForceSetNewAllocDelay(Rnd64ToDelay(mAvgFirstAllocDelay, Random64(lock)));
885 mPhcState = aState;
888 void ResetRNG(MutexAutoLock&) {
889 mRNG = non_crypto::XorShift128PlusRNG(RandomSeed<0>(), RandomSeed<1>());
892 void SetProbabilities(int64_t aAvgDelayFirst, int64_t aAvgDelayNormal,
893 int64_t aAvgDelayPageReuse) {
894 MutexAutoLock lock(mMutex);
896 mAvgFirstAllocDelay = CheckProbability(aAvgDelayFirst);
897 mAvgAllocDelay = CheckProbability(aAvgDelayNormal);
898 mAvgPageReuseDelay = CheckProbability(aAvgDelayPageReuse);
901 static void DisableOnCurrentThread() {
902 MOZ_ASSERT(!tlsIsDisabled.get());
903 tlsIsDisabled.set(true);
906 void EnableOnCurrentThread() {
907 MOZ_ASSERT(tlsIsDisabled.get());
908 tlsIsDisabled.set(false);
911 static bool IsDisabledOnCurrentThread() { return tlsIsDisabled.get(); }
913 static Time Now() {
914 if (!sPHC) {
915 return 0;
918 return sPHC->mNow;
921 void AdvanceNow(uint32_t delay = 0) {
922 mNow += tlsLastDelay.get() - delay;
923 tlsLastDelay.set(delay);
926 // Decrements the delay and returns true if it's time to make a new PHC
927 // allocation.
928 static bool DecrementDelay() {
929 const Delay alloc_delay = tlsAllocDelay.get();
931 if (MOZ_LIKELY(alloc_delay > 0)) {
932 tlsAllocDelay.set(alloc_delay - 1);
933 return false;
935 // The local delay has expired, check the shared delay. This path is also
936 // executed on a new thread's first allocation, the result is the same: all
937 // the thread's TLS fields will be initialised.
939 // This accesses sPHC but we want to ensure it's still a static member
940 // function so that sPHC isn't dereferenced until after the hot path above.
941 MOZ_ASSERT(sPHC);
942 sPHC->AdvanceNow();
944 // Use an atomic fetch-and-subtract. This uses unsigned underflow semantics
945 // to avoid doing a full compare-and-swap.
946 Delay new_delay = (sAllocDelay -= kDelayDecrementAmount);
947 Delay old_delay = new_delay + kDelayDecrementAmount;
948 if (MOZ_LIKELY(new_delay < DELAY_MAX)) {
949 // Normal case, we decremented the shared delay but it's not yet
950 // underflowed.
951 tlsAllocDelay.set(kDelayDecrementAmount);
952 tlsLastDelay.set(kDelayDecrementAmount);
953 LOG("Update sAllocDelay <- %zu, tlsAllocDelay <- %zu\n",
954 size_t(new_delay), size_t(kDelayDecrementAmount));
955 return false;
958 if (old_delay < new_delay) {
959 // The shared delay only just underflowed, so unless we hit exactly zero
960 // we should set our local counter and continue.
961 LOG("Update sAllocDelay <- %zu, tlsAllocDelay <- %zu\n",
962 size_t(new_delay), size_t(old_delay));
963 if (old_delay == 0) {
964 // We don't need to set tlsAllocDelay because it's already zero, we know
965 // because the condition at the beginning of this function failed.
966 return true;
968 tlsAllocDelay.set(old_delay);
969 tlsLastDelay.set(old_delay);
970 return false;
973 // The delay underflowed on another thread or a previous failed allocation
974 // by this thread. Return true and attempt the next allocation, if the
975 // other thread wins we'll check for that before committing.
976 LOG("Update sAllocDelay <- %zu, tlsAllocDelay <- %zu\n", size_t(new_delay),
977 size_t(alloc_delay));
978 return true;
981 static void ResetLocalAllocDelay(Delay aDelay = 0) {
982 // We could take some delay from the shared delay but we'd need a
983 // compare-and-swap because this is called on paths that don't make
984 // allocations. Or we can set the local delay to zero and let it get
985 // initialised on the next allocation.
986 tlsAllocDelay.set(aDelay);
987 tlsLastDelay.set(aDelay);
990 static void ForceSetNewAllocDelay(Delay aNewAllocDelay) {
991 LOG("Setting sAllocDelay <- %zu\n", size_t(aNewAllocDelay));
992 sAllocDelay = aNewAllocDelay;
993 ResetLocalAllocDelay();
996 // Set a new allocation delay and return true if the delay was less than zero
997 // (but it's unsigned so interpret it as signed) indicating that we won the
998 // race to make the next allocation.
999 static bool SetNewAllocDelay(Delay aNewAllocDelay) {
1000 bool cas_retry;
1001 do {
1002 // We read the current delay on every iteration, we consider that the PHC
1003 // allocation is still "up for grabs" if sAllocDelay < 0. This is safe
1004 // even while other threads continuing to fetch-and-subtract sAllocDelay
1005 // in DecrementDelay(), up to DELAY_MAX (2^31) calls to DecrementDelay().
1006 Delay read_delay = sAllocDelay;
1007 if (read_delay < DELAY_MAX) {
1008 // Another thread already set a valid delay.
1009 LOG("Observe delay %zu this thread lost the race\n",
1010 size_t(read_delay));
1011 ResetLocalAllocDelay();
1012 return false;
1013 } else {
1014 LOG("Preparing for CAS, read sAllocDelay %zu\n", size_t(read_delay));
1017 cas_retry = !sAllocDelay.compareExchange(read_delay, aNewAllocDelay);
1018 if (cas_retry) {
1019 LOG("Lost the CAS, sAllocDelay is now %zu\n", size_t(sAllocDelay));
1020 cpu_pause();
1021 // We raced against another thread and lost.
1023 } while (cas_retry);
1024 LOG("Won the CAS, set sAllocDelay = %zu\n", size_t(sAllocDelay));
1025 ResetLocalAllocDelay();
1026 return true;
1029 static Delay LocalAllocDelay() { return tlsAllocDelay.get(); }
1030 static Delay SharedAllocDelay() { return sAllocDelay; }
1032 static Delay LastDelay() { return tlsLastDelay.get(); }
1034 Maybe<uintptr_t> PopNextFreeIfAllocatable(const MutexAutoLock& lock,
1035 Time now) {
1036 if (!mFreePageListHead) {
1037 return Nothing();
1040 uintptr_t index = mFreePageListHead.value();
1042 MOZ_RELEASE_ASSERT(index < kNumAllocPages);
1043 AllocPageInfo& page = mAllocPages[index];
1044 AssertAllocPageNotInUse(lock, page);
1046 if (!IsPageAllocatable(lock, index, now)) {
1047 return Nothing();
1050 mFreePageListHead = page.mNextPage;
1051 page.mNextPage = Nothing();
1052 if (!mFreePageListHead) {
1053 mFreePageListTail = Nothing();
1056 return Some(index);
1059 void UnpopNextFree(const MutexAutoLock& lock, uintptr_t index) {
1060 AllocPageInfo& page = mAllocPages[index];
1061 MOZ_ASSERT(!page.mNextPage);
1063 page.mNextPage = mFreePageListHead;
1064 mFreePageListHead = Some(index);
1065 if (!mFreePageListTail) {
1066 mFreePageListTail = Some(index);
1070 void AppendPageToFreeList(const MutexAutoLock& lock, uintptr_t aIndex) {
1071 MOZ_RELEASE_ASSERT(aIndex < kNumAllocPages);
1072 AllocPageInfo& page = mAllocPages[aIndex];
1073 MOZ_ASSERT(!page.mNextPage);
1074 MOZ_ASSERT(mFreePageListHead != Some(aIndex) &&
1075 mFreePageListTail != Some(aIndex));
1077 if (!mFreePageListTail) {
1078 // The list is empty this page will become the beginning and end.
1079 MOZ_ASSERT(!mFreePageListHead);
1080 mFreePageListHead = Some(aIndex);
1081 } else {
1082 MOZ_ASSERT(mFreePageListTail.value() < kNumAllocPages);
1083 AllocPageInfo& tail_page = mAllocPages[mFreePageListTail.value()];
1084 MOZ_ASSERT(!tail_page.mNextPage);
1085 tail_page.mNextPage = Some(aIndex);
1087 page.mNextPage = Nothing();
1088 mFreePageListTail = Some(aIndex);
1091 private:
1092 template <int N>
1093 uint64_t RandomSeed() {
1094 // An older version of this code used RandomUint64() here, but on Mac that
1095 // function uses arc4random(), which can allocate, which would cause
1096 // re-entry, which would be bad. So we just use time(), a local variable
1097 // address and a global variable address. These are mediocre sources of
1098 // entropy, but good enough for PHC.
1099 static_assert(N == 0 || N == 1 || N == 2, "must be 0, 1 or 2");
1100 uint64_t seed;
1101 if (N == 0) {
1102 time_t t = time(nullptr);
1103 seed = t ^ (t << 32);
1104 } else if (N == 1) {
1105 seed = uintptr_t(&seed) ^ (uintptr_t(&seed) << 32);
1106 } else {
1107 seed = uintptr_t(&sRegion) ^ (uintptr_t(&sRegion) << 32);
1109 return seed;
1112 void AssertAllocPageInUse(PHCLock, const AllocPageInfo& aPage) {
1113 MOZ_ASSERT(aPage.mState == AllocPageState::InUse);
1114 // There is nothing to assert about aPage.mArenaId.
1115 MOZ_ASSERT(aPage.mBaseAddr);
1116 MOZ_ASSERT(aPage.UsableSize() > 0);
1117 MOZ_ASSERT(aPage.mAllocStack.isSome());
1118 MOZ_ASSERT(aPage.mFreeStack.isNothing());
1119 MOZ_ASSERT(aPage.mReuseTime == kMaxTime);
1120 MOZ_ASSERT(!aPage.mNextPage);
1123 void AssertAllocPageNotInUse(PHCLock, const AllocPageInfo& aPage) {
1124 // We can assert a lot about `NeverAllocated` pages, but not much about
1125 // `Freed` pages.
1126 #ifdef DEBUG
1127 bool isFresh = aPage.mState == AllocPageState::NeverAllocated;
1128 MOZ_ASSERT(isFresh || aPage.mState == AllocPageState::Freed);
1129 MOZ_ASSERT_IF(isFresh, aPage.mArenaId == Nothing());
1130 MOZ_ASSERT(isFresh == (aPage.mBaseAddr == nullptr));
1131 MOZ_ASSERT(isFresh == (aPage.mAllocStack.isNothing()));
1132 MOZ_ASSERT(isFresh == (aPage.mFreeStack.isNothing()));
1133 MOZ_ASSERT(aPage.mReuseTime != kMaxTime);
1134 #endif
1137 // To improve locality we try to order this file by how frequently different
1138 // fields are modified and place all the modified-together fields early and
1139 // ideally within a single cache line.
1140 public:
1141 // The mutex that protects the other members.
1142 alignas(kCacheLineSize) Mutex mMutex MOZ_UNANNOTATED;
1144 private:
1145 // The current time. We use ReleaseAcquire semantics since we attempt to
1146 // update this by larger increments and don't want to lose an entire update.
1147 Atomic<Time, ReleaseAcquire> mNow;
1149 // This will only ever be updated from one thread. The other threads should
1150 // eventually get the update.
1151 Atomic<PHCState, Relaxed> mPhcState =
1152 Atomic<PHCState, Relaxed>(DEFAULT_STATE);
1154 // RNG for deciding which allocations to treat specially. It doesn't need to
1155 // be high quality.
1157 // This is a raw pointer for the reason explained in the comment above
1158 // PHC's constructor. Don't change it to UniquePtr or anything like that.
1159 non_crypto::XorShift128PlusRNG mRNG;
1161 // A linked list of free pages. Pages are allocated from the head of the list
1162 // and returned to the tail. The list will naturally order itself by "last
1163 // freed time" so if the head of the list can't satisfy an allocation due to
1164 // time then none of the pages can.
1165 Maybe<uintptr_t> mFreePageListHead;
1166 Maybe<uintptr_t> mFreePageListTail;
1168 #if PHC_LOGGING
1169 // How many allocations that could have been page allocs actually were? As
1170 // constrained kNumAllocPages. If the hit ratio isn't close to 100% it's
1171 // likely that the global constants are poorly chosen.
1172 size_t mPageAllocHits = 0;
1173 size_t mPageAllocMisses = 0;
1174 #endif
1176 // The remaining fields are updated much less often, place them on the next
1177 // cache line.
1179 // The average delay before doing any page allocations at the start of a
1180 // process. Note that roughly 1 million allocations occur in the main process
1181 // while starting the browser. The delay range is 1..gAvgFirstAllocDelay*2.
1182 alignas(kCacheLineSize) Delay mAvgFirstAllocDelay = 64 * 1024;
1184 // The average delay until the next attempted page allocation, once we get
1185 // past the first delay. The delay range is 1..kAvgAllocDelay*2.
1186 Delay mAvgAllocDelay = 16 * 1024;
1188 // The average delay before reusing a freed page. Should be significantly
1189 // larger than kAvgAllocDelay, otherwise there's not much point in having it.
1190 // The delay range is (kAvgAllocDelay / 2)..(kAvgAllocDelay / 2 * 3). This is
1191 // different to the other delay ranges in not having a minimum of 1, because
1192 // that's such a short delay that there is a high likelihood of bad stacks in
1193 // any crash report.
1194 Delay mAvgPageReuseDelay = 256 * 1024;
1196 // When true, PHC does as little as possible.
1198 // (a) It does not allocate any new page allocations.
1200 // (b) It avoids doing any operations that might call malloc/free/etc., which
1201 // would cause re-entry into PHC. (In practice, MozStackWalk() is the
1202 // only such operation.) Note that calls to the functions in MozJemalloc
1203 // are ok.
1205 // For example, replace_malloc() will just fall back to mozjemalloc. However,
1206 // operations involving existing allocations are more complex, because those
1207 // existing allocations may be page allocations. For example, if
1208 // replace_free() is passed a page allocation on a PHC-disabled thread, it
1209 // will free the page allocation in the usual way, but it will get a dummy
1210 // freeStack in order to avoid calling MozStackWalk(), as per (b) above.
1212 // This single disabling mechanism has two distinct uses.
1214 // - It's used to prevent re-entry into PHC, which can cause correctness
1215 // problems. For example, consider this sequence.
1217 // 1. enter replace_free()
1218 // 2. which calls PageFree()
1219 // 3. which calls MozStackWalk()
1220 // 4. which locks a mutex M, and then calls malloc
1221 // 5. enter replace_malloc()
1222 // 6. which calls MaybePageAlloc()
1223 // 7. which calls MozStackWalk()
1224 // 8. which (re)locks a mutex M --> deadlock
1226 // We avoid this sequence by "disabling" the thread in PageFree() (at step
1227 // 2), which causes MaybePageAlloc() to fail, avoiding the call to
1228 // MozStackWalk() (at step 7).
1230 // In practice, realloc or free of a PHC allocation is unlikely on a thread
1231 // that is disabled because of this use: MozStackWalk() will probably only
1232 // realloc/free allocations that it allocated itself, but those won't be
1233 // page allocations because PHC is disabled before calling MozStackWalk().
1235 // (Note that MaybePageAlloc() could safely do a page allocation so long as
1236 // it avoided calling MozStackWalk() by getting a dummy allocStack. But it
1237 // wouldn't be useful, and it would prevent the second use below.)
1239 // - It's used to prevent PHC allocations in some tests that rely on
1240 // mozjemalloc's exact allocation behaviour, which PHC does not replicate
1241 // exactly. (Note that (b) isn't necessary for this use -- MozStackWalk()
1242 // could be safely called -- but it is necessary for the first use above.)
1244 static PHC_THREAD_LOCAL(bool) tlsIsDisabled;
1246 // Delay until the next attempt at a page allocation. The delay is made up of
1247 // two parts the global delay and each thread's local portion of that delay:
1249 // delay = sDelay + sum_all_threads(tlsAllocDelay)
1251 // Threads use their local delay to reduce contention on the shared delay.
1253 // See the comment in MaybePageAlloc() for an explanation of why it uses
1254 // ReleaseAcquire semantics.
1255 static Atomic<Delay, ReleaseAcquire> sAllocDelay;
1256 static PHC_THREAD_LOCAL(Delay) tlsAllocDelay;
1258 // The last value we set tlsAllocDelay to before starting to count down.
1259 static PHC_THREAD_LOCAL(Delay) tlsLastDelay;
1261 AllocPageInfo mAllocPages[kNumAllocPages];
1262 #if PHC_LOGGING
1263 Time mFreeTime[kNumAllocPages];
1264 #endif
1266 public:
1267 Delay GetAvgAllocDelay(const MutexAutoLock&) { return mAvgAllocDelay; }
1268 Delay GetAvgFirstAllocDelay(const MutexAutoLock&) {
1269 return mAvgFirstAllocDelay;
1271 Delay GetAvgPageReuseDelay(const MutexAutoLock&) {
1272 return mAvgPageReuseDelay;
1275 // Both of these are accessed early on hot code paths. We make them both
1276 // static variables rathan making sRegion a member of sPHC to keep these hot
1277 // code paths as fast as possible. They're both "write once" so they can
1278 // share a cache line.
1279 static PHCRegion* sRegion;
1280 static PHC* sPHC;
1283 // These globals are read together and hardly ever written. They should be on
1284 // the same cache line. They should be in a different cache line to data that
1285 // is manipulated often (sMutex and mNow are members of sPHC for that reason) so
1286 // that this cache line can be shared amoung cores. This makes a measurable
1287 // impact to calls to maybe_init()
1288 alignas(kCacheLineSize) PHCRegion* PHC::sRegion;
1289 PHC* PHC::sPHC;
1291 PHC_THREAD_LOCAL(bool) PHC::tlsIsDisabled;
1292 PHC_THREAD_LOCAL(Delay) PHC::tlsAllocDelay;
1293 Atomic<Delay, ReleaseAcquire> PHC::sAllocDelay;
1294 PHC_THREAD_LOCAL(Delay) PHC::tlsLastDelay;
1296 // This must be defined after the PHC class.
1297 PHCRegion::PHCRegion()
1298 : mPagesStart(AllocAllPages()), mPagesLimit(mPagesStart + kAllPagesSize) {
1299 LOG("AllocAllPages at %p..%p\n", mPagesStart, mPagesLimit);
1302 // When PHC wants to crash we first have to unlock so that the crash reporter
1303 // can call into PHC to lockup its pointer. That also means that before calling
1304 // PHCCrash please ensure that state is consistent. Because this can report an
1305 // arbitrary string, use of it must be reviewed by Firefox data stewards.
1306 static void PHCCrash(PHCLock, const char* aMessage)
1307 MOZ_REQUIRES(PHC::sPHC->mMutex) {
1308 PHC::sPHC->mMutex.Unlock();
1309 MOZ_CRASH_UNSAFE(aMessage);
1312 class AutoDisableOnCurrentThread {
1313 public:
1314 AutoDisableOnCurrentThread(const AutoDisableOnCurrentThread&) = delete;
1316 const AutoDisableOnCurrentThread& operator=(
1317 const AutoDisableOnCurrentThread&) = delete;
1319 explicit AutoDisableOnCurrentThread() { PHC::DisableOnCurrentThread(); }
1320 ~AutoDisableOnCurrentThread() { PHC::sPHC->EnableOnCurrentThread(); }
1323 //---------------------------------------------------------------------------
1324 // Initialisation
1325 //---------------------------------------------------------------------------
1327 // WARNING: this function runs *very* early -- before all static initializers
1328 // have run. For this reason, non-scalar globals (sRegion, sPHC) are allocated
1329 // dynamically (so we can guarantee their construction in this function) rather
1330 // than statically.
1331 static bool phc_init() {
1332 if (GetKernelPageSize() != kPageSize) {
1333 return false;
1336 // sRegion and sPHC are never freed. They live for the life of the process.
1337 PHC::sRegion = InfallibleAllocPolicy::new_<PHCRegion>();
1339 PHC::sPHC = InfallibleAllocPolicy::new_<PHC>();
1341 #ifndef XP_WIN
1342 // Avoid deadlocks when forking by acquiring our state lock prior to forking
1343 // and releasing it after forking. See |LogAlloc|'s |phc_init| for
1344 // in-depth details.
1345 pthread_atfork(PHC::prefork, PHC::postfork_parent, PHC::postfork_child);
1346 #endif
1348 return true;
1351 static inline bool maybe_init() {
1352 // This runs on hot paths and we can save some memory accesses by using sPHC
1353 // to test if we've already initialised PHC successfully.
1354 if (MOZ_UNLIKELY(!PHC::sPHC)) {
1355 // The lambda will only be called once and is thread safe.
1356 static bool sInitSuccess = []() { return phc_init(); }();
1357 return sInitSuccess;
1360 return true;
1363 //---------------------------------------------------------------------------
1364 // Page allocation operations
1365 //---------------------------------------------------------------------------
1367 // This is the hot-path for testing if we should make a PHC allocation, it
1368 // should be inlined into the caller while the remainder of the tests that are
1369 // in MaybePageAlloc need not be inlined.
1370 static MOZ_ALWAYS_INLINE bool ShouldPageAllocHot(size_t aReqSize) {
1371 if (MOZ_UNLIKELY(!maybe_init())) {
1372 return false;
1375 if (MOZ_UNLIKELY(aReqSize > kPageSize)) {
1376 return false;
1379 // Decrement the delay. If it's zero, we do a page allocation and reset the
1380 // delay to a random number.
1381 if (MOZ_LIKELY(!PHC::DecrementDelay())) {
1382 return false;
1385 return true;
1388 static void LogNoAlloc(size_t aReqSize, size_t aAlignment,
1389 Delay newAllocDelay) {
1390 // No pages are available, or VirtualAlloc/mprotect failed.
1391 #if PHC_LOGGING
1392 phc::PHCStats stats = PHC::sPHC->GetPageStats(lock);
1393 #endif
1394 LOG("No PageAlloc(%zu, %zu), sAllocDelay <- %zu, fullness %zu/%zu/%zu, "
1395 "hits %zu/%zu (%zu%%)\n",
1396 aReqSize, aAlignment, size_t(newAllocDelay), stats.mSlotsAllocated,
1397 stats.mSlotsFreed, kNumAllocPages, PHC::sPHC->PageAllocHits(lock),
1398 PHC::sPHC->PageAllocAttempts(lock), PHC::sPHC->PageAllocHitRate(lock));
1401 // Attempt a page allocation if the time and the size are right. Allocated
1402 // memory is zeroed if aZero is true. On failure, the caller should attempt a
1403 // normal allocation via MozJemalloc. Can be called in a context where
1404 // PHC::mMutex is locked.
1405 static void* MaybePageAlloc(const Maybe<arena_id_t>& aArenaId, size_t aReqSize,
1406 size_t aAlignment, bool aZero) {
1407 MOZ_ASSERT(IsPowerOfTwo(aAlignment));
1408 MOZ_ASSERT(PHC::sPHC);
1409 if (!PHC::sPHC->ShouldMakeNewAllocations()) {
1410 // Reset the allocation delay so that we take the fast path most of the
1411 // time. Rather than take the lock and use the RNG which are unnecessary
1412 // when PHC is disabled, instead set the delay to a reasonably high number,
1413 // the default average first allocation delay. This is reset when PHC is
1414 // re-enabled anyway.
1415 PHC::ForceSetNewAllocDelay(kDelayResetWhenDisabled);
1416 return nullptr;
1419 if (PHC::IsDisabledOnCurrentThread()) {
1420 // We don't reset sAllocDelay since that might affect other threads. We
1421 // assume this is okay because either this thread will be re-enabled after
1422 // less than DELAY_MAX allocations or that there are other active threads
1423 // that will reset sAllocDelay. We do reset our local delay which will
1424 // cause this thread to "back off" from updating sAllocDelay on future
1425 // allocations.
1426 PHC::ResetLocalAllocDelay(kDelayBackoffAmount);
1427 return nullptr;
1430 // Disable on this thread *before* getting the stack trace.
1431 AutoDisableOnCurrentThread disable;
1433 // Get the stack trace *before* locking the mutex. If we return nullptr then
1434 // it was a waste, but it's not so frequent, and doing a stack walk while
1435 // the mutex is locked is problematic (see the big comment on
1436 // StackTrace::Fill() for details).
1437 StackTrace allocStack;
1438 allocStack.Fill();
1440 MutexAutoLock lock(PHC::sPHC->mMutex);
1442 Time now = PHC::Now();
1444 Delay newAllocDelay = Rnd64ToDelay(PHC::sPHC->GetAvgAllocDelay(lock),
1445 PHC::sPHC->Random64(lock));
1446 if (!PHC::sPHC->SetNewAllocDelay(newAllocDelay)) {
1447 return nullptr;
1450 // Pages are allocated from a free list populated in order of when they're
1451 // freed. If the page at the head of the list is too recently freed to be
1452 // reused then no other pages on the list will be either.
1454 Maybe<uintptr_t> mb_index = PHC::sPHC->PopNextFreeIfAllocatable(lock, now);
1455 if (!mb_index) {
1456 PHC::sPHC->IncPageAllocMisses(lock);
1457 LogNoAlloc(aReqSize, aAlignment, newAllocDelay);
1458 return nullptr;
1460 uintptr_t index = mb_index.value();
1462 #if PHC_LOGGING
1463 Time lifetime = 0;
1464 #endif
1465 uint8_t* pagePtr = PHC::sRegion->AllocPagePtr(index);
1466 MOZ_ASSERT(pagePtr);
1467 bool ok =
1468 #ifdef XP_WIN
1469 !!VirtualAlloc(pagePtr, kPageSize, MEM_COMMIT, PAGE_READWRITE);
1470 #else
1471 mprotect(pagePtr, kPageSize, PROT_READ | PROT_WRITE) == 0;
1472 #endif
1474 if (!ok) {
1475 PHC::sPHC->UnpopNextFree(lock, index);
1476 PHC::sPHC->IncPageAllocMisses(lock);
1477 LogNoAlloc(aReqSize, aAlignment, newAllocDelay);
1478 return nullptr;
1481 size_t usableSize = MozJemalloc::malloc_good_size(aReqSize);
1482 MOZ_ASSERT(usableSize > 0);
1484 // Put the allocation as close to the end of the page as possible,
1485 // allowing for alignment requirements.
1486 uint8_t* ptr = pagePtr + kPageSize - usableSize;
1487 if (aAlignment != 1) {
1488 ptr = reinterpret_cast<uint8_t*>(
1489 (reinterpret_cast<uintptr_t>(ptr) & ~(aAlignment - 1)));
1492 #if PHC_LOGGING
1493 Time then = PHC::sPHC->GetFreeTime(i);
1494 lifetime = then != 0 ? now - then : 0;
1495 #endif
1497 PHC::sPHC->SetPageInUse(lock, index, aArenaId, ptr, allocStack);
1499 if (aZero) {
1500 memset(ptr, 0, usableSize);
1501 } else {
1502 #ifdef DEBUG
1503 memset(ptr, kAllocJunk, usableSize);
1504 #endif
1507 PHC::sPHC->IncPageAllocHits(lock);
1508 #if PHC_LOGGING
1509 phc::PHCStats stats = PHC::sPHC->GetPageStats(lock);
1510 #endif
1511 LOG("PageAlloc(%zu, %zu) -> %p[%zu]/%p (%zu) (z%zu), sAllocDelay <- %zu, "
1512 "fullness %zu/%zu/%zu, hits %zu/%zu (%zu%%), lifetime %zu\n",
1513 aReqSize, aAlignment, pagePtr, i, ptr, usableSize, size_t(newAllocDelay),
1514 size_t(PHC::SharedAllocDelay()), stats.mSlotsAllocated, stats.mSlotsFreed,
1515 kNumAllocPages, PHC::sPHC->PageAllocHits(lock),
1516 PHC::sPHC->PageAllocAttempts(lock), PHC::sPHC->PageAllocHitRate(lock),
1517 lifetime);
1519 return ptr;
1522 static void FreePage(PHCLock aLock, uintptr_t aIndex,
1523 const Maybe<arena_id_t>& aArenaId,
1524 const StackTrace& aFreeStack, Delay aReuseDelay)
1525 MOZ_REQUIRES(PHC::sPHC->mMutex) {
1526 void* pagePtr = PHC::sRegion->AllocPagePtr(aIndex);
1528 #ifdef XP_WIN
1529 if (!VirtualFree(pagePtr, kPageSize, MEM_DECOMMIT)) {
1530 PHCCrash(aLock, "VirtualFree failed");
1532 #else
1533 if (mmap(pagePtr, kPageSize, PROT_NONE, MAP_FIXED | MAP_PRIVATE | MAP_ANON,
1534 -1, 0) == MAP_FAILED) {
1535 PHCCrash(aLock, "mmap failed");
1537 #endif
1539 PHC::sPHC->SetPageFreed(aLock, aIndex, aArenaId, aFreeStack, aReuseDelay);
1542 //---------------------------------------------------------------------------
1543 // replace-malloc machinery
1544 //---------------------------------------------------------------------------
1546 // This handles malloc, moz_arena_malloc, and realloc-with-a-nullptr.
1547 MOZ_ALWAYS_INLINE static void* PageMalloc(const Maybe<arena_id_t>& aArenaId,
1548 size_t aReqSize) {
1549 void* ptr = ShouldPageAllocHot(aReqSize)
1550 // The test on aArenaId here helps the compiler optimise away
1551 // the construction of Nothing() in the caller.
1552 ? MaybePageAlloc(aArenaId.isSome() ? aArenaId : Nothing(),
1553 aReqSize, /* aAlignment */ 1,
1554 /* aZero */ false)
1555 : nullptr;
1556 return ptr ? ptr
1557 : (aArenaId.isSome()
1558 ? MozJemalloc::moz_arena_malloc(*aArenaId, aReqSize)
1559 : MozJemalloc::malloc(aReqSize));
1562 inline void* MozJemallocPHC::malloc(size_t aReqSize) {
1563 return PageMalloc(Nothing(), aReqSize);
1566 static Delay ReuseDelay(PHCLock aLock) {
1567 Delay avg_reuse_delay = PHC::sPHC->GetAvgPageReuseDelay(aLock);
1568 return (avg_reuse_delay / 2) +
1569 Rnd64ToDelay(avg_reuse_delay / 2, PHC::sPHC->Random64(aLock));
1572 // This handles both calloc and moz_arena_calloc.
1573 MOZ_ALWAYS_INLINE static void* PageCalloc(const Maybe<arena_id_t>& aArenaId,
1574 size_t aNum, size_t aReqSize) {
1575 CheckedInt<size_t> checkedSize = CheckedInt<size_t>(aNum) * aReqSize;
1576 if (!checkedSize.isValid()) {
1577 return nullptr;
1580 void* ptr = ShouldPageAllocHot(checkedSize.value())
1581 // The test on aArenaId here helps the compiler optimise away
1582 // the construction of Nothing() in the caller.
1583 ? MaybePageAlloc(aArenaId.isSome() ? aArenaId : Nothing(),
1584 checkedSize.value(), /* aAlignment */ 1,
1585 /* aZero */ true)
1586 : nullptr;
1587 return ptr ? ptr
1588 : (aArenaId.isSome()
1589 ? MozJemalloc::moz_arena_calloc(*aArenaId, aNum, aReqSize)
1590 : MozJemalloc::calloc(aNum, aReqSize));
1593 inline void* MozJemallocPHC::calloc(size_t aNum, size_t aReqSize) {
1594 return PageCalloc(Nothing(), aNum, aReqSize);
1597 // This function handles both realloc and moz_arena_realloc.
1599 // As always, realloc is complicated, and doubly so when there are two
1600 // different kinds of allocations in play. Here are the possible transitions,
1601 // and what we do in practice.
1603 // - normal-to-normal: This is straightforward and obviously necessary.
1605 // - normal-to-page: This is disallowed because it would require getting the
1606 // arenaId of the normal allocation, which isn't possible in non-DEBUG builds
1607 // for security reasons.
1609 // - page-to-page: This is done whenever possible, i.e. whenever the new size
1610 // is less than or equal to 4 KiB. This choice counterbalances the
1611 // disallowing of normal-to-page allocations, in order to avoid biasing
1612 // towards or away from page allocations. It always occurs in-place.
1614 // - page-to-normal: this is done only when necessary, i.e. only when the new
1615 // size is greater than 4 KiB. This choice naturally flows from the
1616 // prior choice on page-to-page transitions.
1618 // In summary: realloc doesn't change the allocation kind unless it must.
1620 // This function may return:
1621 // - Some(pointer) when PHC handled the reallocation.
1622 // - Some(nullptr) when PHC should have handled a page-to-normal transition
1623 // but couldn't because of OOM.
1624 // - Nothing() when PHC is disabled or the original allocation was not
1625 // under PHC.
1626 MOZ_ALWAYS_INLINE static Maybe<void*> MaybePageRealloc(
1627 const Maybe<arena_id_t>& aArenaId, void* aOldPtr, size_t aNewSize) {
1628 if (!aOldPtr) {
1629 // Null pointer. Treat like malloc(aNewSize).
1630 return Some(PageMalloc(aArenaId, aNewSize));
1633 if (!maybe_init()) {
1634 return Nothing();
1637 PtrKind pk = PHC::sRegion->PtrKind(aOldPtr);
1638 if (pk.IsNothing()) {
1639 // A normal-to-normal transition.
1640 return Nothing();
1643 if (pk.IsGuardPage()) {
1644 PHC::CrashOnGuardPage(aOldPtr);
1647 // At this point we know we have an allocation page.
1648 uintptr_t index = pk.AllocPageIndex();
1650 // A page-to-something transition.
1651 PHC::sPHC->AdvanceNow(PHC::LocalAllocDelay());
1653 // Note that `disable` has no effect unless it is emplaced below.
1654 Maybe<AutoDisableOnCurrentThread> disable;
1655 // Get the stack trace *before* locking the mutex.
1656 StackTrace stack;
1657 if (PHC::IsDisabledOnCurrentThread()) {
1658 // PHC is disabled on this thread. Leave the stack empty.
1659 } else {
1660 // Disable on this thread *before* getting the stack trace.
1661 disable.emplace();
1662 stack.Fill();
1665 MutexAutoLock lock(PHC::sPHC->mMutex);
1667 // Check for realloc() of a freed block.
1668 PHC::sPHC->EnsureValidAndInUse(lock, aOldPtr, index);
1670 if (aNewSize <= kPageSize && PHC::sPHC->ShouldMakeNewAllocations()) {
1671 // A page-to-page transition. Just keep using the page allocation. We do
1672 // this even if the thread is disabled, because it doesn't create a new
1673 // page allocation. Note that ResizePageInUse() checks aArenaId.
1675 // Move the bytes with memmove(), because the old allocation and the new
1676 // allocation overlap. Move the usable size rather than the requested size,
1677 // because the user might have used malloc_usable_size() and filled up the
1678 // usable size.
1679 size_t oldUsableSize = PHC::sPHC->PageUsableSize(lock, index);
1680 size_t newUsableSize = MozJemalloc::malloc_good_size(aNewSize);
1681 uint8_t* pagePtr = PHC::sRegion->AllocPagePtr(index);
1682 uint8_t* newPtr = pagePtr + kPageSize - newUsableSize;
1683 memmove(newPtr, aOldPtr, std::min(oldUsableSize, aNewSize));
1684 PHC::sPHC->ResizePageInUse(lock, index, aArenaId, newPtr, stack);
1685 LOG("PageRealloc-Reuse(%p, %zu) -> %p\n", aOldPtr, aNewSize, newPtr);
1686 return Some(newPtr);
1689 // A page-to-normal transition (with the new size greater than page-sized).
1690 // (Note that aArenaId is checked below.)
1691 void* newPtr;
1692 if (aArenaId.isSome()) {
1693 newPtr = MozJemalloc::moz_arena_malloc(*aArenaId, aNewSize);
1694 } else {
1695 Maybe<arena_id_t> oldArenaId = PHC::sPHC->PageArena(lock, index);
1696 newPtr = (oldArenaId.isSome()
1697 ? MozJemalloc::moz_arena_malloc(*oldArenaId, aNewSize)
1698 : MozJemalloc::malloc(aNewSize));
1700 if (!newPtr) {
1701 return Some(nullptr);
1704 Delay reuseDelay = ReuseDelay(lock);
1706 // Copy the usable size rather than the requested size, because the user
1707 // might have used malloc_usable_size() and filled up the usable size. Note
1708 // that FreePage() checks aArenaId (via SetPageFreed()).
1709 size_t oldUsableSize = PHC::sPHC->PageUsableSize(lock, index);
1710 memcpy(newPtr, aOldPtr, std::min(oldUsableSize, aNewSize));
1711 FreePage(lock, index, aArenaId, stack, reuseDelay);
1712 LOG("PageRealloc-Free(%p[%zu], %zu) -> %p, %zu delay, reuse at ~%zu\n",
1713 aOldPtr, index, aNewSize, newPtr, size_t(reuseDelay),
1714 size_t(PHC::Now()) + reuseDelay);
1716 return Some(newPtr);
1719 MOZ_ALWAYS_INLINE static void* PageRealloc(const Maybe<arena_id_t>& aArenaId,
1720 void* aOldPtr, size_t aNewSize) {
1721 Maybe<void*> ptr = MaybePageRealloc(aArenaId, aOldPtr, aNewSize);
1723 return ptr.isSome()
1724 ? *ptr
1725 : (aArenaId.isSome() ? MozJemalloc::moz_arena_realloc(
1726 *aArenaId, aOldPtr, aNewSize)
1727 : MozJemalloc::realloc(aOldPtr, aNewSize));
1730 inline void* MozJemallocPHC::realloc(void* aOldPtr, size_t aNewSize) {
1731 return PageRealloc(Nothing(), aOldPtr, aNewSize);
1734 // This handles both free and moz_arena_free.
1735 static void DoPageFree(const Maybe<arena_id_t>& aArenaId, void* aPtr) {
1736 PtrKind pk = PHC::sRegion->PtrKind(aPtr);
1737 if (pk.IsGuardPage()) {
1738 PHC::CrashOnGuardPage(aPtr);
1741 // At this point we know we have an allocation page.
1742 PHC::sPHC->AdvanceNow(PHC::LocalAllocDelay());
1743 uintptr_t index = pk.AllocPageIndex();
1745 // Note that `disable` has no effect unless it is emplaced below.
1746 Maybe<AutoDisableOnCurrentThread> disable;
1747 // Get the stack trace *before* locking the mutex.
1748 StackTrace freeStack;
1749 if (PHC::IsDisabledOnCurrentThread()) {
1750 // PHC is disabled on this thread. Leave the stack empty.
1751 } else {
1752 // Disable on this thread *before* getting the stack trace.
1753 disable.emplace();
1754 freeStack.Fill();
1757 MutexAutoLock lock(PHC::sPHC->mMutex);
1759 // Check for a double-free.
1760 PHC::sPHC->EnsureValidAndInUse(lock, aPtr, index);
1762 // Note that FreePage() checks aArenaId (via SetPageFreed()).
1763 Delay reuseDelay = ReuseDelay(lock);
1764 FreePage(lock, index, aArenaId, freeStack, reuseDelay);
1766 #if PHC_LOGGING
1767 phc::PHCStats stats = PHC::sPHC->GetPageStats(lock);
1768 #endif
1769 LOG("PageFree(%p[%zu]), %zu delay, reuse at ~%zu, fullness %zu/%zu/%zu\n",
1770 aPtr, index, size_t(reuseDelay), size_t(PHC::Now()) + reuseDelay,
1771 stats.mSlotsAllocated, stats.mSlotsFreed, kNumAllocPages);
1774 MOZ_ALWAYS_INLINE static bool FastIsPHCPtr(void* aPtr) {
1775 if (MOZ_UNLIKELY(!maybe_init())) {
1776 return false;
1779 PtrKind pk = PHC::sRegion->PtrKind(aPtr);
1780 return !pk.IsNothing();
1783 MOZ_ALWAYS_INLINE static void PageFree(const Maybe<arena_id_t>& aArenaId,
1784 void* aPtr) {
1785 if (MOZ_UNLIKELY(FastIsPHCPtr(aPtr))) {
1786 // The tenery expression here helps the compiler optimise away the
1787 // construction of Nothing() in the caller.
1788 DoPageFree(aArenaId.isSome() ? aArenaId : Nothing(), aPtr);
1789 return;
1792 aArenaId.isSome() ? MozJemalloc::moz_arena_free(*aArenaId, aPtr)
1793 : MozJemalloc::free(aPtr);
1796 inline void MozJemallocPHC::free(void* aPtr) { PageFree(Nothing(), aPtr); }
1798 // This handles memalign and moz_arena_memalign.
1799 MOZ_ALWAYS_INLINE static void* PageMemalign(const Maybe<arena_id_t>& aArenaId,
1800 size_t aAlignment,
1801 size_t aReqSize) {
1802 MOZ_RELEASE_ASSERT(IsPowerOfTwo(aAlignment));
1804 // PHC can't satisfy an alignment greater than a page size, so fall back to
1805 // mozjemalloc in that case.
1806 void* ptr = nullptr;
1807 if (ShouldPageAllocHot(aReqSize) && aAlignment <= kPageSize) {
1808 // The test on aArenaId here helps the compiler optimise away
1809 // the construction of Nothing() in the caller.
1810 ptr = MaybePageAlloc(aArenaId.isSome() ? aArenaId : Nothing(), aReqSize,
1811 aAlignment, /* aZero */ false);
1813 return ptr ? ptr
1814 : (aArenaId.isSome()
1815 ? MozJemalloc::moz_arena_memalign(*aArenaId, aAlignment,
1816 aReqSize)
1817 : MozJemalloc::memalign(aAlignment, aReqSize));
1820 inline void* MozJemallocPHC::memalign(size_t aAlignment, size_t aReqSize) {
1821 return PageMemalign(Nothing(), aAlignment, aReqSize);
1824 inline size_t MozJemallocPHC::malloc_usable_size(usable_ptr_t aPtr) {
1825 if (!maybe_init()) {
1826 return MozJemalloc::malloc_usable_size(aPtr);
1829 PtrKind pk = PHC::sRegion->PtrKind(aPtr);
1830 if (pk.IsNothing()) {
1831 // Not a page allocation. Measure it normally.
1832 return MozJemalloc::malloc_usable_size(aPtr);
1835 if (pk.IsGuardPage()) {
1836 PHC::CrashOnGuardPage(const_cast<void*>(aPtr));
1839 // At this point we know aPtr lands within an allocation page, due to the
1840 // math done in the PtrKind constructor. But if aPtr points to memory
1841 // before the base address of the allocation, we return 0.
1842 uintptr_t index = pk.AllocPageIndex();
1844 MutexAutoLock lock(PHC::sPHC->mMutex);
1846 void* pageBaseAddr = PHC::sPHC->AllocPageBaseAddr(lock, index);
1848 if (MOZ_UNLIKELY(aPtr < pageBaseAddr)) {
1849 return 0;
1852 return PHC::sPHC->PageUsableSize(lock, index);
1855 static size_t metadata_size() {
1856 return MozJemalloc::malloc_usable_size(PHC::sRegion) +
1857 MozJemalloc::malloc_usable_size(PHC::sPHC);
1860 inline void MozJemallocPHC::jemalloc_stats_internal(
1861 jemalloc_stats_t* aStats, jemalloc_bin_stats_t* aBinStats) {
1862 MozJemalloc::jemalloc_stats_internal(aStats, aBinStats);
1864 if (!maybe_init()) {
1865 // If we're not initialised, then we're not using any additional memory and
1866 // have nothing to add to the report.
1867 return;
1870 // We allocate our memory from jemalloc so it has already counted our memory
1871 // usage within "mapped" and "allocated", we must subtract the memory we
1872 // allocated from jemalloc from allocated before adding in only the parts that
1873 // we have allocated out to Firefox.
1875 aStats->allocated -= kAllPagesJemallocSize;
1877 size_t allocated = 0;
1879 MutexAutoLock lock(PHC::sPHC->mMutex);
1881 // Add usable space of in-use allocations to `allocated`.
1882 for (size_t i = 0; i < kNumAllocPages; i++) {
1883 if (PHC::sPHC->IsPageInUse(lock, i)) {
1884 allocated += PHC::sPHC->PageUsableSize(lock, i);
1888 aStats->allocated += allocated;
1890 // guards is the gap between `allocated` and `mapped`. In some ways this
1891 // almost fits into aStats->wasted since it feels like wasted memory. However
1892 // wasted should only include committed memory and these guard pages are
1893 // uncommitted. Therefore we don't include it anywhere.
1894 // size_t guards = mapped - allocated;
1896 // aStats.page_cache and aStats.bin_unused are left unchanged because PHC
1897 // doesn't have anything corresponding to those.
1899 // The metadata is stored in normal heap allocations, so they're measured by
1900 // mozjemalloc as `allocated`. Move them into `bookkeeping`.
1901 // They're also reported under explicit/heap-overhead/phc/fragmentation in
1902 // about:memory.
1903 size_t bookkeeping = metadata_size();
1904 aStats->allocated -= bookkeeping;
1905 aStats->bookkeeping += bookkeeping;
1908 inline void MozJemallocPHC::jemalloc_stats_lite(jemalloc_stats_lite_t* aStats) {
1909 MozJemalloc::jemalloc_stats_lite(aStats);
1912 inline void MozJemallocPHC::jemalloc_ptr_info(const void* aPtr,
1913 jemalloc_ptr_info_t* aInfo) {
1914 if (!maybe_init()) {
1915 return MozJemalloc::jemalloc_ptr_info(aPtr, aInfo);
1918 // We need to implement this properly, because various code locations do
1919 // things like checking that allocations are in the expected arena.
1920 PtrKind pk = PHC::sRegion->PtrKind(aPtr);
1921 if (pk.IsNothing()) {
1922 // Not a page allocation.
1923 return MozJemalloc::jemalloc_ptr_info(aPtr, aInfo);
1926 if (pk.IsGuardPage()) {
1927 // Treat a guard page as unknown because there's no better alternative.
1928 *aInfo = {TagUnknown, nullptr, 0, 0};
1929 return;
1932 // At this point we know we have an allocation page.
1933 uintptr_t index = pk.AllocPageIndex();
1935 MutexAutoLock lock(PHC::sPHC->mMutex);
1937 PHC::sPHC->FillJemallocPtrInfo(lock, aPtr, index, aInfo);
1938 #if DEBUG
1939 LOG("JemallocPtrInfo(%p[%zu]) -> {%zu, %p, %zu, %zu}\n", aPtr, index,
1940 size_t(aInfo->tag), aInfo->addr, aInfo->size, aInfo->arenaId);
1941 #else
1942 LOG("JemallocPtrInfo(%p[%zu]) -> {%zu, %p, %zu}\n", aPtr, index,
1943 size_t(aInfo->tag), aInfo->addr, aInfo->size);
1944 #endif
1947 inline void* MozJemallocPHC::moz_arena_malloc(arena_id_t aArenaId,
1948 size_t aReqSize) {
1949 return PageMalloc(Some(aArenaId), aReqSize);
1952 inline void* MozJemallocPHC::moz_arena_calloc(arena_id_t aArenaId, size_t aNum,
1953 size_t aReqSize) {
1954 return PageCalloc(Some(aArenaId), aNum, aReqSize);
1957 inline void* MozJemallocPHC::moz_arena_realloc(arena_id_t aArenaId,
1958 void* aOldPtr, size_t aNewSize) {
1959 return PageRealloc(Some(aArenaId), aOldPtr, aNewSize);
1962 inline void MozJemallocPHC::moz_arena_free(arena_id_t aArenaId, void* aPtr) {
1963 return PageFree(Some(aArenaId), aPtr);
1966 inline void* MozJemallocPHC::moz_arena_memalign(arena_id_t aArenaId,
1967 size_t aAlignment,
1968 size_t aReqSize) {
1969 return PageMemalign(Some(aArenaId), aAlignment, aReqSize);
1972 namespace mozilla::phc {
1974 bool IsPHCAllocation(const void* aPtr, AddrInfo* aOut) {
1975 if (!maybe_init()) {
1976 return false;
1979 PtrKind pk = PHC::sRegion->PtrKind(aPtr);
1980 if (pk.IsNothing()) {
1981 return false;
1984 bool isGuardPage = false;
1985 if (pk.IsGuardPage()) {
1986 if ((uintptr_t(aPtr) % kPageSize) < (kPageSize / 2)) {
1987 // The address is in the lower half of a guard page, so it's probably an
1988 // overflow. But first check that it is not on the very first guard
1989 // page, in which case it cannot be an overflow, and we ignore it.
1990 if (PHC::sRegion->IsInFirstGuardPage(aPtr)) {
1991 return false;
1994 // Get the allocation page preceding this guard page.
1995 pk = PHC::sRegion->PtrKind(static_cast<const uint8_t*>(aPtr) - kPageSize);
1997 } else {
1998 // The address is in the upper half of a guard page, so it's probably an
1999 // underflow. Get the allocation page following this guard page.
2000 pk = PHC::sRegion->PtrKind(static_cast<const uint8_t*>(aPtr) + kPageSize);
2003 // Make a note of the fact that we hit a guard page.
2004 isGuardPage = true;
2007 // At this point we know we have an allocation page.
2008 uintptr_t index = pk.AllocPageIndex();
2010 if (aOut) {
2011 if (PHC::sPHC->mMutex.TryLock()) {
2012 PHC::sPHC->FillAddrInfo(index, aPtr, isGuardPage, *aOut);
2013 LOG("IsPHCAllocation: %zu, %p, %zu, %zu, %zu\n", size_t(aOut->mKind),
2014 aOut->mBaseAddr, aOut->mUsableSize,
2015 aOut->mAllocStack.isSome() ? aOut->mAllocStack->mLength : 0,
2016 aOut->mFreeStack.isSome() ? aOut->mFreeStack->mLength : 0);
2017 PHC::sPHC->mMutex.Unlock();
2018 } else {
2019 LOG("IsPHCAllocation: PHC is locked\n");
2020 aOut->mPhcWasLocked = true;
2023 return true;
2026 void DisablePHCOnCurrentThread() {
2027 PHC::DisableOnCurrentThread();
2028 LOG("DisablePHCOnCurrentThread: %zu\n", 0ul);
2031 void ReenablePHCOnCurrentThread() {
2032 PHC::sPHC->EnableOnCurrentThread();
2033 LOG("ReenablePHCOnCurrentThread: %zu\n", 0ul);
2036 bool IsPHCEnabledOnCurrentThread() {
2037 bool enabled = !PHC::IsDisabledOnCurrentThread();
2038 LOG("IsPHCEnabledOnCurrentThread: %zu\n", size_t(enabled));
2039 return enabled;
2042 void PHCMemoryUsage(MemoryUsage& aMemoryUsage) {
2043 if (!maybe_init()) {
2044 aMemoryUsage = MemoryUsage();
2045 return;
2048 aMemoryUsage.mMetadataBytes = metadata_size();
2049 if (PHC::sPHC) {
2050 MutexAutoLock lock(PHC::sPHC->mMutex);
2051 aMemoryUsage.mFragmentationBytes = PHC::sPHC->FragmentationBytes();
2052 } else {
2053 aMemoryUsage.mFragmentationBytes = 0;
2057 void GetPHCStats(PHCStats& aStats) {
2058 if (!maybe_init()) {
2059 aStats = PHCStats();
2060 return;
2063 MutexAutoLock lock(PHC::sPHC->mMutex);
2065 aStats = PHC::sPHC->GetPageStats(lock);
2068 // Enable or Disable PHC at runtime. If PHC is disabled it will still trap
2069 // bad uses of previous allocations, but won't track any new allocations.
2070 void SetPHCState(PHCState aState) {
2071 if (!maybe_init()) {
2072 return;
2075 PHC::sPHC->SetState(aState);
2078 void SetPHCProbabilities(int64_t aAvgDelayFirst, int64_t aAvgDelayNormal,
2079 int64_t aAvgDelayPageReuse) {
2080 if (!maybe_init()) {
2081 return;
2084 PHC::sPHC->SetProbabilities(aAvgDelayFirst, aAvgDelayNormal,
2085 aAvgDelayPageReuse);
2088 } // namespace mozilla::phc