mm: memcontrol: flush percpu slab vmstats on kmem offlining
commitbee07b33db78d4ee7ed6a2fe810b9473d5471fe4
authorRoman Gushchin <guro@fb.com>
Fri, 30 Aug 2019 23:04:32 +0000 (30 16:04 -0700)
committerLinus Torvalds <torvalds@linux-foundation.org>
Sat, 31 Aug 2019 01:00:50 +0000 (30 18:00 -0700)
tree895b613a4b640996eb34dee8d0b1dedf2790f817
parent846d2db3e00048da3f650e0cfb0b8d67669cec3e
mm: memcontrol: flush percpu slab vmstats on kmem offlining

I've noticed that the "slab" value in memory.stat is sometimes 0, even
if some children memory cgroups have a non-zero "slab" value.  The
following investigation showed that this is the result of the kmem_cache
reparenting in combination with the per-cpu batching of slab vmstats.

At the offlining some vmstat value may leave in the percpu cache, not
being propagated upwards by the cgroup hierarchy.  It means that stats
on ancestor levels are lower than actual.  Later when slab pages are
released, the precise number of pages is substracted on the parent
level, making the value negative.  We don't show negative values, 0 is
printed instead.

To fix this issue, let's flush percpu slab memcg and lruvec stats on
memcg offlining.  This guarantees that numbers on all ancestor levels
are accurate and match the actual number of outstanding slab pages.

Link: http://lkml.kernel.org/r/20190819202338.363363-3-guro@fb.com
Fixes: fb2f2b0adb98 ("mm: memcg/slab: reparent memcg kmem_caches on cgroup removal")
Signed-off-by: Roman Gushchin <guro@fb.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
include/linux/mmzone.h
mm/memcontrol.c