linux-bl808/kernel
Roman Gushchin bcfe06bf26 mm: memcontrol: Use helpers to read page's memcg data
Patch series "mm: allow mapping accounted kernel pages to userspace", v6.

Currently a non-slab kernel page which has been charged to a memory cgroup
can't be mapped to userspace.  The underlying reason is simple: PageKmemcg
flag is defined as a page type (like buddy, offline, etc), so it takes a
bit from a page->mapped counter.  Pages with a type set can't be mapped to
userspace.

But in general the kmemcg flag has nothing to do with mapping to
userspace.  It only means that the page has been accounted by the page
allocator, so it has to be properly uncharged on release.

Some bpf maps are mapping the vmalloc-based memory to userspace, and their
memory can't be accounted because of this implementation detail.

This patchset removes this limitation by moving the PageKmemcg flag into
one of the free bits of the page->mem_cgroup pointer.  Also it formalizes
accesses to the page->mem_cgroup and page->obj_cgroups using new helpers,
adds several checks and removes a couple of obsolete functions.  As the
result the code became more robust with fewer open-coded bit tricks.

This patch (of 4):

Currently there are many open-coded reads of the page->mem_cgroup pointer,
as well as a couple of read helpers, which are barely used.

It creates an obstacle on a way to reuse some bits of the pointer for
storing additional bits of information.  In fact, we already do this for
slab pages, where the last bit indicates that a pointer has an attached
vector of objcg pointers instead of a regular memcg pointer.

This commits uses 2 existing helpers and introduces a new helper to
converts all read sides to calls of these helpers:
  struct mem_cgroup *page_memcg(struct page *page);
  struct mem_cgroup *page_memcg_rcu(struct page *page);
  struct mem_cgroup *page_memcg_check(struct page *page);

page_memcg_check() is intended to be used in cases when the page can be a
slab page and have a memcg pointer pointing at objcg vector.  It does
check the lowest bit, and if set, returns NULL.  page_memcg() contains a
VM_BUG_ON_PAGE() check for the page not being a slab page.

To make sure nobody uses a direct access, struct page's
mem_cgroup/obj_cgroups is converted to unsigned long memcg_data.

Signed-off-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Link: https://lkml.kernel.org/r/20201027001657.3398190-1-guro@fb.com
Link: https://lkml.kernel.org/r/20201027001657.3398190-2-guro@fb.com
Link: https://lore.kernel.org/bpf/20201201215900.3569844-2-guro@fb.com
2020-12-02 18:28:05 -08:00
..
bpf bpf: Add a BPF helper for getting the IMA hash of an inode 2020-11-26 00:04:04 +01:00
cgroup
configs
debug
dma swiotlb: remove the tbl_dma_addr argument to swiotlb_tbl_map_single 2020-11-02 10:10:39 -05:00
entry entry: Fix the incorrect ordering of lockdep and RCU check 2020-11-04 18:06:14 +01:00
events perf/core: Fix a memory leak in perf_event_parse_addr_filter() 2020-11-07 13:07:26 +01:00
gcov
irq A set of fixes for interrupt chip drivers: 2020-11-08 09:52:57 -08:00
kcsan
livepatch
locking A couple of locking fixes: 2020-11-01 11:08:17 -08:00
power PM: sleep: fix typo in kernel/power/process.c 2020-10-27 19:11:44 +01:00
printk printk: ringbuffer: Replace zero-length array with flexible-array member 2020-10-30 16:57:42 -05:00
rcu stop_machine, rcu: Mark functions as notrace 2020-10-26 12:12:27 +01:00
sched cpufreq: Introduce governor flags 2020-11-10 18:31:17 +01:00
time
trace bpf: Add bpf_ktime_get_coarse_ns helper 2020-11-18 23:25:32 +01:00
.gitignore
acct.c
async.c
audit.c
audit.h
audit_fsnotify.c
audit_tree.c
audit_watch.c
auditfilter.c
auditsc.c
backtracetest.c
bounds.c
capability.c
compat.c
configs.c
context_tracking.c
cpu.c
cpu_pm.c
crash_core.c
crash_dump.c
cred.c
delayacct.c
dma.c
elfcore.c
exec_domain.c
exit.c don't dump the threads that had been already exiting when zapped. 2020-10-28 16:39:49 -04:00
extable.c
fail_function.c
fork.c mm: memcontrol: Use helpers to read page's memcg data 2020-12-02 18:28:05 -08:00
freezer.c
futex.c futex: Handle transient "ownerless" rtmutex state correctly 2020-11-07 22:07:04 +01:00
gen_kheaders.sh
groups.c
hung_task.c kernel/hung_task.c: make type annotations consistent 2020-11-02 12:14:19 -08:00
iomem.c
irq_work.c
jump_label.c
kallsyms.c
kcmp.c
Kconfig.freezer
Kconfig.hz
Kconfig.locks
Kconfig.preempt
kcov.c kernel: make kcov_common_handle consider the current context 2020-11-02 18:00:20 -08:00
kexec.c
kexec_core.c
kexec_elf.c
kexec_file.c
kexec_internal.h
kheaders.c
kmod.c
kprobes.c kprobes: Tell lockdep about kprobe nesting 2020-11-04 09:46:06 -05:00
ksysfs.c
kthread.c kthread_worker: prevent queuing delayed work from timer_fn when it is being canceled 2020-11-02 12:14:19 -08:00
latencytop.c
Makefile
module-internal.h
module.c bpf: Sanitize BTF data pointer after module is loaded 2020-11-25 00:05:21 +01:00
module_signature.c
module_signing.c
notifier.c
nsproxy.c
padata.c
panic.c
params.c params: Replace zero-length array with flexible-array member 2020-10-29 17:22:59 -05:00
pid.c
pid_namespace.c
profile.c
ptrace.c
range.c
reboot.c
regset.c
relay.c
resource.c
rseq.c
scftorture.c
scs.c
seccomp.c
signal.c ptrace: fix task_join_group_stop() for the case when current is traced 2020-11-02 12:14:19 -08:00
smp.c
smpboot.c
smpboot.h
softirq.c
stackleak.c
stacktrace.c
static_call.c
stop_machine.c stop_machine, rcu: Mark functions as notrace 2020-10-26 12:12:27 +01:00
sys.c
sys_ni.c
sysctl-test.c
sysctl.c
task_work.c
taskstats.c
test_kprobes.c
torture.c
tracepoint.c tracepoint: Replace zero-length array with flexible-array member 2020-10-29 17:22:59 -05:00
tsacct.c
ucount.c
uid16.c
uid16.h
umh.c
up.c
user-return-notifier.c
user.c
user_namespace.c
usermode_driver.c
utsname.c
utsname_sysctl.c
watch_queue.c
watchdog.c
watchdog_hld.c
workqueue.c
workqueue_internal.h