vmstat: create separate function to fold per cpu diffs into local counters

The main idea behind this patchset is to reduce the vmstat update overhead
by avoiding interrupt enable/disable and the use of per cpu atomics.

This patch (of 3):

It is better to have a separate folding function because
refresh_cpu_vm_stats() also does other things like expire pages in the
page allocator caches.

If we have a separate function then refresh_cpu_vm_stats() is only called
from the local cpu which allows additional optimizations.

The folding function is only called when a cpu is being downed and
therefore no other processor will be accessing the counters.  Also
simplifies synchronization.

[akpm@linux-foundation.org: fix UP build]
Signed-off-by: Christoph Lameter <cl@linux.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
CC: Tejun Heo <tj@kernel.org>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This commit is contained in:
Christoph Lameter 2013-09-11 14:21:30 -07:00 committed by Linus Torvalds
parent d2cf5ad631
commit 2bb921e526
3 changed files with 37 additions and 8 deletions

View file

@ -415,11 +415,7 @@ EXPORT_SYMBOL(dec_zone_page_state);
#endif
/*
* Update the zone counters for one cpu.
*
* The cpu specified must be either the current cpu or a processor that
* is not online. If it is the current cpu then the execution thread must
* be pinned to the current cpu.
* Update the zone counters for the current cpu.
*
* Note that refresh_cpu_vm_stats strives to only access
* node local memory. The per cpu pagesets on remote zones are placed
@ -432,7 +428,7 @@ EXPORT_SYMBOL(dec_zone_page_state);
* with the global counters. These could cause remote node cache line
* bouncing and will have to be only done when necessary.
*/
void refresh_cpu_vm_stats(int cpu)
static void refresh_cpu_vm_stats(int cpu)
{
struct zone *zone;
int i;
@ -493,6 +489,38 @@ void refresh_cpu_vm_stats(int cpu)
atomic_long_add(global_diff[i], &vm_stat[i]);
}
/*
* Fold the data for an offline cpu into the global array.
* There cannot be any access by the offline cpu and therefore
* synchronization is simplified.
*/
void cpu_vm_stats_fold(int cpu)
{
struct zone *zone;
int i;
int global_diff[NR_VM_ZONE_STAT_ITEMS] = { 0, };
for_each_populated_zone(zone) {
struct per_cpu_pageset *p;
p = per_cpu_ptr(zone->pageset, cpu);
for (i = 0; i < NR_VM_ZONE_STAT_ITEMS; i++)
if (p->vm_stat_diff[i]) {
int v;
v = p->vm_stat_diff[i];
p->vm_stat_diff[i] = 0;
atomic_long_add(v, &zone->vm_stat[i]);
global_diff[i] += v;
}
}
for (i = 0; i < NR_VM_ZONE_STAT_ITEMS; i++)
if (global_diff[i])
atomic_long_add(global_diff[i], &vm_stat[i]);
}
/*
* this is only called if !populated_zone(zone), which implies no other users of
* pset->vm_stat_diff[] exsist.