Star64_linux/kernel
Tejun Heo a9ab775bca workqueue: directly restore CPU affinity of workers from CPU_ONLINE
Rebinding workers of a per-cpu pool after a CPU comes online involves
a lot of back-and-forth mostly because only the task itself could
adjust CPU affinity if PF_THREAD_BOUND was set.

As CPU_ONLINE itself couldn't adjust affinity, it had to somehow
coerce the workers themselves to perform set_cpus_allowed_ptr().  Due
to the various states a worker can be in, this led to three different
paths a worker may be rebound.  worker->rebind_work is queued to busy
workers.  Idle ones are signaled by unlinking worker->entry and call
idle_worker_rebind().  The manager isn't covered by either and
implements its own mechanism.

PF_THREAD_BOUND has been relaced with PF_NO_SETAFFINITY and CPU_ONLINE
itself now can manipulate CPU affinity of workers.  This patch
replaces the existing rebind mechanism with direct one where
CPU_ONLINE iterates over all workers using for_each_pool_worker(),
restores CPU affinity, and clears WORKER_UNBOUND.

There are a couple subtleties.  All bound idle workers should have
their runqueues set to that of the bound CPU; however, if the target
task isn't running, set_cpus_allowed_ptr() just updates the
cpus_allowed mask deferring the actual migration to when the task
wakes up.  This is worked around by waking up idle workers after
restoring CPU affinity before any workers can become bound.

Another subtlety is stems from matching @pool->nr_running with the
number of running unbound workers.  While DISASSOCIATED, all workers
are unbound and nr_running is zero.  As workers become bound again,
nr_running needs to be adjusted accordingly; however, there is no good
way to tell whether a given worker is running without poking into
scheduler internals.  Instead of clearing UNBOUND directly,
rebind_workers() replaces UNBOUND with another new NOT_RUNNING flag -
REBOUND, which will later be cleared by the workers themselves while
preparing for the next round of work item execution.  The only change
needed for the workers is clearing REBOUND along with PREP.

* This patch leaves for_each_busy_worker() without any user.  Removed.

* idle_worker_rebind(), busy_worker_rebind_fn(), worker->rebind_work
  and rebind logic in manager_workers() removed.

* worker_thread() now looks at WORKER_DIE instead of testing whether
  @worker->entry is empty to determine whether it needs to do
  something special as dying is the only special thing now.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com>
2013-03-19 13:45:21 -07:00
..
debug KGDB/KDB fixes and cleanups 2013-03-02 08:31:39 -08:00
events hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
gcov kernel/gcov: remove depends on CONFIG_EXPERIMENTAL 2013-01-11 11:39:33 -08:00
irq Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-02-26 20:16:07 -08:00
power Driver core patches for 3.9-rc1 2013-02-21 12:05:51 -08:00
sched sched: replace PF_THREAD_BOUND with PF_NO_SETAFFINITY 2013-03-19 13:45:20 -07:00
time Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux 2013-02-28 19:48:26 -08:00
trace ImgTec Meta architecture changes for v3.9-rc1 2013-03-03 12:06:09 -08:00
.gitignore
acct.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-02-26 20:16:07 -08:00
async.c Merge branch 'for-3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq 2013-02-19 22:01:33 -08:00
audit.c kernel/audit.c: avoid negative sleep durations 2013-01-11 14:54:56 -08:00
audit.h audit: optimize audit_compare_dname_path 2012-10-12 00:32:02 -04:00
audit_tree.c audit: catch possible NULL audit buffers 2013-01-11 14:54:55 -08:00
audit_watch.c audit: catch possible NULL audit buffers 2013-01-11 14:54:55 -08:00
auditfilter.c audit: fix auditfilter.c kernel-doc warnings 2013-01-10 14:35:23 -08:00
auditsc.c audit: catch possible NULL audit buffers 2013-01-11 14:54:55 -08:00
backtracetest.c
bounds.c
capability.c
cgroup.c sched: replace PF_THREAD_BOUND with PF_NO_SETAFFINITY 2013-03-19 13:45:20 -07:00
cgroup_freezer.c cgroup: rename ->create/post_create/pre_destroy/destroy() to ->css_alloc/online/offline/free() 2012-11-19 08:13:38 -08:00
compat.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal 2013-02-23 18:50:11 -08:00
configs.c
context_tracking.c Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2013-02-19 18:19:48 -08:00
cpu.c Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2013-02-19 19:04:55 -08:00
cpu_pm.c
cpuset.c sched: replace PF_THREAD_BOUND with PF_NO_SETAFFINITY 2013-03-19 13:45:20 -07:00
crash_dump.c
cred.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2012-12-18 10:55:28 -08:00
delayacct.c cputime: Use accessors to read task cputime stats 2013-01-27 19:23:31 +01:00
dma.c
elfcore.c
exec_domain.c
exit.c coredump: use a freezable_schedule for the coredump_finish wait 2013-02-27 19:10:11 -08:00
extable.c
fork.c fork: unshare: remove dead code 2013-02-27 19:10:12 -08:00
freezer.c freezer: change ptrace_stop/do_signal_stop to use freezable_schedule() 2012-10-26 14:27:49 -07:00
futex.c more file_inode() open-coded instances 2013-02-27 16:59:05 -05:00
futex_compat.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal 2013-02-23 18:50:11 -08:00
groups.c
hrtimer.c Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2013-02-19 19:05:45 -08:00
hung_task.c
irq_work.c Merge branch 'nohz/printk-v8' into irq/core 2013-02-05 00:48:46 +01:00
itimer.c
jump_label.c jump_label: Export jump_label_rate_limit() 2012-08-06 19:00:35 +03:00
kallsyms.c
kcmp.c kcmp: include linux/ptrace.h 2012-12-20 17:40:19 -08:00
Kconfig.freezer
Kconfig.hz
Kconfig.locks locking: Adjust spin lock inlining Kconfig options 2012-09-13 17:56:13 +02:00
Kconfig.preempt
kexec.c kexec: avoid freeing NULL pointer in image_crash_alloc() 2013-02-27 19:10:12 -08:00
kmod.c Merge branch 'master' into for-3.9-async 2013-01-23 09:31:01 -08:00
kprobes.c hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
ksysfs.c Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-12-11 18:10:49 -08:00
kthread.c sched: replace PF_THREAD_BOUND with PF_NO_SETAFFINITY 2013-03-19 13:45:20 -07:00
latencytop.c
lglock.c
lockdep.c lockdep: check that no locks held at freeze time 2013-02-27 19:10:11 -08:00
lockdep_internals.h
lockdep_proc.c lockdep: Use KSYM_NAME_LEN'ed buffer for __get_key_name() 2012-10-24 12:39:09 +02:00
lockdep_states.h
Makefile Merge branch 'akpm' (final batch from Andrew) 2013-02-27 20:58:09 -08:00
modsign_certificate.S MODSIGN: Avoid using .incbin in C source 2012-12-14 13:06:44 +10:30
modsign_pubkey.c keys: use keyring_alloc() to create module signing keyring 2012-12-20 17:40:21 -08:00
module-internal.h MODSIGN: Move the magic string to the end of a module and eliminate the search 2012-10-19 17:30:40 -07:00
module.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-02-26 20:16:07 -08:00
module_signing.c MODSIGN: Don't use enum-type bitfields in module signature info block 2012-12-05 11:27:24 +10:30
mutex-debug.c
mutex-debug.h
mutex.c sched/rt: Move rt specific bits into new header file 2013-02-07 20:51:08 +01:00
mutex.h
notifier.c
nsproxy.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-02-26 20:16:07 -08:00
padata.c padata: use __this_cpu_read per-cpu helper 2012-12-06 17:16:23 +08:00
panic.c taint: add explicit flag to show whether lock dep is still OK. 2013-01-21 17:17:57 +10:30
params.c
pid.c hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
pid_namespace.c pidns: Stop pid allocation when init dies 2012-12-25 16:10:05 -08:00
posix-cpu-timers.c Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2013-02-19 19:05:45 -08:00
posix-timers.c posix-timers: convert to idr_alloc() 2013-02-27 19:10:19 -08:00
printk.c Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux 2013-02-25 16:46:44 -08:00
profile.c profiling: Remove unused timer hook 2013-01-24 15:37:26 +01:00
ptrace.c uprobes: Add exports for module use 2013-02-08 17:47:13 +01:00
range.c
rcu.h rcu: Provide RCU CPU stall warnings for tiny RCU 2013-01-28 22:06:21 -08:00
rcupdate.c Merge branches 'doctorture.2013.01.29a', 'fixes.2013.01.26a', 'tagcb.2013.01.24a' and 'tiny.2013.01.29b' into HEAD 2013-01-28 22:25:21 -08:00
rcutiny.c Merge branches 'doctorture.2013.01.29a', 'fixes.2013.01.26a', 'tagcb.2013.01.24a' and 'tiny.2013.01.29b' into HEAD 2013-01-28 22:25:21 -08:00
rcutiny_plugin.h rcu: Provide RCU CPU stall warnings for tiny RCU 2013-01-28 22:06:21 -08:00
rcutorture.c rcu: Allow rcutorture to be built at low optimization levels 2013-02-04 12:18:20 -08:00
rcutree.c Merge branches 'doctorture.2013.01.29a', 'fixes.2013.01.26a', 'tagcb.2013.01.24a' and 'tiny.2013.01.29b' into HEAD 2013-01-28 22:25:21 -08:00
rcutree.h Merge branches 'doctorture.2013.01.29a', 'fixes.2013.01.26a', 'tagcb.2013.01.24a' and 'tiny.2013.01.29b' into HEAD 2013-01-28 22:25:21 -08:00
rcutree_plugin.h rcu: Make rcu_nocb_poll an early_param instead of module_param 2013-01-08 14:12:19 -08:00
rcutree_trace.c rcu: Separate accounting of callbacks from callback-free CPUs 2012-11-16 10:05:57 -08:00
relay.c new helper: file_inode(file) 2013-02-22 23:31:31 -05:00
res_counter.c res_counter: return amount of charges after res_counter_uncharge() 2012-12-18 15:02:12 -08:00
resource.c kernel/resource.c: fix stack overflow in __reserve_region_with_split() 2012-10-06 03:05:31 +09:00
rtmutex-debug.c sched/rt: Move rt specific bits into new header file 2013-02-07 20:51:08 +01:00
rtmutex-debug.h
rtmutex-tester.c sched/rt: Move rt specific bits into new header file 2013-02-07 20:51:08 +01:00
rtmutex.c sched/rt: Move rt specific bits into new header file 2013-02-07 20:51:08 +01:00
rtmutex.h
rtmutex_common.h
rwsem.c lockdep, rwsem: provide down_write_nest_lock() 2013-01-11 14:54:55 -08:00
seccomp.c seccomp: Make syscall skipping and nr changes more consistent 2012-10-02 21:14:29 +10:00
semaphore.c
signal.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal 2013-03-02 19:32:06 -08:00
smp.c smp: make smp_call_function_many() use logic similar to smp_call_function_single() 2013-02-21 17:22:20 -08:00
smpboot.c hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
smpboot.h smpboot: Provide infrastructure for percpu hotplug threads 2012-08-13 17:01:07 +02:00
softirq.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2013-02-20 18:58:50 -08:00
spinlock.c
srcu.c srcu: use ACCESS_ONCE() to access sp->completed in srcu_read_lock() 2013-02-07 15:19:36 -08:00
stacktrace.c
stop_machine.c stop_machine: Use smpboot threads 2013-02-14 15:29:38 +01:00
sys.c usermodehelper: cleanup/fix __orderly_poweroff() && argv_free() 2013-02-27 19:10:09 -08:00
sys_ni.c module: add syscall to load module from fd 2012-12-14 13:05:22 +10:30
sysctl.c Initial ARC Linux port with some fixes on top for 3.9-rc1 2013-03-02 07:58:56 -08:00
sysctl_binary.c sysctl: fix null checking in bin_dn_node_address() 2013-02-27 19:10:21 -08:00
task_work.c task_work: task_work_add() should not succeed after exit_task_work() 2012-09-13 16:47:34 +02:00
taskstats.c taskstats: cgroupstats_user_cmd() may leak on error 2012-10-06 03:05:31 +09:00
test_kprobes.c
time.c time: don't inline EXPORT_SYMBOL functions 2013-02-21 17:22:19 -08:00
timeconst.bc kernel: Replace timeconst.pl with a bc script 2013-02-16 23:17:25 +01:00
timer.c Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2013-02-19 18:19:48 -08:00
tracepoint.c hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
tsacct.c cputime: Use accessors to read task cputime stats 2013-01-27 19:23:31 +01:00
uid16.c
up.c
user-return-notifier.c hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
user.c hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
user_namespace.c userns: Allow any uid or gid mappings that don't overlap. 2013-01-26 22:12:04 -08:00
utsname.c kernel/utsname.c: fix wrong comment about clone_uts_ns() 2013-02-27 19:10:22 -08:00
utsname_sysctl.c kernel/utsname_sysctl.c: put get/get_uts() into CONFIG_PROC_SYSCTL code block 2013-02-27 19:10:22 -08:00
wait.c propagate name change to comments in kernel source 2012-12-06 10:39:54 +01:00
watchdog.c Merge branch 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2013-02-22 19:25:09 -08:00
workqueue.c workqueue: directly restore CPU affinity of workers from CPU_ONLINE 2013-03-19 13:45:21 -07:00
workqueue_internal.h workqueue: directly restore CPU affinity of workers from CPU_ONLINE 2013-03-19 13:45:21 -07:00