linux-bl808/kernel
Oleg Nesterov d80e731eca epoll: introduce POLLFREE to flush ->signalfd_wqh before kfree()
This patch is intentionally incomplete to simplify the review.
It ignores ep_unregister_pollwait() which plays with the same wqh.
See the next change.

epoll assumes that the EPOLL_CTL_ADD'ed file controls everything
f_op->poll() needs. In particular it assumes that the wait queue
can't go away until eventpoll_release(). This is not true in case
of signalfd, the task which does EPOLL_CTL_ADD uses its ->sighand
which is not connected to the file.

This patch adds the special event, POLLFREE, currently only for
epoll. It expects that init_poll_funcptr()'ed hook should do the
necessary cleanup. Perhaps it should be defined as EPOLLFREE in
eventpoll.

__cleanup_sighand() is changed to do wake_up_poll(POLLFREE) if
->signalfd_wqh is not empty, we add the new signalfd_cleanup()
helper.

ep_poll_callback(POLLFREE) simply does list_del_init(task_list).
This make this poll entry inconsistent, but we don't care. If you
share epoll fd which contains our sigfd with another process you
should blame yourself. signalfd is "really special". I simply do
not know how we can define the "right" semantics if it used with
epoll.

The main problem is, epoll calls signalfd_poll() once to establish
the connection with the wait queue, after that signalfd_poll(NULL)
returns the different/inconsistent results depending on who does
EPOLL_CTL_MOD/signalfd_read/etc. IOW: apart from sigmask, signalfd
has nothing to do with the file, it works with the current thread.

In short: this patch is the hack which tries to fix the symptoms.
It also assumes that nobody can take tasklist_lock under epoll
locks, this seems to be true.

Note:

	- we do not have wake_up_all_poll() but wake_up_poll()
	  is fine, poll/epoll doesn't use WQ_FLAG_EXCLUSIVE.

	- signalfd_cleanup() uses POLLHUP along with POLLFREE,
	  we need a couple of simple changes in eventpoll.c to
	  make sure it can't be "lost".

Reported-by: Maxime Bizon <mbizon@freebox.fr>
Cc: <stable@kernel.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-02-24 11:42:50 -08:00
..
debug module: struct module_ref should contains long fields 2012-01-13 09:32:14 +10:30
events perf: Fix double start/stop in x86_pmu_start() 2012-02-07 16:58:56 +01:00
gcov
irq module_param: make bool parameters really bool (core code) 2012-01-13 09:32:18 +10:30
power PM / Freezer: Thaw only kernel threads if freezing of kernel threads fails 2012-02-04 22:23:05 +01:00
sched sched/rt: Fix task stack corruption under __ARCH_WANT_INTERRUPTS_ON_CTXSW 2012-01-27 12:49:41 +01:00
time Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2012-01-08 13:21:22 -08:00
trace
.gitignore
acct.c Merge branch 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-01-08 12:19:57 -08:00
async.c
audit.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit 2012-01-17 16:41:31 -08:00
audit.h audit: remove AUDIT_SETUP_CONTEXT as it isn't used 2012-01-17 16:16:57 -05:00
audit_tree.c
audit_watch.c
auditfilter.c audit: allow interfield comparison in audit rules 2012-01-17 16:17:01 -05:00
auditsc.c kernel-doc: fix new warnings in auditsc.c 2012-01-23 08:44:53 -08:00
backtracetest.c backtrace: replace timer with tasklet + completions 2008-06-27 18:09:16 +02:00
bounds.c
capability.c
cgroup.c
cgroup_freezer.c Merge branch 'for-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2012-01-09 12:59:24 -08:00
compat.c
configs.c
cpu.c
cpu_pm.c
cpuset.c Merge branch 'for-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2012-01-09 12:59:24 -08:00
crash_dump.c
cred.c
delayacct.c
dma.c
elfcore.c
exec_domain.c
exit.c sched: Fix ancient race in do_exit() 2012-01-27 11:55:36 +01:00
extable.c extable, core_kernel_data(): Make sure all archs define _sdata 2011-05-20 08:56:56 +02:00
fork.c epoll: introduce POLLFREE to flush ->signalfd_wqh before kfree() 2012-02-24 11:42:50 -08:00
freezer.c freezer: kill unused set_freezable_with_signal() 2011-11-23 09:28:17 -08:00
futex.c
futex_compat.c
groups.c
hrtimer.c
hung_task.c hung_task: fix false positive during vfork 2012-01-03 16:14:32 -08:00
irq_work.c
itimer.c
jump_label.c Merge remote-tracking branch 'tip/perf/core' into kvm-updates/3.3 2011-12-27 11:22:24 +02:00
kallsyms.c
Kconfig.freezer
Kconfig.hz
Kconfig.locks
Kconfig.preempt
kexec.c
kfifo.c
kmod.c
kprobes.c kprobes: fix a memory leak in function pre_handler_kretprobe() 2012-02-03 16:16:41 -08:00
ksysfs.c
kthread.c
latencytop.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
lockdep.c Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-01-06 08:02:58 -08:00
lockdep_internals.h
lockdep_proc.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
lockdep_states.h
Makefile
module.c
mutex-debug.c
mutex-debug.h
mutex.c
mutex.h
notifier.c
nsproxy.c
padata.c
panic.c
params.c module: make module param bint handle nul value 2012-02-14 11:02:15 +10:30
pid.c vfs: fix panic in __d_lookup() with high dentry hashtable counts 2012-02-13 20:45:38 -05:00
pid_namespace.c sysctl: add the kernel.ns_last_pid control 2012-01-12 20:13:11 -08:00
posix-cpu-timers.c
posix-timers.c
printk.c
profile.c
ptrace.c Merge branch 'for-linus' of git://selinuxproject.org/~jmorris/linux-security 2012-01-14 18:36:33 -08:00
range.c
rcu.h
rcupdate.c rcu: Detect illegal rcu dereference in extended quiescent state 2011-12-11 10:31:30 -08:00
rcutiny.c
rcutiny_plugin.h rcu: Apply ACCESS_ONCE() to rcu_boost() return value 2011-12-11 10:33:19 -08:00
rcutorture.c rcu: Add missing __cpuinit annotation in rcutorture code 2012-01-16 09:44:05 -08:00
rcutree.c
rcutree.h
rcutree_plugin.h
rcutree_trace.c
relay.c relay: prevent integer overflow in relay_open() 2012-02-10 09:04:49 +01:00
res_counter.c net: introduce res_counter_charge_nofail() for socket allocations 2012-01-22 15:08:46 -05:00
resource.c
rtmutex-debug.c
rtmutex-debug.h
rtmutex-tester.c
rtmutex.c
rtmutex.h
rtmutex_common.h
rwsem.c
seccomp.c seccomp: audit abnormal end to a process due to seccomp 2012-01-17 16:16:55 -05:00
semaphore.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
signal.c user namespace: make signal.c respect user namespaces 2012-01-10 16:30:54 -08:00
smp.c
softirq.c
spinlock.c
srcu.c
stacktrace.c
stop_machine.c
sys.c
sys_ni.c Cross Memory Attach 2011-10-31 17:30:44 -07:00
sysctl.c x86: Panic on detection of stack overflow 2011-12-05 11:37:47 +01:00
sysctl_binary.c
sysctl_check.c
taskstats.c
test_kprobes.c kprobes: Fix selftest to clear flags field for reusing probes 2010-10-14 08:55:27 +02:00
time.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
timeconst.pl
timer.c
tracepoint.c
tsacct.c
uid16.c
up.c
user-return-notifier.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
user.c
user_namespace.c
utsname.c
utsname_sysctl.c
wait.c
watchdog.c bugs, x86: Fix printk levels for panic, softlockups and stack dumps 2012-01-26 21:28:45 +01:00
workqueue.c
workqueue_sched.h workqueue: implement concurrency managed dynamic worker pool 2010-06-29 10:07:14 +02:00