From c0135d07b013fa8f7ba9ec91b4369c372e6a28cb Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Thu, 22 Jan 2015 22:47:14 -0800 Subject: [PATCH 1/3] rcu: Clear need_qs flag to prevent splat If the scheduling-clock interrupt sets the current tasks need_qs flag, but if the current CPU passes through a quiescent state in the meantime, then rcu_preempt_qs() will fail to clear the need_qs flag, which can fool RCU into thinking that additional rcu_read_unlock_special() processing is needed. This commit therefore clears the need_qs flag before checking for additional processing. For this problem to occur, we need rcu_preempt_data.passed_quiesce equal to true and current->rcu_read_unlock_special.b.need_qs also equal to true. This condition can occur as follows: 1. CPU 0 is aware of the current preemptible RCU grace period, but has not yet passed through a quiescent state. Among other things, this means that rcu_preempt_data.passed_quiesce is false. 2. Task A running on CPU 0 enters a preemptible RCU read-side critical section. 3. CPU 0 takes a scheduling-clock interrupt, which notices the RCU read-side critical section and the need for a quiescent state, and thus sets current->rcu_read_unlock_special.b.need_qs to true. 4. Task A is preempted, enters the scheduler, eventually invoking rcu_preempt_note_context_switch() which in turn invokes rcu_preempt_qs(). Because rcu_preempt_data.passed_quiesce is false, control enters the body of the "if" statement, which sets rcu_preempt_data.passed_quiesce to true. 5. At this point, CPU 0 takes an interrupt. The interrupt handler contains an RCU read-side critical section, and the rcu_read_unlock() notes that current->rcu_read_unlock_special is nonzero, and thus invokes rcu_read_unlock_special(). 6. Once in rcu_read_unlock_special(), the fact that current->rcu_read_unlock_special.b.need_qs is true becomes apparent, so rcu_read_unlock_special() invokes rcu_preempt_qs(). Recursively, given that we interrupted out of that same function in the preceding step. 7. Because rcu_preempt_data.passed_quiesce is now true, rcu_preempt_qs() does nothing, and simply returns. 8. Upon return to rcu_read_unlock_special(), it is noted that current->rcu_read_unlock_special is still nonzero (because the interrupted rcu_preempt_qs() had not yet gotten around to clearing current->rcu_read_unlock_special.b.need_qs). 9. Execution proceeds to the WARN_ON_ONCE(), which notes that we are in an interrupt handler and thus duly splats. The solution, as noted above, is to make rcu_read_unlock_special() clear out current->rcu_read_unlock_special.b.need_qs after calling rcu_preempt_qs(). The interrupted rcu_preempt_qs() will clear it again, but this is harmless. The worst that happens is that we clobber another attempt to set this field, but this is not a problem because we just got done reporting a quiescent state. Reported-by: Sasha Levin Signed-off-by: Paul E. McKenney [ paulmck: Fix embarrassing build bug noted by Sasha Levin. ] Tested-by: Sasha Levin --- kernel/rcu/tree_plugin.h | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 2e850a51bb8f..bca28b00f7e6 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -327,6 +327,7 @@ void rcu_read_unlock_special(struct task_struct *t) special = t->rcu_read_unlock_special; if (special.b.need_qs) { rcu_preempt_qs(); + t->rcu_read_unlock_special.b.need_qs = false; if (!t->rcu_read_unlock_special.s) { local_irq_restore(flags); return; From 1ea76fbadd667b19c4fa4466f3a3b55a505e83d9 Mon Sep 17 00:00:00 2001 From: Jiang Liu Date: Mon, 16 Feb 2015 10:11:13 +0800 Subject: [PATCH 2/3] x86/irq: Fix regression caused by commit b568b8601f05 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Commit b568b8601f05 ("Treat SCI interrupt as normal GSI interrupt") accidently removes support of legacy PIC interrupt when fixing a regression for Xen, which causes a nasty regression on HP/Compaq nc6000 where we fail to register the ACPI interrupt, and thus lose eg. thermal notifications leading a potentially overheated machine. So reintroduce support of legacy PIC based ACPI SCI interrupt. Reported-by: Ville Syrjälä Tested-by: Ville Syrjälä Signed-off-by: Jiang Liu Signed-off-by: Peter Zijlstra (Intel) Acked-by: Pavel Machek Cc: # 3.19+ Cc: H. Peter Anvin Cc: Len Brown Cc: Linus Torvalds Cc: Rafael J. Wysocki Cc: Sander Eikelenboom Cc: linux-pm@vger.kernel.org Link: http://lkml.kernel.org/r/1424052673-22974-1-git-send-email-jiang.liu@linux.intel.com Signed-off-by: Ingo Molnar --- arch/x86/kernel/acpi/boot.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c index a18fff361c7f..8b5916342902 100644 --- a/arch/x86/kernel/acpi/boot.c +++ b/arch/x86/kernel/acpi/boot.c @@ -613,6 +613,11 @@ int acpi_gsi_to_irq(u32 gsi, unsigned int *irqp) { int rc, irq, trigger, polarity; + if (acpi_irq_model == ACPI_IRQ_MODEL_PIC) { + *irqp = gsi; + return 0; + } + rc = acpi_get_override_irq(gsi, &trigger, &polarity); if (rc == 0) { trigger = trigger ? ACPI_LEVEL_SENSITIVE : ACPI_EDGE_SENSITIVE; From d97eb8966c91f2c9d05f0a22eb89ed5b76d966d1 Mon Sep 17 00:00:00 2001 From: Joerg Roedel Date: Wed, 4 Feb 2015 13:33:33 +0100 Subject: [PATCH 3/3] x86/irq: Check for valid irq descriptor in check_irq_vectors_for_cpu_disable() When an interrupt is migrated away from a cpu it will stay in its vector_irq array until smp_irq_move_cleanup_interrupt succeeded. The cfg->move_in_progress flag is cleared already when the IPI was sent. When the interrupt is destroyed after migration its 'struct irq_desc' is freed and the vector_irq arrays are cleaned up. But since cfg->move_in_progress is already 0 the references at cpus before the last migration will not be cleared. So this would leave a reference to an already destroyed irq alive. When the cpu is taken down at this point, the check_irq_vectors_for_cpu_disable() function finds a valid irq number in the vector_irq array, but gets NULL for its descriptor and dereferences it, causing a kernel panic. This has been observed on real systems at shutdown. Add a check to check_irq_vectors_for_cpu_disable() for a valid 'struct irq_desc' to prevent this issue. Signed-off-by: Joerg Roedel Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Jiang Liu Cc: H. Peter Anvin Cc: Jan Beulich Cc: K. Y. Srinivasan Cc: Linus Torvalds Cc: Prarit Bhargava Cc: Rasmus Villemoes Cc: Yinghai Lu Cc: alnovak@suse.com Cc: joro@8bytes.org Link: http://lkml.kernel.org/r/20150204132754.GA10078@suse.de Signed-off-by: Ingo Molnar --- arch/x86/kernel/irq.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c index 705ef8d48e2d..67b1cbe0093a 100644 --- a/arch/x86/kernel/irq.c +++ b/arch/x86/kernel/irq.c @@ -302,6 +302,9 @@ int check_irq_vectors_for_cpu_disable(void) irq = __this_cpu_read(vector_irq[vector]); if (irq >= 0) { desc = irq_to_desc(irq); + if (!desc) + continue; + data = irq_desc_get_irq_data(desc); cpumask_copy(&affinity_new, data->affinity); cpu_clear(this_cpu, affinity_new);