Bug#692607: linux-image-3.2.0-4-686-pae: Kernel crash when coming out of screen saver
Dale Schroeder <dale@BriannasSaladDressing.com> writes:
> I've been able to narrow this down a bit more. One of my other
> systems still had 3.2.0-3-686-pae (3.2.23-1) in its apt archives. This
> kernel also boots successfully and does not hang at mdm startup. So,
> the problem did not exist in any of the 3.2.0-3 series released to
> Wheezy, but was introduced sometime between this image and 3.2.32-1,
> the 1st of the 3.2.0-4 series released to Wheezy.
I note that the changelog for
linux (3.2.29-1) unstable; urgency=low
includes
* [x86] drm/i915: Fix i8xx interrupt handling (Closes: #655152)
which is extremely suspiscious in this context. I wonder if anyone
experiencing this bug has tried reverting this patch?:
http://bugs.debian.org/cgi-bin/bugreport.cgi?msg=59;filename=drm-i915-i8xx-interrupt-handler.patch;att=1;bug=655152
Note that it is another shot in the dark - I have absolutely no idea
what's going on here. But I have a feeling that patch is replacing an
annoying bug on one platform with a critical bug on another.
Not sure that is a good tradeoff...
The debug output kindly provided by Сергей in
http://bugs.debian.org/cgi-bin/bugreport.cgi?msg=36;filename=dmesg.txt;att=4;bug=692607
shows (don't know why I included &work->entry - that's pointless):
[ 59.548011] work->data=0x00000000, &work->entry=de33c56c, work->entry.next= (null), work->entry.prev= (null)
which clearly tells us that the problem is related to
i915_handle_error() calling
queue_work(dev_priv->wq, &dev_priv->error_work);
with an uninitialized error_work. As noted earlier, this is supposed to
be initialized in intel_irq_init() so either that has not happend (yet?)
or something has zeroed it out later.
I am putting a beer on the first alternative.
Right. It's even bloody obvious (no that you all have pointed to the
releases surrounding the 655152 bugfix). That patch adds this i8xx
specific function:
+static void i8xx_irq_preinstall(struct drm_device * dev)
+{
+ drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev->dev_private;
+ int pipe;
+
+ atomic_set(&dev_priv->irq_received, 0);
+
+ for_each_pipe(pipe)
+ I915_WRITE(PIPESTAT(pipe), 0);
+ I915_WRITE16(IMR, 0xffff);
+ I915_WRITE16(IER, 0x0);
+ POSTING_READ16(IER);
+}
replacing this for all chips matching "(INTEL_INFO(dev)->gen == 2)":
static void i915_driver_irq_preinstall(struct drm_device * dev)
{
drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev->dev_private;
int pipe;
atomic_set(&dev_priv->irq_received, 0);
INIT_WORK(&dev_priv->hotplug_work, i915_hotplug_work_func);
INIT_WORK(&dev_priv->error_work, i915_error_work_func);
if (I915_HAS_HOTPLUG(dev)) {
I915_WRITE(PORT_HOTPLUG_EN, 0);
I915_WRITE(PORT_HOTPLUG_STAT, I915_READ(PORT_HOTPLUG_STAT));
}
I915_WRITE(HWSTAM, 0xeffe);
for_each_pipe(pipe)
I915_WRITE(PIPESTAT(pipe), 0);
I915_WRITE(IMR, 0xffffffff);
I915_WRITE(IER, 0x0);
POSTING_READ(IER);
}
Anyone able to spot the missing INIT_WORK()'s? Based on the
I915_HAS_HOTPLUG(dev) test, I assume that leaving the first one out was
intentional. But the second one cannot be left out, as demonstrated by
these bug reports.
I am attaching a proposed fix on top of the 655152 patch, which I have
not tested at all on actual Debian kernel sources. Might need context
adjustments. I'd appreciate it if anyone with crashing hardware could
test it.
Bjørn
>From d2451aff41d2db6047586c22317cd247e4c000ca Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Bj=C3=B8rn=20Mork?= <bjorn@mork.no>
Date: Thu, 28 Feb 2013 11:26:20 +0100
Subject: [PATCH] drm/i915: initialize error_work for i8xx interrupt handler
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The backport of upstream commit c2798b19bac2538393fc932bfbe59807a4734b3e
failed to initialize the error_work struct for gen2 hardware, resulting
in hitting a BUG in kernel/workqueue.c if/when the interrupt handler
tried to queue error handling work.
Signed-off-by: Bjørn Mork <bjorn@mork.no>
---
drivers/gpu/drm/i915/i915_irq.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 47b08ce..bb9b943 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -2102,6 +2102,8 @@ static void i8xx_irq_preinstall(struct drm_device * dev)
atomic_set(&dev_priv->irq_received, 0);
+ INIT_WORK(&dev_priv->error_work, i915_error_work_func);
+
for_each_pipe(pipe)
I915_WRITE(PIPESTAT(pipe), 0);
I915_WRITE16(IMR, 0xffff);
--
1.7.10.4
Reply to: