[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#692607: linux-image-3.2.0-4-686-pae: Kernel crash when coming out of screen saver



Dale Schroeder <dale@BriannasSaladDressing.com> writes:

> I've been able to narrow this down a bit more.  One of my other
> systems still had 3.2.0-3-686-pae (3.2.23-1) in its apt archives. This
> kernel also boots successfully and does not hang at mdm startup.  So,
> the problem did not exist in any of the 3.2.0-3 series released to
> Wheezy, but was introduced sometime between this image and 3.2.32-1,
> the 1st of the 3.2.0-4 series released to Wheezy.


I note that the changelog for

 linux (3.2.29-1) unstable; urgency=low

includes

   * [x86] drm/i915: Fix i8xx interrupt handling (Closes: #655152)

which is extremely suspiscious in this context.  I wonder if anyone
experiencing this bug has tried reverting this patch?:

 http://bugs.debian.org/cgi-bin/bugreport.cgi?msg=59;filename=drm-i915-i8xx-interrupt-handler.patch;att=1;bug=655152

Note that it is another shot in the dark - I have absolutely no idea
what's going on here.  But I have a feeling that patch is replacing an
annoying bug on one platform with a critical bug on another.

Not sure that is a good tradeoff...


The debug output kindly provided by Сергей in
http://bugs.debian.org/cgi-bin/bugreport.cgi?msg=36;filename=dmesg.txt;att=4;bug=692607
shows (don't know why I included &work->entry - that's pointless):

 [   59.548011] work->data=0x00000000, &work->entry=de33c56c, work->entry.next=  (null), work->entry.prev=  (null)

which clearly tells us that the problem is related to
i915_handle_error() calling

 	queue_work(dev_priv->wq, &dev_priv->error_work);

with an uninitialized error_work.  As noted earlier, this is supposed to
be initialized in intel_irq_init() so either that has not happend (yet?)
or something has zeroed it out later.

I am putting a beer on the first alternative.

Right.  It's even bloody obvious (no that you all have pointed to the
releases surrounding the 655152 bugfix).  That patch adds this i8xx
specific function:

+static void i8xx_irq_preinstall(struct drm_device * dev)
+{
+	drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev->dev_private;
+	int pipe;
+
+	atomic_set(&dev_priv->irq_received, 0);
+
+	for_each_pipe(pipe)
+		I915_WRITE(PIPESTAT(pipe), 0);
+	I915_WRITE16(IMR, 0xffff);
+	I915_WRITE16(IER, 0x0);
+	POSTING_READ16(IER);
+}

replacing this for all chips matching "(INTEL_INFO(dev)->gen == 2)":

static void i915_driver_irq_preinstall(struct drm_device * dev)
{
	drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev->dev_private;
	int pipe;

	atomic_set(&dev_priv->irq_received, 0);

	INIT_WORK(&dev_priv->hotplug_work, i915_hotplug_work_func);
	INIT_WORK(&dev_priv->error_work, i915_error_work_func);

	if (I915_HAS_HOTPLUG(dev)) {
		I915_WRITE(PORT_HOTPLUG_EN, 0);
		I915_WRITE(PORT_HOTPLUG_STAT, I915_READ(PORT_HOTPLUG_STAT));
	}

	I915_WRITE(HWSTAM, 0xeffe);
	for_each_pipe(pipe)
		I915_WRITE(PIPESTAT(pipe), 0);
	I915_WRITE(IMR, 0xffffffff);
	I915_WRITE(IER, 0x0);
	POSTING_READ(IER);
}


Anyone able to spot the missing INIT_WORK()'s?  Based on the
I915_HAS_HOTPLUG(dev) test, I assume that leaving the first one out was
intentional.  But the second one cannot be left out, as demonstrated by
these bug reports.

I am attaching a proposed fix on top of the 655152 patch, which I have
not tested at all on actual Debian kernel sources.  Might need context
adjustments.  I'd appreciate it if anyone with crashing hardware could
test it.



Bjørn
>From d2451aff41d2db6047586c22317cd247e4c000ca Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Bj=C3=B8rn=20Mork?= <bjorn@mork.no>
Date: Thu, 28 Feb 2013 11:26:20 +0100
Subject: [PATCH] drm/i915: initialize error_work for i8xx interrupt handler
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The backport of upstream commit c2798b19bac2538393fc932bfbe59807a4734b3e
failed to initialize the error_work struct for gen2 hardware, resulting
in hitting a BUG in kernel/workqueue.c if/when the interrupt handler
tried to queue error handling work.

Signed-off-by: Bjørn Mork <bjorn@mork.no>
---
 drivers/gpu/drm/i915/i915_irq.c |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 47b08ce..bb9b943 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -2102,6 +2102,8 @@ static void i8xx_irq_preinstall(struct drm_device * dev)
 
 	atomic_set(&dev_priv->irq_received, 0);
 
+	INIT_WORK(&dev_priv->error_work, i915_error_work_func);
+
 	for_each_pipe(pipe)
 		I915_WRITE(PIPESTAT(pipe), 0);
 	I915_WRITE16(IMR, 0xffff);
-- 
1.7.10.4


Reply to: