Bug#662348: marked as done (i915: please save more useful debugging info when GPU locks up)

To: Jonathan Nieder <jrnieder@gmail.com>
Subject: Bug#662348: marked as done (i915: please save more useful debugging info when GPU locks up)
From: owner@bugs.debian.org (Debian Bug Tracking System)
Date: Sun, 25 Nov 2012 02:00:12 +0000
Message-id: <[🔎] handler.662348.D662348.13538086456506.ackdone@bugs.debian.org>
References: <20121125015712.GA24819@elie.Belkin> <20120305035231.GB30344@burratino>

Your message dated Sat, 24 Nov 2012 17:57:12 -0800
with message-id <20121125015712.GA24819@elie.Belkin>
and subject line Re: i915: please save more useful debugging info when GPU locks up
has caused the Debian Bug report #662348,
regarding i915: please save more useful debugging info when GPU locks up
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact owner@bugs.debian.org
immediately.)


-- 
662348: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=662348
Debian Bug Tracking System
Contact owner@bugs.debian.org with problems

--- Begin Message ---

To: submit@bugs.debian.org
Subject: i915: please save more useful debugging info when GPU locks up
From: Jonathan Nieder <jrnieder@gmail.com>
Date: Sun, 4 Mar 2012 21:52:31 -0600
Message-id: <20120305035231.GB30344@burratino>
In-reply-to: <2flsjmhexzy.fsf@diskless.uio.no>
References: <2flr5js1be6.fsf@login1.uio.no> <20110830050228.GA9068@elie.gateway.2wire.net> <20111017095009.GA22321@elie.hsd1.il.comcast.net> <2flpqhw7xam.fsf@diskless.uio.no> <20111017102000.GD22559@elie.hsd1.il.comcast.net> <2flsjmhexzy.fsf@diskless.uio.no>

Source: linux-2.6
Version: 2.6.32-41
Severity: wishlist
Tags: upstream patch

After an i915 GPU hang, apparently[*] the most convenient and useful
information a person can provide is the last batch buffer.  A patch
applied in the 2.6.34 merge window taught the driver to save this
information on error:

  9df30794f609 "drm/i915: Record batch buffer following GPU error" 2010-02-18

Ben, would the attached patches make sense for squeeze?  (Warning:
untested.  I don't have a machine using Intel graphics.)

With the first applied, one can do the following after a GPU hang.

	mount -t debugfs debugfs /sys/kernel/debug
	cat /sys/kernel/debug/dri/0/i915_error_state >i915_error_state

A later patch adds a notice inviting the user to do that to the kernel
log.

The patches do not seem very risky (the first one adds the error state
collection feature and the ones afterwards are minor tweaks I noticed
while hunting for potential bugfixes relating to that), but this is
more complication than stable_kernel_rules.txt encourages.  So these
are not meant as candidates for inclusion in 2.6.32.y+drm33.z.

[*] judging from <http://intellinuxgraphics.org/how_to_report_bug.html>:
"for GPU hang, get the last batch buffer (see the guide)".

Chris Wilson (4):
  drm/i915: Record batch buffer following GPU error
  drm/i915: Only print an message if there was an error
  drm/i915: Only print 'generating error event' if we actually are
  drm/i915: Include 'i915_error_state' hint for when the GPU catches
    fire

Daniel Vetter (2):
  drm/i915: unload: fix error_work races
  drm/i915: unload: don't leak error state

 drivers/gpu/drm/i915/i915_debugfs.c |   85 +++++++++++
 drivers/gpu/drm/i915/i915_dma.c     |   10 +-
 drivers/gpu/drm/i915/i915_drv.h     |   21 +++
 drivers/gpu/drm/i915/i915_gem.c     |    2 +-
 drivers/gpu/drm/i915/i915_irq.c     |  266 ++++++++++++++++++++++++++++++++---
 drivers/gpu/drm/i915/i915_reg.h     |    1 +
 6 files changed, 359 insertions(+), 26 deletions(-)

--- End Message ---

--- Begin Message ---

To: 662348-done@bugs.debian.org

Subject: Re: i915: please save more useful debugging info when GPU locks up

From: Jonathan Nieder <jrnieder@gmail.com>

Date: Sat, 24 Nov 2012 17:57:12 -0800

Message-id: <20121125015712.GA24819@elie.Belkin>

In-reply-to: <20120305035231.GB30344@burratino>

References: <2flr5js1be6.fsf@login1.uio.no> <20110830050228.GA9068@elie.gateway.2wire.net> <20111017095009.GA22321@elie.hsd1.il.comcast.net> <2flpqhw7xam.fsf@diskless.uio.no> <20111017102000.GD22559@elie.hsd1.il.comcast.net> <2flsjmhexzy.fsf@diskless.uio.no> <20120305035231.GB30344@burratino>
tags 662348 + wontfix
quit

Jonathan Nieder wrote:

> The patches do not seem very risky (the first one adds the error state
> collection feature and the ones afterwards are minor tweaks I noticed
> while hunting for potential bugfixes relating to that), but this is
> more complication than stable_kernel_rules.txt encourages.  So these
> are not meant as candidates for inclusion in 2.6.32.y+drm33.z.

In practice, updating to a newer kernel is generally a better debugging
strategy.  Closing.

Regards,
Jonathan
--- End Message ---

Reply to:

Prev by Date: Processed: Re: i915: please save more useful debugging info when GPU locks up
Next by Date: Bug#613852: [squeeze -> 2.6.37 regression] Clarkdale: native X resolution does not work
Previous by thread: Processed: Re: i915: please save more useful debugging info when GPU locks up
Next by thread: Processed: Re: linux-source-3.2: After mounting nfs export, kernel forces trailing slash for export in /proc/self/mounts
Index(es):
- Date
- Thread