Bug handling policy [2nd draft]

To: debian-kernel@lists.debian.org
Subject: Bug handling policy [2nd draft]
From: Ben Hutchings <ben@decadent.org.uk>
Date: Sat, 07 Nov 2009 23:48:11 +0000
Message-id: <[🔎] 1257637691.15927.620.camel@localhost>

This incorporates all the comments, I think.  I'll take further comments
on this draft until 14th November, then consider this final if no-one
objects.

Ben.

---

1. Required information

Submitters are expected to run reportbug or other tool that runs our
'bug' script under the kernel version in question.  The response to
reports without this information should be a request to follow-up using
reportbug.  If we do not receive this information within a month of the
request, the bug may be closed.

Exceptions:
* If the kernel does not boot or is very unstable, instead of the usual
  system information we need the console messages via netconsole, serial
  console, or a photograph.
* If the report is relaying information about a bug acknowledged
  upstream, we do not need system information but we do need specific
  references (bugzilla.kernel.org or git commit id).
* If the bug is clearly not hardware-specific (e.g. packaging error), we
  do not need system information.
* If the bug is reported against a well-defined model, we may not need
  device listings.

2. Severities

Many submitters report bugs with the wrong severity.  We interpret the
criteria as follows and will adjust severity as appropriate:

'critical: makes unrelated software on the system (or the whole system)
break...'
   The bug must make the kernel unbootable or unstable on common
   hardware or all systems that a specific flavour is supposed to
   support.  There is no 'unrelated software' since everything
   depends on the kernel.

'grave: makes the package in question unusable or mostly so...'
   If the kernel is unusable, this already qualifies as critical.

'grave: ...or causes data loss...'
   We exclude loss of data in memory due to a crash.  Only corruption
   of data in storage or communication, or silent failure to write data,
   qualifies.

important
   We include lack of support for new hardware that is generally
   available.

3. Tagging

We do not use user-tags.  In order to aid bug triage we should make use
of the standard tags and 'forwarded' field defined by the BTS.  In
particular:

* Add 'moreinfo' whenever we are waiting for a response from the
  submitter and remove it when we are not
* Do not add 'unreproducible' to bugs that may be hardware-dependent

4. Analysis by maintainers

Generally we should not expect to be able to reproduce bugs without
having similar hardware.  We should consider:

* Searching bugzilla.kernel.org (including closed bugs) or other
  relevant bug tracker
* Searching kernel mailing lists
  - Of the many archives, http://news.gmane.org seems to suck least
  - Patches submitted to some lists are archived at
    http://patchwork.kernel.org/
* Viewing git commit logs for relevant source files
  - In case of a regression, from the known good to the bad version
  - In other cases, from the bad version forwards, in case the bug
    has been fixed since
* Searching kerneloops.org for similar oopses
* Matching the machine code and registers in an 'oops' against the
  source and deducing how the impossible happened (this doesn't work
  that often but when it does you look like a genius ;-)

5. Testing by submitter

Depending on the technical sophistication of the submitter and the
service requirements of the system in question (e.g. whether it's a
production server) we can request one or more of the following:

* Gathering more information passively (e.g. further logging, reporting
  contents of files in procfs or sysfs)
* Upgrading to the current stable/stable-proposed-updates/
  stable-security version, if it includes a fix for a similar bug
* Adding debug or fallback options to the kernel command line or
  module parameters
* Installing the unstable or backports version temporarily
* Rebuilding and installing the kernel with a specific patch added
  (the script debian/bin/test-patches should make this easy)
* Using 'git bisect' to find a specific upstream change that
  introduced the bug

When a bug occurs in what upstream considers the current or previous
stable release, and we cannot fix it, we ask the submitter to report it
upstream at bugzilla.kernel.org under a specific Product and Component,
and to tell us the upstream bug number.  We do not report bugs directly
because follow-up questions from upstream need to go to the submitter,
not to us.  Given the upstream bug number, we mark the bug as forwarded.
bts-link then updates its status.

6. Keeping bugs separate

Many submitters search for a characteristic error message and treat this
as indicating a specific bug.  This can lead to many 'me too' follow-ups
where, for example, the message indicates a driver bug and the second
submitter is using a different driver from the original submitter.

In order to avoid the report turning into a mess of conflicting
information about two or more different bugs:
* We should try to respond to such a follow-up quickly, requesting a
  separate bug report
* We can use the BTS 'summary' command to improve the description of
  the bug
* As a last resort, it may be necessary to open new bugs with the
  relevant information, set their submitters accordingly, and close the
  original report

Where the original report describes more than one bug ('...and other
thing...'), we should clone it and deal with each separately.

7. Applying patches

Patches should normally be reviewed and accepted by the relevant
upstream maintainer (aside from necessary adjustments for an older
kernel version) before being applied.

8. Talking to submitters

We should always be polite to submitters.  Not only is this implied by
the Social Contract, but it is likely to lead to a faster resolution of
the bug.  If a submitter overrated the severity, quietly downgrade it.
If a submitter has done something stupid, request that they undo that
and report back.  'Sorry', and 'please' make a big difference in tone.

We will maintain general advice to submitters at
<http://wiki.debian.org/DebianKernelReportingBugs>.

Ben.

-- 
Ben Hutchings
It is impossible to make anything foolproof because fools are so ingenious.

Reply to:

Follow-Ups:
- Re: Bug handling policy [2nd draft]
  - From: Bastian Blank <waldi@debian.org>

Prev by Date: Bug#509441: marked as done (computer locks up during down-load)
Next by Date: Bug#555017: "BUG: unable to handle kernel paging request" (sock_alloc_send_skb)
Previous by thread: Bug#509441: marked as done (computer locks up during down-load)
Next by thread: Re: Bug handling policy [2nd draft]
Index(es):
- Date
- Thread