Bug#584314: base: System freezes at random time after Resume from Suspend (Regression)
Andreas Berger wrote:
> ok, i narrowed it down, but it is:
>
> found: linux-image-2.6.36-trunk-686, version 2.6.36-1~experimental.1
> not found: linux-image-2.6.37-rc4-686, version 2.6.37~rc4-1~experimental.1
>
> and this time i think i got a complete call trace, is attached
Nice. Alas, after looking at the Debian changelog and "git shortlog
v2.6.36..v2.6.37-rc4" output, no particular change jumps out as likely
to have fixed this corruption (and the places the kernel panicked
don't give any obvious clue).
Some ideas for narrowing it down:
- could you try suspending in single-user mode (i.e., kernel
parameters "single debug"), to rule out a problem in the i915
driver?
- likewise, does unloading other modules before suspend help?
- if nothing else gives a hint: can you bisect to find the fix? It
works like this:
1. Reproduce the bug with the unpatched kernel.
# apt-get install git-core build-essential
$ git clone git://github.com/torvalds/linux.git; # kernel.org is down
$ cd linux
$ git checkout v2.6.36
$ make localmodconfig; # minimal configuration
$ make deb-pkg; # with -j<n> for parallel build if wanted
# dpkg -i ../<linux-image package name>
# reboot
... test test test ...
Hopefully it reproduces the bug. Otherwise, declare victory and we
can figure out how Debian-specific changes screwed it up.
2. Reproduce the fix.
$ cd ~/src/linux
$ git checkout v2.6.37-rc4
$ yes "" | make silentoldconfig; # reuse configuration
$ make deb-pkg
# dpkg -i ../<linux-image package name>
# reboot
... test test test ...
Hopefully it does _not_ reproduce the bug. If not, try again after
copying Debian's config-2.6.37-rc4-686 as ~/src/linux/.config and
rebuild --- if that fixes it, declare victory and we can figure out
which configuration change fixed it, and if that doesn't fix it, we
can look for a relevant Debian-specific patch.
3. Great --- so v2.6.36 reproduces the bug and v2.6.37-rc4 reproduces
the fix. Tell git:
$ cd ~/src/linux
$ git bisect start v2.6.37-rc4 v2.6.36
Git checks out a revision halfway between to test.
$ yes "" | make silentoldconfig; # reuse configuration
$ make deb-pkg
# dpkg -i ../<linux-image package name>
# reboot
... test test test ...
$ cd ~/src/linux
$ git bisect good; # if it crashes
$ git bisect bad; # if it is stable
$ git bisect skip; # if some other bug makes it hard to test
Yes, "good" means "successfully demonstrates the bug". The naming is
a little confusing because git bisect is usually used to find changes
introducing bugs rather than changes fixing them.
4. Repeat until bored:
$ make silentoldconfig
$ make deb-pkg
# dpkg -i ../<linux-image package>
# reboot
... test test test ...
$ cd ~/src/linux
$ git bisect good / bad / skip
Eventually it will tell the "first bad commit" (i.e., the fix), which
was what was wanted. If you get bored before then, that's still
useful --- "git bisect log" will tell the results so far. (Even a
few rounds can narrow things down a lot.) If the gitk package is
installed, you can run "git bisect visualize" at any time to watch the
range of changes potentially containing the fix narrowing.
"man git-bisect" and /usr/share/doc/git-doc/git-bisect-lk2009.html
from the git-doc package have details.
Thanks much for your help so far!
Jonathan
Reply to: