[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#613979: [regression] "BUG: Unable to handle kernel paging request at ffffc90013cd8000" and no sound card recognized



On Mon, 2011-08-15 at 15:18 +0200, Paul Menzel wrote:
> Dear folks,
> 
> 
> Am Montag, den 15.08.2011, 15:00 +0200 schrieb Paul Menzel:
> 
> > Am Donnerstag, den 14.07.2011, 23:35 +0200 schrieb Svante Signell:
> > > On Thu, 2011-07-14 at 13:27 -0500, Jonathan Nieder wrote:
> > 
> > > > Takashi Iwai wrote:
> > > > >>> On Wed, 2011-03-30 at 15:13 +0200, Clemens Ladisch wrote:
> > > > >>>>>>>>> Svante Signell wrote:
> > > > 
> > > > >>>>>>>>>> During boot of kernel 2.6.38 (and 2.6.37) udev bugs out:
> > > > >>>>>>>>>> Waiting for /dev to be fully populated
> > > > >>>>>>>>>> BUG: Unable to handle kernel paging request at ffffc90013cd8000
> > > > >>>>>>>>>> axz_probe+ ... [snd_hda_intel]
> > > > >>>>>>>>>> ...lots of output lost...
> > > > >>>>>>>>>> udevadm timeout 180 sec ...
> > > > >>>>>>>>>> udevd[390]: worker [439] failed while handling
> > > > >>>>>>>>>> '/devices/pci0000:80/0000:80:01.0'
> > > > >>>>>>>>>> 
> > > > >>>>>>>>>> After the timeout the boot continues! Have not yet tested if sound is
> > > > >>>>>>>>>> functional.
> > > > [...]
> > > > >>>> This is the azx_readw(chip, GCAP) in azx_create(); chip->remap_addr is
> > > > >>>> 0xffffc90011c08000 which does look like a valid pointer, but isn't.
> > > > [...]
> > > > > The point where it Oops implies that the problem isn't in the sound
> > > > > driver but rather in a breakage in a deeper level, either PCI core,
> > > > > x86 mm or ACPI/BIOS.
> > > > >
> > > > > Any chance to bisect the kernel?
> > > > 
> > > > Svante bisected it to v2.6.34-rc1~218^2~27 (x86/pci: Use
> > > > resource_size_t in update_res, 2010-02-10) --- thanks.  Which is
> > > > pretty weird, since I think phys_addr_t on an amd64 machine (and hence
> > > > resource_size_t) would be 64 bits, making that commit a no-op.
> > > > 
> > > > Svante, more questions (sorry):
> > > > 
> > > >  - could you try booting b74fd238a9cf and b74fd238a9cf^ again
> > > >    (to make sure we haven't hit a heisenbug) and send the
> > > >    corresponding full dmesg and .config files?
> > > 
> > > I am very sorry but I don't have physical access to that box for a month
> > > from now. However, something that might be more interesting is the
> > > output of the second-to last message that was concerning
> > > x86/pci/amd/.... I might get help to find that message in a few days.
> > 
> > thanks to Jonathan’s help I was able to perform the bisection.
> > 
> >         $ git bisect start 0f2cc4ecd81dc1917a041dc93db0ada28f8356fa 8724fdb53d27d7b59b60c8a399cc67f9abfabb33
> >         […]
> >         $ git bisect log
> >         # bad: [0f2cc4ecd81dc1917a041dc93db0ada28f8356fa] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
> >         # good: [8724fdb53d27d7b59b60c8a399cc67f9abfabb33] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
> >         git bisect start '0f2cc4ecd81dc1917a041dc93db0ada28f8356fa' '8724fdb53d27d7b59b60c8a399cc67f9abfabb33'
> >         # good: [14f3ad6f4a12495b32b0dd743bc7179f36658208] ipv6: Use 1280 as min MTU for ipv6 forwarding
> >         git bisect good 14f3ad6f4a12495b32b0dd743bc7179f36658208
> >         # good: [60f8a8d4c6c46bb080e8e65d30be31b172a39a78] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
> >         git bisect good 60f8a8d4c6c46bb080e8e65d30be31b172a39a78
> >         # bad: [7f5b09c15ab989ed5ce4adda0be42c1302df70b7] Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6
> >         git bisect bad 7f5b09c15ab989ed5ce4adda0be42c1302df70b7
> >         # good: [9f445cb29918dc488b7a9a92ef018599cce33df7] USB: musb: disable double buffering for older RTL versions
> >         git bisect good 9f445cb29918dc488b7a9a92ef018599cce33df7
> >         # bad: [fb7b096d949fa852442ed9d8f982bce526ccfe7e] Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
> >         git bisect bad fb7b096d949fa852442ed9d8f982bce526ccfe7e
> >         # bad: [dce46a04d55d6358d2d4ab44a4946a19f9425fe2] early_res: Need to save the allocation name in drop_range_partial()
> >         git bisect bad dce46a04d55d6358d2d4ab44a4946a19f9425fe2
> >         # bad: [c252a5bb1f57afb1e336d68085217727ca7b2134] x86: Only call dma32_reserve_bootmem 64bit !CONFIG_NUMA
> >         git bisect bad c252a5bb1f57afb1e336d68085217727ca7b2134
> >         # bad: [9ad3f2c7c69659c343843393944d739fec1f2e73] x86/pci: Add cap_resource()
> >         git bisect bad 9ad3f2c7c69659c343843393944d739fec1f2e73
> >         # good: [27811d8cabe56e0c3622251b049086f49face4ff] x86: Move range related operation to one file
> >         git bisect good 27811d8cabe56e0c3622251b049086f49face4ff
> >         # bad: [3e3da00c01d050307e753fb7b3e84aefc16da0d0] x86/pci: AMD one chain system to use pci read out res
> >         git bisect bad 3e3da00c01d050307e753fb7b3e84aefc16da0d0
> > 
> > and `git bisect` showed me the same faulty commit as Svante found out:
> > b74fd238 [2].
> > 
> >         commit b74fd238a9cf39a81d94152f375b756bf795b4af
> >         Author: Yinghai Lu <yinghai@kernel.org>
> >         Date:   Wed Feb 10 01:20:08 2010 -0800
> >         
> >             x86/pci: Use resource_size_t in update_res
> > 
> > (Is there a command to show that summary again? I could not find it in
> > the help.)
> > 
> > But I before started with the bisection I tried what Jonathan asked
> > Svante to do and b74fd238a9cf and b74fd238a9cf^ did not show these
> > problems. Additionally it is weird that I never had to test that commit
> > b74fd238 in that bisection run.
> > 
> > I will test it again¹ and rebuild it and report back. But I would
> > suspect 3e3da00c [2] as the culprit.
> > 
> >         commit 3e3da00c01d050307e753fb7b3e84aefc16da0d0
> >         Author: Yinghai Lu <yinghai@kernel.org>
> >         Date:   Wed Feb 10 01:20:09 2010 -0800
> >         
> >             x86/pci: AMD one chain system to use pci read out res
> > 
> > `make` should take care of all necessary recompilations I guess, so no
> > `make clean`(?) is required.
> 
> So from my perspective I would say `git bisect` has a bug.
> 
>         $ git --version
>         git version 1.7.5.4
> 
> Here is the new result.
> 
>         $ git bisect good b74fd238a9cf39a81d94152f375b756bf795b4af
>         3e3da00c01d050307e753fb7b3e84aefc16da0d0 is the first bad commit
>         commit 3e3da00c01d050307e753fb7b3e84aefc16da0d0
>         Author: Yinghai Lu <yinghai@kernel.org>
>         Date:   Wed Feb 10 01:20:09 2010 -0800
>         
>             x86/pci: AMD one chain system to use pci read out res
>             
>             Found MSI amd k8 based laptops is hiding [0x70000000, 0x80000000) RAM
>             from e820.
>             
>             enable amd one chain even for all.
>             
>             -v2: use bool for found, according to Andrew
>             
>             Signed-off-by: Yinghai Lu <yinghai@kernel.org>
>             LKML-Reference: <1265793639-15071-6-git-send-email-yinghai@kernel.org>
>             Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
>             Signed-off-by: H. Peter Anvin <hpa@zytor.com>
>         
>         :040000 040000 44e134ff22492c50d49c5e66880cfbf0b6738e50 f3e9511913b613c2f98e96136927b5ed44736e4e M	arch
> 
> Please notice that with this patch also the following messages seem to
> start to be written to the Linux kernel ring buffer (`dmesg`).
> 
>         [    0.227319] pnp 00:0e: disabling [mem 0x00000000-0x0009ffff] because it overlaps 0000:00:00.0 BAR 0 [mem 0x00000000-0x07ffffff pref]
>         [    0.227324] pnp 00:0e: disabling [mem 0x000c0000-0x000bffff disabled] because it overlaps 0000:00:00.0 BAR 0 [mem 0x00000000-0x07ffffff pref]
>         [    0.227330] pnp 00:0e: disabling [mem 0x000e0000-0x000fffff] because it overlaps 0000:00:00.0 BAR 0 [mem 0x00000000-0x07ffffff pref]
>         [    0.227334] pnp 00:0e: disabling [mem 0x00100000-0x77ffffff] because it overlaps 0000:00:00.0 BAR 0 [mem 0x00000000-0x07ffffff pref]
> 
> 
> Thanks,
> 
> Paul
> 
> 
> > ¹ It would be nice if the SHA1 checksum of the commit could be
> > incorporated in the Debian package somehow.
> > 
> > 
> > [1] http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=b74fd238a9cf39a81d94152f375b756bf795b4af
> > [2] http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=3e3da00c01d050307e753fb7b3e84aefc16da0d0


I'm back now and have physical access to the computer. How can I confirm
the commit causing the problems?




Reply to: