Bug#545517: New occurence of this bug
Hi,
Using KMS+last Debian kernel (2.6.31-1~experimental.2) with hibernate/resume
cycles leads to memory corruption. Here are the IRC logs about this.
Regards,
Vincent
--
Vincent Danjean GPG key ID 0x9D025E87 vdanjean@debian.org
GPG key fingerprint: FC95 08A6 854D DB48 4B9A 8A94 0BF7 7867 9D02 5E87
Unofficial pacakges: http://moais.imag.fr/membres/vincent.danjean/deb.html
APT repo: deb http://perso.debian.org/~vdanjean/debian unstable main
**** DÃ?BUT DU JOURNAL Ã? Thu Oct 15 19:27:41 2009
oct 15 19:27:41 --> You are now talking on #debian-kernel
oct 15 19:27:41 --- Topic for #debian-kernel is 2.6.{18,24} etch update | 2.6.26 lenny | 2.6.30 squeeze | 2.6.30 sid | snapshots: deb http://kernel-archive.buildserver.net/debian-kernel | http://lists.debian.org/debian-kernel | http://wiki.debian.org/DebianKernel | kernel-archive.buildserver.net down until further notice
oct 15 19:27:41 --- Topic for #debian-kernel set by bwh!~bwh@82-69-137-158.dsl.in-addr.zen.co.uk at Thu Sep 10 18:41:23 2009
oct 15 19:29:02 <vdanjean> Hi. I reported a few weeks ago #545517 about possible corruption when using KMS, a new kernel and suspend-to-disk. I tried today again with the last kernel in experimental. I'm experimenting some strange behavior since my last reboot. For example:
oct 15 19:29:22 <vdanjean> vdanjean@eyak:~$ bash
oct 15 19:29:22 <vdanjean> bash: relocation error: bash: symbol getdtablesize, version GLIBC_2.2.5 not defined in file libc.so.6 with link time reference
oct 15 19:29:22 <vdanjean> vdanjean@eyak:~$ tbl
oct 15 19:29:22 <vdanjean> Erreur de segmentation
oct 15 19:29:22 <vdanjean> vdanjean@eyak:~$ md5sum /lib/libc-2.9.so
oct 15 19:29:22 <vdanjean> 310b264a27dfbbe89f84f2cfb973c507 /lib/libc-2.9.so
oct 15 19:29:22 <vdanjean> vdanjean@eyak:~$ ls -l /lib/libc-2.9.so
oct 15 19:29:22 <vdanjean> -rwxr-xr-x 1 root root 1367432 oct 1 03:39 /lib/libc-2.9.so
oct 15 19:29:22 <vdanjean> vdanjean@eyak:~$ echo $SHELL
oct 15 19:29:22 <vdanjean> /bin/bash
oct 15 19:29:44 <vdanjean> It was working correctly before my last suspend/resume. Is there someone from the kernel team or the libc team that would be interested by some tests ? (else, I will reboot my system)
oct 15 19:45:19 <bwh> You sent kernel logs, right?
oct 15 19:48:17 <vdanjean> There is nothing in the current kernel log. And there was nothing when the kernel freeze (when I initially report the bug)
oct 15 19:48:59 <vdanjean> In /var/log/kernel.log, I just have Oct 15 19:07:54 eyak kernel: [38763.827004] tbl[12855] general protection ip:7ff36ce14ffb sp:7fffe2d9aa88 error:0 in ld-2.9.so[7ff36ce02000+1d000]
oct 15 19:48:59 <vdanjean> Oct 15 19:08:00 eyak kernel: [38769.598878] tbl[12859] general protection ip:7f769937dffb sp:7fff2bf23f58 error:0 in ld-2.9.so[7f769936b000+1d000]
oct 15 19:48:59 <vdanjean> Oct 15 19:09:01 eyak kernel: [38830.925900] cron[12871] general protection ip:7f625756e5d8 sp:953143de47d91411 error:0 in ld-2.9.so[7f6257559000+1d000]
oct 15 19:48:59 <vdanjean> Oct 15 19:22:54 eyak kernel: [39663.286909] tbl[13535] general protection ip:7fac7f0e2ffb sp:7fff740857d8 error:0 in ld-2.9.so[7fac7f0d0000+1d000]
oct 15 19:48:59 <vdanjean> Oct 15 19:39:01 eyak kernel: [40630.056163] cron[13809] general protection ip:7f625756e5d8 sp:953143de47d91411 error:0 in ld-2.9.so[7f6257559000+1d000]
oct 15 19:49:19 <vdanjean> ie segfault in user space
oct 15 19:50:19 <vdanjean> Can someone give me the current md5sum of /lib/libc-2.9.so ?
oct 15 19:50:22 <waldi> i remember a similar bug some months ago
oct 15 19:50:49 <jcristau> vdanjean: amd64?
oct 15 19:50:52 <vdanjean> yes
oct 15 19:51:13 <jcristau> 5c53408b506a1ad4e986e11ecf3f18b0 /lib/libc-2.9.so
oct 15 19:51:14 <vdanjean> last update from sid last evening
oct 15 19:51:16 <jcristau> Version: 2.9-27
oct 15 19:51:58 <vdanjean> libc6:
oct 15 19:51:58 <vdanjean> Installé : 2.9-27
oct 15 19:52:10 <jcristau> fun.
oct 15 19:52:18 <vdanjean> Arghh
oct 15 19:52:44 <vdanjean> I hope the corruption is only in the cache, not on-disk
oct 15 19:53:16 <vdanjean> Do you see other tests before I reboot (without KMS)
oct 15 19:56:17 <vdanjean> So, I will reboot and look again the md5sum. I will report here.
**** DÃ?BUT DU JOURNAL Ã? Thu Oct 15 20:03:40 2009
oct 15 20:03:40 --> You are now talking on #debian-kernel
oct 15 20:03:40 --- Topic for #debian-kernel is 2.6.{18,24} etch update | 2.6.26 lenny | 2.6.30 squeeze | 2.6.30 sid | snapshots: deb http://kernel-archive.buildserver.net/debian-kernel | http://lists.debian.org/debian-kernel | http://wiki.debian.org/DebianKernel | kernel-archive.buildserver.net down until further notice
oct 15 20:03:40 --- Topic for #debian-kernel set by bwh!~bwh@82-69-137-158.dsl.in-addr.zen.co.uk at Thu Sep 10 18:41:23 2009
oct 15 20:04:22 <vdanjean> I rebooted. I got a fsck on / (due to mount count, not due to error) : nothing detected
oct 15 20:05:57 <vdanjean> I got some inodes have been deleted on /home (I hard switchoff my laptop to stop it)
oct 15 20:06:02 <vdanjean> But the fun is:
oct 15 20:06:25 <vdanjean> vdanjean@eyak:~$ md5sum /lib/libc-2.9.so
oct 15 20:06:25 <vdanjean> 5c53408b506a1ad4e986e11ecf3f18b0 /lib/libc-2.9.so
oct 15 20:06:46 <vdanjean> So, I was really a in-memory corruption.
oct 15 20:07:33 <vdanjean> I suspect this is due to the use of KMS+suspend/reboot
oct 15 20:09:12 <vdanjean> Since my bug report 545517, I stopped to use KMS (but still use the same kernel) and I had no problem with many suspend/resume cycle.
oct 15 20:09:55 <vdanjean> Last evening, I installed the kernel from experimental and reenabled KMS. I got this bug today.
oct 15 20:10:15 <jcristau> suspend/resume, or hibernate?
oct 15 20:10:31 <vdanjean> hibernate (ie on disk, not on ram)
oct 15 20:11:12 <vdanjean> There is nothing special in /var/log/Xorg.0.log
oct 15 20:11:20 <jcristau> yeah there wouldn't be
oct 15 20:12:48 <vdanjean> For info, when my laptop is at work, it is on a dock with an external DVI screen. In this case, I use xrandr to use the DVI screen and the Panel
oct 15 20:14:33 <vdanjean> Before hibernation, I always switch off the DVI screen (because it is my "main" at work and if I forgot the switch it off, I do not have menu, panel, ... until I type the good xrandr line on resume)
oct 15 20:15:19 <vdanjean> Here, it was a wakeup at home (ie without dock and DVI screen, only the LVDS)
oct 15 20:25:34 <vdanjean> I will add these infos to the Debian bug (#545517) and to the freedesktop bugs (https://bugs.freedesktop.org/show_bug.cgi?id=23836)
Reply to: