--- Begin Message ---
Package: linux-image-2.6.17-2-amd64
Version: 2.6.17-8
Severity: critical
Justification: breaks unrelated software
The specifications of the machine in question before the bug was found:
AMD64 3700+ processor (single core)
ASUS A8N-SLI deluxe motherboard
2Gb of 333MHz DDR in two 1Gb sticks
Debian Etch (dist-upgrade as of August 31)
Debian stock kernel 2.6.17-2-amd64
VMWare 5.5 using Any-to-Any patch, installed and running properly
Everything was happy until I added another 1Gb of RAM to the machine (in
the form of two 512Mb 400MHz DDR sticks). Upon doing so, the VMWare GUI
launches properly but starting a VM crashes the entire X session and a
number of other programs with the following output to syslog:
-------------------------------------------------------------------------------
Sep 4 00:40:31 localhost kernel: Active:73720 inactive:32596 dirty:683
writeback:0 unstable:0 free:648340 slab:9099 mapped:57706
pagetables:1790
Sep 4 00:40:31 localhost kernel: Node 0 DMA free:24kB min:24kB low:28kB
high:36kB active:0kB inactive:0kB present:11292kB pages_scanned:0
all_unreclaimable? yes
Sep 4 00:40:31 localhost kernel: lowmem_reserve[]: 0 3014 3014 3014
Sep 4 00:40:31 localhost kernel: Node 0 DMA32 free:2593336kB min:7012kB
low:8764kB high:10516kB active:294880kB inactive:130384kB
present:3086500kB pages_scanned:0 all_unreclaimable? no
Sep 4 00:40:31 localhost kernel: lowmem_reserve[]: 0 0 0 0
Sep 4 00:40:31 localhost kernel: Node 0 Normal free:0kB min:0kB low:0kB
high:0kB active:0kB inactive:0kB present:0kB pages_scanned:0
all_unreclaimable? no
Sep 4 00:40:31 localhost kernel: lowmem_reserve[]: 0 0 0 0
Sep 4 00:40:31 localhost kernel: Node 0 HighMem free:0kB min:128kB
low:128kB high:128kB active:0kB inactive:0kB present:0kB pages_scanned:0
all_unreclaimable? no
Sep 4 00:40:31 localhost kernel: lowmem_reserve[]: 0 0 0 0
Sep 4 00:40:31 localhost kernel: Node 0 DMA: 0*4kB 1*8kB 1*16kB 0*32kB
0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 24kB
Sep 4 00:40:31 localhost kernel: Node 0 DMA32: 2506*4kB 19942*8kB
21602*16kB 16772*32kB 10139*64kB 4347*128kB 843*256kB 109*512kB
29*1024kB 11*2048kB 3*4096kB = 2593336kB
Sep 4 00:40:31 localhost kernel: Node 0 Normal: empty
Sep 4 00:40:31 localhost kernel: Node 0 HighMem: empty
Sep 4 00:40:31 localhost kernel: Swap cache: add 44, delete 44, find
0/0, race 0+0
Sep 4 00:40:31 localhost kernel: Free swap = 3855420kB
Sep 4 00:40:31 localhost kernel: Total swap = 3855592kB
Sep 4 00:40:31 localhost kernel: Free swap: 3855420kB
Sep 4 00:40:31 localhost kernel: 786416 pages of RAM
Sep 4 00:40:31 localhost kernel: 13657 reserved pages
Sep 4 00:40:31 localhost kernel: 120890 pages shared
Sep 4 00:40:31 localhost kernel: 0 pages swap cached
Sep 4 00:40:31 localhost kernel: Out of Memory: Kill process 21606
(kdeinit) score 200185 and children.
Sep 4 00:40:31 localhost kernel: Out of memory: Killed process 21634
(artsd).
Sep 4 00:40:31 localhost kernel: Out of Memory: Kill process 21606
(kdeinit) score 191734 and children.
Sep 4 00:40:31 localhost kernel: Out of memory: Killed process 21638
(konqueror).
Sep 4 00:40:31 localhost kernel: Out of Memory: Kill process 21606
(kdeinit) score 165438 and children.
Sep 4 00:40:31 localhost kernel: Out of memory: Killed process 21639
(konqueror).
Sep 4 00:40:31 localhost kernel: Out of Memory: Kill process 21606
(kdeinit) score 139152 and children.
Sep 4 00:40:31 localhost kernel: Out of memory: Killed process 21640
(kwrite).
Sep 4 00:40:31 localhost kernel: Out of Memory: Kill process 21606
(kdeinit) score 124096 and children.
Sep 4 00:40:31 localhost kernel: Out of memory: Killed process 21641
(konqueror).
Sep 4 00:40:31 localhost kernel: Out of Memory: Kill process 21606
(kdeinit) score 97760 and children.
Sep 4 00:40:31 localhost kernel: Out of memory: Killed process 21642
(konqueror).
Sep 4 00:40:31 localhost kernel: Out of Memory: Kill process 21606
(kdeinit) score 71478 and children.
Sep 4 00:40:31 localhost kernel: Out of memory: Killed process 21643
(konqueror).
Sep 4 00:40:31 localhost kernel: Out of Memory: Kill process 21644
(gaim) score 58404 and children.
Sep 4 00:40:31 localhost kernel: Out of memory: Killed process 21651
(gaim).
Sep 4 00:40:31 localhost kernel: Out of Memory: Kill process 21606
(kdeinit) score 45197 and children.
Sep 4 00:40:31 localhost kernel: Out of memory: Killed process 21644
(gaim).
Sep 4 00:40:31 localhost kernel: Out of Memory: Kill process 21623
(kdesktop) score 30577 and children.
Sep 4 00:40:31 localhost kernel: Out of memory: Killed process 21623
(kdesktop).
Sep 4 00:40:31 localhost kernel: Out of Memory: Kill process 21606
(kdeinit) score 30482 and children.
Sep 4 00:40:31 localhost kernel: Out of memory: Killed process 21823
(wrapper-gtk24.s).
Sep 4 00:40:31 localhost kernel: Out of Memory: Kill process 21613
(kded) score 30037 and children.
Sep 4 00:40:31 localhost kernel: Out of memory: Killed process 21613
(kded).
Sep 4 00:40:31 localhost kernel: Out of Memory: Kill process 21606
(kdeinit) score 29581 and children.
Sep 4 00:40:31 localhost kernel: Out of memory: Killed process 21856
(klauncher).
Sep 4 00:40:31 localhost kernel: Out of Memory: Kill process 21658
(knotify) score 28944 and children.
Sep 4 00:40:31 localhost kernel: Out of memory: Killed process 21658
(knotify).
Sep 4 00:40:31 localhost kernel: Out of Memory: Kill process 21489
(FahCore_a0.exe) score 28370 and children.
Sep 4 00:40:31 localhost kernel: Out of memory: Killed process 21489
(FahCore_a0.exe).
Sep 4 00:40:31 localhost kernel: Out of memory: Killed process 21490
(FahCore_a0.exe).
Sep 4 00:40:31 localhost kernel: Out of memory: Killed process 21491
(FahCore_a0.exe).
Sep 4 00:40:31 localhost kernel: Out of memory: Killed process 21492
(FahCore_a0.exe).
Sep 4 00:40:32 localhost kernel: Out of Memory: Kill process 21625
(kicker) score 26543 and children.
Sep 4 00:40:32 localhost kernel: Out of memory: Killed process 21625
(kicker).
Sep 4 00:40:32 localhost kernel: Out of Memory: Kill process 21654
(klipper) score 22490 and children.
Sep 4 00:40:32 localhost kernel: Out of memory: Killed process 21654
(klipper).
Sep 4 00:40:32 localhost kernel: Out of Memory: Kill process 21636
(kaccess) score 21457 and children.
Sep 4 00:40:32 localhost kernel: Out of memory: Killed process 21636
(kaccess).
Sep 4 00:40:32 localhost kernel: Out of Memory: Kill process 21620
(ksmserver) score 21218 and children.
Sep 4 00:40:32 localhost kernel: Out of memory: Killed process 21620
(ksmserver).
Sep 4 00:40:32 localhost kernel: Out of Memory: Kill process 21606
(kdeinit) score 19539 and children.
Sep 4 00:40:32 localhost kernel: Out of memory: Killed process 21606
(kdeinit).
Sep 4 00:40:32 localhost kernel: Out of Memory: Kill process 21609
(dcopserver) score 18792 and children.
Sep 4 00:40:32 localhost kernel: Out of memory: Killed process 21609
(dcopserver).
Sep 4 00:40:32 localhost kernel: Out of Memory: Kill process 21853
(vmware) score 11787 and children.
Sep 4 00:40:32 localhost kernel: Out of memory: Killed process 21855
(vmware-vmx).
Sep 4 00:40:36 localhost kdm_greet[21878]: Can't open default user face
-------------------------------------------------------------------------------
The following appears later in the syslog:
-------------------------------------------------------------------------------
Sep 4 00:42:28 localhost kernel: DMA memory shortage. Temporarily
falling back on virtual DMA
Sep 4 00:43:26 localhost kernel: vmware: page allocation failure.
order:3, mode:0x10d1
Sep 4 00:43:26 localhost kernel:
Sep 4 00:43:26 localhost kernel: Call Trace:
<ffffffff8020e081>{__alloc_pages+680}
<ffffffff8023a566>{__get_free_pages+14}
Sep 4 00:43:26 localhost kernel:
<ffffffff88160d7d>{:floppy:floppy_open+295}
<ffffffff802b6590>{do_open+160}
Sep 4 00:43:26 localhost kernel: <ffffffff802b64e8>{bdget+276}
<ffffffff802b6ab2>{blkdev_open+0}
Sep 4 00:43:26 localhost kernel:
<ffffffff802b6ad3>{blkdev_open+33} <ffffffff8021cadf>{__dentry_open+217}
Sep 4 00:43:26 localhost kernel:
<ffffffff80225701>{do_filp_open+42}
<ffffffff80214751>{get_unused_fd+102}
Sep 4 00:43:26 localhost kernel:
<ffffffff8021859e>{do_sys_open+68} <ffffffff8025ccd2>{ia32_sysret+0}
Sep 4 00:43:26 localhost kernel: Mem-info:
Sep 4 00:43:26 localhost kernel: Node 0 DMA per-cpu:
Sep 4 00:43:26 localhost kernel: cpu 0 hot: high 0, batch 1 used:0
Sep 4 00:43:26 localhost kernel: cpu 0 cold: high 0, batch 1 used:0
Sep 4 00:43:26 localhost kernel: Node 0 DMA32 per-cpu:
Sep 4 00:43:26 localhost kernel: cpu 0 hot: high 186, batch 31 used:159
Sep 4 00:43:26 localhost kernel: cpu 0 cold: high 62, batch 15 used:11
Sep 4 00:43:26 localhost kernel: Node 0 Normal per-cpu: empty
Sep 4 00:43:26 localhost kernel: Node 0 HighMem per-cpu: empty
Sep 4 00:43:26 localhost kernel: Free pages: 2313696kB (0kB
HighMem)
Sep 4 00:43:26 localhost kernel: Active:126170 inactive:48987 dirty:596
writeback:0 unstable:0 free:578424 slab:9283 mapped:115120
pagetables:2412
Sep 4 00:43:26 localhost kernel: Node 0 DMA free:32kB min:24kB low:28kB
high:36kB active:0kB inactive:0kB present:11292kB pages_scanned:0
all_unreclaimable? yes
Sep 4 00:43:26 localhost kernel: lowmem_reserve[]: 0 3014 3014 3014
Sep 4 00:43:26 localhost kernel: Node 0 DMA32 free:2313664kB min:7012kB
low:8764kB high:10516kB active:504680kB inactive:195948kB
present:3086500kB pages_scanned:0 all_unreclaimable? no
-------------------------------------------------------------------------------
I am hoping this is the right place to submit this information. Given
as the problem only seems to be reproducable with VMWare (thus requiring
a tainted kernel), I am submitting here instead of LKML. Removal of the
extra 1Gb of DDR (reducing the system to 2Gb again) fixes the problem;
at 2Gb, VMWare runs properly again. The problem is not unique to a
specific VM: two VMs (one Windows 2000, one Debian IA32 Sarge) both
exhibited the same behavior, as did a new one.
I would not mark this problem with such high severity except that the
following output to the syslog halted my tinkering for the evening:
-------------------------------------------------------------------------------
Sep 4 01:06:01 localhost kernel: md: syncing RAID array md0
Sep 4 01:06:01 localhost kernel: md: minimum _guaranteed_
reconstruction speed: 1000 KB/sec/disc.
Sep 4 01:06:01 localhost kernel: md: using maximum available idle IO
bandwidth (but not more than 200000 KB/sec) for reconstruction.
Sep 4 01:06:01 localhost kernel: md: using 128k window, over a total of
312571136 blocks.
-------------------------------------------------------------------------------
This happened immediately after my fourth X session crash as a result of
starting VMWare. In this case, whatever memory problem is arising
apparently caused my RAID5 to go out of sync. mdadm --detail /dev/md0
reported the array's status as "clean, resyncing". I did not feel
comfortable proceeding until the array reconstruction had completed.
The following is the output of a few potentially helpful commands from
my system:
-------------------------------------------------------------------------------
root@nestor:~# free
total used free shared buffers
cached
Mem: 3091600 1342420 1749180 0 7528
834596
-/+ buffers/cache: 500296 2591304
Swap: 3855592 116 3855476
root@nestor:~# cat /proc/meminfo
MemTotal: 3091600 kB
MemFree: 1740884 kB
Buffers: 7884 kB
Cached: 840740 kB
SwapCached: 16 kB
Active: 803144 kB
Inactive: 457132 kB
HighTotal: 0 kB
HighFree: 0 kB
LowTotal: 3091600 kB
LowFree: 1740884 kB
SwapTotal: 3855592 kB
SwapFree: 3855476 kB
Dirty: 112 kB
Writeback: 0 kB
Mapped: 543992 kB
Slab: 50144 kB
CommitLimit: 5401392 kB
Committed_AS: 829868 kB
PageTables: 9344 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 298580 kB
VmallocChunk: 34359432699 kB
root@nestor:~# cat /proc/vmstat
nr_dirty 121
nr_writeback 0
nr_unstable 0
nr_page_table_pages 2338
nr_mapped 129519
nr_slab 12908
pgpgin 79491310
pgpgout 58817848
pswpin 18
pswpout 44
pgalloc_high 0
pgalloc_normal 0
pgalloc_dma32 67361665
pgalloc_dma 590
pgfree 67755244
pgactivate 4297035
pgdeactivate 2885886
pgfault 34368037
pgmajfault 46515
pgrefill_high 0
pgrefill_normal 0
pgrefill_dma32 8859331
pgrefill_dma 0
pgsteal_high 0
pgsteal_normal 0
pgsteal_dma32 18486611
pgsteal_dma 0
pgscan_kswapd_high 0
pgscan_kswapd_normal 0
pgscan_kswapd_dma32 21019296
pgscan_kswapd_dma 0
pgscan_direct_high 0
pgscan_direct_normal 0
pgscan_direct_dma32 1526688
pgscan_direct_dma 0
pginodesteal 526399
slabs_scanned 874310528
kswapd_steal 17458257
kswapd_inodesteal 10148069
pageoutrun 149720
allocstall 6287
pgrotated 2698
nr_bounce 0
root@nestor:~# vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
1 0 116 1568544 14812 1015812 0 0 899 665 99 176 15 4 77 4
2 0 116 1568544 14812 1015812 0 0 0 0 1539 2059 59 41 0 0
1 0 116 1568420 14812 1015812 0 0 4 0 1416 2223 61 39 0 0
-------------------------------------------------------------------------------
More information available upon request. I'm not overly familiar with
the bug reporting process, so my e-mail is zep01@bahj.com (just in case
more information is needed and I'm not responding to questions or
somesuch).
Cheers!
-- System Information:
Debian Release: testing/unstable
APT prefers testing
APT policy: (600, 'testing'), (500, 'stable'), (400, 'unstable')
Architecture: amd64 (x86_64)
Shell: /bin/sh linked to /bin/bash
Kernel: Linux 2.6.17-2-amd64
Locale: LANG=en_US, LC_CTYPE=en_US (charmap=ISO-8859-1)
Versions of packages linux-image-2.6.17-2-amd64 depends on:
ii e2fsprogs 1.39-1 ext2 file system utilities and lib
ii initramfs-tools [linux-initra 0.76 tools for generating an initramfs
ii module-init-tools 3.2.2-3 tools for managing Linux kernel mo
linux-image-2.6.17-2-amd64 recommends no packages.
-- debconf information:
linux-image-2.6.17-2-amd64/postinst/kimage-is-a-directory:
shared/kernel-image/really-run-bootloader: true
linux-image-2.6.17-2-amd64/preinst/abort-overwrite-2.6.17-2-amd64:
linux-image-2.6.17-2-amd64/preinst/overwriting-modules-2.6.17-2-amd64: true
linux-image-2.6.17-2-amd64/postinst/old-system-map-link-2.6.17-2-amd64: true
linux-image-2.6.17-2-amd64/preinst/failed-to-move-modules-2.6.17-2-amd64:
linux-image-2.6.17-2-amd64/preinst/abort-install-2.6.17-2-amd64:
linux-image-2.6.17-2-amd64/postinst/depmod-error-2.6.17-2-amd64: false
linux-image-2.6.17-2-amd64/preinst/elilo-initrd-2.6.17-2-amd64: true
linux-image-2.6.17-2-amd64/postinst/bootloader-error-2.6.17-2-amd64:
linux-image-2.6.17-2-amd64/prerm/would-invalidate-boot-loader-2.6.17-2-amd64: true
linux-image-2.6.17-2-amd64/preinst/lilo-has-ramdisk:
linux-image-2.6.17-2-amd64/postinst/old-dir-initrd-link-2.6.17-2-amd64: true
linux-image-2.6.17-2-amd64/preinst/lilo-initrd-2.6.17-2-amd64: true
linux-image-2.6.17-2-amd64/preinst/already-running-this-2.6.17-2-amd64:
linux-image-2.6.17-2-amd64/postinst/old-initrd-link-2.6.17-2-amd64: true
linux-image-2.6.17-2-amd64/postinst/depmod-error-initrd-2.6.17-2-amd64: false
linux-image-2.6.17-2-amd64/postinst/create-kimage-link-2.6.17-2-amd64: true
linux-image-2.6.17-2-amd64/prerm/removing-running-kernel-2.6.17-2-amd64: true
linux-image-2.6.17-2-amd64/postinst/bootloader-test-error-2.6.17-2-amd64:
linux-image-2.6.17-2-amd64/preinst/initrd-2.6.17-2-amd64:
linux-image-2.6.17-2-amd64/preinst/bootloader-initrd-2.6.17-2-amd64: true
--- End Message ---
--- Begin Message ---
On Mon, Sep 04, 2006 at 09:47:20PM +0000, Zachary Palmer wrote:
> Upon doing so, the VMWare GUI
> launches properly but starting a VM crashes the entire X session and a
> number of other programs with the following output to syslog:
We can't support tainted kernels. You have to reproduce this without
vmware modules loaded.
Bastian
--
A princess should not be afraid -- not with a brave knight to protect her.
-- McCoy, "Shore Leave", stardate 3025.3
--- End Message ---