[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#386056: marked as done (linux-image-2.6.17-2-amd64: Out of Memory syslog kernel message with more than 2Gb of RAM)



Your message dated Tue, 5 Sep 2006 00:25:04 +0200
with message-id <20060904222504.GB4259@wavehammer.waldi.eu.org>
and subject line Bug#386056: linux-image-2.6.17-2-amd64: Out of Memory syslog kernel message with more than 2Gb of RAM
has caused the attached Bug report to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what I am
talking about this indicates a serious mail system misconfiguration
somewhere.  Please contact me immediately.)

Debian bug tracking system administrator
(administrator, Debian Bugs database)

--- Begin Message ---
Package: linux-image-2.6.17-2-amd64
Version: 2.6.17-8
Severity: critical
Justification: breaks unrelated software

The specifications of the machine in question before the bug was found:
    AMD64 3700+ processor (single core)
    ASUS A8N-SLI deluxe motherboard
    2Gb of 333MHz DDR in two 1Gb sticks
    Debian Etch (dist-upgrade as of August 31)
    Debian stock kernel 2.6.17-2-amd64
    VMWare 5.5 using Any-to-Any patch, installed and running properly

Everything was happy until I added another 1Gb of RAM to the machine (in
the form of two 512Mb 400MHz DDR sticks).  Upon doing so, the VMWare GUI
launches properly but starting a VM crashes the entire X session and a
number of other programs with the following output to syslog:

-------------------------------------------------------------------------------
Sep  4 00:40:31 localhost kernel: Active:73720 inactive:32596 dirty:683
writeback:0 unstable:0 free:648340 slab:9099 mapped:57706
pagetables:1790
Sep  4 00:40:31 localhost kernel: Node 0 DMA free:24kB min:24kB low:28kB
high:36kB active:0kB inactive:0kB present:11292kB pages_scanned:0
all_unreclaimable? yes
Sep  4 00:40:31 localhost kernel: lowmem_reserve[]: 0 3014 3014 3014
Sep  4 00:40:31 localhost kernel: Node 0 DMA32 free:2593336kB min:7012kB
low:8764kB high:10516kB active:294880kB inactive:130384kB
present:3086500kB pages_scanned:0 all_unreclaimable? no
Sep  4 00:40:31 localhost kernel: lowmem_reserve[]: 0 0 0 0
Sep  4 00:40:31 localhost kernel: Node 0 Normal free:0kB min:0kB low:0kB
high:0kB active:0kB inactive:0kB present:0kB pages_scanned:0
all_unreclaimable? no
Sep  4 00:40:31 localhost kernel: lowmem_reserve[]: 0 0 0 0
Sep  4 00:40:31 localhost kernel: Node 0 HighMem free:0kB min:128kB
low:128kB high:128kB active:0kB inactive:0kB present:0kB pages_scanned:0
all_unreclaimable? no
Sep  4 00:40:31 localhost kernel: lowmem_reserve[]: 0 0 0 0
Sep  4 00:40:31 localhost kernel: Node 0 DMA: 0*4kB 1*8kB 1*16kB 0*32kB
0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 24kB
Sep  4 00:40:31 localhost kernel: Node 0 DMA32: 2506*4kB 19942*8kB
21602*16kB 16772*32kB 10139*64kB 4347*128kB 843*256kB 109*512kB
29*1024kB 11*2048kB 3*4096kB = 2593336kB
Sep  4 00:40:31 localhost kernel: Node 0 Normal: empty
Sep  4 00:40:31 localhost kernel: Node 0 HighMem: empty
Sep  4 00:40:31 localhost kernel: Swap cache: add 44, delete 44, find
0/0, race 0+0
Sep  4 00:40:31 localhost kernel: Free swap  = 3855420kB
Sep  4 00:40:31 localhost kernel: Total swap = 3855592kB
Sep  4 00:40:31 localhost kernel: Free swap:       3855420kB
Sep  4 00:40:31 localhost kernel: 786416 pages of RAM
Sep  4 00:40:31 localhost kernel: 13657 reserved pages
Sep  4 00:40:31 localhost kernel: 120890 pages shared
Sep  4 00:40:31 localhost kernel: 0 pages swap cached
Sep  4 00:40:31 localhost kernel: Out of Memory: Kill process 21606
(kdeinit) score 200185 and children.
Sep  4 00:40:31 localhost kernel: Out of memory: Killed process 21634
(artsd).
Sep  4 00:40:31 localhost kernel: Out of Memory: Kill process 21606
(kdeinit) score 191734 and children.
Sep  4 00:40:31 localhost kernel: Out of memory: Killed process 21638
(konqueror).
Sep  4 00:40:31 localhost kernel: Out of Memory: Kill process 21606
(kdeinit) score 165438 and children.
Sep  4 00:40:31 localhost kernel: Out of memory: Killed process 21639
(konqueror).
Sep  4 00:40:31 localhost kernel: Out of Memory: Kill process 21606
(kdeinit) score 139152 and children.
Sep  4 00:40:31 localhost kernel: Out of memory: Killed process 21640
(kwrite).
Sep  4 00:40:31 localhost kernel: Out of Memory: Kill process 21606
(kdeinit) score 124096 and children.
Sep  4 00:40:31 localhost kernel: Out of memory: Killed process 21641
(konqueror).
Sep  4 00:40:31 localhost kernel: Out of Memory: Kill process 21606
(kdeinit) score 97760 and children.
Sep  4 00:40:31 localhost kernel: Out of memory: Killed process 21642
(konqueror).
Sep  4 00:40:31 localhost kernel: Out of Memory: Kill process 21606
(kdeinit) score 71478 and children.
Sep  4 00:40:31 localhost kernel: Out of memory: Killed process 21643
(konqueror).
Sep  4 00:40:31 localhost kernel: Out of Memory: Kill process 21644
(gaim) score 58404 and children.
Sep  4 00:40:31 localhost kernel: Out of memory: Killed process 21651
(gaim).
Sep  4 00:40:31 localhost kernel: Out of Memory: Kill process 21606
(kdeinit) score 45197 and children.
Sep  4 00:40:31 localhost kernel: Out of memory: Killed process 21644
(gaim).
Sep  4 00:40:31 localhost kernel: Out of Memory: Kill process 21623
(kdesktop) score 30577 and children.
Sep  4 00:40:31 localhost kernel: Out of memory: Killed process 21623
(kdesktop).
Sep  4 00:40:31 localhost kernel: Out of Memory: Kill process 21606
(kdeinit) score 30482 and children.
Sep  4 00:40:31 localhost kernel: Out of memory: Killed process 21823
(wrapper-gtk24.s).
Sep  4 00:40:31 localhost kernel: Out of Memory: Kill process 21613
(kded) score 30037 and children.
Sep  4 00:40:31 localhost kernel: Out of memory: Killed process 21613
(kded).
Sep  4 00:40:31 localhost kernel: Out of Memory: Kill process 21606
(kdeinit) score 29581 and children.
Sep  4 00:40:31 localhost kernel: Out of memory: Killed process 21856
(klauncher).
Sep  4 00:40:31 localhost kernel: Out of Memory: Kill process 21658
(knotify) score 28944 and children.
Sep  4 00:40:31 localhost kernel: Out of memory: Killed process 21658
(knotify).
Sep  4 00:40:31 localhost kernel: Out of Memory: Kill process 21489
(FahCore_a0.exe) score 28370 and children.
Sep  4 00:40:31 localhost kernel: Out of memory: Killed process 21489
(FahCore_a0.exe).
Sep  4 00:40:31 localhost kernel: Out of memory: Killed process 21490
(FahCore_a0.exe).
Sep  4 00:40:31 localhost kernel: Out of memory: Killed process 21491
(FahCore_a0.exe).
Sep  4 00:40:31 localhost kernel: Out of memory: Killed process 21492
(FahCore_a0.exe).
Sep  4 00:40:32 localhost kernel: Out of Memory: Kill process 21625
(kicker) score 26543 and children.
Sep  4 00:40:32 localhost kernel: Out of memory: Killed process 21625
(kicker).
Sep  4 00:40:32 localhost kernel: Out of Memory: Kill process 21654
(klipper) score 22490 and children.
Sep  4 00:40:32 localhost kernel: Out of memory: Killed process 21654
(klipper).
Sep  4 00:40:32 localhost kernel: Out of Memory: Kill process 21636
(kaccess) score 21457 and children.
Sep  4 00:40:32 localhost kernel: Out of memory: Killed process 21636
(kaccess).
Sep  4 00:40:32 localhost kernel: Out of Memory: Kill process 21620
(ksmserver) score 21218 and children.
Sep  4 00:40:32 localhost kernel: Out of memory: Killed process 21620
(ksmserver).
Sep  4 00:40:32 localhost kernel: Out of Memory: Kill process 21606
(kdeinit) score 19539 and children.
Sep  4 00:40:32 localhost kernel: Out of memory: Killed process 21606
(kdeinit).
Sep  4 00:40:32 localhost kernel: Out of Memory: Kill process 21609
(dcopserver) score 18792 and children.
Sep  4 00:40:32 localhost kernel: Out of memory: Killed process 21609
(dcopserver).
Sep  4 00:40:32 localhost kernel: Out of Memory: Kill process 21853
(vmware) score 11787 and children.
Sep  4 00:40:32 localhost kernel: Out of memory: Killed process 21855
(vmware-vmx).
Sep  4 00:40:36 localhost kdm_greet[21878]: Can't open default user face
-------------------------------------------------------------------------------

The following appears later in the syslog:

-------------------------------------------------------------------------------
Sep  4 00:42:28 localhost kernel: DMA memory shortage. Temporarily
falling back on virtual DMA
Sep  4 00:43:26 localhost kernel: vmware: page allocation failure.
order:3, mode:0x10d1
Sep  4 00:43:26 localhost kernel:
Sep  4 00:43:26 localhost kernel: Call Trace:
<ffffffff8020e081>{__alloc_pages+680}
<ffffffff8023a566>{__get_free_pages+14}
Sep  4 00:43:26 localhost kernel:
<ffffffff88160d7d>{:floppy:floppy_open+295}
<ffffffff802b6590>{do_open+160}
Sep  4 00:43:26 localhost kernel:        <ffffffff802b64e8>{bdget+276}
<ffffffff802b6ab2>{blkdev_open+0}
Sep  4 00:43:26 localhost kernel:
<ffffffff802b6ad3>{blkdev_open+33} <ffffffff8021cadf>{__dentry_open+217}
Sep  4 00:43:26 localhost kernel:
<ffffffff80225701>{do_filp_open+42}
<ffffffff80214751>{get_unused_fd+102}
Sep  4 00:43:26 localhost kernel:
<ffffffff8021859e>{do_sys_open+68} <ffffffff8025ccd2>{ia32_sysret+0}
Sep  4 00:43:26 localhost kernel: Mem-info:
Sep  4 00:43:26 localhost kernel: Node 0 DMA per-cpu:
Sep  4 00:43:26 localhost kernel: cpu 0 hot: high 0, batch 1 used:0
Sep  4 00:43:26 localhost kernel: cpu 0 cold: high 0, batch 1 used:0
Sep  4 00:43:26 localhost kernel: Node 0 DMA32 per-cpu:
Sep  4 00:43:26 localhost kernel: cpu 0 hot: high 186, batch 31 used:159
Sep  4 00:43:26 localhost kernel: cpu 0 cold: high 62, batch 15 used:11
Sep  4 00:43:26 localhost kernel: Node 0 Normal per-cpu: empty
Sep  4 00:43:26 localhost kernel: Node 0 HighMem per-cpu: empty
Sep  4 00:43:26 localhost kernel: Free pages:     2313696kB (0kB
HighMem)
Sep  4 00:43:26 localhost kernel: Active:126170 inactive:48987 dirty:596
writeback:0 unstable:0 free:578424 slab:9283 mapped:115120
pagetables:2412
Sep  4 00:43:26 localhost kernel: Node 0 DMA free:32kB min:24kB low:28kB
high:36kB active:0kB inactive:0kB present:11292kB pages_scanned:0
all_unreclaimable? yes
Sep  4 00:43:26 localhost kernel: lowmem_reserve[]: 0 3014 3014 3014
Sep  4 00:43:26 localhost kernel: Node 0 DMA32 free:2313664kB min:7012kB
low:8764kB high:10516kB active:504680kB inactive:195948kB
present:3086500kB pages_scanned:0 all_unreclaimable? no
-------------------------------------------------------------------------------

I am hoping this is the right place to submit this information.  Given
as the problem only seems to be reproducable with VMWare (thus requiring
a tainted kernel), I am submitting here instead of LKML.  Removal of the
extra 1Gb of DDR (reducing the system to 2Gb again) fixes the problem;
at 2Gb, VMWare runs properly again.  The problem is not unique to a
specific VM: two VMs (one Windows 2000, one Debian IA32 Sarge) both
exhibited the same behavior, as did a new one.

I would not mark this problem with such high severity except that the
following output to the syslog halted my tinkering for the evening:

-------------------------------------------------------------------------------
Sep  4 01:06:01 localhost kernel: md: syncing RAID array md0
Sep  4 01:06:01 localhost kernel: md: minimum _guaranteed_
reconstruction speed: 1000 KB/sec/disc.
Sep  4 01:06:01 localhost kernel: md: using maximum available idle IO
bandwidth (but not more than 200000 KB/sec) for reconstruction.
Sep  4 01:06:01 localhost kernel: md: using 128k window, over a total of
312571136 blocks.
-------------------------------------------------------------------------------

This happened immediately after my fourth X session crash as a result of
starting VMWare.  In this case, whatever memory problem is arising
apparently caused my RAID5 to go out of sync.  mdadm --detail /dev/md0
reported the array's status as "clean, resyncing".  I did not feel
comfortable proceeding until the array reconstruction had completed.

The following is the output of a few potentially helpful commands from
my system:

-------------------------------------------------------------------------------
root@nestor:~# free
             total       used       free     shared    buffers
cached
Mem:       3091600    1342420    1749180          0       7528
834596
-/+ buffers/cache:     500296    2591304
Swap:      3855592        116    3855476
root@nestor:~# cat /proc/meminfo
MemTotal:      3091600 kB
MemFree:       1740884 kB
Buffers:          7884 kB
Cached:         840740 kB
SwapCached:         16 kB
Active:         803144 kB
Inactive:       457132 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:      3091600 kB
LowFree:       1740884 kB
SwapTotal:     3855592 kB
SwapFree:      3855476 kB
Dirty:             112 kB
Writeback:           0 kB
Mapped:         543992 kB
Slab:            50144 kB
CommitLimit:   5401392 kB
Committed_AS:   829868 kB
PageTables:       9344 kB
VmallocTotal: 34359738367 kB
VmallocUsed:    298580 kB
VmallocChunk: 34359432699 kB
root@nestor:~# cat /proc/vmstat
nr_dirty 121
nr_writeback 0
nr_unstable 0
nr_page_table_pages 2338
nr_mapped 129519
nr_slab 12908
pgpgin 79491310
pgpgout 58817848
pswpin 18
pswpout 44
pgalloc_high 0
pgalloc_normal 0
pgalloc_dma32 67361665
pgalloc_dma 590
pgfree 67755244
pgactivate 4297035
pgdeactivate 2885886
pgfault 34368037
pgmajfault 46515
pgrefill_high 0
pgrefill_normal 0
pgrefill_dma32 8859331
pgrefill_dma 0
pgsteal_high 0
pgsteal_normal 0
pgsteal_dma32 18486611
pgsteal_dma 0
pgscan_kswapd_high 0
pgscan_kswapd_normal 0
pgscan_kswapd_dma32 21019296
pgscan_kswapd_dma 0
pgscan_direct_high 0
pgscan_direct_normal 0
pgscan_direct_dma32 1526688
pgscan_direct_dma 0
pginodesteal 526399
slabs_scanned 874310528
kswapd_steal 17458257
kswapd_inodesteal 10148069
pageoutrun 149720
allocstall 6287
pgrotated 2698
nr_bounce 0
root@nestor:~# vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
 1  0    116 1568544  14812 1015812  0    0   899   665   99  176 15  4 77  4
 2  0    116 1568544  14812 1015812  0    0     0     0 1539 2059 59 41  0  0
 1  0    116 1568420  14812 1015812  0    0     4     0 1416 2223 61 39  0  0
-------------------------------------------------------------------------------

More information available upon request.  I'm not overly familiar with
the bug reporting process, so my e-mail is zep01@bahj.com (just in case
more information is needed and I'm not responding to questions or
somesuch).

Cheers!

-- System Information:
Debian Release: testing/unstable
  APT prefers testing
  APT policy: (600, 'testing'), (500, 'stable'), (400, 'unstable')
Architecture: amd64 (x86_64)
Shell:  /bin/sh linked to /bin/bash
Kernel: Linux 2.6.17-2-amd64
Locale: LANG=en_US, LC_CTYPE=en_US (charmap=ISO-8859-1)

Versions of packages linux-image-2.6.17-2-amd64 depends on:
ii  e2fsprogs                     1.39-1     ext2 file system utilities and lib
ii  initramfs-tools [linux-initra 0.76       tools for generating an initramfs
ii  module-init-tools             3.2.2-3    tools for managing Linux kernel mo

linux-image-2.6.17-2-amd64 recommends no packages.

-- debconf information:
  linux-image-2.6.17-2-amd64/postinst/kimage-is-a-directory:
  shared/kernel-image/really-run-bootloader: true
  linux-image-2.6.17-2-amd64/preinst/abort-overwrite-2.6.17-2-amd64:
  linux-image-2.6.17-2-amd64/preinst/overwriting-modules-2.6.17-2-amd64: true
  linux-image-2.6.17-2-amd64/postinst/old-system-map-link-2.6.17-2-amd64: true
  linux-image-2.6.17-2-amd64/preinst/failed-to-move-modules-2.6.17-2-amd64:
  linux-image-2.6.17-2-amd64/preinst/abort-install-2.6.17-2-amd64:
  linux-image-2.6.17-2-amd64/postinst/depmod-error-2.6.17-2-amd64: false
  linux-image-2.6.17-2-amd64/preinst/elilo-initrd-2.6.17-2-amd64: true
  linux-image-2.6.17-2-amd64/postinst/bootloader-error-2.6.17-2-amd64:
  linux-image-2.6.17-2-amd64/prerm/would-invalidate-boot-loader-2.6.17-2-amd64: true
  linux-image-2.6.17-2-amd64/preinst/lilo-has-ramdisk:
  linux-image-2.6.17-2-amd64/postinst/old-dir-initrd-link-2.6.17-2-amd64: true
  linux-image-2.6.17-2-amd64/preinst/lilo-initrd-2.6.17-2-amd64: true
  linux-image-2.6.17-2-amd64/preinst/already-running-this-2.6.17-2-amd64:
  linux-image-2.6.17-2-amd64/postinst/old-initrd-link-2.6.17-2-amd64: true
  linux-image-2.6.17-2-amd64/postinst/depmod-error-initrd-2.6.17-2-amd64: false
  linux-image-2.6.17-2-amd64/postinst/create-kimage-link-2.6.17-2-amd64: true
  linux-image-2.6.17-2-amd64/prerm/removing-running-kernel-2.6.17-2-amd64: true
  linux-image-2.6.17-2-amd64/postinst/bootloader-test-error-2.6.17-2-amd64:
  linux-image-2.6.17-2-amd64/preinst/initrd-2.6.17-2-amd64:
  linux-image-2.6.17-2-amd64/preinst/bootloader-initrd-2.6.17-2-amd64: true


--- End Message ---
--- Begin Message ---
On Mon, Sep 04, 2006 at 09:47:20PM +0000, Zachary Palmer wrote:
>                                            Upon doing so, the VMWare GUI
> launches properly but starting a VM crashes the entire X session and a
> number of other programs with the following output to syslog:

We can't support tainted kernels. You have to reproduce this without
vmware modules loaded.

Bastian

-- 
A princess should not be afraid -- not with a brave knight to protect her.
		-- McCoy, "Shore Leave", stardate 3025.3

--- End Message ---

Reply to: