[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Lost interrupt, page allocation failure, and kernel oops



First of all, thanks a lot to Rick and Thomas for your advice
and background information!

On 18.03.2006, at 07:03, Rick Thomas wrote:
On Mar 17, 2006, at 9:49 PM, Kaspar Fischer wrote:
On 17.03.2006, at 19:35, Michael Schmitz wrote:
Bad RAM, perhaps? Or other hardware dying?

As to RAM, how can I test it? http://www.memtest86.com/ seems
to be for Intel architectures only.

I wish I knew.

In the meantime I have found one:

  http://pyropus.ca/software/memtester/

And it seems that something *is* strange on my system. Looking
into /proc/meminfo,

bumbum:/tmp/ramtest# cat /proc/meminfo
MemTotal:       191464 kB
MemFree:          1568 kB
Buffers:           472 kB
Cached:          11248 kB
SwapCached:     165732 kB
Active:          98064 kB
Inactive:        79512 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:       191464 kB
LowFree:          1568 kB
SwapTotal:      560532 kB
SwapFree:       370248 kB
Dirty:              76 kB
Writeback:           0 kB
Mapped:         169872 kB
Slab:             7116 kB
Committed_AS:   296188 kB
PageTables:       1388 kB
VmallocTotal:   793468 kB
VmallocUsed:     19664 kB
VmallocChunk:   773356 kB

I see that I have 190MB RAM, and running memtest on some 160MB
of these works fine:

bumbum:/tmp/ramtest/memtester-4.0.5# ./memtester 160
memtester version 4.0.5 (32-bit)
Copyright (C) 2005 Charles Cazabon.
Licensed under the GNU General Public License version 2 (only).
pagesize is 4096
pagesizemask is 0xfffff000
want 160MB (167772160 bytes)
got  160MB (167772160 bytes), trying mlock ...locked.
Loop 1:
  Stuck Address       : ok
  Random Value        : ok
  Compare XOR         : ok
  Compare SUB         : ok
  Compare MUL         : ok
  Compare DIV         : ok
  Compare OR          : ok
  Compare AND         : ok
  Sequential Increment: ok
  Solid Bits          : ok
...

However, when I do a "./memtester 190", memtester (again) tries
to mlock the memory -- and this does fail with the oom-killer
being invoked. Is this normal? I have 560MB swap (190MB used
according to top), so I see no reason why oom-killer should
come into play (and kill my ssh sessions). Or is this normal?
(I'd have expected pages to be swapped out or mlock() to fail,
but not something as drastic as killing my ssh sessions...)
Here is the precise log:

Apr  1 17:52:32 bumbum -- MARK --
Apr  1 18:12:32 bumbum -- MARK --
# That's where I did ./memtester 190
Apr  1 18:15:28 bumbum kernel: oom-killer: gfp_mask=0xd2
Apr  1 18:15:29 bumbum kernel: DMA per-cpu:
Apr  1 18:15:29 bumbum kernel: cpu 0 hot: low 24, high 72, batch 12
Apr  1 18:15:29 bumbum kernel: cpu 0 cold: low 0, high 24, batch 12
Apr  1 18:15:29 bumbum kernel: Normal per-cpu: empty
Apr  1 18:15:29 bumbum kernel: HighMem per-cpu: empty
Apr  1 18:15:29 bumbum kernel:
Apr  1 18:15:29 bumbum kernel: Free pages:         864kB (0kB HighMem)
Apr 1 18:15:29 bumbum kernel: Active:22360 inactive:22107 dirty:0 writeback:0 unstable:0 free:216 slab:1863 mapped:44397 pagetables:340 Apr 1 18:15:29 bumbum kernel: DMA free:864kB min:440kB low:880kB high:1320kB active:89440kB inactive:88428kB present:196608kB
Apr  1 18:15:29 bumbum kernel: protections[]: 220 220 220
Apr 1 18:15:29 bumbum kernel: Normal free:0kB min:0kB low:0kB high: 0kB active:0kB inactive:0kB present:0kB
Apr  1 18:15:29 bumbum kernel: protections[]: 0 0 0
Apr 1 18:15:29 bumbum kernel: HighMem free:0kB min:128kB low:256kB high:384kB active:0kB inactive:0kB present:0kB
Apr  1 18:15:29 bumbum kernel: protections[]: 0 0 0
Apr 1 18:15:29 bumbum kernel: DMA: 0*4kB 0*8kB 0*16kB 1*32kB 1*64kB 0*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 864kB
Apr  1 18:15:29 bumbum kernel: Normal: empty
Apr  1 18:15:29 bumbum kernel: HighMem: empty
Apr 1 18:15:29 bumbum kernel: Swap cache: add 49457, delete 6085, find 54/90, race 0+0
Apr  1 18:15:29 bumbum kernel: oom-killer: gfp_mask=0x1d2
Apr  1 18:15:29 bumbum kernel: DMA per-cpu:
Apr  1 18:15:29 bumbum kernel: cpu 0 hot: low 24, high 72, batch 12
Apr  1 18:15:29 bumbum kernel: cpu 0 cold: low 0, high 24, batch 12
Apr  1 18:15:29 bumbum kernel: Normal per-cpu: empty
Apr  1 18:15:29 bumbum kernel: HighMem per-cpu: empty
Apr  1 18:15:29 bumbum kernel:
Apr  1 18:15:29 bumbum kernel: Free pages:         864kB (0kB HighMem)
Apr 1 18:15:29 bumbum kernel: Active:21944 inactive:22480 dirty:0 writeback:0 unstable:0 free:216 slab:1837 mapped:44397 pagetables:334 Apr 1 18:15:29 bumbum kernel: DMA free:864kB min:440kB low:880kB high:1320kB active:87776kB inactive:89920kB present:196608kB
Apr  1 18:15:29 bumbum kernel: protections[]: 220 220 220
Apr 1 18:15:29 bumbum kernel: Normal free:0kB min:0kB low:0kB high: 0kB active:0kB inactive:0kB present:0kB
Apr  1 18:15:29 bumbum kernel: protections[]: 0 0 0
Apr 1 18:15:29 bumbum kernel: HighMem free:0kB min:128kB low:256kB high:384kB active:0kB inactive:0kB present:0kB
Apr  1 18:15:29 bumbum kernel: protections[]: 0 0 0
Apr 1 18:15:29 bumbum kernel: DMA: 0*4kB 0*8kB 0*16kB 1*32kB 1*64kB 0*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 864kB
Apr  1 18:15:29 bumbum kernel: Normal: empty
Apr  1 18:15:29 bumbum kernel: HighMem: empty
Apr 1 18:15:29 bumbum kernel: Swap cache: add 49485, delete 6103, find 55/97, race 0+0
Apr  1 18:15:29 bumbum kernel: oom-killer: gfp_mask=0x1d2
Apr  1 18:15:29 bumbum kernel: DMA per-cpu:
Apr  1 18:15:29 bumbum kernel: cpu 0 hot: low 24, high 72, batch 12
Apr  1 18:15:29 bumbum kernel: cpu 0 cold: low 0, high 24, batch 12
Apr  1 18:15:29 bumbum kernel: Normal per-cpu: empty
Apr  1 18:15:29 bumbum kernel: HighMem per-cpu: empty
Apr  1 18:15:29 bumbum kernel:
Apr  1 18:15:29 bumbum kernel: Free pages:         864kB (0kB HighMem)
Apr 1 18:15:29 bumbum kernel: Active:24778 inactive:19757 dirty:0 writeback:0 unstable:0 free:216 slab:1817 mapped:44402 pagetables:328 Apr 1 18:15:29 bumbum kernel: DMA free:864kB min:440kB low:880kB high:1320kB active:99112kB inactive:79028kB present:196608kB
Apr  1 18:15:29 bumbum kernel: protections[]: 220 220 220
Apr 1 18:15:29 bumbum kernel: Normal free:0kB min:0kB low:0kB high: 0kB active:0kB inactive:0kB present:0kB
Apr  1 18:15:29 bumbum kernel: protections[]: 0 0 0
Apr 1 18:15:29 bumbum kernel: HighMem free:0kB min:128kB low:256kB high:384kB active:0kB inactive:0kB present:0kB
Apr  1 18:15:29 bumbum kernel: protections[]: 0 0 0
Apr 1 18:15:29 bumbum kernel: DMA: 0*4kB 0*8kB 0*16kB 1*32kB 1*64kB 0*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 864kB
Apr  1 18:15:29 bumbum kernel: Normal: empty
Apr  1 18:15:29 bumbum kernel: HighMem: empty
Apr 1 18:15:29 bumbum kernel: Swap cache: add 49552, delete 6154, find 57/109, race 0+0 Apr 1 18:15:29 bumbum kernel: memtester: page allocation failure. order:0, mode:0xd2
Apr  1 18:15:29 bumbum kernel: Call trace:
Apr  1 18:15:29 bumbum kernel:  [c000ba7c] dump_stack+0x18/0x28
Apr  1 18:15:29 bumbum kernel:  [c003f1ec] __alloc_pages+0x324/0x388
Apr  1 18:15:29 bumbum kernel:  [c004ca18] do_anonymous_page+0x98/0x4a8
Apr  1 18:15:29 bumbum kernel:  [c004ce98] do_no_page+0x70/0x794
Apr  1 18:15:29 bumbum kernel:  [c004d824] handle_mm_fault+0xfc/0x1ec
Apr  1 18:15:29 bumbum kernel:  [c004aeec] get_user_pages+0x12c/0x50c
Apr  1 18:15:29 bumbum kernel:  [c004d99c] make_pages_present+0x7c/0xa8
Apr  1 18:15:29 bumbum kernel:  [c004dfd4] mlock_fixup+0xcc/0xe0
Apr  1 18:15:29 bumbum kernel:  [c004e104] do_mlock+0x11c/0x120
Apr  1 18:15:29 bumbum kernel:  [c004e1dc] sys_mlock+0xd4/0xe4
Apr  1 18:15:29 bumbum kernel:  [c0007d50] ret_from_syscall+0x0/0x4c

I am on a new track here or is this just "normal behaviour"
which has nothing to do with my initial problems?

(I just don't manage to reproduce the original problem I had.
I have tried serveral times so far -- but I have to admit that
I have not had the courage so far to do it with my harddrisk
mounted read/write, as was the case when the problem popped
up first...)

Many thanks,
Kaspar



Reply to: