[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Lost interrupt, page allocation failure, and kernel oops



On Sat, 1 Apr 2006, Kaspar Fischer wrote:

> I see that I have 190MB RAM, and running memtest on some 160MB
> of these works fine:
> 
> bumbum:/tmp/ramtest/memtester-4.0.5# ./memtester 160
> memtester version 4.0.5 (32-bit)
> Copyright (C) 2005 Charles Cazabon.
> Licensed under the GNU General Public License version 2 (only).
> pagesize is 4096
> pagesizemask is 0xfffff000
> want 160MB (167772160 bytes)
> got  160MB (167772160 bytes), trying mlock ...locked.
> Loop 1:
>  Stuck Address       : ok
>  Random Value        : ok
>  Compare XOR         : ok
>  Compare SUB         : ok
>  Compare MUL         : ok
>  Compare DIV         : ok
>  Compare OR          : ok
>  Compare AND         : ok
>  Sequential Increment: ok
>  Solid Bits          : ok
> ...
> 
> However, when I do a "./memtester 190", memtester (again) tries
> to mlock the memory -- and this does fail with the oom-killer
> being invoked. Is this normal? I have 560MB swap (190MB used
> according to top), so I see no reason why oom-killer should
> come into play (and kill my ssh sessions). Or is this normal?
> (I'd have expected pages to be swapped out or mlock() to fail,
> but not something as drastic as killing my ssh sessions...)

 This seems normal :-( Memtester is an application, so at a minimum it 
can only test what is left by the kernel, your login shell (and ssh), 
and the loaded libraries.  Also, the amount of memory it will use (for 
buffers or arrays, I suppose) varies through the suite - I've seen it 
start ok, then get cancelled because it couldn't get the needed memory. 

 Memtester will happily use swap space, but it seems pointless - tried 
that while running X, icewm, and 'top' when I added more memory to my 
box : top updated erratically and the clock in the task bar would not 
update for minutes at a time in some of the tests (2GB memory).

 What you want to test is the *physical* memory, I think it is even 
possible that a test to see if memory "decays" could be nullified by an 
intervening write to disk.  So, turn swap off.  After that, try using a 
little less memory than is shown as free - once you get an amount of 
memory that will start the tests, keep an eye open - every time it runs 
out of memory, reduce the size and repeat.  For serious use, single-user 
mode is preferred if you can log-in like that, and do everything you can 
to reduce your login's footprint on the system memory.

 If anybody has a better way of using it to test as much as possible, 
I'm all ears!  Perhaps, a slimmed-down kernel, building it with 
something like uclibc, and using a script to try to determine a safe 
amount of memory to test (with init pointing to the script).  That 
wouldn't help in your case where you need to run ssh, and it probably 
would need a machine-specific kernel.  Unfortunately, people on x86 and 
x86_64 are spoiled when it comes to memory testing!

Ken
-- 
das eine Mal als Tragödie, das andere Mal als Farce

Reply to: