[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: cgroup OOM killer loop causes system to lockup (possible fix included)



Thanks for the response. I have sent this across to the guys at openssh-server.

Although, I did check the openssh source code myself, and from what I could tell, everything was being done correctly.

I have a feeling there gonna be a lot of 'buck passing' on this one :(

Cal

On 30/05/2011 21:25, Ben Hutchings wrote:
On Mon, 2011-05-30 at 21:03 +0100, Cal Leeming [Simplicity Media Ltd]
wrote:
More strangeness..

If I keep the kernel module loaded, but disable the entry
in /etc/network/interfaces for eth0, the oom_adj problem disappears.
But then ofc, I'm left with no network interface. I then tried
binding /etc/ssh/sshd_config to only listen on 127.0.0.1.. effectively
bypassing the eth0 interface, whilst still allowing it to be loaded.
But the problem still happens.
[...]

My guess is that sshd tries to protect itself against the OOM-killer so
that you can still log in to a system that has gone OOM.  If there is no
network available, it doesn't do this because you cannot log in remotely
anyway.

The bug seems to be that sshd does not reset the OOM adjustment before
running the login shell (or other program).  Therefore, please report a
bug against openssh-server.

Ben.



Reply to: