[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#487325: openssh-server: /etc/default/ssh setting for oom_adj confused



* Colin Watson <cjwatson@debian.org> [2008-06-21 03:07-0400]:
> tags 487325 pending
> thanks
> 
> On Fri, Jun 20, 2008 at 07:27:30PM -0400, Micah Anderson wrote:
> > I discovered recently during a testing migration that in a vserver
> > environment you do not have the capability to adjust /proc values.
> > 
> > This means that the oom_adj results in a lot of noise in the logfiles:
> > 
> > sshd[9363]: error writing /proc/self/oom_adj: Operation not permitted
> 
> I wonder if there's any way to detect this? Maybe I should just not
> print EPERM errors? If possible, I'd rather have a default that works
> for nearly everyone.

After some discussion with the Linux-Vserver folks, they found some
interesting information I thought it worth adding. First EPERM is not
the error that they expected, and that inside a vserver guest its really
strict about what options you open it with, both O_CREAT and O_TRUNC are
forbidden, and O_WRONLY lets you write 0\n to it.

The Linux-Vserver folks found that they didn't experience the exact same
problem I did, they could set SSHD_OOM_ADJUST=0 in /etc/default/ssh,
without changing the initscript. If I did that, I would get an
error. This lead us to look into what was different, and it turned out
that I am running a 2.6.18 kernel on the host (its pretty normal to run
an etch host with guests that could be lenny, sid, or even redhat), and
they were running a newer kernel. 

In 2.6.20 linux upstream remo removed the unconditional
capable(CAP_SYS_RESOURCE) check. So that means that in kernels lower
than 2.6.20 the capability CAP_SYS_RESOURCE is required to modify the
oom_adj value at all. As capabilities are stripped in guests, unless you
explicitly allow them, this is why I would get the error, and they
wouldn't. With newer kernels, you just need the capability
CAP_SYS_RESOURCE to lower the values.

> I think perhaps the wording here is simply misleading. What "disable"
> means here is "disable the OOM-killer", that is "tell the kernel never
> to kill this process".
> 
> I've changed the text as follows:
> 
>   # OOM-killer adjustment for sshd (see
>   # linux/Documentation/filesystems/proc.txt; lower values reduce likelihood
>   # of being killed, while -17 means the OOM-killer will ignore sshd; set to
>   # the empty string to skip adjustment)

That clarifies the difference nicely!

> > After trial-and-error it seems like it shouldn't be set to anything at
> > all if it is supposed to be disabled. So, the environment variable
> > SSHD_OOM_ADJUST needs to be non-existant to actually disable it. I
> > don't understand why, unless there is some environment scrubbing going
> > on?
> 
> My intent was that the empty string would prevent fiddling with the
> OOM-killer, but that didn't work due to an implementation bug (the above
> should have been 'if (!oom_adj || !*oom_adj) return;'). I've fixed this
> in CVS.

Great.

> > It doesn't help that in /etc/init.d/ssh, we find this:
> > 
> > export SSHD_OOM_ADJUST=-17
> > 
> > right before the sourcing of the /etc/default/ssh file. 
> > 
> > So the only way to really disable this is to comment out both
> > the line in /etc/init.d/ssh where the environment variable is
> > set to -17 and the line in /etc/default/ssh where it is also set.
> 
> No, even at present, 'unset SSHD_OOM_ADJUST' in /etc/default/ssh would
> do it without having to edit the init script.

Thats a better temporary solution, thanks for the suggestion.

> Thanks for your report,

Thanks for the quick response, its appreciated,
Micah

Attachment: signature.asc
Description: Digital signature


Reply to: