[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#628690: Acknowledgement (openssh-server inherits oom_adj -17 upon start under specific conditions, causing DoS potential for oom_killer)



Here is the latest conversation with openssh devs, which confirms this definitely falls within the remit of debian or kernel-mm.

On 31/05/2011 13:25, Gert Doering wrote:
> Hi,
>
> On Tue, May 31, 2011 at 12:11:13PM +0100, Cal Leeming [Simplicity Media Ltd] wrote:
>> Could you point out the line of code where oom_adj_save is set to the
>> original value, because I've looked everywhere, and from what I can
>> tell, it's only ever set to INT_MIN, and no where else is it changed.
>> (C is not my strongest language tho, so I most likely have overlooked
>> something). This is where I got thrown off.
>
> oom_adjust_setup() does this:
>
>                 if ((fp = fopen(oom_adj_path, "r+")) != NULL) {
>                         if (fscanf(fp, "%d", &oom_adj_save) != 1)
> verbose("error reading %s: %s", oom_adj_path,
>                                     strerror(errno));
>
> the "fscanf()" call will read an integer ("%d") from the file named,
> and write that number into the variable being pointed to (&oom_adj_save).
>
> The loop is a bit tricky to read as it takes different paths into
> account, and will exit after the first successful update.
>
> fscanf() will return the number of successful conversions, so if it
> was able to read "one number", the return value is "1", and it will
> jump to the else block
>
>                         else {
>                                 rewind(fp);
>                                 if (fprintf(fp, "%d\n", value) <= 0)
>                                         verbose("error writing %s: %s",
> oom_adj_path, strerror(errno));
>                                 else
>                                         verbose("Set %s from %d to %d",
> oom_adj_path, oom_adj_save, value);
>                         }
>
> where it will overwrite what is in that file with the new value
> ("value"), and then print the "Set ... from -17 to -17" message that
> you saw.

Ah, thank you for explaining this. Makes a lot more sense now :)

>
>
>>> The question here is why sshd is sometimes started with -17 and sometimes
>>> with 0 - that's the bug, not that sshd keeps what it's given.
>>>
>>> (Ask yourself: if sshd had no idea about oom_adj at all, would this make
>>> it buggy by not changing the value?)
>>
>> This was what I was trying to pinpoint down before. I had came to this
>> conclusion myself, sent it to the Debian bug list, and they dismissed
>> on the grounds that it was an openssh problem...
>
> I must admit that I have no idea what is causing it, but from the logs,
> it very much looks like sshd is started with "-17" in there - but only
> in the problem case.
>
>
>> So far, the buck has been passed from kernel-mm to debian-kernel, to
>> openssh, and now back to debian-kernel lol. The most annoying thing,
>> is that you can't get this bug to happen unless you physically test on
>> a machine which requires the bnx2 firmwire, so I get the feeling this
>> won't get resolved :/
>
> And *that* strongly points to a bug in the bnx2 stuff - if other programs
> change their behaviour based on the existance of a given driver, that
> does not smell very healthy.

Agreed.. I was thinking of adding some debug into the fs/proc/ code which does a kprint on every oom_adj read/write, but I couldn't figure out how to extract the pid from the task (pointer?).

>
> [..]
>>> Anyway, as a workaround for your system, you can certainly set
>>>
>>>  oom_adj_save = 0;
>>>
>>> in the beginning of port-linux.c / oom_adjust_restore(), to claim that
>>> "hey, this was the saved value to start with" and "restore" oom_adj to 0
>>> then - but that's just hiding the bug, not fixing it.
>>
>> I'm disappointed this wasn't the correct fix, I honestly thought I had
>> patched it right :(
>
> Well, that's the short hand - "just ignore everything that happened at
> init / save time, and forcibly write back '0', no matter what was there
> before".
>
>> But, on the other hand, ssh users should really never have a default
>> oom_adj of -17, so maybe 0 should be set as default anyway? If this is
>> not the case, could you give reasons why??
>
> Well, I would say "the default value in there is a matter of local policy",
> so what if someone wants to make sure that whatever is run from sshd is
> always privileged regarding oom, even if a local firefox etc. is running
> amock and you need to ssh-in and kill the GUI stuff...
>
> One might opt to run sshd (and all its children) at "-5" (slightly special
> treatment), or "0" (no special treatment), or even at "-17" - but that's
> local policy.

Ah, okay that's make sense.

>
>
> Mmmh.
>
> Since this seems to be inherited, it might even work if you just change
> the sshd startup script, and insert
>
>   echo 0 >/proc/self/oom_adj
>
> in there, right before it starts the sshd...  "local policy at work".

Yeah I was going to do this, but then I thought "well if this problem is occurring for openssh, then what else could it be affecting?". As you pointed out above, having the oom_adj changed based on the existence of a driver is really not good.

I will paste this convo trail into the debian ticket, and hopefully it'll help convince them this issue needs fixing.

>
> gert

Thanks again for taking the time to reply!




On 31/05/2011 12:24, Debian Bug Tracking System wrote:
Thank you for filing a new Bug report with Debian.

This is an automatically generated reply to let you know your message
has been received.

Your message is being forwarded to the package maintainers and other
interested parties for their attention; they will reply in due course.

Your message has been sent to the package maintainer(s):
  Debian OpenSSH Maintainers<debian-ssh@lists.debian.org>

If you wish to submit further information on this problem, please
send it to 628690@bugs.debian.org.

Please do not send mail to owner@bugs.debian.org unless you wish
to report a problem with the Bug-tracking system.





Reply to: