Bug#695182: Write couple of 1GB files for OOM crash

To: paul.szabo@sydney.edu.au
Cc: 695182@bugs.debian.org, ben@decadent.org.uk
Subject: Bug#695182: Write couple of 1GB files for OOM crash
From: Jonathan Nieder <jrnieder@gmail.com>
Date: Sun, 6 Jan 2013 19:51:52 -0800
Message-id: <[🔎] 20130107035152.GA27909@elie.Belkin>
Reply-to: Jonathan Nieder <jrnieder@gmail.com>, 695182@bugs.debian.org
In-reply-to: <[🔎] 201301070303.r0733Hs0026687@como.maths.usyd.edu.au>
References: <[🔎] 20130107012024.GC3823@elie.Belkin> <[🔎] 201301070303.r0733Hs0026687@como.maths.usyd.edu.au>

paul.szabo@sydney.edu.au wrote:

>>> +/* Easy call: do "echo 3 > /proc/sys/vm/drop_caches" */
>>
>> I don't understand this comment.  What does "Easy call" mean?
>>
>>> +void easy_drop_caches(void)
[...]
>> This should be declared in some appropriate header (e.g.,
>> include/linux/mm.h).
>
> I added one new call, so I could easily call it from elsewhere without
> having to (cross?)-declare many things; in fact would have preferred to
> call iterate_supers() and drop_slab() directly.
>
> Yes, easy_drop_caches() might be declared in some include file, but
> should that be <linux/mm.h> or <linux/fs.h>? Would add another file to
> those changed, seemed more direct this way.

Remember that once your patch is applied, it is part of the Linux
kernel that other people develop against.  So it is best to optimize
for readability of the result, not readability of the patch.  Another
file changed is not a big deal.

The existing drop_caches functionality is declared in linux/mm.h, so I
think this new function would be easiest to find there.

> Should I change something?

Please clarify the comment to explain what the function is meant
to do.  It will help future readers.

[...]
>>> +	/* Subtract min_free_kbytes */
>>> +	if (min_free_kbytes > 0)
>>> +		y = min_free_kbytes >> (PAGE_SHIFT - 10);
>>
>> Why the "if"?
>
> Sanity-check paranoia: because I do not trust the input values. Maybe
> should have
>   BUG_ON(min_free_kbytes < 0);
> but that does not belong there.

Elsewhere the code assumes that min_free_kbytes is nonnegative.  If
you want to check that assertion here, that's fine.  An assertion test
would be spelled as

	if (!WARN_ON(min_free_kbytes < 0))

since it's possible for the kernel to recover for additional debugging
when it fails.

[...]
>>>  	setpoint = (freerun + limit) / 2;
>>> -	x = div_s64((setpoint - dirty) << RATELIMIT_CALC_SHIFT,
>>> +	x = div_s64(((s64)setpoint - (s64)dirty) << RATELIMIT_CALC_SHIFT,
>>>  		    limit - setpoint + 1);
>>
>> Why are 'setpoint' and 'dirty' of type unsigned long instead of long
>> in the first place?  What happens if one of their values overflows a
>> long?
>
> I guess because most other such things are also unsigned long; and I
> guess they were so when they meant memory addresses. These are now page
> counts, cannot overflow (with current limit of 64GB RAM).

This could be worth a comment and WARN_ON to document the assumption
so other readers don't have to be distracted by the same question.
Something like

	/*
	 * Since memory addresses fit in an unsigned long, page
	 * counts cannot overflow a "long" and a cast to s64 is safe.
	 */
	WARN_ON(setpoint > LONG_MAX || dirty > LONG_MAX);
	BUILD_BUG_ON(sizeof(long) > sizeof(s64));

	x = div_s64(((s64)setpoint - ...

[...]
>>> --- mm/vmscan.c.old	2012-10-17 13:50:15.000000000 +1100
>>> +++ mm/vmscan.c	2013-01-06 09:50:49.000000000 +1100
>> [...]
>>> @@ -2726,9 +2731,87 @@ loop_again:
>>>  				nr_slab = shrink_slab(&shrink, sc.nr_scanned, lru_pages);
>>>  				sc.nr_reclaimed += reclaim_state->reclaimed_slab;
>>>  				total_scanned += sc.nr_scanned;
>>> +				if (unlikely(
>>> +				    i == 1 &&
>>> +				    nr_slab < 10 &&
>>> +				    (reclaim_state->reclaimed_slab) < 10 &&
>>> +				    zone_page_state(zone, NR_SLAB_RECLAIMABLE) > 10 &&
[...]
>> This is getting really deeply nested.  Would it be possible to split
>> out a function so this code could be more easily contemplated in
>> isolation?
>
> Hmm... I would much prefer to leave it as is.

Can you say a little about why?  At this point in the function, the
context is

	static unsigned long balance_pgdat()
	{
		...
		for (priority = DEF_PRIORITY; priority >= 0; priority--) {
			...
			for (i = 0; i <= end_zone; i++) {
				...
				if (!zone_watermark_ok_safe(zone, order, ...) {
					...
					if (unlikely(i == 1 && ...) {

It is easy to lose track at this point in a long function.  As
Documentation/CodingStyle explains:

	Now, some people will claim that having 8-character indentations makes
	the code move too far to the right, and makes it hard to read on a
	80-character terminal screen.  The answer to that is that if you need
	more than 3 levels of indentation, you're screwed anyway, and should fix
	your program.

and:

	Functions should be short and sweet, and do just one thing.  They should
	fit on one or two screenfuls of text (the ISO/ANSI screen size is 80x24,
	as we all know), and do one thing and do that well.

	The maximum length of a function is inversely proportional to the
	complexity and indentation level of that function.  So, if you have a
	conceptually simple function that is just one long (but simple)
	case-statement, where you have to do lots of small things for a lot of
	different cases, it's OK to have a longer function.

	However, if you have a complex function, and you suspect that a
	less-than-gifted first-year high-school student might not even
	understand what the function is all about, you should adhere to the
	maximum limits all the more closely.

When a typical kernel hacker drives by this part of this long function
because it showed up in a stacktrace for a bug she is tracking down,
it is going to take a lot longer than it ought to to figure out what
is going on.  If it's possible to come up with some logical subset of
that function to factor out (in a separate patch) before applying your
bugfix, that can help everyone's sanity.  No need to worry about the
waste of a function name within the same file having only one caller
--- that's the usual kernel style.

Thanks for your thoughtfulness, and hope that helps.

Regards,
Jonathan

Reply to:

Follow-Ups:
- Bug#695182: Write couple of 1GB files for OOM crash
  - From: paul.szabo@sydney.edu.au
- Bug#695182: Write couple of 1GB files for OOM crash
  - From: paul.szabo@sydney.edu.au

References:
- Bug#695182: Write couple of 1GB files for OOM crash
  - From: Jonathan Nieder <jrnieder@gmail.com>
- Bug#695182: Write couple of 1GB files for OOM crash
  - From: paul.szabo@sydney.edu.au

Prev by Date: Bug#695182: Write couple of 1GB files for OOM crash
Next by Date: Bug#695182: Write couple of 1GB files for OOM crash
Previous by thread: Bug#695182: Write couple of 1GB files for OOM crash
Next by thread: Bug#695182: Write couple of 1GB files for OOM crash
Index(es):
- Date
- Thread