[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [PATCH] libdpkg: Use the amount of available memory instead phys_mem/2



On 2021-01-24 01:47:18 [+0100], Guillem Jover wrote:
> Hi!
Hi,

> [ You might want to take a peek at doc/coding-style.txt, although
>   I've covered these in the review. :) ]

sadly, I've been lookig…

> On Fri, 2021-01-08 at 23:03:30 +0100, Sebastian Andrzej Siewior wrote:
> > On Linux there is the field `MemAvailable' in `/proc/meminfo' which
> 
> Please no unbalanced `' quotes. :)

okay.

> > holds the amount of memory which is available as of now. It considers
> > the fact that the page cache can purge (if not all) some memory can be
> > reclaimed (for instance by writting back dirty inodes) and some memory
> > should remain free just in case. This amount of memory can be used
> > without the need to swap-out some memory.  The complete definition can
> > be located at [0].
> > 
> > The advantage over PHYS_MEM/2 is that it considers the current
> > status/usage of the system with assumung that half of what is physically
>                                   ^assuming?

correct.

> > avilable is usable.
> 
> Right, it's probably a better metric to use, but I'm not sure we
> should take the entire value as is, instead of scaling it down,
> otherwise we might still try to use too much memory and leaving the
> rest of the system w/o.

If you do have an idea how to scale, sure.
So that value holds the amount of memory that should be available as of
now. That should fine unless two+ programs are currently doing something
according to that value. Otherwise we should be good. Knock knock wood.

I was afraid of the parallel decompressor eating too much memory. The
compressor is more or less limited as there (in my imagination) one
dpkg-build instance on a system/buildd. The decompressor on the other
hand runs on a user system which may already use more than physmem/2.
So if you have an idea how to limit it further…

> > [0] https://www.kernel.org/doc/html/latest/filesystems/proc.html
> > Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
> > ---
> > 
> > This is a bit of my multithreaded XZ decompressor for dpkg. For
> > compression the PhysMem/2 estimation is probably good enough since
> > mostly used the buildd and owns the system.. 
> > For decompression the amount of memory should be close to reality so it
> > does not start threads and allocate a lot of memory if the system is
> > quite busy at the amout if package upgrade/installation.
> 
> Once ready I'll queue this for 1.21.x, as it indeed didn't seem urgent
> to rush into 1.20.x at the point of the freeze.

Yes. The parallel decompression has been posted upstream for review and
I don't see this comming for Bullseye. So yes.
 
> > diff --git a/lib/dpkg/compress.c b/lib/dpkg/compress.c
> > index 41991317afe53..7ec9144a56290 100644
> > --- a/lib/dpkg/compress.c
> > +++ b/lib/dpkg/compress.c
> > @@ -523,6 +523,88 @@ filter_lzma_error(struct io_lzma *io, lzma_ret ret)
> >  	       dpkg_lzma_strerror(ret, io->status));
> >  }
> >  
> > +#ifdef HAVE_LZMA_MT
> > +# ifdef __linux__
> > +
> > +#include <sys/stat.h>
> > +#include <fcntl.h>
> 
> Please move this with the other header inclusions. The <sys/*> on its
> own paragraph after <compat.h>, and <fcntl.h> with the system ones,
> before <unistd.h>. And I'd probably just include these unconditionally.

Okay.

> > +/*
> > + * An estimate of how much memory is available. Swap will not be used, the page
> > + * cache may be purged, not everything will be reclaimed what might be
> > + * reclaimed, watermarks are considers.
> > + */
> > +static char str_MemAvailable[] = "MemAvailable";
> > +
> > +static int
> > +get_avail_mem(uint64_t *val)
> > +{
> > +	char buf[4096];
> > +	char *str;
> > +	ssize_t bytes;
> > +	int fd;
> > +
> > +	*val = 0;
> > +
> > +	fd = open("/proc/meminfo", O_RDONLY);
> > +	if (fd < 0)
> > +		return -1;
> > +
> > +	bytes = read(fd, buf, sizeof(buf));
> > +	close(fd);
> 
> We could use file_slurp() here.

from looking at file_slurp_fd() it uses fstat() to determine the amount
of memory to allocate. This does not work with the proc file. It always
reports 0 and you have to read until EOF to get the whole file.
In theorhy, we could read less since the string we look for is at the
beginning.
 
> > +	if (bytes <= 0)
> > +		return -1;
> > +
> > +	buf[bytes] = '\0';
> > +
> > +	str = buf;
> > +	while (1) {
> > +		char *end;
> > +
> > +		end = strchr(str, ':');
> > +		if (!end)
> > +			break;
> > +		if ((end - str) == sizeof(str_MemAvailable) - 1) {
> > +			if (!strncmp(str, str_MemAvailable,
> > +				     sizeof(str_MemAvailable) - 1)) {
> 
> Use == instead of !, which makes it obvious what the comparison is
> about.

okay.

> > +				uint64_t num;
> > +
> > +				str = end + 1;
> > +				num = strtoull(str, &end, 10);
> 
> I'm not very fond of strtou*() functions, as they still accept
> negative numbers, just use strtoimax(). We also need to reset
> errno and check it afterwards to be sure there's a range error.

Not sure what you are getting at. The kernel does not put negative
numbers. strtoimax() would have a limit at 2TiB which is not something I
expect on 32bit. But then you may run 32bit on a 64bit kernel ;)

I update as asked…

> > +				if (!num)
> > +					return -1;
> > +				if (num == ULLONG_MAX)
> > +					return -1;
> > +				/* it should end with ' kb\n' */
> 
> While I see the Linux kernel does not currently support any other
> unit, I'd be more comfortable checking for it explicitly so that in
> the unlikely case this ever changes, we do not compute bogus numbers.

Okay.

> > +				if (*end != ' ')
> > +					return -1;
> > +
> > +				num *= 1024;
> > +				*val = num;
> > +				return 0;
> > +			}
> > +		}
> > +
> > +		end = strchr(end + 1, '\n');
> > +		if (!end)
> > +			break;
> > +		str = end + 1;
> > +	}
> > +	return -1;
> > +}
> > +
> 
> Thanks,
> Guillem

Sebastian


Reply to: