[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Pageout not succeeding



Michael Kelly, le ven. 30 mai 2025 12:22:23 +0100, a ecrit:
> On 29/05/2025 20:28, Michael Kelly wrote:
> > On 29/05/2025 14:49, Samuel Thibault wrote:
> > > So it's mach-defpager which need to be given vm_privilege. If that's
> > > not already so, that is what really needs to be fixed. Normally that's
> > > already done by mach-defpager's wire_thread() which uses thread_wire(),
> > > but possibly a bug makes it miss some mach-defpager thread, and that's
> > > what needs to be fixed.
> > I'll look into that.
> 
> The only kernel threads that are not privileged are:
> 
> 1) The signal handler thread
> 
> 2) The default_pager thread for external objects

Ok. The signal handler thread is not supposed to be getting called in
normal conditions and it'll be hard to thread-wire it, that should be
fine.

It's apparently on purpose that the default_pager for external objects
doesn't have privileges:

	 *	Threads handling external objects cannot have
	 *	privileges.  Otherwise a burst of data-requests for an
	 *	external object could empty the free-page queue,
	 *	because the fault code only reserves real pages for
	 *	requests sent to internal objects.

I'll add thread_wire calls to the rumpdisk translator, it really needs
them.

> Here's verification (with my added printfs in the kernel debugger):

I have now added a printed flag to master.

> > > > -    if (double_paging && !object->pager_initialized) {
> > > > +    if (!object->pager_initialized) {
> > > >           vm_object_pager_create(object);
> > > >       }
> > > So this would be my recent fd63a4bbf6f2201846f37afba348c5db88364c44
> > > 
> > > The point of the patch was to cope with the case where there is no DMM.
> > > I indeed got the condition from for the internal objects which of course
> > > will never have double_paging even if we have a DMM. I have changed the
> > > condition, could you try it?
> > I've seen your update which I'll test locally.
> The new revision works when the DMM is present;

Ok, good :)

> > > Thanks again for the investigation, this will be really useful to fix
> > > building large packages etc.
> > 
> > I hope that I can help with that.
> > 
> I am wondering if it might be more meaningful if I was able to recreate the
> paging problems shown by your build environment rather than trying to create
> a test case that simulates it.

Not necessarily: as long as you manage to produce simple situations
which pose problem, it is very useful to solve them, because it's way
simpler to determine what is happening, and check whether the issue is
fixed, rather than random build runs which can pose various troubles
etc.

> I'd imagine it would be difficult to create
> that exact environment on my own machine but perhaps I can test building of
> some large packages. Have you suggestions for which packages to build and
> what build parameters would be important (eg. parallelism) ?

It's essentially the package in "Building" status on

https://buildd.debian.org/status/architecture.php?a=hurd-amd64&suite=
https://buildd.debian.org/status/architecture.php?a=hurd-i386&suite=

and which are not labelled "sthibault" (which are usually packages that
are known to break boxes for other reasons).

> Is SMP stable enough to be used for such tests

Not and it would only bring more headaches to the problem solving.

> and if so does that apply on your build machine?

SMP is not stable enough for that.

> How are the issues that you have with the build presented?

Most often they end up with the "unable to recycle any page" warning and
get hung. Sometimes they just hung without the warning. I have not
really investigated the situation since it was clearly an out of memory
issue due to swapping not working. I'll have to test with swapping
fixed.

Samuel


Reply to: