[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Deadlock during page-in and thread_suspend




On 24/09/2025 22:10, Samuel Thibault wrote:
Hello,

Michael Kelly, le mar. 23 sept. 2025 15:26:45 +0100, a ecrit:
3) thread0 in the worker generates a page fault. This causes page-in
involving a (top) vm_object which also has a shadow object. A mapping is
made between the top object/offset and a fictitious page to block other
threads from attempting the same page-in until thread0 has handled the page
fault. thread0 then traverses the object chain to the shadow and makes the
memory_object_data_request on the shadow object/offset and blocks itself
until the reply has arrived and been processed.

4) A signal is received by the process and is handled by (say) thread1. As
per normal signal handling, this results in thread0 being suspended by
thread1 via the system call to thread_suspend(). It can be immediately
suspended because thread0 is in TH_WAIT state and is interruptible (TH_UNINT
not set).
Ah, the suspension also holds any kernel activity of thread0, so it
won't be able to do what it promised? (paging in)
Exactly so. It also means that no other thread will be able to map the required page even after the memory_object_data_supply response has supplied the page itself (which does happen).
5) After thread1 has suspended thread0 it trips a page fault itself which
actually requires the same page that was being paged-in by thread0. thread1
now blocks indefinitely and cannot proceed until the original page-in
completes which of course it cannot as thread0 is suspended. thread0 will
only be resumed by thread1 and thread1 cannot continue because of the state
managed by thread0.

I have some confidence that the above sequence is broadly what is happening
but it's difficult to be certain. I've got to this stage by adding extra
state to data structures rather than the otherwise huge volume of debug
logging which normally alters the timing to the point of masking the problem
anyway. In any case, I think that the scenario described above is possible
and provides a good match against the evidence that I do have.

I have some very vague ideas for solutions but before discussing those it
would be helpful to have my analysis scrutinised for obvious error.

I've additional confidence that this is indeed what is happening after further scrutiny of more recent tests.

I don't yet have a solution for this problem. vm_fault_page() implementation has this restriction commented:

         *      2)      To prevent another thread from racing us down the                                      
         *              shadow chain and entering a new page in the top                                        
         *              object before we do, we must keep a busy page in                                       
         *              the top object while following the shadow chain.                                       
 
This is what prevents others from completing page-in of a page that actually becomes available after thread0 is suspended. I'm quite apprehensive about how hard it might be to re-implement safely without that restriction. It seems to me that there are many inter-dependencies across areas of code throughout gnumach that are difficult to find without a very thorough knowledge. That in itself makes it hard for newcomers to contribute.

A method that might require less functional change would be to somehow transfer the responsibility to complete page-in to another thread although I cannot see how that could be efficiently managed.

Basically, I'm almost at a stand still on this one and could benefit from a nudge in the right direction.

All the best,

Mike.


Reply to: