Re: libports and interrupted RPCs

To: debian-hurd@lists.debian.org
Subject: Re: libports and interrupted RPCs
From: Michael Kelly <mike@weatherwax.co.uk>
Date: Mon, 8 Sep 2025 07:05:39 +0100
Message-id: <[🔎] 7c79e2d2-2f92-4b0c-a2bb-d2e989c56b7b@weatherwax.co.uk>
In-reply-to: <[🔎] aLzBxrs1mYcllHDZ@begin>
References: <9727ec8d-37dd-4649-bdc0-89c4f5821de9@weatherwax.co.uk> <aLTC1R9H4DZVSetS@begin> <[🔎] cd260a32-1b2c-4658-be2d-27b077758379@weatherwax.co.uk> <[🔎] aLjJaXWV5b56QJYC@begin> <[🔎] fcee31ad-bddf-4a30-b891-62a7a7bb8c95@weatherwax.co.uk> <[🔎] aLzBxrs1mYcllHDZ@begin>

On 07/09/2025 00:20, Samuel Thibault wrote:

You are not really sure what is happening with the rpc_info list while
you don't have the lock. Possibly currently it happens to be safe
because the item you are on while not move within the list, but this
looks very fragile to me. Maybe better record an array of the rpc_info
pointers that we want to cancel.

The changes I suggest do not access the list in this way after the mutexhas been released. The next iteration restarts the scan from the(possibly new) head of the list. Admittedly, this will result in anumber of passes down the list and how that compares in performance todynamically allocating memory for an array of variable size isn't clear.Both solutions still require the 'cancelling' state added to rpc_infothat prevents the affected RPC thread from terminating.

3) Reset the RPC as no longer in cancellation and repeat from 1) until there
are no more RPCs to be cancelled by this thread.

We may end up in a livelock here, if somehow some other code keeps
making newer RPCs in the thread.

That does not occur because the first pass of the list marks the RPCsthat will be cancelled in this call. Any RPCs added (or removed) fromthe list later will not be considered for cancellation.

I did wonder why all RPCs were being cancelled when the signal is delivered

It's not all RPCs, just the ones on the port that the signaled thread is
waiting an RPC for. That can indeed be a lot if a lot of threads happen
to be waiting on this port. It however looks safer this way: you'd never
really know which kind of interlockign condition there might be in the
server for the various threads blocked on the port. For instance if the
server was serving a shared condition variable, you might want to make
sure that everyone has a chance to wake up, and not only the one that is
getting interrupted and might try to be doing something else.

We just want to avoid a storm of interruptions, and I believe avoiding
to cancel an already being-canceled thread can lead us way further to
that direction.

I didn't explicitly say RPCs 'on that port' but that is what I had inmind. I was aware that it wasn't all RPCs in the system.

I don't understand the suggestion about not re-cancelling a threadalready in cancellation due to a signal. That occurs within theoriginating client but isn't the storm of interruptions being generatedon the server side?


Regards,

Mike.

Reply to:

Follow-Ups:
- Re: libports and interrupted RPCs
  - From: Samuel Thibault <sthibault@debian.org>

References:
- Re: libports and interrupted RPCs
  - From: Michael Kelly <mike@weatherwax.co.uk>
- Re: libports and interrupted RPCs
  - From: Samuel Thibault <sthibault@debian.org>
- Re: libports and interrupted RPCs
  - From: Michael Kelly <mike@weatherwax.co.uk>
- Re: libports and interrupted RPCs
  - From: Samuel Thibault <sthibault@debian.org>

Prev by Date: Re: sbuild on hurd-amd64...
Next by Date: Re: sbuild on hurd-amd64...
Previous by thread: Re: libports and interrupted RPCs
Next by thread: Re: libports and interrupted RPCs
Index(es):
- Date
- Thread