Re: libports and interrupted RPCs
Hi Samuel,
On 16/09/2025 00:00, Samuel Thibault wrote:
Michael Kelly, le ven. 12 sept. 2025 19:27:00 +0100, a ecrit:
On 08/09/2025 21:58, Samuel Thibault wrote:
Michael Kelly, le lun. 08 sept. 2025 07:05:39 +0100, a ecrit:
I think what is needed is a feature to guarantee the current sequential
behaviour per port_info but which also permits genuine concurrency across
different port_info. That isn't really true at the moment because of the
usage of the global _ports_lock. This could possibly be a port_info specific
mutex or perhaps an extension of the 'blocking' flags. I'm investigating
those options currently but it seems difficult to ensure locking order to
prevent deadlock.
I don't think we need a strict sequential guarantee.
Why not recording the port_info* pointers in a local array? This
requires extending their life in case of a concurrent termination of
the RPC, but that can be done by adding a reference counter and some
port_info_ref/_unref helpers that maintain it, and ports_end_rpc can
wait for the counter to lower to zero.
Well, if you think this approach is possibly valid after all, please
consider the method that I've attached. There is a diff showing minor
changes to 4 source files but a complete inclusion of interrupt-rpcs.c
since that is quite small and the solution is easier to comprehend when
read directly rather than via a diff.
In essence the 2 solutions follow the same method of capturing the RPCs
to cancel then cancelling each in turn as a 2nd stage. My solution adds
cl_prev and cl_next list item pointers to each rpc_info so that the ones
to cancel can be linked together to form a separate cancellation list.
This acts the part of the array you refer to. This improves my previous
solution by having a maximum number of iterations only double the
initial RPC list length.
The advantages of doing this are:
a) There is no need to call malloc() or similar to allocate memory for
the array of indeterminate size.
b) Any RPC that is in the cancellation list can actually run to
completion if it can do so before the call to hurd_thread_cancel() is
made. The benefit is that the thread associated with that RPC can then
start processing a new RPC rather than blocking until the
ports_interrupt_rpcs() completes.
Note that there is no reference to the head of cancellation list stored
in any data structure. It is only maintained locally within the call to
ports_interrupt_rpcs(). Note also that it is possible for
ports_interrupt_rpcs() to be called concurrently from another thread but
only one of these will capture the RPCs to cancel and the other would
return immediately with nothing to do. This is the issue I was concerned
about but it might be acceptable to expect synchronisation to be managed
by the callers instead.
The other aspect you solve with the reference counting is managed by the
'cancelling' flag associated with the rpc_info.
I'm sorry that perhaps I'm over complicating the solution but I was keen
to permit maximum concurrency especially as it seems to be causing
significant system issues under stress. I appreciate that you are very
busy and that this issue is perhaps not at the top of your job list so
considering this thread 'low priority' is fine with me.
All the best,
Mike.
diff -ur hurd.orig/libports/begin-rpc.c hurd/libports/begin-rpc.c
--- hurd.orig/libports/begin-rpc.c 2025-07-31 20:36:05.942918581 +0100
+++ hurd/libports/begin-rpc.c 2025-09-10 17:00:11.249190752 +0100
@@ -91,6 +91,8 @@
/* Record that that an RPC is in progress */
info->thread = hurd_thread_self ();
+ info->cl_next = info->cl_prev = NULL;
+ info->cancelling = FALSE;
info->next = pi->current_rpcs;
info->notifies = 0;
if (pi->current_rpcs)
diff -ur hurd.orig/libports/end-rpc.c hurd/libports/end-rpc.c
--- hurd.orig/libports/end-rpc.c 2025-07-31 20:36:05.942918581 +0100
+++ hurd/libports/end-rpc.c 2025-09-09 05:07:03.903620614 +0100
@@ -27,6 +27,17 @@
pthread_mutex_lock (&_ports_lock);
+ while (info->cancelling)
+ {
+ pthread_cond_wait(&_ports_block, &_ports_lock);
+ }
+
+ if (info->cl_next != NULL)
+ info->cl_next->cl_prev = info->cl_prev;
+
+ if (info->cl_prev != NULL)
+ info->cl_prev->cl_next = info->cl_next;
+
if (info->notifies)
_ports_remove_notified_rpc (info);
diff -ur hurd.orig/libports/ports.h hurd/libports/ports.h
--- hurd.orig/libports/ports.h 2025-07-31 20:36:05.946918537 +0100
+++ hurd/libports/ports.h 2025-09-09 05:06:25.719897838 +0100
@@ -22,6 +22,7 @@
#define _HURD_PORTS_
#include <mach.h>
+#include <stdbool.h>
#include <stdlib.h>
#include <hurd.h>
#include <hurd/ihash.h>
@@ -112,6 +113,8 @@
struct rpc_info *next, **prevp;
struct rpc_notify *notifies;
struct rpc_info *interrupted_next;
+ struct rpc_info* cl_next, *cl_prev;
+ bool cancelling;
};
/* An rpc has requested interruption on a port notification. */
/*
Copyright (C) 1995, 1996, 1997 Free Software Foundation, Inc.
Written by Michael I. Bushnell.
This file is part of the GNU Hurd.
The GNU Hurd is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License as
published by the Free Software Foundation; either version 2, or (at
your option) any later version.
The GNU Hurd is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. */
#include "ports.h"
#include <hurd.h>
#include <stdbool.h>
void
ports_interrupt_rpcs (void *portstruct)
{
struct port_info *pi = portstruct;
struct rpc_info *rpc, *prev, *cancel;
thread_t self = hurd_thread_self ();
pthread_mutex_lock (&_ports_lock);
cancel = NULL;
/* Build a cancellation list of all RPCs that are:
1) not ourselves AND
2) not already marked for cancellation
'cancel' becomes the head of the cancellation list. */
for (rpc = pi->current_rpcs, prev = NULL; rpc; rpc = rpc->next)
{
if (rpc->thread != self
&& (rpc->cl_next == NULL && rpc->cl_prev == NULL))
{
if (cancel == NULL)
cancel = rpc;
rpc->cl_prev = prev;
if (prev != NULL)
prev->cl_next = rpc;
prev = rpc;
}
}
while (cancel)
{
thread_t thread = cancel->thread;
/* setting 'cancelling' guarantees that the 'cancel' data
structure remains accessible after the _ports_lock is
released and reacquired. */
cancel->cancelling = TRUE;
pthread_mutex_unlock (&_ports_lock);
hurd_thread_cancel (thread);
pthread_mutex_lock (&_ports_lock);
_ports_record_interruption (cancel);
cancel->cancelling = FALSE;
cancel = cancel->cl_next;
/* Don't remove 'cancel' from the cancellation list otherwise
any future calls to this function would add it onto a new
cancellation list resulting in pointless additional
cancellation. The cancellation list pointers are safely
managed when the RPC is completed.*/
pthread_cond_broadcast(&_ports_block);
}
pthread_mutex_unlock (&_ports_lock);
}
Reply to: