[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Deadlock with SIGALRM handling



On 22/06/2025 23:59, Samuel Thibault wrote:
I made a small alteration to unlock and relock the _hurd_itimer_lock when
the _hurd_siglock is required within setitimer_locked. There is no state
within the function that is invalidated by the unlock/lock
Yes, there is. For instance, restart_itimer copies
_hurd_itimerval.it_interval into `it', it expects that nothing modifies
_hurd_itimerval in between so the timing computation is respected.
My suggestion was indeed poor.
I'd say rather make callers of setitimer_locked just always take
_hurd_siglock first (or not for restart_itimer since it already has it)
and drop the hurd_siglocked parameter. It seems much simpler to take
the mutex spuriously in the setitimer(0) case than trying to fix this
double-lock issue.

I have an alternative proposal which reduces the length of time that _hurd_siglock is held. I've moved the testing and insertion of the signal preemptor to occur before the call to setitimer_locked takes place. The existing code installs the preemptor when a non-zero timeout is required and that only occurs in 2 places: the calls to __setitimer and fork_itimer. The preemptor is never removed so there is no need to check whether it needs to be installed within restart_itimer. Perhaps it should be removed when no timer is required?

I've attached my suggested patch. I expect that this addresses the deadlock issue that I noticed since it has run my stress-ng test for 3 hours without it occurring. Without the patch it would previously normally occur within 10 minutes. I did however see 2 instances of a separate deadlock involving thread_terminate/thread_abort that I have reported before. I don't have a full understanding of that one yet.

Samuel, I did try your suggestion of locking _hurd_siglock before each call to setitimer_locked but that implementation soon deadlocked. I can't remember for sure, but it was another lock that was involved, possibly _hurd_sigstate_lock. I didn't spend long looking at that because I had the idea for this patch. If this patch turns out as another blunder, then I'll reexamine what I did to implement your suggestion to see if the mistake was mine or elsewhere.

Mike.

--- setitimer.c.orig	2025-06-24 12:57:35.905712238 +0100
+++ setitimer.c	2025-06-24 14:22:04.717827650 +0100
@@ -131,8 +131,7 @@
 
 /* Forward declaration.  */
 static int setitimer_locked (const struct itimerval *new,
-			     struct itimerval *old, void *crit,
-			     int hurd_siglocked);
+			     struct itimerval *old, void *crit);
 
 static sighandler_t
 restart_itimer (struct hurd_signal_preemptor *preemptor,
@@ -146,7 +145,7 @@
   /* Either reload or disable the itimer.  */
   __spin_lock (&_hurd_itimer_lock);
   it.it_value = it.it_interval = _hurd_itimerval.it_interval;
-  setitimer_locked (&it, NULL, NULL, 1);
+  setitimer_locked (&it, NULL, NULL);
 
   /* Continue with normal delivery (or hold, etc.) of SIGALRM.  */
   return SIG_ERR;
@@ -158,7 +157,7 @@
 
 static int
 setitimer_locked (const struct itimerval *new, struct itimerval *old,
-		  void *crit, int hurd_siglocked)
+		  void *crit)
 {
   struct itimerval newval;
   struct timeval now, remaining, elapsed;
@@ -192,24 +191,6 @@
     {
       /* Make sure the itimer thread is set up.  */
 
-      /* Set up a signal preemptor global for all threads to
-	 run `restart_itimer' each time a SIGALRM would arrive.  */
-      static struct hurd_signal_preemptor preemptor =
-	{
-	  __sigmask (SIGALRM), SI_TIMER, SI_TIMER,
-	  &restart_itimer,
-	};
-      if (!hurd_siglocked)
-	__mutex_lock (&_hurd_siglock);
-      if (! preemptor.next && _hurdsig_preemptors != &preemptor)
-	{
-	  preemptor.next = _hurdsig_preemptors;
-	  _hurdsig_preemptors = &preemptor;
-	  _hurdsig_preempted_set |= preemptor.signals;
-	}
-      if (!hurd_siglocked)
-	__mutex_unlock (&_hurd_siglock);
-
       if (_hurd_itimer_port == MACH_PORT_NULL)
 	{
 	  /* Allocate a receive right that the itimer thread will
@@ -332,6 +313,32 @@
   return __hurd_fail (err);
 }
 
+/* Must be called with _hurd_siglock and _hurd_itimer_lock in the
+   locked state. It is important to acquire the _hurd_siglock
+   before the _hurd_itimer_lock. */
+static void
+install_preemptor(const struct itimerval *new)
+{
+  if (new != NULL &&
+      ((new->it_value.tv_sec | new->it_value.tv_usec) != 0))
+    {
+      /* Set up a signal preemptor global for all threads to
+	 run `restart_itimer' each time a SIGALRM would arrive.  */
+      static struct hurd_signal_preemptor preemptor =
+	{
+	  __sigmask (SIGALRM), SI_TIMER, SI_TIMER,
+	  &restart_itimer,
+	};
+
+      if (! preemptor.next && _hurdsig_preemptors != &preemptor)
+	{
+	  preemptor.next = _hurdsig_preemptors;
+	  _hurdsig_preemptors = &preemptor;
+	  _hurdsig_preempted_set |= preemptor.signals;
+	}
+    }
+}
+
 /* Set the timer WHICH to *NEW.  If OLD is not NULL,
    set *OLD to the old value of timer WHICH.
    Returns 0 on success, -1 on errors.  */
@@ -357,8 +364,12 @@
 
 retry:
   crit = _hurd_critical_section_lock ();
+  __mutex_lock (&_hurd_siglock);
   __spin_lock (&_hurd_itimer_lock);
-  ret = setitimer_locked (new, old, crit, 0);
+  install_preemptor(new);
+  __mutex_unlock (&_hurd_siglock);
+
+  ret = setitimer_locked (new, old, crit);
   if (ret == -1 && errno == EINTR)
     /* Got a signal while inside an RPC of the critical section, retry again */
     goto retry;
@@ -373,12 +384,15 @@
 
   struct itimerval it;
 
+  __mutex_lock (&_hurd_siglock);
   __spin_lock (&_hurd_itimer_lock);
   _hurd_itimer_thread = MACH_PORT_NULL;
   it = _hurd_itimerval;
   it.it_value = it.it_interval;
+  install_preemptor(&it);
+  __mutex_unlock (&_hurd_siglock);
 
-  setitimer_locked (&it, NULL, NULL, 0);
+  setitimer_locked (&it, NULL, NULL);
 
   (void) &fork_itimer;		/* Avoid gcc optimizing out the function.  */
 }

Reply to: