Bug#1107785: User-space watchdog timers vs suspend-to-idle
- To: Ben Hutchings <ben@decadent.org.uk>
- Cc: Ingo Molnar <mingo@redhat.com>, Peter Zijlstra <peterz@infradead.org>, "Rafael J. Wysocki" <rafael@kernel.org>, Len Brown <len.brown@intel.com>, Pavel Machek <pavel@ucw.cz>, Thomas Gleixner <tglx@linutronix.de>, LKML <linux-kernel@vger.kernel.org>, linux-pm@vger.kernel.org, 1107785@bugs.debian.org, Tiffany Yang <ynaffit@google.com>
- Subject: Bug#1107785: User-space watchdog timers vs suspend-to-idle
- From: John Stultz <jstultz@google.com>
- Date: Thu, 10 Jul 2025 15:34:43 -0700
- Message-id: <[🔎] CANDhNCqK26S7p0nypKOytgvzKUL8CMMr4-JbN-8PkNc=Em6VYA@mail.gmail.com>
- Reply-to: John Stultz <jstultz@google.com>, 1107785@bugs.debian.org
- In-reply-to: <[🔎] CANDhNCoYPX_5m-v_sR4TJ3Xj5TVtrMLP8Bswo_-_+BMXwWUkjg@mail.gmail.com>
- References: <[🔎] 3cbd9533b091576a62f597691ced375850d7464a.camel@decadent.org.uk> <[🔎] CANDhNCoYPX_5m-v_sR4TJ3Xj5TVtrMLP8Bswo_-_+BMXwWUkjg@mail.gmail.com> <569c0792-7564-4b85-98e1-e5ea4b8bfb1f@kolahilft.de>
On Thu, Jul 10, 2025 at 2:59 PM John Stultz <jstultz@google.com> wrote:
> On Thu, Jul 10, 2025 at 12:52 PM Ben Hutchings <ben@decadent.org.uk> wrote:
> > There seems to be a longstanding issue with the combination of user-
> > space watchdog timers (using CLOCK_MONOTONIC) and suspend-to-idle. This
> > was reported at <https://bugzilla.kernel.org/show_bug.cgi?id=200595> and
> > more recently at <https://bugs.debian.org/1107785>.
> >
> > During suspend-to-idle the system may be woken by interrupts and the
> > CLOCK_MONOTONIC clock may tick while that happens, but no user-space
> > tasks are allowed to run. So when the system finally exits suspend, a
> > watchdog timer based on CLOCK_MONOTONIC may expire immediately without
> > the task being supervised ever having an opportunity to pet the
> > watchdog.
> >
> > This seems like a hard problem to solve!
>
> So I don't know much about suspend-to-idle, but I'm surprised it's not
> suspending timekeeping! That definitely seems problematic.
Hrm. The docs here seem to call out that timekeeping is supposed to be
suspended in s2idle:
https://docs.kernel.org/admin-guide/pm/sleep-states.html#suspend-to-idle
Looking at enter_s2idle_proper():
https://elixir.bootlin.com/linux/v6.16-rc5/source/drivers/cpuidle/cpuidle.c#L154
We call tick_freeze():
https://elixir.bootlin.com/linux/v6.16-rc5/source/kernel/time/tick-common.c#L524
Which calls timekeeping_suspend() when the last cpu's tick has been frozen.
So it seems like the problem might be somehow all the cpus maybe
aren't entering s2idle, causing time to keep running?
thanks
-john
Reply to: