Bug#984928: Acknowledgement (slurmctld: fails to start on reboot)
- To: 984928@bugs.debian.org
- Subject: Bug#984928: Acknowledgement (slurmctld: fails to start on reboot)
- From: David Bremner <bremner@debian.org>
- Date: Fri, 06 Aug 2021 11:01:48 -0300
- Message-id: <[🔎] 87czqqvfhv.fsf@tethera.net>
- Reply-to: David Bremner <bremner@debian.org>, 984928@bugs.debian.org
- In-reply-to: <878s682p12.fsf@tethera.net>
- References: <161537798134.1991510.12203898027789323171.reportbug@convex.cs.unb.ca> <handler.984928.B.161537798514834.ack@bugs.debian.org> <878s682p12.fsf@tethera.net> <161537798134.1991510.12203898027789323171.reportbug@convex.cs.unb.ca>
David Bremner <bremner@debian.org> writes:
> As a workaround, I noticed that setting the main ethernet interface to
> "auto" instead of "allow-hotplug" seems to fix the problem. By way of
> confirmation, on a different (virtual) machine changing the "auto" to
> "allow-hotplog" on the main ethernet interface causes the same problem
> to manifest.
>
> This is still a bit mysterious, since the messages complain about
> 127.0.0.1 which is of course the loopback interace, already marked
> "auto", and presumably up pretty early.
I think (one) underlying problem is that the systemd unit file for
slurmctld is incorrect. The details are in [1], but it seems like
network.target is not correct (I think it very rarely is a useful
target). I added the following
# /etc/systemd/system/slurmctld.service.d/override.conf
[Unit]
After=network-online.target munge.service
Wants=network-online.target
And it seems to help. I didn't check if the second mention of
munge.service is really needed.
I've switched to systemd-networkd on the hosts in question, so I can't
easily test how this works with ifupdown, but I notice ifupdown provides
/lib/systemd/system/ifupdown-wait-online.service
which (guessing based on the name) should provide similar functionality
to those documented in [1] for NetworkManager and systemd-networkd.
[1]: https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/
Reply to: