[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: WORKAROUND (longish): was bookworm and network connections



On Sat 02 Sep 2023 at 08:19:56 (-0600), D. R. Evans wrote:
> Starting a new thread so that this doesn't get lost in the postings in
> the original thread.
> 
> The original thread was started at:
>   https://lists.debian.org/debian-user/2023/09/msg00024.html
> 
> That post contains a description of the problem.
> 
> I now have a workaround (although not an explanation) for the problem.
> 
> As I noted in the above thread, once the system was up, I could get
> the networking to function correctly by manually entering the
> commands:
> 
>   sudo nmcli connection down "Wired connection enp11s0(eth0)"
>   sudo nmcli connection up "Wired connection enp11s0(eth0)"

I don't see how those two commands are meant to do anything.
You appear to have two interfaces which connect to two different
things, one (11) to what looks like the internet, and the other (12)
to a LAN. From the choice of address (.1) for 12, it looks as if
this computer is intended to be the gateway for the LAN. So presumably
interface 11 is connected to a modem and your ISP.

What you haven't yet posted is anything telling NM that the modem is
on 11, and it looks as if the system chose/guessed 12 at boot time.

> However, if I put those same nmcli commands in rc.local, the problem
> was not resolved.
> 
> After floundering for a while, and being suspicious that manual
> commands once the system was up were not being treated the same as the
> same commands in rc.local, I tried putting these lines in rc.local
> (whose output I log, so I can see what's happening):
> 
> ----
> nmcli connection down "Wired connection enp11s0(eth0)"
> nmcli
> 
> sleep 10
> 
> nmcli connection up "Wired connection enp11s0(eth0)"
> nmcli
> ----
> 
> These showed that after the first command, everything looked as it
> should, but after the second, everything had reverted to the broken
> state.
> 
> But it was still true that if I entered the same commands manually
> after the system had completed booting, the networking worked.
> 
> But I'm a slow typist, and I wondered if that 10-second pause might be
> too short. So I changed it to 20 seconds.
> 
> And lo! and behold! That worked.
> 
> I tried booting numerous times with a 10-second delay, and also with a
> 20-second delay. The results were consistent. With a 10-second delay,
> the network comes up in an unusable state. With a 20-second delay, it
> comes up in a working state. (Which to me suggests a race condition
> somewhere, but I'll let the developers deal with the exact cause and
> finding a proper fix.)

I've not used NM, but I looked at the man page for nmcli on archlinux,
and it says:

 "down [id | uuid | path | apath] ID...
    Deactivate a connection from a device without preventing the
    device from further auto-activation. Multiple connections can be
    passed to the command.

   "Be aware that this command deactivates the specified active
    connection, but the device on which the connection was active, is
    still ready to connect and will perform auto-activation by
    looking for a suitable connection that has the 'autoconnect' flag
    set. Note that the deactivating connection profile is internally
    blocked from autoconnecting again. Hence it will not autoconnect
    until reboot or until the user performs an action that unblocks
    autoconnect, like modifying the profile or explicitly activating
    it.

   "In most cases you may want to use device down command instead."

I don't know whether you have configured autoconections, and whether
they would timeout, and over what time interval, but NM allegedly
has a logging facility that should carry a record of exactly what
is being tried, if you set the level accordingly.

As for the fact that the problem started after an upgrade, well,
something might have changed in NM's approach to guessing, assuming
that guessing is what it has been doing all along. You were just
lucky until now.

> Of course, this is all just a workaround for what appears to be a
> problem during network initialisation in the boot process. But it does
> seem to work.
> 
> I will file a bug report.

The problem could be similar to those in other areas where static
methods have given way to dynamic ones, like using device names
(sdX …) in fstab, or ethN in /e/n/i, where the assignments are
no longer predictable as once they were.

Cheers,
David.


Reply to: