[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: networking.service fails



Dear Debian community,

First of all many thanks to everybody who has replied to my message and
contributed ideas to solve the issue.

On 3 Apr 2022 22:41:44, Greg Wooledge wrote:
> So... you've got some wild stuff going on here.
>
> You have two interfaces.  One of them was named enp2s0 and then got
> renamed to eth0, which is usually the opposite direction from what one
> expects.
>
> The other was named enp3s0 and then got renamed to eth1.  Which, again,
> is the opposite of what one expects.
>
> Normally, the kernel assigns the name eth0 temporarily to the first
> interface that it finds, and then renames it to some "predictable"
> name according to obtuse schemes.  In your case, I'm imagining that
> your interface started as eth0, then got renamed to enp2s0, then renamed
> a second time back to eth0.
>
> At the time when ifup tried to run, the interface was named enp2s0, but
> your configuration tried to address it by the name eth0.
>
> So, there are many questions here.
>
> How many different interface configuration schemes are you using here?
> Which ones are they?
>
> a) Do you set the net.ifnames kernel parameter?  If so, what value are
>    you setting it to?

No, in my initial setup I didn't have this parameter.

> b) Do you have any non-loopback interfaces configured in the
>    /etc/network/interfaces file? If so, which ones, and how are
>    they configured?

The configuration is trivial: it adds both eth0 eth1 to the bridge br0.

=== cut /etc/network/interfaces ===
auto lo
auto eth0
auto eth1

iface lo inet loopback

auto br0
iface br0 inet static
        address 10.0.1.100
        gateway 10.0.1.1
        netmask 255.0.0.0
        bridge_ports eth0 eth1
        bridge_maxwait 60
=== cut ===

> c) Do you have an /etc/udev/rules.d/70-persistent-net.rules file?  Note
>    that this file was deprecated in buster.  If it continued to work in
>    buster, that was just a happy accident.  If it stopped working in
>    bullseye, that is not a surprise.

In my initial setup I had this file.

> d) Do you have any /etc/systemd/network/*.link files?  If so, what's in
>    them?

No, no files in that directory.

> e) Do you have network-manager installed?  If so, I don't know what
>    followup questions to ask about it, because I don't use it.

No, I don't have network-manager installed.

> f) Do you have any /etc/systemd/network/*.network files?  If so, what's
>    in those?

No, no files in that directory.

> g) Do you have any *other* interface configuration schemes in play that
>    I don't know about?

Nothing special I can think about.

> Once you've identified all of the moving parts in this puzzle, then you
> can figure out which ones you actually want to keep, and which ones
> you should rip out.
>
> For comparison, my system (Debian 11 bullseye) has one interface.  I
> configure its name using a file in /etc/systemd/network/ which sets
> the name to lan0.  Then I configure its in /etc/network/interfaces
> using that name.
>
> Some relevant log file lines:
>
> Mar 26 08:01:54 unicorn kernel: [    1.042579] r8169 0000:02:00.0 eth0: RTL8168gu/8111gu, 18:60:24:77:5c:ec, XID 509, IRQ 127
> [...]
> Mar 26 08:01:54 unicorn kernel: [    1.057022] r8169 0000:02:00.0 lan0: renamed from eth0
> [...]
> Mar 26 08:01:59 unicorn kernel: [   20.372606] r8169 0000:02:00.0 lan0: Link is Up - 100Mbps/Full - flow control rx/tx
>
> That's just one of many possible ways to configure network interfaces.
> You might be using a different scheme.  The important thing is that
> you pick *one* scheme and set it up correctly.  If you've got two or
> more competing schemes in play, and they're undoing each other's work,
> that's not desirable.
>
> It's also worth pointing out that /etc/udev/rules.d/70-persistent-net.rules
> was deprecated in buster (Debian 10).  According to the release notes,
> it *may* work, or it may not.  Users were instructed to migrate away
> from it.
>
> It would not surprise me one bit if there's a race condition which causes
> the renaming done by 70-persistent-net.rules to occur at the wrong time,
> if it even happens at all.

Greg, thanks for your remarks and questions – I have learned something new
while looking here and there.

On 4 Apr 2022 08:50:42, Reco wrote:
> As /var/log/messages helpfully show, your udev rules work.
> The problem is, next thing udev does is renaming your network interfaces
> back to (Un)Predictable Naming™ scheme.
> 
> Thus whatever stanzas you have in your interfaces(5) about eth0 and eth1
> fail, thus the whole networking.service fails.
> 
> 
> The conclusion is simple too:
> 
> 1) Remove 70-persistent-net.rules, it's not doing what it should anyway.
> 
> 2) Either use (Un)Predictable Network names in your interfaces, such as
> enp2s0 and enp3s0.
> 
> 3) Or use systemd network link files to rename network interfaces.
> 
> 4) Or add "net.ifnames=0" to kernel's cmdline, as others suggested.
> 
> Reco

Reco, I have applied (1) and (2), namely what I did:

* Added net.ifnames=0 as kernel parameter.
* Removed /etc/udev/rules.d/70-persistent-net.rules
* Rebooted.

Unfortunately after reboot the "issue" is still there :(

# systemctl status networking.service
* networking.service - Raise network interfaces
     Loaded: loaded (/lib/systemd/system/networking.service; enabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Sun 2022-04-10 12:37:14 CEST; 12h ago
       Docs: man:interfaces(5)
    Process: 966 ExecStart=/sbin/ifup -a --read-environment (code=exited, status=1/FAILURE)
   Main PID: 966 (code=exited, status=1/FAILURE)
        CPU: 81ms

Apr 10 12:37:13 debian systemd[1]: Starting Raise network interfaces...
Apr 10 12:37:13 debian ifup[966]: ifup: unknown interface eth0
Apr 10 12:37:13 debian ifup[966]: ifup: unknown interface eth1
Apr 10 12:37:14 debian systemd[1]: networking.service: Main process exited, code=exited, status=1/FAILURE
Apr 10 12:37:14 debian systemd[1]: networking.service: Failed with result 'exit-code'.
Apr 10 12:37:14 debian systemd[1]: Failed to start Raise network interfaces.

Here comes a combined log. I've added milliseconds so that we can be
sure what event comes first (that was an issue in my original report):

2022-04-10 12:37:13.488243 debian kernel: [    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-4.19.0-14-amd64 root=/dev/sdc1 ro 3 quiet net.ifnames=0
2022-04-10 12:37:13.488284 debian systemd[1]: Starting Helper to synchronize boot up for ifupdown...
2022-04-10 12:37:13.488289 debian sh[299]: ifquery: unknown interface eth0
2022-04-10 12:37:13.488294 debian sh[299]: ifquery: unknown interface eth1
2022-04-10 12:37:13.488542 debian systemd[1]: Starting Raise network interfaces...
2022-04-10 12:37:13.488824 debian ifup[966]: ifup: unknown interface eth0
2022-04-10 12:37:13.488828 debian ifup[966]: ifup: unknown interface eth1
2022-04-10 12:37:13.488959 debian kernel: [    1.062952] r8169 0000:02:00.0: can't disable ASPM; OS doesn't have ASPM control
2022-04-10 12:37:13.488968 debian kernel: [    1.078567] r8169 0000:02:00.0 eth0: RTL8168g/8111g, 00:17:e8:92:b7:77, XID 4c0, IRQ 29
2022-04-10 12:37:13.488968 debian kernel: [    1.078586] r8169 0000:03:00.0: can't disable ASPM; OS doesn't have ASPM control
2022-04-10 12:37:13.489038 debian kernel: [    1.094570] r8169 0000:03:00.0 eth1: RTL8168g/8111g, 00:17:20:53:44:58, XID 4c0, IRQ 32
2022-04-10 12:37:13.529811 debian kernel: [   15.877077] br0: port 1(eth0) entered blocking state
2022-04-10 12:37:13.529821 debian kernel: [   15.877079] br0: port 1(eth0) entered disabled state
2022-04-10 12:37:13.529822 debian kernel: [   15.877131] device eth0 entered promiscuous mode
2022-04-10 12:37:13.561822 debian kernel: [   15.911538] r8169 0000:02:00.0: firmware: direct-loading firmware rtl_nic/rtl8168g-2.fw
2022-04-10 12:37:13.589806 debian kernel: [   15.936881] Generic FE-GE Realtek PHY r8169-0-200:00: attached PHY driver [Generic FE-GE Realtek PHY] (mii_bus:phy_addr=r8169-0-200:00, irq=IGNORE)
2022-04-10 12:37:13.797798 debian kernel: [   16.145273] r8169 0000:02:00.0 eth0: Link is Down
2022-04-10 12:37:13.797814 debian kernel: [   16.145714] br0: port 2(eth1) entered blocking state
2022-04-10 12:37:13.797815 debian kernel: [   16.145715] br0: port 2(eth1) entered disabled state
2022-04-10 12:37:13.797816 debian kernel: [   16.145764] device eth1 entered promiscuous mode
2022-04-10 12:37:13.829809 debian kernel: [   16.176910] Generic FE-GE Realtek PHY r8169-0-300:00: attached PHY driver [Generic FE-GE Realtek PHY] (mii_bus:phy_addr=r8169-0-300:00, irq=IGNORE)
2022-04-10 12:37:14.029829 debian kernel: [   16.377000] r8169 0000:03:00.0 eth1: Link is Down
2022-04-10 12:37:14.029842 debian kernel: [   16.378479] br0: port 2(eth1) entered blocking state
2022-04-10 12:37:14.029843 debian kernel: [   16.378481] br0: port 2(eth1) entered forwarding state
2022-04-10 12:37:14.029843 debian kernel: [   16.378488] br0: port 1(eth0) entered blocking state
2022-04-10 12:37:14.029844 debian kernel: [   16.378490] br0: port 1(eth0) entered forwarding state
2022-04-10 12:37:14.279635 debian systemd[1]: networking.service: Main process exited, code=exited, status=1/FAILURE
2022-04-10 12:37:14.279757 debian systemd[1]: networking.service: Failed with result 'exit-code'.
2022-04-10 12:37:14.279847 debian systemd[1]: Failed to start Raise network interfaces.
2022-04-10 12:37:14.513825 debian kernel: [   16.860939] br0: port 1(eth0) entered disabled state
2022-04-10 12:37:14.513836 debian kernel: [   16.861081] br0: port 2(eth1) entered disabled state
2022-04-10 12:37:16.289779 debian kernel: [   18.637162] r8169 0000:02:00.0 eth0: Link is Up - 1Gbps/Full - flow control rx/tx

What confuses me is that ifup is triggered before the interface is
started – is that expected? Should systemd retry after interfaces are
up? At the end of the day, br0 is created and functions just fine, so
the system works, but networking.service is marked as failed.

Thanks for any further ideas.

-- 
With best regards,
Dmitry


Reply to: