Re: Ethernet interface numbering in etch

To: debian-devel@lists.debian.org
Subject: Re: Ethernet interface numbering in etch
From: Nathanael Nerode <neroden@fastmail.fm>
Date: Tue, 27 Mar 2007 00:06:09 -0400
Message-id: <[🔎] 20070327040609.GA4571@doctormoo.dyndns.org>
In-reply-to: <[🔎] 20070326220644.GG2930@borges.dodds.net>

After writing a very long message, I realize that there was a much simpler 
solution, so if you want to cut to the chase, skip to the end!

Steve Langasek wrote:
>Which do you think is the common case -- a system with more than one network
>interface where it's necessary to preserve interface ordering across
>reboots, or a system where the admin will frequently change out the network
>hardware and need to reuse the same interface names?

I would guess the second, or more specifically I would guess that in any given 
month,
 - (the total number of instances of admins swapping out network hardware and 
    needing to reuse the same interface names)
is greater than
 - (the total number of instances of admins setting up new systems with more than 
    one interface where it's necessary to preserve interface ordering across 
    reboots)

Specifically because:
* Most machines have only one interface (If Debian is running on more routers 
than workstations, obviously this would be wrong, but I doubt that's the case.)
* Lots of hardware is crummy and needs to be replaced at least once in a box's 
lifetime.

Also, there's another argument for defaulting to case two.
* Setup for case one needs only needs to be done (worst case) at the 
first reboot, and can be documented in installation notes which users with two 
network cards will probably read; setup for case two must be done (usually)
at the time of hardware malfunction and replacement, which is generally already
a stressful time, and the admin will have no hint as to where to look for the 
problem.

>In the absence of kernel guarantees about device ordering (as is the case
>with 2.6), the current udev implementation gives you the first at the
>expense of the second.  I believe this correctly optimizes for the common
>case.

I suspect that it incorrectly optimizes for the less common case.

----
Of course, it would be ideal if a compromise which handles both cases could be 
found.  The proposed solution at the top of thread is pretty good:

(1) Hardware which exists is assigned a static name, and keeps it;
(2) When the hardware stops existing, the name is released for reuse;
(3) A new piece of hardware will reuse the first released name.

This optimizes both the one-interface-replaced case (it's always eth0), and 
the typical multiple-interfaces-at-once case (they're stably named).

It wouldn't handle cases with two or more interfaces genuinely, physically 
hotplugged after boot *and* requiring stable names, but that's a genuinely rare 
case and anyone with that complicated a system should be writing custom rules.
(Most systems with large numbers of interfaces are routers, which either don't 
have swappable interfaces or have ones which are essentially "case two", 
hardware replacement.  Most other systems with two interfaces have at most one of 
them on a PCMCIA card or similar swappable device, and I've never heard of one with two 
different PCMCIA cards which need to retain consistent names distinct from 
each other.)

To be more specific about how this would work, network interface naming would
be a two-stage process, with all "new" interfaces delayed until after all "old" 
interfaces were believed to be up.
(1) When a "new" interface shows up which doesn't have a static name assigned yet,
    delay naming it until you believe all the "old" interfaces which do have 
    a static name assigned are up.
(2) When all the interfaces with static names assigned are believed to be up 
    (enough time has passed), assign static names to the "new" interfaces, 
    starting with the first unused one, even if it was previously assigned to 
    some interface which no longer exists (didn't come up).

Essentially udev would start with a list of "old" interfaces "known before this 
boot"; it would delay dealing with "new" interfaces until either all "old" 
interfaces were up, or a predefined time delay had passed.  No interface would be 
"new" more than once.  After a while multiple interfaces might end up with the 
same "stable" name assigned, but it would be very unlikely that they were 
interfaces which were actually used at the same time; generally one would be 
'active' and the others would be obsolete; alternatively, in the case of a system 
with several alternate PCMCIA cards, each one would have the same name but only 
one would be used at any given time.

The proposed solution above is pretty complicated (if straightforward) to 
implement, requiring some convoluted code in z45_persistent_net_generator and 
write_net_rules to test whether the interfaces with the persistent names were 
actually found and used (rather than merely whether a rule was written), and a 
delay, loop, and timeout if they weren't all found, and a fancier implementation 
of find_next_available to reuse abandoned rules.

Perhaps one could call it "semi_persistent_net_generator" -- the algorithm would 
mean that a name is persistent until you remove that hardware *and* insert new 
hardware.

Marco D'Itri wrote "Think harder about it and you will understand why this cannot 
be tested in practice," but of course that's bullshit.  It's perfectly 
implementable and testable -- for testing, the delay and timeout could be set 
quite long.  It's not perfectly *reliable*, but it's just as
reliable as anything which depends on udev "finishing" setting up /dev, and 
we've been able to handle that (gobs and gobs of stuff depends on that, and 
we've managed to live with it).

Implementation is complicated enough that I wouldn't ask anyone else to do it.
(And it's easier to just work around it than to implement it myself.)
However, if someone *does* implement semi_persistent_net_generator, it would 
probably be a better default than the current scripts.

--------- 

But perhaps the best "solution" is to document prominently that if you replace 
your network hardware, you should delete the line associated with the removed 
hardware from /etc/udev/rules.d/z25_persistent-net.rules before inserting the new 
hardware.  This would almost always give exactly the desired result, that the new
hardware would assume the name of the old hardware.

Actually, the same caveat should be documented with regard to 
z25_persistent-cd.rules.  I've had to swap out CD drives disturbingly often 
(dust, I think....), and thankfully I haven't yet had to do it with a machine 
running udev, because this would have bitten me as I wondered why the CD numbers 
kept going up and up and up.

-- 
Nathanael Nerode  <neroden@fastmail.fm>

"(Instead, we front-load the flamewars and grudges in
the interest of efficiency.)" --Steve Lanagasek,
http://lists.debian.org/debian-devel/2005/09/msg01056.html

Reply to:

Follow-Ups:
- Re: Ethernet interface numbering in etch
  - From: Steve Langasek <vorlon@debian.org>
- Re: Ethernet interface numbering in etch
  - From: Mike Hommey <mh@glandium.org>
- Re: Ethernet interface numbering in etch
  - From: Gabor Gombas <gombasg@sztaki.hu>
- Re: Ethernet interface numbering in etch
  - From: md@Linux.IT (Marco d'Itri)
- Re: Ethernet interface numbering in etch
  - From: Nathanael Nerode <neroden@fastmail.fm>

References:
- Re: Ethernet interface numbering in etch
  - From: Steve Langasek <vorlon@debian.org>

Prev by Date: Re: racoon and bug 372665
Next by Date: Re: Ethernet interface numbering in etch
Previous by thread: Re: Ethernet interface numbering in etch
Next by thread: Re: Ethernet interface numbering in etch
Index(es):
- Date
- Thread