[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#991613: DHCPv6 problem in our image: needs "-D LL" when spawning dhclient



Hi Noah,

Thanks for your answer.

On 7/30/21 1:54 AM, Noah Meyerhans wrote:
> Control: severity -1 important
> 
> Please see https://www.debian.org/Bugs/Developer#severities
> 
> On Wed, Jul 28, 2021 at 05:22:43PM +0200, Thomas Goirand wrote:
>> After spawning a VM, it takes a long time to get networking (output from
>> the console):
>>
>> cloud-init[281]: Cloud-init v. 20.2 running 'init-local' at Wed, 28 Jul 2021 07:49:23 +0000. Up 2.98 seconds.
>> Started [0;1;39mInitial cloud-init job (pre-networking).
>> Reached target [0;1;39mNetwork (Pre).
>> Starting [0;1;39mRaise network interfaces...
>> A start job is running for Raise network interfaces (6s / 5min 1s)
>> A start job is running for Raise network interfaces (7s / 5min 1s)
>> A start job is running for Raise network interfaces (7s / 5min 1s)
>> [...]
>> A start job is running for Raise ne���ork interfaces (5min 1s / 5min 1s)
>> Failed to start Raise network interfaces.
>>
>> A systemctl status networking.service shows:
>>
>>    Loaded: loaded (/lib/systemd/system/networking.service; enabled; vendor preset: enabled)
>>    Active: failed (Result: timeout) since Wed 2021-07-28 07:54:23 UTC; 52min ago
>>
>> This is specific to the Debian image. We've compared with Ubuntu 21.04.
> 
> It seems specific to the OpenStack image.  Which OpenStack images are
> you testing?  The FAI-generated images or the others?

The FAI one. I am hoping to get the others stopped for Bullseye.

> 
>> Ubuntu:
>> - Initial boot:
>> 2021-07-28T11:58:50.836457+00:00 pub1-network-3 dnsmasq-dhcp[3765807]: DHCPSOLICIT(tap67fa8c3f-8d) 00:02:00:00:ab:11:11:16:f0:97:0e:c5:c9:b6
>> 2021-07-28T11:58:50.836724+00:00 pub1-network-3 dnsmasq-dhcp[3765807]: DHCPREPLY(tap67fa8c3f-8d) <redacted>::3ba 00:02:00:00:ab:11:11:16:f0:97:0e:c5:c9:b6 host-<redacted>--3ba
>>
>> - Server side:
>> /var/lib/neutron/dhcp/dcf25c41-9057-4bc2-8475-a2e3c5d8c662/host:fa:16:3e:63:54:8c,tag:dhcpv6,host-<redacted>--3ba.dc3-a.pub1.infomaniak.cloud.,[<redacted>::3ba]
>> /var/lib/neutron/dhcp/dcf25c41-9057-4bc2-8475-a2e3c5d8c662/leases:1627559930 3042863103 <redacted>::3ba /host-<redacted>--3ba 00:02:00:00:ab:11:11:16:f0:97:0e:c5:c9:b6
>>
>> Then we do "openstack server rebuild" and get the same result.
> 
> What exactly are the semantics of "openstack server rebuild"?  Is the
> rebuilt host expected to be identical to the original?

It's the same as if you where doing "openstack server delete", then
"openstack server create", except that if you do a rebuild, the VM will
keep some of the attributes that were in the original VM: it will boot
on the same compute, will keep the same Neutron port (with same MAC),
and same IP address. With "openstack server rebuild" you can also pass a
"--image <image-name>" parameter if you wish to change image.

> 
>> Debian:
>> - Intial boot:
>> 2021-07-28T11:59:15.838131+00:00 pub1-network-3 dnsmasq-dhcp[3765807]: DHCPSOLICIT(tap67fa8c3f-8d) 00:01:00:01:28:94:03:11:fa:16:3e:f1:a9:da
>> 2021-07-28T11:59:15.838369+00:00 pub1-network-3 dnsmasq-dhcp[3765807]: DHCPADVERTISE(tap67fa8c3f-8d) <redacted>::143 00:01:00:01:28:94:03:11:fa:16:3e:f1:a9:da host-<redacted>--143
>> 2021-07-28T11:59:16.795826+00:00 pub1-network-3 dnsmasq-dhcp[3765807]: DHCPREQUEST(tap67fa8c3f-8d) 00:01:00:01:28:94:03:11:fa:16:3e:f1:a9:da
>> 2021-07-28T11:59:16.796177+00:00 pub1-network-3 dnsmasq-dhcp[3765807]: DHCPREPLY(tap67fa8c3f-8d) <redacted>::143 00:01:00:01:28:94:03:11:fa:16:3e:f1:a9:da host-<redacted>--143
> 
> These logs are coming from dnsmasq, not dhclient, which isn't installed
> on the FAI-generated images, so I guess you're talking about the images
> generated from openstack-debian-images?  Do our FAI generated images
> exhibit similar symptoms in the same environment?

dnsmasq is the thing that acts as dhcp server in OpenStack, so I'm not
sure where this came from (I barely copied a redmine ticket from a
colleague, as I didn't do the investigation myself).

>> So here, we probably need to get ifupdown to use the -D LL option
>> explicitely, but I'm not sure how to do this... Does ifupdown even has
>> an option for forcing that? It doesn't seem to be the case. :/
>>
>> Any help or comment would be welcome.
> 
> A host (VM or physical) would only normally generate a DUID once, on
> initial launch.

Yes, that's correct. But when you do an "openstack server rebuild", the
VM get to do the initial launch twice...

> It shouldn't really matter which mechanism it uses to
> choose one except for the purposes of disambiguation in the event of
> interface reuse across installations.

That's exactly what's going on.

> I think we need more details on exactly what software is involved and
> exactly what issue is wrong.
The problem is IMO non-predictable/reproducible DUID which confuses
dnsmasq, probably. If there's a solution on the dnsmasq, on the
OpenStack deployment level, that'd be fine to me as well: we would just
need to document this.

Cheers,

Thomas Goirand (zigo)


Reply to: