[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#863580: cloud.debian.org: Vagrant boxes randomly fail to come up when additional networks are configured



Package: cloud.debian.org
Severity: important

Dear Maintainer,

When additional networks are configured in Vagrantfile, this can
result in Vagrant boxes randomly failing while bringing them up. Issue
occurs both with debian/jessie64 and debian/contrib-jessie64.

Reproduction steps:

1. Install latest version of Vagrant and VirtualBox. Issue should be
   reproducible with older versions too.

2. Create a simple Vagrantfile with the following content:

----%----
# -*- mode: ruby -*-
# vi: set ft=ruby :

Vagrant.configure("2") do |config|

  config.vm.box = "debian/contrib-jessie64"

  config.vm.define "machine" do |machine|
    machine.vm.hostname = "machine"
    machine.vm.network "private_network", ip: "10.64.128.10"
  end

end
----%----

3. Bring up and destroy the Vagrant machine repeatedly (several
   repetitions may be needed to reproduce the issue):

vagrant up && vagrant destroy -f


Expected results:

1. Step (3) of reproduction steps always succeeds.


Actual results:

1. Step (3) of reproduction steps intermittently fails, with error:

----%----
    machine: SSH auth method: private key
Timed out while waiting for the machine to boot. This means that
Vagrant was unable to communicate with the guest machine within
the configured ("config.vm.boot_timeout" value) time period.

If you look above, you should be able to see the error(s) that
Vagrant had when attempting to connect to the machine. These errors
are usually good hints as to what may be wrong.

If you're using a custom box, make sure that networking is properly
working and you're able to connect to the machine. It is a common
problem that networking isn't setup properly in these boxes.
Verify that authentication configurations are also setup properly,
as well.

If the box appears to be booting properly, you may want to increase
the timeout ("config.vm.boot_timeout") value.
----%----


Additional information:

Host environment is Gentoo x86_64, VirtualBox 5.1.18, Vagrant 1.9.5,
debian/contrib-jessie64 8.7.0.

Some testing was done with bento/debian-8.7 base box too, and the
issue could not be reproduced. After some troubleshooting, it turned
out that with debian/jessie64 box the network interfaces eth0 and eth1
may "swap places". E.g. instead of default (NAT) adapter being eth0
and private adapeter being eth1, default adapter would become available
as eth1, and private adapter would become available as eth0, making it
impossible for Vagrant to connect (no IP assigned, since
/etc/network/interfaces assumes eth0 is the NAT adapter for purpose of
getting address from built-in DHCP).

After comparing the files in two boxes, turned out that
debian/contrib-jessie64 defines different network adapter type for
first network adapter and remaining ones (82540EM vs Am79C973). This
probably results in somewhat random ordering in the VM itself when
adapters are being named. Relevant file where this is defined is
"box.ovf".

One could work around this issue with private network definition as:

machine.vm.network "private_network", ip: "10.64.128.10", nic_type: "82540EM"

The best way would be, however, to fix it in base box image instead,
similar to what the Bento box does (all the network interfaces, even
unused ones have 82540EM specified as adapter type). This is especially
true because of how hard it is to figure the issue out (took me a
while to realise what is happening).

Alternative would be to introduce udev rules that would ensure the NAT
network adapter always gets the same name based on MAC.


-- System Information:
Debian Release: 8.7
  APT prefers stable
  APT policy: (500, 'stable')
Architecture: amd64 (x86_64)

Kernel: Linux 3.16.0-4-amd64 (SMP w/2 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)


Reply to: