[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#677475: Same problem with SunFire X4140 reproduced independently



Hey everyone, just wanted to mention that we reproduced this problem on all three of our SunFire X4140's as well.

We had Solaris on them, but decided to switch them to Debian 7.1.0 to run KVM, as SmartOS doesn't support AMD virtualization hardware (yet).

Here's how easy it was to reproduce:

1) Plug in an Ethernet cable with Internet access and DHCP into the first port (eth0).

2) Install Debian using a boot CD made from debian-7.1.0-amd64-netinst.iso with the default options.
# uname -a
Linux master 3.2.0-4-amd64 #1 SMP Debian 3.2.46-1+deb7u1 x86_64 GNU/Linux

3) Log in and install network bridging.
# apt-get install bridge-utils

4) Change the default /etc/network/interfaces file from:

----cut---------cut---------cut---------cut---------cut---------cut-----
# The primary network interface
allow-hotplug eth0
iface eth0 inet dhcp

----cut---------cut---------cut---------cut---------cut---------cut-----

to:

----cut---------cut---------cut---------cut---------cut---------cut-----
# The primary network interface
auto eth0
iface eth0 inet manual

# The bridged network interface
auto br0
iface br0 inet static
        address 192.168.1.163
        network 192.168.1.0
        netmask 255.255.255.0
        broadcast 192.168.1.255
        gateway 192.168.1.1
        dns-nameservers 208.67.222.222 208.67.220.220
        bridge_ports eth0
        bridge_fd 0
        bridge_hello 2
        bridge_maxage 12
        bridge_stp off

----cut---------cut---------cut---------cut---------cut---------cut-----

5) Reboot the server
# reboot


As it boots, the system will reset when it tries to configure the network, and the BIOS will log "Hypertranspot sync flood error."  Additional reboots do the same thing.  The only way to get the server up and running is to unplug your Ethernet cable from eth0, and then once you see the main console login come up plug the cable back in.  From then on the server works as expected.

I've used Salvatore's little trick of adding in "
pre-up /sbin/ifconfig eth0 up" right before the "bridge_ports eth0" line in the br0 section, and that allows the server to boot with the cable still in eth0, both from a warm boot and from a cold boot.  This makes me think that the problem involves some sort of timing issues.  So a big thanks to Salvatore for what appears to be a usable workaround!

Just FYI, from the SunFire Server Diagnostics Guide, when the CPU detects one of the following errors, it reboots immediately, and then on start the BIOS inspects the machine registers and logs "Hypertransport sync flood error".

1) The CPU detects an uncorrectable multi-bit DIMM error
2) CRC or link error on one of the Hypertransport links
3) System or parity error on a PCI bus

I would be willing to test any updates that attempt to fix this bug, as I understand not everyone has X4140's lying around (lol).

Thanks!

  Boyd


Reply to: