[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#795060: Latest Wheezy backport kernel prefers Infiniband mlx4_en over mlx4_ib, breaks existing installs



On Tue, 11 Aug 2015 20:00:41 +0200 Ben Hutchings wrote:

> On Tue, 2015-08-11 at 10:38 +0900, Christian Balzer wrote:
> > Hello Ben,
> > 
> > thanks for the quick and detailed reply.
> > 
> > On Mon, 10 Aug 2015 15:53:57 +0200 Ben Hutchings wrote:
> > 
> > > Control: severity -1 important
> > > Control: tag -1 upstream
> > > 
> > > On Mon, 2015-08-10 at 13:52 +0900, Christian Balzer wrote:
> > > [...]
> > > > I'm also not seeing this on several other machines we use for Ceph
> > > > with the current Jessie kernel, but to be fair they use slightly
> > > > different (QDR, not FDR) ConnectX-3 HBAs.
> > > 
> > > If SR-IOV is enabled on the adapter then the ports will always
> > > operate in Ethernet mode as it's apparently not supported for IB.
> > > Perhaps SR -IOV enabled on some adapters but not others?
> > > 
> > I was wondering about that, but wasn't aware of the Ethernet only bit
> > of SR-IOV. 
> > Anyway, the previous cluster and one blade of this new one have
> > Mellanox firmware 2.30.8000, which doesn't offer the Flexboot Bios
> > menu and thus have no SR-IOV configuration option at boot time.
> > 
> > However the other blade (replacement mobo for a DoA one) in the new
> > server does have firmware 2.33.5100 and the Flexboot menu and had
> > SR-IOV enabled.
> > 
> > Alas disabling it (and taking out the fake-install) did result in the
> > same behavior, mlx4_en was auto-loaded before mlx4_ib.
> [...]
> > I added that "options mlx4_core port_type_array=1" (since there is only
> > one port) to /etc/modprobe.d/local.conf, depmod -a, update-initramfs
> > -u, but no joy.
> > The mlx4_en module gets auto-loaded before the IB one as well with this
> > setting.
> [...]
> 
> There was a deliberate change in mlx4_core in Linux 3.15 to load
> mlx4_en first if it finds any Ethernet port.  

Interesting. So this _could_ have bitten me earlier with any flavor of
3.16 kernel if there had been an "Ethernet port" around.
Again, given that a cluster with otherwise identical hardware doesn't do
this leads me to assume that the presence of that Ethernet port stems from
the 2.33.5100 firmware, no matter if SR-IOV is enabled or not.

> But that is separate from
> the decision of what types of port are configured.
> 
From where I'm standing it looks like it will use/configure mlx_en no
matter what. And once the mlx4_en is loaded, mlx4_ib is no longer capable
of creating IB ports.

In fact it will even tear down the remote IB port and load mlx4_en if just
one side changes from IB to EN.
To wit, I had both nodes up with running ib0: interfaces (mlx4_en disabled
via fake-install). 
I then commented out the fake-install on both and did a depmod -a.
On node mbx09 (the one with the newer firmware) I then rmmod'ed mlx4_ib and
mlx4_core.
Then I modprobe'd mlx4_core:
---
Aug 12 10:14:56 mbx09  mlx4_core: Mellanox ConnectX core driver v2.2-1 (Feb, 2014)
Aug 12 10:14:56 mbx09  mlx4_core: Initializing 0000:02:00.0
Aug 12 10:15:01 mbx09  mlx4_core 0000:02:00.0: PCIe link speed is 8.0GT/s, device supports 8.0GT/s
Aug 12 10:15:01 mbx09  mlx4_core 0000:02:00.0: PCIe link width is x8, device supports x8
Aug 12 10:15:01 mbx09  mlx4_core 0000:02:00.0: irq 124 for MSI/MSI-X
Aug 12 10:15:01 mbx09  mlx4_core 0000:02:00.0: irq 125 for MSI/MSI-X
Aug 12 10:15:01 mbx09  mlx4_core 0000:02:00.0: irq 126 for MSI/MSI-X
Aug 12 10:15:01 mbx09  mlx4_core 0000:02:00.0: irq 127 for MSI/MSI-X
Aug 12 10:15:01 mbx09  mlx4_core 0000:02:00.0: irq 128 for MSI/MSI-X
Aug 12 10:15:01 mbx09  mlx4_core 0000:02:00.0: irq 129 for MSI/MSI-X
Aug 12 10:15:01 mbx09  mlx4_core 0000:02:00.0: irq 130 for MSI/MSI-X
Aug 12 10:15:01 mbx09  mlx4_core 0000:02:00.0: irq 131 for MSI/MSI-X
Aug 12 10:15:01 mbx09  mlx4_core 0000:02:00.0: irq 132 for MSI/MSI-X
Aug 12 10:15:01 mbx09  mlx4_core 0000:02:00.0: irq 133 for MSI/MSI-X
Aug 12 10:15:01 mbx09  mlx4_core 0000:02:00.0: irq 134 for MSI/MSI-X
Aug 12 10:15:01 mbx09  mlx4_core 0000:02:00.0: irq 135 for MSI/MSI-X
Aug 12 10:15:01 mbx09  mlx4_core 0000:02:00.0: irq 136 for MSI/MSI-X
Aug 12 10:15:01 mbx09  mlx4_core 0000:02:00.0: irq 137 for MSI/MSI-X
Aug 12 10:15:01 mbx09  mlx4_core 0000:02:00.0: irq 138 for MSI/MSI-X
Aug 12 10:15:01 mbx09  mlx4_core 0000:02:00.0: irq 139 for MSI/MSI-X
Aug 12 10:15:01 mbx09  mlx4_core 0000:02:00.0: irq 140 for MSI/MSI-X
Aug 12 10:15:01 mbx09  mlx4_core 0000:02:00.0: irq 141 for MSI/MSI-X
Aug 12 10:15:01 mbx09  mlx4_core 0000:02:00.0: irq 142 for MSI/MSI-X
Aug 12 10:15:01 mbx09  mlx4_core 0000:02:00.0: irq 143 for MSI/MSI-X
Aug 12 10:15:01 mbx09  mlx4_core 0000:02:00.0: irq 144 for MSI/MSI-X
Aug 12 10:15:01 mbx09  <mlx4_ib> mlx4_ib_add: mlx4_ib: Mellanox ConnectX InfiniBand driver v2.2-1 (Feb 2014)
Aug 12 10:15:01 mbx09  mlx4_en: Mellanox ConnectX HCA Ethernet driver v2.2-1 (Feb 2014)
Aug 12 10:15:01 mbx09  mlx4_en 0000:02:00.0: registered PHC clock
Aug 12 10:15:01 mbx09  mlx4_en 0000:02:00.0: Activating port:1
Aug 12 10:15:01 mbx09  mlx4_en: 0000:02:00.0: Port 1: Using 192 TX rings
Aug 12 10:15:01 mbx09  mlx4_en: 0000:02:00.0: Port 1: Using 8 RX rings
Aug 12 10:15:01 mbx09  mlx4_en: 0000:02:00.0: Port 1:   frag:0 - size:1526 prefix:0 align:0 stride:1536
Aug 12 10:15:01 mbx09  mlx4_en: 0000:02:00.0: Port 1: Initializing port
Aug 12 10:15:01 mbx09  mlx4_en: eth2: Link Up
---

On node mbx10 this happened, note that these are NTP sync'ed times, so the
moment the mlx4_ib and mlx4_core were unloaded on mbx09 it decided that it
was Ethernet time (peanut butter jelly optional):
---
Aug 12 10:14:41 mbx10  mlx4_en: Mellanox ConnectX HCA Ethernet driver v2.2-1 (Feb 2014)
Aug 12 10:14:41 mbx10  mlx4_en 0000:02:00.0: registered PHC clock
Aug 12 10:14:41 mbx10  mlx4_en 0000:02:00.0: Activating port:1
Aug 12 10:14:41 mbx10  mlx4_en: 0000:02:00.0: Port 1: Using 192 TX rings
Aug 12 10:14:41 mbx10  mlx4_en: 0000:02:00.0: Port 1: Using 8 RX rings
Aug 12 10:14:41 mbx10  mlx4_en: 0000:02:00.0: Port 1:   frag:0 - size:1526 prefix:0 align:0 stride:1536
Aug 12 10:14:41 mbx10  mlx4_en: 0000:02:00.0: Port 1: Initializing port
Aug 12 10:14:43 mbx10  mlx4_en: eth2: Link Up
Aug 12 10:14:56 mbx10  mlx4_en: eth2: Link Down
Aug 12 10:14:58 mbx10  mlx4_en: eth2: Link Up
---

> What messages do the drivers log?  (dmesg | grep mlx4)
> 
See above.
During a normal boot the EN module gets auto-loaded 8 seconds in and the
IB one 2 seconds later, but as I mentioned, sequence doesn't matter, EN
trumps/suppresses IB.

> What is the output of:
> 
>     find /sys/bus/pci/drivers/mlx4_core/0*/ -name port_type | xargs grep
> -H .
> 
There's no "port_type" anywhere in that tree...

Regards,

Christian
-- 
Christian Balzer        Network/Systems Engineer                
chibi@gol.com   	Global OnLine Japan/Fusion Communications
http://www.gol.com/


Reply to: