[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#795060: Latest Wheezy backport kernel prefers Infiniband mlx4_en over mlx4_ib, breaks existing installs



Hello Ben,

thanks for the quick and detailed reply.

On Mon, 10 Aug 2015 15:53:57 +0200 Ben Hutchings wrote:

> Control: severity -1 important
> Control: tag -1 upstream
> 
> On Mon, 2015-08-10 at 13:52 +0900, Christian Balzer wrote:
> [...]
> > I'm also not seeing this on several other machines we use for Ceph
> > with the current Jessie kernel, but to be fair they use slightly
> > different (QDR, not FDR) ConnectX-3 HBAs.
> 
> If SR-IOV is enabled on the adapter then the ports will always operate
> in Ethernet mode as it's apparently not supported for IB.  Perhaps SR
> -IOV enabled on some adapters but not others?
>
I was wondering about that, but wasn't aware of the Ethernet only bit of
SR-IOV. 
Anyway, the previous cluster and one blade of this new one have Mellanox
firmware 2.30.8000, which doesn't offer the Flexboot Bios menu and thus
have no SR-IOV configuration option at boot time.

However the other blade (replacement mobo for a DoA one) in the new server
does have firmware 2.33.5100 and the Flexboot menu and had SR-IOV enabled.

Alas disabling it (and taking out the fake-install) did result in the same
behavior, mlx4_en was auto-loaded before mlx4_ib.

In all following tests I did reboot both nodes simultaneously, to avoid
having one port in Ethernet mode forcing things on the other side.

Also the newest QDR card for one of the Ceph cluster machines here does
have that firmware, but behaves properly (no mlx4_en auto-load) with the
latest Jessie kernel.
 
> If that's not the issue, it looks like you are supposed to set a module
> parameter in mlx4_core:
>     port_type_array:Array of port types: HW_DEFAULT (0) is default 1 for
> IB, 2 for Ethernet (array of int) e.g.:
>     options mlx4_core port_type_array=1,1
> 
I added that "options mlx4_core port_type_array=1" (since there is only
one port) to /etc/modprobe.d/local.conf, depmod -a, update-initramfs -u,
but no joy.
The mlx4_en module gets auto-loaded before the IB one as well with this
setting.

So ultimately only the fake-install of mlx4_en provides a workaround.

If you have anything else you would like to try let me know, this cluster
will probably not go into production for another 2 weeks.

> I don't know what determines the hardware default.
> 
> [...]
> > Given that the previous version works as expected and that Jessie is
> > doing the "right" thing as well, I'd consider this a critical bug.
> 
> No, it is important (since it is a regression) but it is not critical.
> 
Fair enough.

> > Had I rebooted the older production cluster with 500,000 users on it
> > into this kernel, the results would not have been pretty.
> 
> And that's why you tested on one machine first, right?
> 
Of course, but it would have still a) broken things (replication stopped)
and b) taken me even more time to figure out what was going on and how to
work around it as I can't reboot that cluster willy-nilly. 

There simply is a very high expectation that a kernel update like this
won't leave you dead in the water.

Regards,

Christian
-- 
Christian Balzer        Network/Systems Engineer                
chibi@gol.com   	Global OnLine Japan/Fusion Communications
http://www.gol.com/


Reply to: