[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [Sid] InfiniBand (please apply the following pach)



On 2/20/22 01:14, Grzesiek wrote:
On 2/16/22 00:11, Grzesiek wrote:
Hi there,

I plan to use NFS over RDMA. I decided to buy two QLE7340 InfniBand controllers and DAC cable (no switch, two nodes only). Is QLE7340 supported by Sid? Is required software included in Debian repositories?

It seems that the required software is present in Sid unfortunately there is a problem with kernels 5.15 and 5.16. When trying to load ib_qib you get:

[    4.481908] ib_qib 0000:02:00.0: qib0: Reserving QPNs from 0x656b78 to 0x656b78 for non-verbs use [    4.482005] sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:01.1/0000:02:00.0/infiniband/qib0/ports/1/
linkcontrol'
[    4.482008] CPU: 4 PID: 471 Comm: systemd-udevd Not tainted 5.16.0-1-amd64 #1  Debian 5.16.7-2 [    4.482011] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z97 Extreme4, BIOS P2.60 03/06/2018
[    4.482013] Call Trace:
[    4.482014]  <TASK>
[    4.482016]  dump_stack_lvl+0x48/0x5e
[    4.482021]  sysfs_warn_dup.cold+0x17/0x24
[    4.482024]  internal_create_group+0x365/0x380
[    4.482027]  internal_create_groups.part.0+0x3d/0xa0
[    4.482029]  setup_port+0x370/0x680 [ib_core]
[    4.482044]  ? kobject_add+0x7e/0xb0
[    4.482048]  ib_setup_port_attrs+0x98/0x240 [ib_core]
[    4.482058]  ib_register_device+0x57f/0x660 [ib_core]
[    4.482067]  ? vmalloc_node+0x47/0x50
[    4.482071]  rvt_register_device+0x10c/0x270 [rdmavt]
[    4.482076]  qib_register_ib_device+0x608/0x7d0 [ib_qib]
[    4.482089]  qib_init_one+0x17f/0x470 [ib_qib]
[    4.482099]  local_pci_probe+0x45/0x80
[    4.482102]  ? pci_match_device+0xd7/0x130
[    4.482104]  pci_device_probe+0xd2/0x1c0
[    4.482106]  really_probe+0x1f5/0x3f0
[    4.482109]  __driver_probe_device+0xfe/0x180
[    4.482111]  driver_probe_device+0x1e/0x90
[    4.482113]  __driver_attach+0xc0/0x1c0
[    4.482115]  ? __device_attach_driver+0xe0/0xe0
[    4.482117]  ? __device_attach_driver+0xe0/0xe0
[    4.482119]  bus_for_each_dev+0x78/0xc0
[    4.482122]  bus_add_driver+0x149/0x1e0
[    4.482124]  driver_register+0x8f/0xe0
[    4.482126]  ? qib_init_qibfs+0x11/0x11 [ib_qib]
[    4.482135]  qib_ib_init+0x3e/0xf62 [ib_qib]
[    4.482144]  do_one_initcall+0x44/0x200
[    4.482147]  ? kmem_cache_alloc_trace+0x175/0x3d0
[    4.482150]  do_init_module+0x5c/0x280
[    4.482153]  __do_sys_finit_module+0xae/0x110
[    4.482156]  do_syscall_64+0x3b/0xc0
[    4.482159]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[    4.482162] RIP: 0033:0x7fe1bbc36f79
[    4.482164] Code: 48 8d 3d 9a a8 0d 00 0f 05 eb a5 66 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d c7 5e 0d 00 f7 d8 64 89 01 48 [    4.482167] RSP: 002b:00007fff9e28e408 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 [    4.482170] RAX: ffffffffffffffda RBX: 0000556a935aedb0 RCX: 00007fe1bbc36f79 [    4.482171] RDX: 0000000000000000 RSI: 00007fe1bbde6eed RDI: 000000000000000f [    4.482173] RBP: 0000000000020000 R08: 0000000000000000 R09: 0000556a93570540 [    4.482174] R10: 000000000000000f R11: 0000000000000246 R12: 00007fe1bbde6eed [    4.482175] R13: 0000000000000000 R14: 0000556a935af730 R15: 0000556a935aedb0
[    4.482178]  </TASK>
[    4.482195] infiniband qib0: Couldn't register device with driver model
[    4.482209] ib_qib 0000:02:00.0: qib0: Failed to register driver with ib core.
[    4.482229] ib_qib 0000:02:00.0: qib0: cannot register verbs: 17!
[    4.482762] ib_qib 0000:02:00.0: Disabling notifier on HCA 0 irq 49
[    4.482818] ib_qib 0000:02:00.0: Disabling notifier on HCA 0 irq 50
[    4.482841] ib_qib 0000:02:00.0: Disabling notifier on HCA 0 irq 51
[    4.482911] ib_qib 0000:02:00.0: Disabling notifier on HCA 0 irq 53
[    4.483485] ib_qib: probe of 0000:02:00.0 failed with error -17

After installing linux-image-5.10.0-10-amd64 from Bullseye ib_qib loads without problem. Please fix.

The problem seems to be related to newer kernels.

btw: Is the openibd needed to acquire direct connect between two adapters? If so, how to install it on Sid?

This patch seems to fix the problem. Please apply.

https://lore.kernel.org/linux-rdma/1645106372-23004-1-git-send-email-mike.marciniszyn@cornelisnetworks.com/T/#u


Reply to: