[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#954272: slurmd: SLURM not working with OpenMPI




On 20/07/2020 14:52, Lars Veldscholte wrote:
Hi,

I believe I have found a solution.

I must confess that I still don't fully understand the difference between the various PMI APIs, and which ones are supported by OpenMPI, but I found that the recommended way is to use PMIx.

However, PMIx was not working on my system even though libpmix2 is installed:

# srun --mpi pmix ./a.out

srun: error: (null) [0] /mpi_pmix.c:133 [init] mpi/pmix: ERROR: pmi/pmix: can not load PMIx library

srun: error: Couldn't load specified plugin name for mpi/pmix: Plugin init() callback failed

srun: error: cannot create mpi context for mpi/pmix

srun: error: invalid MPI type 'pmix', --mpi=list for acceptable types

Running `strace srun --mpi=pmix ./a.out` revealed that SLURM is looking for the pmix library at `/usr/lib/x86_64-linux-gnu/pmix/lib/libpmix.so`, which does not exist, only `libpmix.so.2` exists.

Which code is trying to load libpmix.so ? The compiled code should be loading libpmix.so.2 directly; the libpmix.so should only be needed

at build-time.


Perhaps the problem is that libpmix-dev is not installed at compile time. I can add it as a Dependency of libopenmpi-dev.


Installing the package `libpmix-dev` installs this library (it symlinks it to the same file `libpmix.so.2` is symlinked to).

Now, `srun --mpi=pmix ./a.out` is working!

I'm not 100% sure, but I think that the package `libpmix2` should also install the file `libpmix.so`. The dev package shouldn't be required for that, right?

Lars

Regards

Alastair

--
Alastair McKinstry, <alastair@sceal.ie>, <mckinstry@debian.org>, https://diaspora.sceal.ie/u/amckinstry
Misentropy: doubting that the Universe is becoming more disordered.


Reply to: