[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#954272: slurmd: SLURM not working with OpenMPI



Package: slurmd
Version: 19.05.3.2-2+b1
Severity: important

Dear Maintainer,

I am trying to get SLURM working on a single node. I have installed and configured slurmd and slurmctld.

A simple test like `srun hostname` works, even on multiple cores. However, when trying to use MPI, it crashes with the following error message:

*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)

This happens even in the most simple "Hello World" case, as long as the program is MPI-enabled.

I am trying to use OpenMPI (4.0.2) from the Debian repositories. `srun --mpi list` returns:

srun: MPI types are...
srun: openmpi
srun: pmi2
srun: none

I have tried all options, but the result is the same in all cases.

Maybe this is user error, as this is my first time setting up SLURM, but I have not been able to find any possible causes/solutions and I am kind of stuck at this point.

Regards,

Lars

-- System Information:
Debian Release: bullseye/sid
  APT prefers testing
  APT policy: (500, 'testing')
Architecture: amd64 (x86_64)

Kernel: Linux 5.4.0-3-amd64 (SMP w/64 CPU cores)
Kernel taint flags: TAINT_OOT_MODULE, TAINT_UNSIGNED_MODULE
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages slurmd depends on:
ii  libc6                    2.30-2
ii  libhwloc15               2.1.0+dfsg-4
ii  liblz4-1                 1.9.2-2
ii  libnuma1                 2.0.12-1+b1
ii  libpam0g                 1.3.1-5
ii  lsb-base                 11.1.0
ii  munge                    0.5.13-2+b1
ii  openssl                  1.1.1d-2
ii  slurm-wlm-basic-plugins  19.05.3.2-2+b1
ii  ucf                      3.0038+nmu1
ii  zlib1g                   1:1.2.11.dfsg-2

slurmd recommends no packages.

slurmd suggests no packages.

-- no debconf information


Reply to: