Bug#954272: slurmd: SLURM not working with OpenMPI
On 20/07/2020 14:52, Lars Veldscholte wrote:
srun: error: (null) [0] /mpi_pmix.c:133 [init] mpi/pmix: ERROR:
pmi/pmix: can not load PMIx library
srun: error: Couldn't load specified plugin name for mpi/pmix: Plugin
init() callback failed
srun: error: cannot create mpi context for mpi/pmix
srun: error: invalid MPI type 'pmix', --mpi=list for acceptable types
Running `strace srun --mpi=pmix ./a.out` revealed that SLURM is
looking for the pmix library at
`/usr/lib/x86_64-linux-gnu/pmix/lib/libpmix.so`, which does not exist,
only `libpmix.so.2` exists.
Installing the package `libpmix-dev` installs this library (it
symlinks it to the same file `libpmix.so.2` is symlinked to).
Now, `srun --mpi=pmix ./a.out` is working!
I'm not 100% sure, but I think that the package `libpmix2` should also
install the file `libpmix.so`. The dev package shouldn't be required
for that, right?
Lars
pmix is transitioning from pmix2 -> pmix3 (at least in bullseye
timeframe) so it was important that the modules in $libdir/pmix/lib/pmix
be versioned, so I
renamed it to $libdir/pmix2/lib/pmix. I had thought that only
libpmix.so.2 accessed these modules so the path was ok, but looking at
slurm-llnl debian/rules
its clear slurmd uses "--with-pmix=/usr/lib/x86_64-linux-gnu/pmix"
which needs to be updated.
libX.so files are normally development-only ; I'll move libpmix.so into
libpmix2 from libpmix-dev to fix the above error.
Alastair
--
Alastair McKinstry, <alastair@sceal.ie>, <mckinstry@debian.org>, https://diaspora.sceal.ie/u/amckinstry
Misentropy: doubting that the Universe is becoming more disordered.
Reply to: