Bug#910485: Confirm issue with libpsm2-2/11.2.68-1
Hi Mehdi,
The issue manifests itself as follows. If I have libpsm2-2/11.2.68-1
installed, any MPI program (whether a big program or a small MPI hello
world example) has a 15-second delay when I run it. And for each rank
the following error message is displayed:
cyberdyne.25917hfi_wait_for_device: The /dev/hfi1_0 device failed to
appear after 15.0 seconds: Connection timed out
(cyberdyne is the host name of my machine.)
If I downgrade libpsm2-2 to 10.3.58-2 the MPI programs run instantly
(without a 15 sec delay) and no error message about /dev/hfi1_0 is
displayed.
I'm on Debian testing with libopenmpi3/3.1.2-6.
Thanks for looking into this and please let me know if you need more
information.
Thanks,
Jonas
--
Jonas Lippuner, PhD
Scientist
Computational Physics and Methods, CCS-2
Center for Theoretical Astrophysics
Los Alamos National Laboratory
jlippuner@lanl.gov
505-667-1646
http://jonaslippuner.com
-----Original Message-----
From: Mehdi Dogguy <mehdi@dogguy.org>
To: "Lippuner, Jonas" <jlippuner@lanl.gov>, 910485@bugs.debian.org
Subject: Re: Bug#910485: Confirm issue with libpsm2-2/11.2.68-1
Date: Tue, 16 Oct 2018 10:10:04 +0200
Hi Jonas,
On 2018-10-15 19:54, Lippuner, Jonas wrote:
> I'm having the same issue with libpsm2-2 version 11.2.68-1.
> Downgrading
> to 10.3.58-2 fixes it for me.
Can you please explain how you experienced the bug? I've understood
Drew's
case, but maybe yours is slightly different.
Reply to: