[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#910485: Confirm issue with libpsm2-2/11.2.68-1



Hi Mehdi,

The issue manifests itself as follows. If I have libpsm2-2/11.2.68-1
installed, any MPI program (whether a big program or a small MPI hello
world example) has a 15-second delay when I run it. And for each rank
the following error message is displayed:

cyberdyne.25917hfi_wait_for_device: The /dev/hfi1_0 device failed to
appear after 15.0 seconds: Connection timed out

(cyberdyne is the host name of my machine.)

If I downgrade libpsm2-2 to 10.3.58-2 the MPI programs run instantly
(without a 15 sec delay) and no error message about /dev/hfi1_0 is
displayed.

I'm on Debian testing with libopenmpi3/3.1.2-6.

Thanks for looking into this and please let me know if you need more
information.


Thanks,
Jonas


-- 
Jonas Lippuner, PhD
Scientist
Computational Physics and Methods, CCS-2
Center for Theoretical Astrophysics
Los Alamos National Laboratory
jlippuner@lanl.gov
505-667-1646
http://jonaslippuner.com


-----Original Message-----
From: Mehdi Dogguy <mehdi@dogguy.org>
To: "Lippuner, Jonas" <jlippuner@lanl.gov>, 910485@bugs.debian.org
Subject: Re: Bug#910485: Confirm issue with libpsm2-2/11.2.68-1
Date: Tue, 16 Oct 2018 10:10:04 +0200

Hi Jonas,

On 2018-10-15 19:54, Lippuner, Jonas wrote:
> I'm having the same issue with libpsm2-2 version 11.2.68-1.
> Downgrading
> to 10.3.58-2 fixes it for me.

Can you please explain how you experienced the bug? I've understood 
Drew's
case, but maybe yours is slightly different.


Reply to: