[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Autopkgtest, MPI and network access (Was: Autopkgtest and MPI code)



Le 04/10/2016 à 20:28, Alastair McKinstry a écrit :
> Deat Thibault
> 
> Were these tests done under fakeroot ? See the thread from earlier
> (yesterday and today) on debian-devel. Basically, openmpi breaks under
> fakeroot as it gets unexpected credentials back from getsockopt()
> 

Dear Alastair,

I have seen this thread. The tests done manually for sure were not run
under fakeroot. I don't know for sure whether autopkgtest invokes
fakeroot, but I doubt it.

Kind regards, Thibaut.

> regards
> 
> Alastair
> 
> 
> 
> On 04/10/2016 17:09, Thibaut Paumard wrote:
>> Dear all,
>>
>> Last year I had trouble writing autopkgtest tests that would run
>> smoothly in a container. It seems that recent changes in openmpi have
>> broken it again.
>>
>> This is what worked last year:
>>
>> Le 26/05/2015 à 10:07, Johannes Ring a écrit :
>>> On Sat, May 23, 2015 at 7:10 PM, Thibaut Paumard <thibaut@debian.org> wrote:
>>>> What does work on my box is:
>>>> orterun --mca btl_tcp_if_include lo <job>
>>>>
>>>> This never crashes the machine, but it does not work in a chroot (for
>>>> lack of a loopback interface, I guess). I get this error message:
>>> It works for me in pbuilder if I set OMPI_MCA_orte_rsh_agent=/bin/false:
>>>
>>> (pbuild22309)root@debian-t420s:/# orterun --mca btl_tcp_if_include lo ls
>> This has been working fine until mid-September this year:
>>
>> https://ci.debian.net/packages/g/gyoto/unstable/amd64/
>>
>> I checked that I can still run gyoto with MPI parallelisation with
>> openmpi 2, which is reassuring, but:
>>
>>   1- gyoto with MPI parallelisation fails again if I turn off network
>> access (processes are unable to reach one-another). To turn off network
>> access, I issue "ifdown eth0".
>>
>>   2- even with networking turned on, using "--mca btl_tcp_if_include lo"
>> makes my gyoto job fail, whatever the value of OMPI_MCA_orte_rsh_agent
>> (processes are unable to reach one-another);
>>
>>   3- I removed the two variables from the test script, it runs fine when
>> started manually as long as network is available, but if I let
>> autopkgtest run it with `null' as virtualization server, the job remain
>> stuck, apparently at the step when gyoto tries to spawn workers using
>> MPI_Comm_spawn.
>>
>> Those tests were done in VirtualBox.
>>
>> Looking at the last successful run of autopkgtest on ci.debian.net and
>> the first failed one, the version of openmpi seems to be the same
>> (libopenmpi1.10).
>>
>> Does anyone have a clue what might be going on here? I have the
>> impression I am facing two issues, one of which is related to ompenmpi,
>> the other one possibly to autopkgtest.
>>
>> Regards, Thibaut.
>>
> 


-- 
* Dr Thibaut Paumard       | LESIA/CNRS - Table équatoriale (bât. 5)   *
* Tel: +33 1 45 07 78 60   | Observatoire de Paris - Section de Meudon *
* Fax: +33 1 45 07 79 17   | 5, Place Jules Janssen                    *
* thibaut.paumard@obspm.fr | 92195 MEUDON CEDEX (France)               *

Attachment: smime.p7s
Description: Signature cryptographique S/MIME


Reply to: