[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: plan of deep learning team for next stable release?



Hi Christian,

On Sat, Nov 28, 2020 at 12:19:32PM +0100, Christian Kastner wrote:
> On 11/24/20 1:38 PM, Mo Zhou wrote:
> > I've just uploaded pytorch 1.7.0-2 to unstable, which should compile on
> > at least amd64 arm64 and ppc64el. There is no any non-cpu hardware
> > acceleration enabled yet.
> 
> official non-CPU support is probably going to be a larger issue, but I
> think the sooner we address this, the better.

Agreed. Now that the cpu-only version of pytorch has landed on the
archive. The -cuda version should not be hard to preprae.

> So, how support for accelerated packages look? One would need
> software/configuration support in the package itself,

For the cuda version of pytorch, one important and madatory piece of
software (non-free) is still missing from debian: cuDNN.

> and one would need
> hardware to test it during build and/or autopkgtest.

In terms of CUDA, at least hardware is not a problem for me as I must
use many nvidia gpus in my daily research work.
 
> Our buildds currently don't have the necessary hardware. I believe we
> should strive to change that. The funds are there (as recently addressed
> by the DPL) and accelerated computing is developing into a key area that
> Debian cannot miss out on, in my opinion.

We could make good use of our funds, but I'm not sure whether running
non-free software on non-free hardware for our specific purpose would be
accepted by the community. Maybe I can carefully raise an question on
-devel when we reached some kind of consensus.
 
> With regards to software/configuration, I think the way to go would be
> to request new build profiles (one per flavor), so that B-Ds only get
> installed where necessary, and packages only get built where possible.

Sounds like changes in apt/dpkg will be required. I'd recommend the
solution for caffe, see src:caffe and src:caffe-cuda (removed from
unstable).
 
> Ideally, we'd also have some porter boxes. This sounds like going far by
> Debian standards, but we are talking about only a few hundred Euros per
> box, and as the DPL said, we should make more use of our funds.

Official porterbox running too many non-free stuff does not sound
sensible.

> The elephant in the room is, of course, CUDA. It's non-free so that will
> irk a lot of people, but it's also the de facto standard, and I don't
> see what alternative we have. People needing accelerated computing today
> will rather leave behind Debian than CUDA.

This is indeed a good question to raise again. Without CUDA acceleration
I will not be able to use the pytorch compiled for Debian in my own
research work, and the use case for the cpu-only version is undoubtedly
limited.

> Thoughts?

The core questions is that are we willing to compromise to the cuda
software stack, in order to introduce something more useful and
practical.


Reply to: