Hi, M. Zhou, on 2023-03-21: > On Tue, 2023-03-21 at 19:41 +0100, Christian Kastner wrote: > > I agree that a split of some sort will probably be necessary in the > > short-to-midterm, and your proposed solutions looks reasonable to me. > > > > One difficulty we will need to figure out one way or another is how to > > actually bring the user to the right package. What do we do when the > > user wants to `apt install pytorch-rocm`? > > If possible, I suggest we stick to only one single binary package that > supports multiple selected architectures. The pytorch-rocm popcon > will not likely to be very large to deserve such a special treatment. > Neither does pytorch-cuda. I'm somewhat concerned that the sheer size of the library might propagate to bigger .deb binary packages which in turn would cause issues with the infrastructure. I can't recall when I've read it, but I do recall people mentionning problems once a .deb exceeds the size of a gigabyte or two. Splitting libraries might thus end up being needed if such scenario were to occur. Looking at Xorg userland video drivers, I even thought it might be possible to mimick the layout of xserver-xorg-video-* packages, which are pulled by default by xserver-xorg-video-all, which in turn will get all the drivers for all the gpu out there; people interested in just one gpu can remove the -all and all other userland Xorg drivers. On the other hand, the package compression algorithm seems to do a good job of deduplicating the common segments of architecture specific codes. In the case of the librocsparse, I see almost a factor eighteen of compression: $ du -sh librocsparse0_5.3.0+dfsg-3_amd64.deb 114M librocsparse0_5.3.0+dfsg-3_amd64.deb $ du -sh librocsparse0_5.3.0+dfsg-3_amd64/usr/lib/x86_64-linux-gnu/librocsparse.so.0.1 2.0G librocsparse0_5.3.0+dfsg-3_amd64/usr/lib/x86_64-linux-gnu/librocsparse.so.0.1 So perhaps this is a non-problem (at least regarding rocsparse, but other components may prove to be more difficult if they are much larger). I'm not sure what to think. Long term there will be a need upstream to split the libraries when architectures will add up, otherwise the model will not scale due to the issues pointed out by Cory. Short term the monolithic library is not good, but fair enough, and splitting would introduce a number of issues pointed out by Mo Zhou. I was hoping to produce a more useful message, but so be it… -- Étienne Mollier <emollier@emlwks999.eu> Fingerprint: 8f91 b227 c7d6 f2b1 948c 8236 793c f67e 8f0d 11da Sent from /dev/pts/2, please excuse my verbosity.
Attachment:
signature.asc
Description: PGP signature