[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Speedup when using qemu to cross-build packages.



Quoting Warlich, Christof (2022-09-06 16:44:58)
> I need to rebuild some several hundred Debian packages for amd64, arm64,
> armhf and mipsel. This works pretty well for all native (i.e.amd64) packages,
> but as cross-building seems to only function for less than half of the
> packages, I resorted to use qemu-user emulation within architecture-specific
> docker containers to fake a native build environment for the other
> architectures. But while this _does_ work quite reliably, it’s certainly dead
> slow: A clean build of e.g. gcc-10 for armhf takes almost a week!
> 
> To speed things up, I started to mess with the armhf container by replacing
> the “native” armhf GCC with its related cross-GCC (i.e.
> gcc-10-arm-linux-gnueabihf:amd64) with all its dependent libraries from
> amd64: Multiarch is my best friend here 😊. Doing the same for g++ and
> binutils gave quite a performance boost, but the build was still much slower
> than a real native build. Further investigation showed that e.g.
> configure-scripts are still damn slow, so I also replaced the armhf bash with
> the amd64 bash (and a few other binaries) …
> 
> Ok, coming to the point: As the build speed for the foreign architectures is
> still far from being optimal even with my hacks, I would rather like to do
> things the other way round, i.e. starting from an amd64 container, but making
> it “think” to be an armhf container with minimal changes, i.e. with as much
> as possible adm64 executables left over, to achieve optimal speed. GCC and
> the like would still need to be replaced by their cross-counterparts
> generation for armhf almost the same way as described above, but what else
> comes to mind that needs to be “fixed”? Certainly, apt must be replaced by
> apt:armhf to make e.g. “apt-get build-dep” install the build dependencies for
> armhf instead of amd64 … (note that even apt-get build-dep
> --host-architecture=armhf fails for quite a few packages!). But what else
> needs to be changed to armhf-ify the amd64 container? uname comes to mind,
> and probably dpkg?
> 
> Finally, I may certainly be entirely off the rocker with my idea, maybe just
> because cross-building packages _does_ always work, but I’m just doing it
> wrong? But _if_ cross-building _is_ a known issue and considering how far I
> already got with my initial speedup approach (e.g. by partly replacing armhf
> executables with amd64 executables), I would very much like to get some
> feedback from more experienced Debian experts. And after all, as reliable and
> timely cross-builds of all Debian packages does not seem to be an unusual
> desire to me, maybe there is something out there already that does the job
> but that I so far have totally missed?

If half the time people spend on hacks would be spent on a proper solution, so
much more could've been done already... I know, everybody can spend their free
time on the stuff they find fun but sometimes it's a bit frustrating...

This topic reminds me of a problem I was facing a few days ago. If you search
online for the problem of humongous (several hundred GB to TB) /var/log/lastlog
and /var/log/faillog, you see lots of people recommending each other and code
implementing just adding --no-log-init to their call to the useradd tool. This
hack is so prevalent that it even made it into the official "docker best
practices" guide on docker.com. So all these people have the same problem
(super large /var/log/lastlog) and they all find out about this hack and spend
time on working around the problem with this hack. The "proper" solution is
actually only four lines of code in useradd.c and was recently implemented by
David Kalnischkies. Starting from the next release of shadow, nobody will be
facing these super large log files anymore after calling useradd. Four lines
changed. Compare the effort to do this with the countless issues you find on
github.com, gitlab.com or stackoverflow. Think of all the time that could've
been saved if somebody before David would've spent some of the time they spent
on implementing the --no-log-init hack to solve the problem at its core.

Cross-building is often the same. Every new architecture that comes along is
bootstrapped in a rush and everybody rewrites the same hacks over and over and
over again.

Kudos here to Helmut's rebootstrap which is absolutely the right way forward
even if the world thinks otherwise and rather repeats last year's dirty hacks
instead.

In an attempt to be at least a little bit constructive with this email: Your
best solution is probably to buy two new computers. You probably already have
an amd64 machine, so that problem is solved. A computer that can run arm64 can
also run armhf so you only need one machine to run both architectures natively.
The second new machine is one that can run mipsel. This will cost a bit of
money but if you buy the stuff second-hand it will not cost much and if you
really need to compile several hundred packages, the money might be worth the
time you will be saving because qemu indeed is very slow.

If instead, you want to continue with the hack you started, then it sounds that
what you want is the equivalent of what box64 does for amd64:
https://github.com/ptitSeb/box64 Essentially, this uses a mechanisms called
dynamic recompilation where a foreign architecture binary will be able to use
native architecture libraries:
https://box86.org/2021/07/inner-workings-a-high%e2%80%91level-view-of-box86-and-a-low%e2%80%91level-view-of-the-dynarec/
This is still hacky and it will still fail in many situations but there you go.
:)

Thanks!

cheers, josch

Attachment: signature.asc
Description: signature


Reply to: