[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Architecture variants for Debian / Ubuntu



Hi!

On Thu, 2023-09-21 at 14:43:42 +1200, Michael Hudson-Doyle wrote:
> Thanks for the considered response. And sorry for the very slow reply.

Idem! :)

> On Wed, 6 Sept 2023 at 21:27, Guillem Jover wrote:
> > I'm not sure I entirely agree with the requirements you set forth
> > though:
> >
> >  - I think such optimized builds might need to be done with "special
> >    toolchains" (these could simply be wrappers over the host compiler
> >    passing the appropriate flags via command-line or via specs or
> >    similar, not necessarily full blown toolchains), passing these via
> >    something like dpkg-buildflags seems currently unreliable, as I don't
> >    think we have full coverage in packages (neither for all compilers
> >    available)? Although it would be better as it would centralize the
> >    management. (For reference this is in part how rpm handles this:
> >     https://github.com/rpm-software-management/rpm/blob/master/rpmrc.in)
> >
> 
> I agree that is not completely clear what the best approach here is, do we
> change the defaults of gcc or influence things via default buildflags.
> 
> I'm sure there are packages that do not respect dpkg-buildflags during
> build but the consequences of this do not seem all that great -- such
> packages would not be optimized for the variant / ISA but if someone
> manages to notice this, they can fix the bug.
> 
> OTOH, having the compiler default change may be a bit of a surprise for
> people who build binaries for deployment not via Debian packages. (Do our
> compilers in general target the same baseline as Debian does for a given
> architecture?).

Right, given that the failure mode would be just "no-optimized-builds",
and should not end up with those packages being broken, at most just
redundant with the baseline ones, then I guess controlling it either
way would seem fine, yes.

(Also if the packages are reproducible, and end up being not optimized
this might be detectable as producing identical artifacts as on the
baseline.)

> >  - Perhaps that's a limitation from the archive software side, but
> >    requiring to place the binary packages in the same pool seems
> >    rather restrictive (it forces different filenames for example).

> We are considering supporting multiple variant/ISAs in the primary Ubuntu
> archive, so if we get that far then yes, we want to have all the binary
> packages in the same pool. The first steps don't have to support this I
> guess.

Ok. Just a note that even if served from the primary archive, there
could be multiple pools (like the multi-pool setup on debian-ports),
as the entry point are the (In)Release files. But, yes, the other
option would be to use the variant/ISA name as a "fake arch" just in
the binary package name.

> >  - I guess it might be nice for the ISA to be passed down to the
> >    dpkg tools, but I don't think this is strictly necessary? A
> >    frontend like apt could also decide based on metadata in say the
> >    Release file, although not having the actual installed package
> >    metadata on whether it was a different ISA build or not would make
> >    its job more inconvenient. In any case I don't have a big issue
> >    with recording this via dpkg-gencontrol or similar if necessary.

> I agree, I don't think it's /strictly/ required that the target ISA is
> recorded in the deb. But I think adding a field for it reduces scope for
> confusion later.

Yes, agreed.

> > On the specific implementation details:
> >
> >  - As covered in previous discussions, dpkg could (but I don't think
> >    it's necessary) check whether the .deb is runnable on the current
> >    hw, but that's tricky as chrootless installs need to be taken
> >    into account, etc. It should certainly not be part of dependency
> >    resolution.

> I'm sorry, what is a chrootless install? But I think I agree here too:
> tricky and just not really worth it.

https://wiki.debian.org/Teams/Dpkg/Spec/InstallBootstrap

This can be used among other things to set up foreign chroots, by
running the host tools, so disallowing installation could be
problematic. Even though I guess there could be a warning about this,
or maybe it could be controlled through a force option, although both
seems like they could be disruptive.

> >  - I'm not fond of having to change the binary package name format
> >    either for this (name_version_arch.deb) even if at least dpkg
> >    itself does not care (but I know other tools do care), and
> >    depending on the format I'd expect things to break (this goes
> >    back to the shared pool concern).
> 
> I don't think this is avoidable in the long run. I must admit I have
> generally thought of the presence of the architecture name in the .deb file
> name to be more a convention than part of the format (and the "real"
> indication of a binary package's architecture is in DEBIAN/control).

Yes and no I guess. In theory the (canonical) information should be
extracted from the DEBIAN/control from inside the .deb, in practice
I think tools (?) (might) try to use heuristics from just the filename
to avoid having to open, uncompress and parse every .deb around, for
performance reasons.

If the only change in the package filename format is in the <arch> part
where we'd use a name which would otherwise be valid as an arch name (so,
no weird symbols, or «-» separators that are not intended to split <os>
and <cpu> or similar), then using a name for the variant/ISA would be
fine.

> >  - If dpkg-architecture needs to be aware of this, then this might need
> >    to be auto-detectable from just the current toolchain being used.

> So you are saying to configure a build environment for, say, x86-64-v3 you
> would configure gcc with --with-arch64=x86-64-v3 and then dpkg-architecture
> would parse the output of gcc -Q --help=target to set DEB_HOST_ARCH_VARIANT
> appropriately? (modulo mistakes in details) Or do you mean something else
> entirely?

That would be one solution yes, which could give automatic bijective
mappings, although ideally with a machine-readable way to get at it,
which I'm not sure we have currently. For example code in dpkg-dev
already runs «$CC -dumpmachine» to infer the host architecture to use
during builds.

While using a triplet variation could be a way to do that, that would
require such triplet support for each variant/ISA, which tends to be
very painful to introduce if it's not there already, so I'd not
consider this specific way a viable option.

> > Some of the above problems could perhaps be avoided if we introduced
> > a concept of architecture aliases/ISAs (similar to what rpm has), which
> > would side-step the pool sharing issue, the binary package renaming,
> > etc. One big issue with this is that it requires for dpkg to have an
> > exhaustive table of all such aliases, and if there's ever a new alias
> > added, old dpkg versions need to be updated or they will not understand
> > what they match with. So this does not seem ideal either. So I guess this
> > is a variation over your proposal, but perhaps this could still be used
> > in specific contexts, say only at build-time (but not for dependency
> > relationships), for repo management (say binary-arm64v9/Packages.xz),
> > or binary package names where the field would specify the actual name
> > for the filename, say:
> >
> >   Architecture: arm64
> >   ArchitectureIsa: arm64v9
> >
> > or maybe better:
> >
> >   Architecture: arm64
> >   ArchitectureIsa: v9
> >
> > resulting in dpkg-deb generating:
> >
> >   binpkg_1.0-1_arm64v9.deb
> >
> > but targeting arm64.

> I'm not sure but I think you have talked yourself into suggesting something
> very similar to my proposal here?

Ah sorry, yeah, didn't mean to present it as a new idea, I was mostly
trying to walk over the issues, and refine upon your initial idea,
with those constraints applied. :)

> > On Fri, 2023-09-01 at 08:43:55 +1200, Michael Hudson-Doyle wrote:
> > > Is there a better way of doing this?
> >
> > I think starting from 5, the rest are probably just details to hammer
> > out, but not insurmountable things.

> Great. The things I see as a bit vague at a base level currently are:
> 
> * Should the ISA influence the toolchain via toolchain defaults or
> dpkg-buildflags?
> * How is the default ISA for a buildd chroot selected?

So the clear downsides of either modifying the default toolchain or
having to provide an additional one is that this seems pretty heavy
weight. Also because people might want to build optimized variants
locally w/o having to mess with their already existing toolchains.
(I'm not sure whether something going along the lines of
<https://git.hadrons.org/cgit/debian/fakecross.git> could be an
option, although as mentioned above, if that would imply new triplets,
then probably not.)

So the easiest way might indeed be by controlling this via an envvar,
which dpkg-buildpackage could also setup internally via a new option,
say --arch-isa=amd64v3 or similar to make this slightly more
discoverable. Which would be easy to use from the buildds too I guess.

> There is also the question of whether partial coverage of an ISA is handled
> by the package publisher or client side in apt but that's at least one
> level higher.

Yeah, that would be of no concern to dpkg, I think.

Thanks,
Guillem


Reply to: