[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Architecture variants for Debian / Ubuntu



On Wed, 6 Sept 2023 at 21:27, Guillem Jover <guillem@debian.org> wrote:
Hi!

Hi!

Thanks for the considered response. And sorry for the very slow reply.
 
On Fri, 2023-09-01 at 08:43:55 +1200, Michael Hudson-Doyle wrote:
> Recently the topic of exploiting newer instructions without dropping
> support for older machines has come up several times inside Ubuntu
> engineering. I understand this topic has come up several times in the past
> for Debian as well, but nothing has really come of it to date.

I also had a chat about this with Matthias Klose (CCed) around 2022-05.

> I've spent a while thinking through the options and coming up with a design
> and wrote some notes into a wiki page:
> https://wiki.debian.org/ArchitectureVariants

I think we are already doing 1, 2 and 3. I agree 4 is just wrong. And
something like 5 is what I suggested to Matthias for Ubuntu when we
last discussed it as the best way to go about this.

OK, glad we agree to this point.
 
I'm not sure I entirely agree with the requirements you set forth
though:

 - I think such optimized builds might need to be done with "special
   toolchains" (these could simply be wrappers over the host compiler
   passing the appropriate flags via command-line or via specs or
   similar, not necessarily full blown toolchains), passing these via
   something like dpkg-buildflags seems currently unreliable, as I don't
   think we have full coverage in packages (neither for all compilers
   available)? Although it would be better as it would centralize the
   management. (For reference this is in part how rpm handles this:
    https://github.com/rpm-software-management/rpm/blob/master/rpmrc.in)

I agree that is not completely clear what the best approach here is, do we change the defaults of gcc or influence things via default buildflags.

I'm sure there are packages that do not respect dpkg-buildflags during build but the consequences of this do not seem all that great -- such packages would not be optimized for the variant / ISA but if someone manages to notice this, they can fix the bug.

OTOH, having the compiler default change may be a bit of a surprise for people who build binaries for deployment not via Debian packages. (Do our compilers in general target the same baseline as Debian does for a given architecture?).
 
 - Perhaps that's a limitation from the archive software side, but
   requiring to place the binary packages in the same pool seems
   rather restrictive (it forces different filenames for example).

We are considering supporting multiple variant/ISAs in the primary Ubuntu archive, so if we get that far then yes, we want to have all the binary packages in the same pool. The first steps don't have to support this I guess.
 
 - I guess it might be nice for the ISA to be passed down to the
   dpkg tools, but I don't think this is strictly necessary? A
   frontend like apt could also decide based on metadata in say the
   Release file, although not having the actual installed package
   metadata on whether it was a different ISA build or not would make
   its job more inconvenient. In any case I don't have a big issue
   with recording this via dpkg-gencontrol or similar if necessary.

I agree, I don't think it's /strictly/ required that the target ISA is recorded in the deb. But I think adding a field for it reduces scope for confusion later.
 
On the specific implementation details:

 - Changing the Architecture format (as in adding colons there) seems
   like a non-starter, and I expect that would break lots of things
   (I mean it could be done but I'm not sure it's worth it for this).
   Recording this mostly as a hint than anything else, via another
   field (if necessary at all) I think would be best.

Agreed.
 
 - As covered in previous discussions, dpkg could (but I don't think
   it's necessary) check whether the .deb is runnable on the current
   hw, but that's tricky as chrootless installs need to be taken
   into account, etc. It should certainly not be part of dependency
   resolution.

I'm sorry, what is a chrootless install? But I think I agree here too: tricky and just not really worth it.
 
 - I'm not fond of having to change the binary package name format
   either for this (name_version_arch.deb) even if at least dpkg
   itself does not care (but I know other tools do care), and
   depending on the format I'd expect things to break (this goes
   back to the shared pool concern).

I don't think this is avoidable in the long run. I must admit I have generally thought of the presence of the architecture name in the .deb file name to be more a convention than part of the format (and the "real" indication of a binary package's architecture is in DEBIAN/control).
 
 - If dpkg-architecture needs to be aware of this, then this might need
   to be auto-detectable from just the current toolchain being used.

So you are saying to configure a build environment for, say, x86-64-v3 you would configure gcc with --with-arch64=x86-64-v3 and then dpkg-architecture would parse the output of gcc -Q --help=target to set DEB_HOST_ARCH_VARIANT appropriately? (modulo mistakes in details) Or do you mean something else entirely?
 
Some of the above problems could perhaps be avoided if we introduced
a concept of architecture aliases/ISAs (similar to what rpm has), which
would side-step the pool sharing issue, the binary package renaming,
etc. One big issue with this is that it requires for dpkg to have an
exhaustive table of all such aliases, and if there's ever a new alias
added, old dpkg versions need to be updated or they will not understand
what they match with. So this does not seem ideal either. So I guess this
is a variation over your proposal, but perhaps this could still be used
in specific contexts, say only at build-time (but not for dependency
relationships), for repo management (say binary-arm64v9/Packages.xz),
or binary package names where the field would specify the actual name
for the filename, say:

  Architecture: arm64
  ArchitectureIsa: arm64v9

or maybe better:

  Architecture: arm64
  ArchitectureIsa: v9

resulting in dpkg-deb generating:

  binpkg_1.0-1_arm64v9.deb

but targeting arm64.

I'm not sure but I think you have talked yourself into suggesting something very similar to my proposal here?
 
I also think I prefer naming this explicitly as ISA
variants, if you will, than just architecture variants as that gives
way too much room

Certainly I think all the interesting use cases are basically changing the set of instructions emitted by the toolchains by default. I suppose you could have a variant that changed the set of hardening flags or something but that doesn't seem an especially good idea. So I guess I'd be happy with s/ArchitectureVariant/ArchitectureISA/ everywhere.
 
(which perhaps we want, but then that has other
implications over compatibility), and for the field perhaps just Isa is
better, to avoid the implicit repetition of
ArchitectureInstructionSetArchitecture :), but that makes it less easy
to associate both as related.

In the end though, I think there are perhaps bigger constraints from
the infra side of things than the package tooling, stuff like archive
management software, or binary transition migration and similar.

I think I managed to convince myself that most things like britney and ben can and should treat each variant/ISA as a separate architecture. It depends a bit how publication is done in the case where not all packages are built for all ISAs but not in very interesting ways I think. And my intention is to start with amd64v3 and build everything for this ISA (as we have heaps of builder capacity on amd64 in Ubuntu) and sidestep worrying about that for a little while.
 
> In terms of building consensus around this design, I thought it makes sense
> to start at the bottom of the stack and so here I am on this mailing list
> :-) I guess in due course this could become a DEP, and would certainly need
> to be discussed on debian-devel before getting too far.

I'm not sure there's ever been much of a wide interest in something
like this in Debian TBH. Due to deployment and increased infra
overhead at least?

Yes that's fair. And as I said somewhere, I myself am not proposing to support any additional ISAs in Debian at this time.
 
> What do you think? Have I missed any glaring implications?

No, I think the overall picture is about right, and captures most of the
things we have discussed at various times and places in the past. :)

I am very happy to read this!
 
> Is there a better way of doing this?

I think starting from 5, the rest are probably just details to hammer
out, but not insurmountable things.

Great. The things I see as a bit vague at a base level currently are:

* Should the ISA influence the toolchain via toolchain defaults or dpkg-buildflags?
* How is the default ISA for a buildd chroot selected?

There is also the question of whether partial coverage of an ISA is handled by the package publisher or client side in apt but that's at least one level higher.

Cheers,
mwh 

Reply to: