[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Upstream dist tarball transparency (was Re: Validating tarballs against git repositories)



Hi Guillem,

On Wed, 3 Apr 2024 19:36:33 +0200, Guillem wrote:
> On Fri, 2024-03-29 at 23:29:01 -0700, Russ Allbery wrote:
> > On 2024-03-29 22:41, Guillem Jover wrote:
> > I think with my upstream hat on I'd rather ship a clear manifest (checked
> > into Git) that tells distributions which files in the distribution tarball
> > are build artifacts, and guarantee that if you delete all of those files,
> > the remaining tree should be byte-for-byte identical with the
> > corresponding signed Git tag.  (In other words, Guillem's suggestion.)
> > Then I can continue to ship only one release artifact.
>
> I've been pondering about this and I think I might have come up with a
> protocol that to me (!) seems safe, even against a malicious upstream. And
> does not require two tarballs which as you say seems cumbersome, and makes
> it harder to explain to users. But I'd like to run this through the list
> in case I've missed something obvious.

Does this cater for situations where part of the preparation of a source
tarball involves populating a directory with a list of filenames that
correspond to hostnames known to the source preparer?

If that set of hostnames changes, then regardless of the same source
VCS checkout being used, the resulting distribution source tarball could
differ.

Yes, it's a hypothetical example; but given time and attacker patience,
someone is motivated to attempt any workaround.  In practice the
difference could be a directory of hostnames or it could be a bitflag
that is part of a macro that is only evaluated under various nested
conditions.

To take a leaf from the Reproducible Builds[1] project: to achieve a
one-to-one mapping between a set of inputs and an output, you need to
record all of the inputs; not only the source code, but also the build
environment.

I'm not yet convinced that source-as-was-written to distributed-source-tarball
is a problem that is any different to that of distributed-source-tarball to
built-package.  Changes to tooling do, in reality, affect the output of
build processes -- and that's usually good, because it allows for
performance optimizations.  But it also necessitates the inclusion of the
toolchain and environment to produce repeatable results.

Regards,
James

[1] - https://reproducible-builds.org/


Reply to: