[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Validating tarballs against git repositories



Simon Josefsson <simon@josefsson.org> writes:
> Russ Allbery <rra@debian.org> writes:

>> I believe you're talking about two different things.  I think Sean is
>> talking about preimage resistance, which assumes that the known-good
>> repository is trusted, and I believe Simon is talking about
>> manufactured collisions where the attacker controls both the good and
>> the bad repository.

> Right.  I think the latter describes the xz scenario: someone could have
> pushed a maliciously crafted commit with a SHA1 collision commit id, so
> there are two different git repositories with that commit id, and a
> signed git tag on that commit id authenticates both trees, opening up
> for uncertainty about what was intended to be used.  Unless I'm missing
> some detail of how git signed tag verification works that would catch
> this.

This is also my understanding.

>> The dgit and tag2upload design probably (I'd have to think about it
>> some more, ideally while bouncing the problem off of someone else,
>> because I've recycled those brain cells for other things) only needs
>> preimage resistance, but the general case of a malicious upstream may
>> be vulnerable to manufactured collisions.

> It is not completely clear to me: How about if some malicious person
> pushed a commit to salsa, asked a DD to "please review this repository
> and sign a tag to make the upload"?  The DD would presumably sign a
> commit id that authenticate two different git trees, one with the
> exploit and one without it.

Oh, hm, yes, this is a good point.  I had forgotten that tag2upload was
intended to work by pushing a tag to Salsa.  This means an attacker can
potentially race Salsa CI to move that tag to the malicious tree before
the tree is fetched by tag from Salsa, or reuse the signed tag with a
different repository with the same SHA-1.

The first, most obvious step is that one has to make sure that a signed
tag is restricted to a specific package and version and not portable to a
different package and/or version that has the same SHA-1 hash due to
attacker construction.  There are several obvious ways that could be done;
the one that comes immediately to mind is to require the tag message be
the source package name and version number, which is good practice anyway.

I think any remaining issues could be addressed with a fairly simple
modification to the protocol: rather than pushing the signed tag to Salsa,
the DD reviewer should push the signed tag to a separate archive server
similar to that used by dgit today.  As long as the first time the signed
tag leaves the DD's system is in conjunction with a push of the
corresponding reviewed tree to secure project systems, this avoids the
substitution problem.  The tag could then be pushed back to Salsa, either
by the DD or by the service.

This unfortunately means that one couldn't use the Salsa CI service to do
the source package construction, and one has to know about this extra
server.  I think that restriction comes from the fact that we're worried
an attacker may be able to manipulate the Salsa Git repository (through
force pushes and tag replacements, for example), whereas the separate
dedicated archive server can be more restrictive and never allow force
pushes or tag moves, and reject any attempts to push a SHA-1 hash that has
already been seen.

Another possible option would be to prevent force pushes and tag moves in
Salsa, since I think one of those operations would be required to pull off
this attack, but maybe I'm missing someting.  One of the things I'm murky
on is exactly what Git operations are required to substitute the two trees
with identical SHA-1 hashes.  That property is going to break Git in weird
ways, and I'm not sure what that means for one's ability to manipulate a
Git repository over the protocols that Salsa exposes.

Obviously it would be ideal if Git used stronger hashes than SHA-1 for
tags, so that one need worry less about all of this.

Even if my analysis is wrong, I think there are some fairly obvious and
trivial additions to the tag2upload process that would prevent this
attack, such as building a Merkle tree of the reviewed source tree using a
SHA-256 hash and embedding the top hash of that tree in the body of the
signed tag where it can be verified by the archive infrastructure.  That
might be a good idea *anyway*, although it does have the unfortunate side
effect of requiring a local client to produce a correct tag rather than
using standard Git signed tags.  Uploading to Debian currently already
semi-requires a custom local client, so to me this isn't a big deal,
although I think there was some hope to avoid that.

(These variations unfortunately don't help with the upstream problem.)

-- 
Russ Allbery (rra@debian.org)              <https://www.eyrie.org/~eagle/>


Reply to: