• Patrick Steinhardt's avatar
    checks: Fix revalidation of preexisting commits · d86514f1
    Patrick Steinhardt authored
    When pushing commits into a repository, then client and server negotiate
    a packfile containing all objects which are necessary to end up with a
    fully connected graph on the server-side. In general, this contains at
    least all objects which have been newly introduced between the set of
    all old and new references. But in some cases, it can be that Git will
    send packfiles which knowingly contain objects which existed on the
    server side already, e.g. when Git decides to reuse deltas from an
    existing packfile where the delta base is a preexisting commit. As a
    result, any well-formed packfile is a superset of objects required to
    satisfy the update.
    
    In v14.3 we have refactored access checks to use the quarantine
    directory to enumerate new commits directly. Because of above property,
    we may get too many objects from the object quarantine directory, which
    means that as a result we may perform access checks on commits which in
    fact aren't new in case the client decided to include these in the pack.
    While this is not a problem in most access checks (an object which is in
    the main repository but which we re-check is going to still pass the
    checks), other checks are more sensitive. Most importantly, push rules
    may require a commit to be created by the author who is currently
    performing the change. If we include preexisting commits of a different
    author in such a check, then it is totally expected that the access
    check will now fail. As a result, we must never include preexisting
    commits in push rule access checks.
    
    To determine new commits in push rules, we do an in-memory walk of
    commits returned from the quarantine directory, where we walk from the
    tip of each change until we are not able to satisfy the commit's parents
    anymore. And in this case, we happily traverse past commits which are
    known already inc ase those were returned from the quarantine directory.
    To fix this, we need to abort the walk as soon as we hit an already
    known object.
    
    The problem is though that we have no easy way to determine the already
    known object in the general case. But we can do so in limited cases:
    when the change we're processing has both an old and a new revision
    (that is, it is an "update"), then we simply skip adding oldrev to the
    result set. This doesn't work though for branch creations, where we
    ain't got no oldrev. We thus fall back to enumerating commits not via
    the quarantine directory in that case, but instead by using a revision
    walk with `--not --all`. This walk will not contain any objects which
    are referenced by any reference, and thus we can be sure that the
    in-memory walk will not traverse past any preexisting object.
    
    Implement this schema. Unfortunately this is going to be a lot less
    performant compared to using the quarantine directory in all cases. But
    better be less performant than wrong.
    
    Changelog: fixed
    d86514f1
To find the state of this project's repository at the time of any of these versions, check out the tags.
changes_access.rb 4.26 KB