• Patrick Steinhardt's avatar
    checks: Fix combinatorial explosion in `#commits_for()` · 755ed5e3
    Patrick Steinhardt authored
    The function `#commits_for()` has been introduced via 156ce433
    (checks: Implement infrastructure for bulk diff checks, 2021-07-29) in
    order to allow bulk-loading of commits. What this function does is given
    a set of new commits and a specific object ID, it will do a graph walk
    of these new commits starting from this ID in order to extract only
    those commits which have been newly introduced via this object ID.
    
    As it turns out though, the implementation has a bug which causes
    combinatorial explosion: if a commit is reachable via multiple commits,
    then it will be walked and returned multiple times. This can happen if
    there are merge commits, where we'll now repeatedly walk all common
    ancestors of both commits. In case these again contain merge commits,
    the common ancestors again get walked multiple times. The result is that
    commits get walked exponentially many times. For the following graph
    with criss-cross merges, this causes us to return 768 commits instead of
    the expected 18:
    
          o---o---o---o---o---o---o---o
         / \ / \ / \ / \ / \ / \ / \ / \
        o   X   X   X   X   X   X   X   o
        \  / \ / \ / \ / \ / \ / \ / \ /
          o---o---o---o---o---o---o---o
    
    Fix the bug by removing seen commit IDs from the hash tracking commits
    by their object ID. Furthermore, the pending queue is converted to a set
    such that we don't re-add IDs which we have already seen before such
    that it doesn't exhibit exponential growth.
    
    Changelog: fixed
    755ed5e3
To find the state of this project's repository at the time of any of these versions, check out the tags.
changes_access_spec.rb 5.36 KB