• Kent Overstreet's avatar
    bcachefs: Fix a journal deadlock in replay · 87b0d8d3
    Kent Overstreet authored
    Recently, journal pre-reservations were removed. They were for reserving
    space ahead of time in the journal for operations that are required for
    journal reclaim, e.g. btree key cache flushing and interior node btree
    updates.
    
    Instead we have watermarks - only operations for journal reclaim are
    allowed when the journal is low on space, and in general we're quite
    good about doing operations in the order that will free up space in the
    journal quickest when we're low on space. If we're doing a journal
    reclaim operation out of order, we usually do it in nonblocking mode if
    it's not freeing up space at the end of the journal.
    
    There's an exceptino though - interior btree node update operations have
    to be BCH_WATERMARK_reclaim - once they've been started, and they can't
    be nonblocking. Generally this is fine because they'll only be a very
    small fraction of transaction commits - but there's an exception, which
    is during journal replay.
    
    Journal replay does many btree operations, but doesn't need to commit
    them to the journal since they're already in the journal. So killing off
    of pre-reservation, plus another change to make journal replay more
    efficient by initially doing the replay in sorted btree order, made it
    possible for the interior update operations replay generates to fill and
    deadlock the journal.
    
    Fix this by introducing a new check on journal space at the _start_ of
    an interior update operation. This causes us to block if necessary in
    exactly the same way as we used to when interior updates took a journal
    pre-reservaiton, but without all the expensive accounting
    pre-reservations required.
    Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
    87b0d8d3
btree_update_interior.c 65.1 KB