Commits · b1cdc398ae36689300b4108ce9c90c58cac1ba34 · Kirill Smelkov / linux

An error occurred fetching the project authors.

22 Oct, 2023 40 commits

bcachefs: Make more btree_paths available · b1cdc398

Kent Overstreet authored 2 years ago

 - Don't decrease BTREE_ITER_MAX when building with CONFIG_LOCKDEP
   anymore. The lockdep table sizes are configurable now, we don't need
   this anymore.
 - btree_trans_too_many_iters() is less conservative now. Previously it
   was causing a transaction restart if we had used more than
   BTREE_ITER_MAX / 2 paths, change this to BTREE_ITER_MAX - 8.

This helps with excessive transaction restarts/livelocks in the bucket
allocator path.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

b1cdc398

bcachefs: Track maximum transaction memory · 616928c3

Kent Overstreet authored 2 years ago

This patch
 - tracks maximum bch2_trans_kmalloc() memory used in btree_transaction_stats
 - makes it available in debugfs
 - switches bch2_trans_init() to using that for the amount of memory to
   preallocate, instead of the parameter passed in

This drastically reduces transaction restarts, and means we no longer
need to track this in the source code.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

616928c3

bcachefs: btree_locking.c · cd5afabe

Kent Overstreet authored 2 years ago

Start to centralize some of the locking code in a new file; more locking
code will be moving here in the future.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

cd5afabe

bcachefs: Add assertions for unexpected transaction restarts · f0d2e9f2
Kent Overstreet authored 2 years ago
```
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
```
f0d2e9f2

bcachefs: Increment restart count in bch2_trans_begin() · c497df8b

Kent Overstreet authored 2 years ago

Instead of counting transaction restarts, count when the transaction is
restarted: if bch2_trans_begin() was called when the transaction wasn't
restarted we need to ensure restart_count is still incremented.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

c497df8b

bcachefs: Track the maximum btree_paths ever allocated by each transaction · 5c0bb66a

Kent Overstreet authored 2 years ago

We need a way to check if the machinery for handling btree_paths with in
a transaction is behaving reasonably, as it often has not been - we've
had bugs with transaction path overflows caused by duplicate paths and
plenty of other things.

This patch tracks, per transaction fn, the most btree paths ever
allocated by that transaction and makes it available in debugfs.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

5c0bb66a

bcachefs: Tracepoint improvements · 9f96568c

Kent Overstreet authored 2 years ago

Our types are exported to the tracepoint code, so it's not necessary to
break things out individually when passing them to tracepoints - we can
also call other functions from TP_fast_assign().
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

9f96568c

bcachefs: Fix incorrectly freeing btree_path in alloc path · 17047fbc

Kent Overstreet authored 2 years ago

Clearing path->preserve means the path will be dropping in
bch2_trans_begin() - but on transaction restart, we're likely to need
that path again.

This fixes a livelock in the allocation path.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

17047fbc

bcachefs: for_each_btree_key_reverse() · 4f84b7e3

Kent Overstreet authored 2 years ago

This adds a new macro, like for_each_btree_key2(), but for iterating in
reverse order.

Also, change for_each_btree_key2() to properly check the return value of
bch2_btree_iter_advance().
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

4f84b7e3

bcachefs: EINTR -> BCH_ERR_transaction_restart · 549d173c

Kent Overstreet authored 2 years ago

Now that we have error codes, with subtypes, we can switch to our own
error code for transaction restarts - and even better, a distinct error
code for each transaction restart reason: clearer code and better
debugging.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

549d173c

bcachefs: btree_trans_too_many_iters() is now a transaction restart · 0990efae

Kent Overstreet authored 2 years ago

All transaction restarts need a tracepoint - this is essential for
debugging
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

0990efae

bcachefs: Add a counter for btree_trans restarts · e941ae7d

Kent Overstreet authored 2 years ago

This will help us improve nested transactions - we need to add
assertions that whenever an inner transaction handles a restart, it
still returns -EINTR to the outer transaction.

This also adds nested_lockrestart_do() and nested_commit_do() which use
the new counters to correctly return -EINTR when the transaction was
restarted.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

e941ae7d

bcachefs: lock time stats prep work. · 8bfe14e8

Daniel Hill authored 2 years ago

We need the caller name and a place to store our results, btree_trans provides this.
Signed-off-by: Daniel Hill <daniel@gluo.nz>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

8bfe14e8

bcachefs: for_each_btree_key2() · a1783320

Kent Overstreet authored 2 years ago

This introduces two new macros for iterating through the btree, with
transaction restart handling
 - for_each_btree_key2()
 - for_each_btree_key_commit()

Every iteration is now in an implicit transaction, and - as with
lockrestart_do() and commit_do() - returning -EINTR will cause the
transaction to be restarted, at the same key.

This patch converts a bunch of code that was open coding this to these
new macros, saving a substantial amount of code.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

a1783320

bcachefs: Improve an error message · 50b13bee

Kent Overstreet authored 2 years ago

When inserting a key type that's not valid for a given btree, we should
print out which btree we were inserting into.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

50b13bee

bcachefs: Fix journal_keys_search() overhead · 30525f68

Kent Overstreet authored 2 years ago

Previously, on every btree_iter_peek() operation we were searching the
journal keys, doing a full binary search - which was slow.

This patch fixes that by saving our position in the journal keys, so
that we only do a full binary search when moving our position backwards
or a large jump forwards.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

30525f68

bcachefs: bch2_btree_iter_peek_all_levels() · b0babf2a

Kent Overstreet authored 2 years ago

This adds bch2_btree_iter_peek_all_levels(), which returns keys from
every level of the btree - interior nodes included - in monotonically
increasing order, soon to be used by the backpointers check & repair
code.

 - BTREE_ITER_ALL_LEVELS can now be passed to for_each_btree_key() to
   iterate thusly, much like BTREE_ITER_SLOTS

 - The existing algorithm in bch2_btree_iter_advance() doesn't work with
   peek_all_levels(): we have to defer the actual advancing until the
   next time we call peek, where we have the btree path traversed and
   uptodate. So, we add an advanced bit to btree_iter; when
   BTREE_ITER_ALL_LEVELS is set bch2_btree_iter_advanced() just marks
   the iterator as advanced.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

b0babf2a

bcachefs: btree_path_make_mut() clears should_be_locked · d8648425

Kent Overstreet authored 2 years ago

This fixes a bug where __bch2_btree_node_update_key() wasn't clearing
should_be_locked, leading to bch2_btree_path_traverse() always failing -
all callers of btree_path_make_mut() want should_be_locked cleared, so
do it there.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

d8648425

bcachefs: bch2_trans_updates_to_text() · 8570d775

Kent Overstreet authored 2 years ago

This turns bch2_dump_trans_updates() into a to_text() method - this way
it can be used by debug tracing.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

8570d775

bcachefs: bch2_trans_inconsistent() · 2158fe46

Kent Overstreet authored 2 years ago

Add a new error macro that also dumps transaction updates in addition to
doing an emergency shutdown - when a transaction update discovers or is
causing a fs inconsistency, it's helpful to see what updates it was
doing.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

2158fe46

bcachefs: bch2_btree_iter_peek_upto() · 85d8cf16

Kent Overstreet authored 2 years ago

In BTREE_ITER_FILTER_SNAPHOTS mode, we skip over keys in unrelated
snapshots. When we hit the end of an inode, if the next inode(s) are in
a different subvolume, we could potentially have to skip past many keys
before finding a key we can return to the caller, so they can terminate
the iteration.

This adds a peek_upto() variant to solve this problem, to be used when
we know the range we're searching within.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

85d8cf16

bcachefs: BTREE_ITER_WITH_KEY_CACHE · f7b6ca23

Kent Overstreet authored 3 years ago

This is the start of cache coherency with the btree key cache - this
adds a btree iterator flag that causes lookups to also check the key
cache when we're iterating over the btree (not iterating over the key
cache).

Note that we could still race with another thread creating at item in
the key cache and updating it, since we aren't holding the key cache
locked if it wasn't found. The next patch for the update path will
address this by causing the transaction to restart if the key cache is
found to be dirty.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

f7b6ca23

bcachefs: bch2_btree_path_set_pos() · ce91abd6

Kent Overstreet authored 3 years ago

bch2_btree_path_set_pos() is now available outside of btree_iter.c
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

ce91abd6

bcachefs: iter->update_path · 1f2d9192

Kent Overstreet authored 3 years ago

With BTREE_ITER_FILTER_SNAPSHOTS, we have to distinguish between the
path where the key was found, and the path for inserting into the
current snapshot. This adds a new field to struct btree_iter for saving
a path for the current snapshot, and plumbs it through
bch2_trans_update().
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

1f2d9192

bcachefs: Refactor bch2_btree_iter() · a1e82d35

Kent Overstreet authored 3 years ago

This splits bch2_btree_iter() up into two functions: an inner function
that handles BTREE_ITER_WITH_JOURNAL, BTREE_ITER_WITH_UPDATES, and
iterating acrcoss leaf nodes, and an outer one that implements
BTREE_ITER_FILTER_SNAPHSOTS.

This is prep work for remember a btree_path at our update position in
BTREE_ITER_FILTER_SNAPSHOTS mode.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

a1e82d35

bcachefs: Switch to __func__for recording where btree_trans was initialized · 669f87a5

Kent Overstreet authored 3 years ago

Symbol decoding, via %ps, isn't supported in userspace - this will also
be faster when we're using trans->fn in the fast path, as with the new
BCH_JSET_ENTRY_log journal messages.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

669f87a5

bcachefs: BTREE_ITER_NOPRESERVE · f3e1f444

Kent Overstreet authored 3 years ago

This adds a flag to not mark the initial btree_path as preserve, for
paths that we expect to be cheap to reconstitute if necessary - this
solves a btree_path overflow caused by need_whiteout_for_snapshot().
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

f3e1f444

bcachefs: Apply workaround for too many btree iters to read path · 084d42bb

Kent Overstreet authored 3 years ago

Reading from cached data, which calls bch2_bucket_io_time_reset(), is
leading to transaction iterator overflows - this standardizes the
workaround.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

084d42bb

bcachefs: bch2_assert_pos_locked() · 32b26e8c

Kent Overstreet authored 3 years ago

This adds a new assertion to be used by bch2_inode_update_after_write(),
which updates the VFS inode based on the update to the btree inode we
just did - we require that the btree inode still be locked when we do
that update.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

32b26e8c

bcachefs: path->should_be_locked fixes · 9a74f63c

Kent Overstreet authored 3 years ago

 - We should only be clearing should_be_locked in btree_path_set_pos() -
   it's the responsiblity of the btree_path code, not the btree_iter
   code.

 - bch2_path_put() needs to pay attention to path->should_be_locked, to
   ensure we don't drop locks we're supposed to be keeping.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

9a74f63c

bcachefs: Fix upgrade_readers() · f527afea

Kent Overstreet authored 3 years ago

The bch2_btree_path_upgrade() call was failing and tripping an assert -
path->level + 1 is in this case not necessarily exactly what we want,
fix it by upgrading exactly the locks we want.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

f527afea

bcachefs: More general fix for transaction paths overflow · d1211725

Kent Overstreet authored 3 years ago

for_each_btree_key() now calls bch2_trans_begin() as needed; that means,
we can also call it when we're in danger of overflowing transaction
paths.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

d1211725

bcachefs: Must check for errors from bch2_trans_cond_resched() · b0d1b70a

Kent Overstreet authored 3 years ago

But we don't need to call it from outside the btree iterator code
anymore, since it's called by bch2_trans_begin() and
bch2_btree_path_traverse().
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

b0d1b70a

bcachefs: Fix restart handling in for_each_btree_key() · e5fa91d7

Kent Overstreet authored 3 years ago

Code that uses for_each_btree_key often wants transaction restarts to be
handled locally and not returned. Originally, we wouldn't return
transaction restarts if there was a single iterator in the transaction -
the reasoning being if there weren't other iterators being invalidated,
and the current iterator was being advanced/retraversed, there weren't
any locks or iterators we were required to preserve.

But with the btree_path conversion that approach doesn't work anymore -
even when we're using for_each_btree_key() with a single iterator there
will still be two paths in the transaction, since we now always preserve
the path at the pos the iterator was initialized at - the reason being
that on restart we often restart from the same place.

And it turns out there's now a lot of for_each_btree_key() uses that _do
not_ want transaction restarts handled locally, and should be returning
them.

This patch splits out for_each_btree_key_norestart() and
for_each_btree_key_continue_norestart(), and converts existing users as
appropriate. for_each_btree_key(), for_each_btree_key_continue(), and
for_each_btree_node() now handle transaction restarts themselves by
calling bch2_trans_begin() when necessary - and the old hack to not
return transaction restarts when there's a single path in the
transaction has been deleted.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

e5fa91d7

bcachefs: bch2_trans_exit() no longer returns errors · 9a796fdb

Kent Overstreet authored 3 years ago

Now that peek_node()/next_node() are converted to return errors
directly, we don't need bch2_trans_exit() to return errors - it's
cleaner this way and wasn't used much anymore.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

9a796fdb

bcachefs: for_each_btree_node() now returns errors directly · d355c6f4

Kent Overstreet authored 3 years ago

This changes for_each_btree_node() to work like for_each_btree_key(),
and to that end bch2_btree_iter_peek_node() and next_node() also return
error ptrs.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

d355c6f4

bcachefs: BTREE_ITER_FILTER_SNAPSHOTS · c075ff70

Kent Overstreet authored 3 years ago

For snapshots, we need to implement btree lookups that return the first
key that's an ancestor of the snapshot ID the lookup is being done in -
and filter out keys in unrelated snapshots. This patch adds the btree
iterator flag BTREE_ITER_FILTER_SNAPSHOTS which does that filtering.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

c075ff70

bcachefs: Optimize btree lookups in write path · db92f2ea

Kent Overstreet authored 3 years ago

This patch significantly reduces the number of btree lookups required in
the extent update path.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

db92f2ea

bcachefs: Tighten up btree locking invariants · 1d3ecd7e

Kent Overstreet authored 3 years ago

New rule is: if a btree path holds any locks it should be holding
precisely the locks wanted (accoringing to path->level and
path->locks_want).
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

1d3ecd7e

bcachefs: Add more assertions for locking btree iterators out of order · 068bcaa5

Kent Overstreet authored 3 years ago

btree_path_traverse_all() traverses btree iterators in sorted order, and
thus shouldn't see transaction restarts due to potential deadlocks - but
sometimes we do. This patch adds some more assertions and tracks some
more state to help track this down.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>

068bcaa5