An error occurred fetching the project authors.
- 22 Oct, 2023 40 commits
-
-
Kent Overstreet authored
With BTREE_ITER_FILTER_SNAPSHOTS, we have to distinguish between the path where the key was found, and the path for inserting into the current snapshot. This adds a new field to struct btree_iter for saving a path for the current snapshot, and plumbs it through bch2_trans_update(). Signed-off-by:
Kent Overstreet <kent.overstreet@linux.dev>
-
Kent Overstreet authored
This splits bch2_btree_iter() up into two functions: an inner function that handles BTREE_ITER_WITH_JOURNAL, BTREE_ITER_WITH_UPDATES, and iterating acrcoss leaf nodes, and an outer one that implements BTREE_ITER_FILTER_SNAPHSOTS. This is prep work for remember a btree_path at our update position in BTREE_ITER_FILTER_SNAPSHOTS mode. Signed-off-by:
Kent Overstreet <kent.overstreet@linux.dev>
-
Kent Overstreet authored
Symbol decoding, via %ps, isn't supported in userspace - this will also be faster when we're using trans->fn in the fast path, as with the new BCH_JSET_ENTRY_log journal messages. Signed-off-by:
Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
This adds a flag to not mark the initial btree_path as preserve, for paths that we expect to be cheap to reconstitute if necessary - this solves a btree_path overflow caused by need_whiteout_for_snapshot(). Signed-off-by:
Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
Reading from cached data, which calls bch2_bucket_io_time_reset(), is leading to transaction iterator overflows - this standardizes the workaround. Signed-off-by:
Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
This adds a new assertion to be used by bch2_inode_update_after_write(), which updates the VFS inode based on the update to the btree inode we just did - we require that the btree inode still be locked when we do that update. Signed-off-by:
Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
- We should only be clearing should_be_locked in btree_path_set_pos() - it's the responsiblity of the btree_path code, not the btree_iter code. - bch2_path_put() needs to pay attention to path->should_be_locked, to ensure we don't drop locks we're supposed to be keeping. Signed-off-by:
Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
The bch2_btree_path_upgrade() call was failing and tripping an assert - path->level + 1 is in this case not necessarily exactly what we want, fix it by upgrading exactly the locks we want. Signed-off-by:
Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
for_each_btree_key() now calls bch2_trans_begin() as needed; that means, we can also call it when we're in danger of overflowing transaction paths. Signed-off-by:
Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
But we don't need to call it from outside the btree iterator code anymore, since it's called by bch2_trans_begin() and bch2_btree_path_traverse(). Signed-off-by:
Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
Code that uses for_each_btree_key often wants transaction restarts to be handled locally and not returned. Originally, we wouldn't return transaction restarts if there was a single iterator in the transaction - the reasoning being if there weren't other iterators being invalidated, and the current iterator was being advanced/retraversed, there weren't any locks or iterators we were required to preserve. But with the btree_path conversion that approach doesn't work anymore - even when we're using for_each_btree_key() with a single iterator there will still be two paths in the transaction, since we now always preserve the path at the pos the iterator was initialized at - the reason being that on restart we often restart from the same place. And it turns out there's now a lot of for_each_btree_key() uses that _do not_ want transaction restarts handled locally, and should be returning them. This patch splits out for_each_btree_key_norestart() and for_each_btree_key_continue_norestart(), and converts existing users as appropriate. for_each_btree_key(), for_each_btree_key_continue(), and for_each_btree_node() now handle transaction restarts themselves by calling bch2_trans_begin() when necessary - and the old hack to not return transaction restarts when there's a single path in the transaction has been deleted. Signed-off-by:
Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
Now that peek_node()/next_node() are converted to return errors directly, we don't need bch2_trans_exit() to return errors - it's cleaner this way and wasn't used much anymore. Signed-off-by:
Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
This changes for_each_btree_node() to work like for_each_btree_key(), and to that end bch2_btree_iter_peek_node() and next_node() also return error ptrs. Signed-off-by:
Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
For snapshots, we need to implement btree lookups that return the first key that's an ancestor of the snapshot ID the lookup is being done in - and filter out keys in unrelated snapshots. This patch adds the btree iterator flag BTREE_ITER_FILTER_SNAPSHOTS which does that filtering. Signed-off-by:
Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
This patch significantly reduces the number of btree lookups required in the extent update path. Signed-off-by:
Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
New rule is: if a btree path holds any locks it should be holding precisely the locks wanted (accoringing to path->level and path->locks_want). Signed-off-by:
Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
btree_path_traverse_all() traverses btree iterators in sorted order, and thus shouldn't see transaction restarts due to potential deadlocks - but sometimes we do. This patch adds some more assertions and tracks some more state to help track this down. Signed-off-by:
Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
This splits btree_iter into two components: btree_iter is now the externally visible componont, and it points to a btree_path which is now reference counted. This means we no longer have to clone iterators up front if they might be mutated - btree_path can be shared by multiple iterators, and cloned if an iterator would mutate a shared btree_path. This will help us use iterators more efficiently, as well as slimming down the main long lived state in btree_trans, and significantly cleans up the logic for iterator lifetimes. Signed-off-by:
Kent Overstreet <kent.overstreet@linux.dev>
-
Kent Overstreet authored
This was used for an optimization that hasn't existing in quite awhile - iter->uptodate will probably be going away as well. Signed-off-by:
Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
Signed-off-by:
Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
These utility functions are for managing btree node state within a btree_trans - rename them for consistency, and drop some unneeded arguments. Signed-off-by:
Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
This is prep work for splitting btree_path out from btree_iter - btree_path will not have a pointer to btree_trans. Signed-off-by:
Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
Disfavoured, and should go away. Signed-off-by:
Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
This factors out bch2_dump_trans_iters_updates() from the iter alloc overflow path, and makes some small improvements to what it prints. Signed-off-by:
Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
This will be used to make other operations on btree iterators within a transaction more efficient, and enable some other improvements to how we manage btree iterators. Signed-off-by:
Kent Overstreet <kent.overstreet@linux.dev>
-
Kent Overstreet authored
It's now the caller's responsibility to call bch2_trans_begin. Signed-off-by:
Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
Start tracking when btree transactions have been restarted - and assert that we're always calling bch2_trans_begin() immediately after transaction restart. Signed-off-by:
Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
Btree node merging now happens prior to transaction commit, not after, so we don't need to pay attention to BTREE_INSERT_NOUNLOCK. Also, foreground_maybe_merge shouldn't be calling bch2_btree_iter_traverse_all() - this is becoming private to the btree iterator code and should only be called by bch2_trans_begin(). Signed-off-by:
Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
This adds a new helper for btree_cache.c that does what we want where the iterator is still being traverse - and also eliminates some unnecessary transaction restarts. Signed-off-by:
Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
This closes a significant hole (and last known hole) in our ability to verify metadata. Previously, since btree nodes are log structured, we couldn't detect lost btree writes that weren't the first write to a given node. Additionally, this seems to have lead to some significant metadata corruption on multi device filesystems with metadata replication: since a write may have made it to one device and not another, if we read that btree node back from the replica that did have that write and started appending after that point, the other replica would have a gap in the bset entries and reading from that replica wouldn't find the rest of the bsets. But, since updates to interior btree nodes are now journalled, we can close this hole by updating pointers to btree nodes after every write with the currently written number of sectors, without negatively affecting performance. This means we will always detect lost or corrupt metadata - it also means that our btree is now a curious hybrid of COW and non COW btrees, with all the benefits of both (excluding complexity). Signed-off-by:
Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
We weren't correctly verifying that we had interior node intent locks - this patch also fixes bugs uncovered by the new assertions. Signed-off-by:
Kent Overstreet <kent.overstreet@gmail.com>
-
Dan Robertson authored
Add basic kernel docs for bch2_trans_reset and bch2_trans_begin. Signed-off-by:
Dan Robertson <dan@dlrobertson.com> Signed-off-by:
Kent Overstreet <kent.overstreet@linux.dev>
-
Kent Overstreet authored
Adding iter->should_be_locked introduced a regression where it ended up not being set on the iterator passed to bch2_btree_update_start(), which is definitely not what we want. This patch requires it to be set when calling bch2_trans_update(), and adds various fixups to make that happen. Signed-off-by:
Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
It's now been rolled into bch2_btree_iter_peek_slot() Signed-off-by:
Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
This drops bch2_btree_iter_peek_with_updates() and replaces it with a new flag, BTREE_ITER_WITH_UPDATES, and also reworks bch2_btree_iter_peek_slot() to respect it too. Signed-off-by:
Kent Overstreet <kent.overstreet@linux.dev>
-
Kent Overstreet authored
This adds the ability for btree iterators to own child iterators - to be used by an upcoming rework of bch2_btree_iter_peek_slot(), so we can scan forwards while maintaining our current position. Signed-off-by:
Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
Add a field to struct btree_iter for tracking whether it should be locked - this fixes spurious transaction restarts in bch2_trans_relock(). Signed-off-by:
Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
This patch adds some new tracepoints to the btree iterator code, and adds new fields to the existing tracepoints - primarily for the iterator position. Signed-off-by:
Kent Overstreet <kent.overstreet@gmail.com>
-
Kent Overstreet authored
By changing it to upgrade iterators to intent locks to avoid lock restarts we can simplify __bch2_btree_node_lock() quite a bit - this fixes a probable bug where it could potentially drop a lock on an unrelated error but still succeed instead of causing a transaction restart. Signed-off-by:
Kent Overstreet <kent.overstreet@linux.dev>
-
Kent Overstreet authored
gcc is emitting rep stos here, which is silly (and slow) for an 8 byte memset. Signed-off-by:
Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by:
Kent Overstreet <kent.overstreet@linux.dev>
-