Commits · 89fd25be70b4e3fc540d4cf591a02898470f1ef0 · Kirill Smelkov / linux

22 Oct, 2023 40 commits

bcachefs: Use x-macros for data types · 89fd25be
Kent Overstreet authored Jul 09, 2020
```
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
```
89fd25be

bcachefs: Fix short buffered writes · 912bdf17

Kent Overstreet authored Jul 09, 2020

In the buffered write path, we have to check for short writes that write
to the full page, where the page wasn't UpToDate; when this happens, the
page is partly garbage, so we have to zero it out and revert that part
of the write.

This check was wrong - we reverted total from copied, but didn't revert
the iov_iter, probably also leading to corrupted writes.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

912bdf17

bcachefs: Allow existing stripes to be updated with new data buckets · 0ba95acc

Kent Overstreet authored Jun 30, 2020

This solves internal fragmentation within stripes. We already have
copygc, which evacuates buckets that are partially or mostly empty, but
it's up to the ec code that manages stripes to deal with stripes that
have empty buckets in them.

This patch changes the path for creating new stripes to check if there's
existing stripes with empty buckets - and if so, update them with new
data buckets instead of creating new stripes.

TODO: improve the disk space accounting so that we can only use this
(more expensive path) when we have too much fragmentation in existing
stripes.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

0ba95acc

bcachefs: Refactor stripe creation · f6b94a3b

Kent Overstreet authored Jul 06, 2020

Prep work for the patch to update existing stripes with new data blocks.
This moves allocating new stripes into ec.c, and also sets up the data
structures so that we can handly only allocating some of the blocks in a
stripe.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

f6b94a3b

bcachefs: Move stripe creation to workqueue · 703e2a43

Kent Overstreet authored Jul 06, 2020

This is mainly to solve a lock ordering issue, and also simplifies the
code a bit.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

703e2a43

bcachefs: Improve stripe triggers/heap code · ba6dd1dd

Kent Overstreet authored Jul 06, 2020

Soon we'll be able to modify existing stripes - replacing empty blocks
with new blocks and new p/q blocks. This patch updates the trigger code
to handle pointers changing in an existing stripe; also, it
significantly improves how the stripes heap works, which means we can
get rid of the stripe creation/deletion lock.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

ba6dd1dd

bcachefs: Rework triggers interface · e63534a2

Kent Overstreet authored Jul 06, 2020

The trigger for stripe keys is shortly going to need both the old and
the new key passed to the trigger - this patch does that rework.

For now, this just changes the in memory triggers, and this doesn't
change how extent triggers work.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

e63534a2

bcachefs: Kill BTREE_TRIGGER_NOOVERWRITES · 697e45b2

Kent Overstreet authored Jul 06, 2020

This is prep work for reworking the triggers machinery - we have
triggers that need to know both the old and the new key.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

697e45b2

bcachefs: Mark btree nodes as needing rewrite when not all replicas are RW · fff899b1

Kent Overstreet authored Jul 03, 2020

This fixes a bug where recovery fails when one of the devices is read
only.

Also - consolidate the "must rewrite this node to insert it" behind a
new btree node flag.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

fff899b1

bcachefs: Use blk_status_to_str() · 306d40df

Kent Overstreet authored Jul 02, 2020

Improved error messages are always a good thing
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

306d40df

bcachefs: Don't cap ios in dio write path at 2 MB · 52fbb7c8

Kent Overstreet authored Jun 30, 2020

It appears this was erronious, a different bug was responsible
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

52fbb7c8

bcachefs: Refactor dio write code to reinit bch_write_op · 042a1f26

Kent Overstreet authored Jun 29, 2020

This fixes a bug where the BCH_WRITE_SKIP_CLOSURE_PUT was set
incorrectly, causing the completion to be delivered multiple times.
oops.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

042a1f26

bcachefs: Fix bch2_extent_can_insert() not being called · 64f2a880

Kent Overstreet authored Jun 28, 2020

It's supposed to check whether we're splitting a compressed extent and
if so get a bigger disk reservation - hence this fixes a "disk usage
increased by x without a reservaiton" bug.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

64f2a880

bcachefs: Fix a null ptr deref in bch2_btree_iter_traverse_one() · c61b7e21

Kent Overstreet authored Jun 26, 2020

We use sentinal values that aren't NULL to indicate there's a btree node
at a higher level; occasionally, this may result in
btree_iter_up_until_good_node() stopping at one of those sentinal
values.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

c61b7e21

bcachefs: Track sectors of erasure coded data · 649a9b68

Kent Overstreet authored Jun 18, 2020

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

649a9b68

bcachefs: Use btree reserve when appropriate · 937f5036

Kent Overstreet authored Jun 18, 2020

Whenever we're doing an update that has pointers, that generally means
we need to do the update in order to release open bucket references - so
we should be using the btree open bucket reserve.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

937f5036

bcachefs: Add a kthread_should_stop() check to allocator thread · eff508b4

Kent Overstreet authored Jun 17, 2020

Turns out it's possible during shutdown for the allocator to get stuck
spinning on bch2_invalidate_buckets() without hitting any of the other
checks.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

eff508b4

bcachefs: Change bch2_dump_bset() to also print key values · a34782a0

Kent Overstreet authored Jun 17, 2020

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

a34782a0

bcachefs: Fix a deadlock in the RO path · b9c3d139

Kent Overstreet authored Jun 17, 2020

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

b9c3d139

bcachefs: Fix incorrect gfp check · 47a5649a

Kent Overstreet authored Jun 15, 2020

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

47a5649a

bcachefs: Fix lock ordering with new btree cache code · d211b408

Kent Overstreet authored Jun 15, 2020

The code that checks lock ordering was recently changed to go off of the
pos of the btree node, rather than the iterator, but the btree cache
code didn't update to handle iterators that point to cached bkeys. Oops

Also, update various debug code.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

d211b408

bcachefs: delete a slightly faulty assertion · 1d186789

Kent Overstreet authored Jun 15, 2020

state lock isn't held at startup
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

1d186789

bcachefs: Increase size of btree node reserve · 7dd1ebfa

Kent Overstreet authored Jun 15, 2020

Also tweak the allocator to be more aggressive about keeping it full.
The recent changes to make updates to interior nodes transactional (and
thus generate updates to the alloc btree) all put more stress on the
btree node reserves.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

7dd1ebfa

bcachefs: Give bkey_cached_key same attributes as bpos · e27b03b3

Kent Overstreet authored Jun 15, 2020

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

e27b03b3

bcachefs: Use cached iterators for alloc btree · 5d20ba48

Kent Overstreet authored Oct 05, 2019

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

5d20ba48

bcachefs: Btree key cache · 2ca88e5a

Kent Overstreet authored Mar 07, 2019

This introduces a new kind of btree iterator, cached iterators, which
point to keys cached in a hash table. The cache also acts as a write
cache - in the update path, we journal the update but defer updating the
btree until the cached entry is flushed by journal reclaim.

Cache coherency is for now up to the users to handle, which isn't ideal
but should be good enough for now.

These new iterators will be used for updating inodes and alloc info (the
alloc and stripes btrees).
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

2ca88e5a

bcachefs: Implement a new gc that only recalcs oldest gen · 451570a5

Kent Overstreet authored Jun 15, 2020

Full mark and sweep gc doesn't (yet?) work with the new btree key cache
code, but it also blocks updates to interior btree nodes for the
duration and isn't really necessary in practice; we aren't currently
attempting to repair errors in allocation info at runtime.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

451570a5

bcachefs: Turn c->state_lock into an rwsem · 1ada1606

Kent Overstreet authored Jun 15, 2020

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

1ada1606

bcachefs: Add an internal option for reading entire journal · 7fffc85b

Kent Overstreet authored Jun 13, 2020

To be used the debug tool that dumps the contents of the journal.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

7fffc85b

bcachefs: Don't deadlock when btree node reuse changes lock ordering · bd2bb273

Kent Overstreet authored Jun 12, 2020

Btree node lock ordering is based on the logical key. However, 'struct
btree' may be reused for a different btree node under memory pressure.
This patch uses the new six lock callback to check if a btree node is no
longer the node we wanted to lock before blocking.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

bd2bb273

bcachefs: Fix a deadlock · 515282ac

Kent Overstreet authored Jun 12, 2020

__bch2_btree_node_lock() was incorrectly using iter->pos as a proxy for
btree node lock ordering, this caused an off by one error that was
triggered by bch2_btree_node_get_sibling() getting the previous node.

This refactors the code to compare against btree node keys directly.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

515282ac

bcachefs: Refactor btree insert path · 4e8224ed

Kent Overstreet authored Jun 09, 2020

This splits out the journalling code from the btree update code; prep
work for the btree key cache.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

4e8224ed

bcachefs: Always give out journal pre-res if we already have one · 4efe71a6

Kent Overstreet authored Jun 09, 2020

This is better than skipping the journal pre-reservation if we already
have one - we should still acount for the journal reservation we're
going to have to get.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

4efe71a6

bcachefs: More open buckets · 374153c2

Kent Overstreet authored Jun 09, 2020

We need a larger open bucket reserve now that the btree interior update
path holds onto open bucket references; filesystems with many high
through devices may need more open buckets now.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

374153c2

bcachefs: Don't allocate memory under the btree cache lock · e38821f3

Kent Overstreet authored Jun 09, 2020

The btree cache lock is needed for reclaiming from the btree node cache,
and memory allocation can potentially spin and sleep (for 100 ms at a
time), so.. don't do that.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

e38821f3

bcachefs: Fix a linked list bug · 966885ee

Kent Overstreet authored Jun 09, 2020

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

966885ee

bcachefs: Make open bucket reserves more conservative · 6b5f9b29

Kent Overstreet authored Jun 09, 2020

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

6b5f9b29

bcachefs: btree_update_nodes_written() requires alloc reserve · 40ca39b5

Kent Overstreet authored Jun 09, 2020

Also, in the btree_update_start() path, if we already have a journal
pre-reservation we don't want to take another - that's a deadlock.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

40ca39b5

bcachefs: Check gfp_flags correctly in bch2_btree_cache_scan() · 8c9eef95

Kent Overstreet authored Jun 05, 2020

bch2_btree_node_mem_alloc() uses memalloc_nofs_save()/GFP_NOFS, but
GFP_NOFS does include __GFP_IO - oops. We used to use GFP_NOIO, but as
we're a filesystem now GFP_NOFS makes more sense now and is looser.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

8c9eef95

bcachefs: Call bch2_btree_iter_traverse() if necessary in commit path · 8804ef1f
Kent Overstreet authored Jun 08, 2020
```
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
```
8804ef1f