Commits · ae1ede5893bd9b46f40cc9d1148321206369a9f2 · Kirill Smelkov / linux

22 Oct, 2023 40 commits

bcachefs: Don't embed btree iters in btree_trans · ae1ede58

Kent Overstreet authored Nov 02, 2020

These haven't been in used since reallocing iterators has been disabled,
and saves us a lot of stack if we get rid of it.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

ae1ede58

bcachefs: Split out debug_check_btree_accounting · 692d4031

Kent Overstreet authored Nov 02, 2020

This check is very expensive
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

692d4031

bcachefs: Drop sysfs interface to debug parameters · 29364f34

Kent Overstreet authored Nov 02, 2020

It's not used much anymore, the module paramter interface is better.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

29364f34

bcachefs: Minor journal reclaim improvement · 2f33ece9

Kent Overstreet authored Nov 02, 2020

With the btree key cache code, journal reclaim now has a lot more work
to do. It could be the case that after journal reclaim has finished one
iteration there's already more work to do, so put it in a loop to check
for that.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

2f33ece9

bcachefs: Inode create optimization · 45e4dcba

Kent Overstreet authored Oct 27, 2020

On workloads that do a lot of multithreaded creates all at once, lock
contention on the inodes btree turns out to still be an issue.

This patch adds a small buffer of inode numbers that are known to be
free, so that we can avoid touching the btree on every create. Also,
this changes inode creates to update via the btree key cache for the
initial create.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

45e4dcba

bcachefs: Improve check for when bios are physically contiguous · b16fa0ba

Kent Overstreet authored Oct 30, 2020

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

b16fa0ba

bcachefs: Fix spurious transaction restarts · dcf141b9

Kent Overstreet authored Oct 28, 2020

The check for whether locking a btree node would deadlock was wrong - we
have to check that interior nodes are locked before descendents, but
this check was wrong when consider cached vs. non cached iterators.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

dcf141b9

bcachefs: Improve tracing for transaction restarts · a301dc38

Kent Overstreet authored Oct 28, 2020

We have a bug where we can get stuck with a process spinning in
transaction restarts - need more information.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

a301dc38

bcachefs: Fix stack corruption · 527087c7

Kent Overstreet authored Oct 27, 2020

A bkey_on_stack_realloc() call was in the wrong place, and broken for
indirect extents
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

527087c7

bcachefs: Use cached iterators for inode updates · 8cad3e2f

Kent Overstreet authored Sep 22, 2019

This switches inode updates to use cached btree iterators - which should
be a nice performance boost, since lock contention on the inodes btree
can be a bottleneck on multithreaded workloads.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

8cad3e2f

bcachefs: fiemap fixes · e7b854b1

Kent Overstreet authored Oct 26, 2020

 - fiemap didn't know about inline extents, fixed
 - advancing to the next extent after we'd chased a pointer to the
   reflink btree was wrong, fixed
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

e7b854b1

bcachefs: Fix btree updates when mixing cached and non cached iterators · 645d72aa

Kent Overstreet authored Oct 26, 2020

There was a bug where bch2_trans_update() would incorrectly delete a
pending update where the new update did not actually overwrite the
existing update, because we were incorrectly using BTREE_ITER_TYPE when
sorting pending btree updates.

This affects the pending patch to use cached iterators for inode
updates.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

645d72aa

bcachefs: Add mode to bch2_inode_to_text · eb460979

Kent Overstreet authored Oct 26, 2020

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

eb460979

bcachefs: Always write a journal entry when stopping journal · 8be901d5

Kent Overstreet authored Oct 25, 2020

This is to fix a (harmless) bug where the read clock hand in the
superblock doesn't match the journal.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

8be901d5

bcachefs: Drop alloc keys from journal when -o reconstruct_alloc · 33114c2d

Kent Overstreet authored Oct 24, 2020

This fixes a bug where we'd pop an assertion due to replaying a key for
an interior btree node when that node no longer exists.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

33114c2d

bcachefs: Indirect inline data extents · 801a3de6

Kent Overstreet authored Oct 24, 2020

When inline data extents were added, reflink was forgotten about - we
need indirect inline data extents for reflink + inline data to work
correctly.

This patch adds them, and a new feature bit that's flipped when they're
used.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

801a3de6

bcachefs: Fix rare use after free in read path · 13dcd4ab

Kent Overstreet authored Oct 24, 2020

If the bkey_on_stack_reassemble() call in __bch2_read_indirect_extent()
reallocates the buffer, k in bch2_read - which we pointed at the
bkey_on_stack buffer - will now point to a stale buffer. Whoops.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

13dcd4ab

bcachefs: Improve some error messages · e00711d2

Kent Overstreet authored Oct 24, 2020

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

e00711d2

bcachefs: Fix for passing target= opts as mount opts · a10e677a

Kent Overstreet authored Oct 23, 2020

Some options can't be parsed until the filesystem initialized;
previously, passing these options to mount or remount would cause mount
to fail.

This changes the mount path so that we parse the options passed in
twice, and just ignore any options that can't be parsed the first time.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

a10e677a

bcachefs: Fix bch2_mark_stripe() · 5b088c1d

Kent Overstreet authored Oct 23, 2020

There's no reason not to always recalculate these fields
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

5b088c1d

bcachefs: Don't drop replicas when copygcing ec data · b88e971e

Kent Overstreet authored Jul 22, 2020

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

b88e971e

bcachefs: Account for stripe parity sectors separately · af4d05c4

Kent Overstreet authored Jul 09, 2020

Instead of trying to charge EC parity to the data within the stripe
(which is subject to rounding errors), let's charge it to the stripe
itself. It should also make -ENOSPC issues easier to deal with if we
charge for parity blocks up front, and means we can also make more fine
grained accounting available to the user.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

af4d05c4

bcachefs: Fix for bad stripe pointers · 39283c71

Kent Overstreet authored Oct 19, 2020

The allocator usually doesn't increment bucket gens right away on
buckets that it's about to hand out (for reasons that need to be
documented), instead deferring that to whatever extent update first
references that bucket.

But stripe pointers reference buckets without changing bucket sector
counts, meaning we could end up with a pointer in a stripe with a gen
newer than the bucket it points to.

Fix this by adding a transactional trigger for KEY_TYPE_stripe that just
writes out the keys in the alloc btree for the buckets it points to.

Also - consolidate the code that checks pointer validity.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

39283c71

bcachefs: Start/stop io clock hands in read/write paths · 28998019

Kent Overstreet authored Oct 17, 2020

This fixes a bug where the clock hands in the journal and superblock
didn't match, because we were still incrementing the read clock hand
while read-only.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

28998019

bcachefs: Improvements to writing alloc info · 8d6b6222

Kent Overstreet authored Oct 16, 2020

Now that we've got transactional alloc info updates (and have for
awhile), we don't need to write it out on shutdown, and we don't need to
write it out on startup except when GC found errors - this is a big
improvement to mount/unmount performance.

This patch also fixes a few bugs where we weren't writing out alloc
info (on new filesystems, and new devices) and should have been.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

8d6b6222

bcachefs: Fix assertion popping in transaction commit path · aa8889c0

Kent Overstreet authored Dec 16, 2020

We can't be holding read locks on btree nodes when we go to take write
locks: this would deadlock if another thread is holding an intent lock
on the node we have a read lock on, and it tries to commit and upgrade
to a write lock.

But instead of triggering an assertion, if this happens we can just
upgrade the read lock to an intent lock.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

aa8889c0

bcachefs: Perf improvements for bch_alloc_read() · f3721e12

Kent Overstreet authored Oct 16, 2020

On large filesystems reading in the alloc info takes a significant
amount of time. But we don't need to be calling into the fully general
bch2_mark_key() path, just open code what we need in
bch2_alloc_read_fn().
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

f3721e12

bcachefs: Fix copygc dying on startup · 9f20ed15

Kent Overstreet authored Oct 15, 2020

The copygc threads errors out and makes the filesystem go RO if it ever
tries to run and discovers it has no reserve allocated - which is a
problem if it races with the allocator thread and its reserve hasn't
been filled yet.

The allocator thread doesn't start filling the copygc reserve until
after BCH_FS_STARTED has been set, so make sure to wake up the allocator
threads after setting that and before starting copygc.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

9f20ed15

bcachefs: Fix copygc of compressed data · 6ea873d1

Kent Overstreet authored Oct 15, 2020

The check for when we need to get a disk reservation was wrong.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

6ea873d1

bcachefs: Fix another lockdep splat · 97c0e195

Kent Overstreet authored Oct 15, 2020

vfree() can allocate memory, so we need to call memalloc_nofs_save().
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

97c0e195

bcachefs: Fix errors early in the fs init process · 505b7a4c

Kent Overstreet authored Oct 15, 2020

At some point bch2_fs_alloc() was changed to always call bch2_fs_free()
in the error path, which means we need c->cl to always be initialized.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

505b7a4c

bcachefs: Copy ptr->cached when migrating data · 922ae9f4

Kent Overstreet authored Jul 10, 2020

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

922ae9f4

bcachefs: Fix gc of stale ptr gens · c47c50f8

Kent Overstreet authored Oct 13, 2020

Awhile back, gcing of stale pointers was split out from full
mark-and-sweep gc - but, the bit to actually drop those stale pointers
wasn't implemnted. Whoops.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

c47c50f8

bcachefs: Fix off-by-one error in ptr gen check · 9ee38f62

Kent Overstreet authored Oct 13, 2020

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

9ee38f62

bcachefs: Fix a lockdep splat · 5d0b7f90

Kent Overstreet authored Oct 11, 2020

We can't allocate memory with GFP_FS while holding the btree cache lock,
and vfree() can allocate memory.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

5d0b7f90

bcachefs: Fix __bch2_truncate_page() · 9ba2eb25

Kent Overstreet authored Oct 09, 2020

__bch2_truncate_page() will mark some of the blocks in a page as
unallocated. But, if the page is mmapped (and writable), every block in
the page needs to be marked dirty, else those blocks won't be written by
__bch2_writepage().

The solution is to change those userspace mappings to RO, so that we
force bch2_page_mkwrite() to be called again.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

9ba2eb25

bcachefs: Fix journal_seq_copy() · 61ce38b8

Kent Overstreet authored Oct 06, 2020

We also need to update the journal's bloom filter of inode numbers that
each journal write has upudates for - in case the inode gets evicted
before it gets fsynced.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

61ce38b8

bcachefs: Fix unmount path · d5e4dcc2

Kent Overstreet authored Sep 08, 2020

There was a long standing race in the mount/unmount code - the VFS
intends for mount/unmount synchronizatino to be handled by the list of
superblocks, but we were still holding devices open after tearing down
our superblock in the unmount path.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

d5e4dcc2

bcachefs: Don't fail mount if device has been removed · 625104ea

Kent Overstreet authored Sep 06, 2020

Also - make sure to show the devices we actually have open in /proc
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

625104ea

bcachefs: Improvements to the journal read error paths · ca73852a

Kent Overstreet authored Aug 24, 2020

 - Print out more information in error messages
 - On checksum error, keep the journal entry but mark it bad so that we
   can prefer entries from other devices that don't have bad checksums
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

ca73852a