Commits · 497c982f057d3a20af4df313d997e81ca903ce4e · Kirill Smelkov / linux

08 May, 2024 15 commits

bcachefs: New assertion for writing to the journal after shutdown · 497c982f
Kent Overstreet authored Feb 20, 2024
```
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
```
497c982f

bcachefs: bch2_btree_path_to_text() · 00589cad

Kent Overstreet authored Apr 05, 2024

Long form version of bch2_btree_path_to_text() - useful in error
messages and tracepoints.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

00589cad

bcachefs: add btree_node_merging_disabled debug param · 55778814
Kent Overstreet authored Apr 05, 2024
```
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
```
55778814
bcachefs: bch2_hash_lookup() now returns bkey_s_c · ac01928b
Kent Overstreet authored Apr 07, 2024
```
small cleanup
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
```
ac01928b
bcachefs: bch2_journal_keys_dump() · 6ab71b4a
Kent Overstreet authored Apr 09, 2024
```
debug helper
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
```
6ab71b4a

bcachefs: bch2_btree_node_header_to_text() · 9089376f

Kent Overstreet authored Apr 10, 2024

better btree node read path error messages
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

9089376f

bcachefs: prt_printf() now respects \r\n\t · 7423330e
Kent Overstreet authored Apr 10, 2024
```
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
```
7423330e
bcachefs: printbufs: prt_printf() now handles \t\r\n · 2dcb605e
Kent Overstreet authored Apr 10, 2024
```
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
```
2dcb605e

bcachefs: printbuf improvements · acce32a5

Kent Overstreet authored Apr 10, 2024

- fix assorted (harmless) off-by-one errors
- we were inconsistent on whether out->pos stays <= out->size on
  overflow; now it does, and printbuf.overflow exists to indicate if a
  printbuf has overflowed
- factor out printbuf_advance_pos()
- printbuf_nul_terminate_reserved(); use this to reduce the number of
  printbuf_make_room() calls
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

acce32a5

bcachefs: Run upgrade/downgrade even in -o nochanges mode · 62606398

Kent Overstreet authored Apr 28, 2024

We need to be able to test these paths in dry run mode.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

62606398

bcachefs: Better write_super() error messages · 6d828691

Kent Overstreet authored May 03, 2024

When a superblock write is silently dropped or it's been modified by
another process we need to know which device it was.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

6d828691

bcachefs: Fix xattr_to_text() unsafety · 74768337
Kent Overstreet authored May 08, 2024
```
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
```
74768337

bcachefs: bch2_bkey_format_field_overflows() · 61692c78

Kent Overstreet authored May 08, 2024

Fix another shift-by-64 by factoring out a common helper for
bch2_bkey_format_invalid() and bformat_needs_redo() (where it was
already fixed).

Reported-by: syzbot+9833a1d29d4a44361e2c@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

61692c78

bcachefs: Fix needs_whiteout BUG_ON() in bkey_sort() · 5dfd3746

Kent Overstreet authored May 08, 2024

Btree nodes are log structured; thus, we need to emit whiteouts when
we're deleting a key that's been written out to disk.

k->needs_whiteout tracks whether a key will need a whiteout when it's
deleted, and this requires some careful handling; e.g. the key we're
deleting may not have been written out to disk, but it may have
overwritten a key that was - thus we need to carry this flag around on
overwrites.

Invariants:
There may be multiple key for the same position in a given node (because
of overwrites), but only one of them will be a live (non deleted) key,
and only one key for a given position will have the needs_whiteout flag
set.

Additionally, we don't want to carry around whiteouts that need to be
written in the main searchable part of a btree node - btree_iter_peek()
will have to skip past them, and this can lead to an O(n^2) issues when
doing sequential deletions (e.g. inode rm/truncate). So there's a
separate region in the btree node buffer for unwritten whiteouts; these
are merge sorted with the rest of the keys we're writing in the btree
node write path.

The unwritten whiteouts was a later optimization that bch2_sort_keys()
didn't take into account; the unwritten whiteouts area means that we
never have deleted keys with needs_whiteout set in the main searchable
part of a btree node.

That means we can simplify and optimize some sort paths, and eliminate
an assertion that syzbot found:

- Unless we're in the btree node write path, it's always ok to drop
  whiteouts when sorting
- When sorting for a btree node write, we drop the whiteout if it's not
  from the unwritten whiteouts area, or if it's overwritten by a real
  key at the same position.

This completely eliminates some tricky logic for propagating the
needs_whiteout flag: syzbot was able to hit the assertion that checked
that there shouldn't be more than one key at the same pos with
needs_whiteout set, likely due to a combination of flipping on
needs_whiteout on all written keys (they need whiteouts if overwritten),
combined with not always dropping unneeded whiteouts, and the tricky
logic in the sort path for preserving needs_whiteout that wasn't really
needed.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

5dfd3746

bcachefs: Fix sb_clean_validate endianness conversion · 5ad1f33c
Kent Overstreet authored May 08, 2024
```
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
```
5ad1f33c

07 May, 2024 2 commits

bcachefs: Add missing sched_annotate_sleep() in bch2_journal_flush_seq_async() · 6e297a73
Kent Overstreet authored May 06, 2024
```
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
```
6e297a73

bcachefs: Fix race in bch2_write_super() · 54541c1f

Kent Overstreet authored May 06, 2024

bch2_write_super() was looping over online devices multiple times -
dropping and retaking io_ref each time.

This meant it could race with device removal; it could increment the
sequence number on a device but fail to write it - and then if the
device was re-added, it would get confused the next time around thinking
a superblock write was silently dropped.

Fix this by taking io_ref once, and stashing pointers to online devices
in a darray.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

54541c1f

06 May, 2024 18 commits

bcachefs: BCH_SB_LAYOUT_SIZE_BITS_MAX · 71dac248

Kent Overstreet authored May 06, 2024

Define a constant for the max superblock size, to avoid a too-large
shift.

Reported-by: syzbot+a8b0fb419355c91dda7f@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

71dac248

bcachefs: Add missing skcipher_request_set_callback() call · 88ab1018
Kent Overstreet authored May 06, 2024
```
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
```
88ab1018

bcachefs: Fix snapshot_t() usage in bch2_fs_quota_read_inode() · 8060bf1d

Kent Overstreet authored May 05, 2024

bch2_fs_quota_read_inode() wasn't entirely updated to the
bch2_snapshot_tree() helper, which takes rcu lock.

Reported-by: syzbot+a3a9a61224ed3b7f0010@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

8060bf1d

bcachefs: Fix shift-by-64 in bformat_needs_redo() · 0ec5b3b7

Kent Overstreet authored May 05, 2024

Ancient versions of bcachefs produced packed formats that could
represent keys that our in memory format cannot represent;
bformat_needs_redo() has some tricky shifts to check for this sort of
overflow.

Reported-by: syzbot+594427aebfefeebe91c6@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

0ec5b3b7

bcachefs: Guard against unknown k.k->type in __bkey_invalid() · 2bb9600d

Kent Overstreet authored May 05, 2024

For forwards compatibility we have to allow unknown key types, and only
run the checks that make sense against them.

Fix a missing guard on k.k->type being known.

Reported-by: syzbot+ae4dc916da3ce51f284f@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

2bb9600d

bcachefs: Add missing validation for superblock section clean · f3905522

Kent Overstreet authored May 05, 2024

We were forgetting to check for jset entries that overrun the end of the
section - both in validate and to_text(); to_text() needs to be safe for
types that fail to validate.

Reported-by: syzbot+c48865e11e7e893ec4ab@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

f3905522

bcachefs: Fix assert in bch2_alloc_v4_invalid() · 6b8cbfc3

Kent Overstreet authored May 05, 2024

Reported-by: syzbot+10827fa6b176e1acf1d0@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

6b8cbfc3

bcachefs: fix overflow in fiemap · 9a0ec045

Reed Riley authored May 04, 2024

filefrag (and potentially other utilities that call fiemap) sometimes
pass ULONG_MAX as the length.  fiemap_prep clamps excessively large
lengths - but the calculation of end can overflow if it occurs before
calling fiemap_prep.  When this happens, filefrag assumes it has read to
the end and exits.
Signed-off-by: Reed Riley <reed@riley.engineer>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

9a0ec045

bcachefs: Add a better limit for maximum number of buckets · db42549d

Kent Overstreet authored May 04, 2024

The bucket_gens array is a single array allocation (one byte per
bucket), and kernel allocations are still limited to INT_MAX.

Check this limit to avoid failing the bucket_gens array allocation.

Reported-by: syzbot+b29f436493184ea42e2b@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

db42549d

bcachefs: Fix lifetime issue in device iterator helpers · 18b4abce

Kent Overstreet authored May 04, 2024

bch2_get_next_dev() and bch2_get_next_online_dev() iterate over devices,
dropping and taking refs as they go; we can't access the previous device
(for ca->dev_idx) after we've dropped our ref to it, unless we take
rcu_read_lock() first.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

18b4abce

bcachefs: Fix bch2_dev_lookup() refcounting · 3a2d0259

Kent Overstreet authored May 04, 2024

bch2_dev_lookup() is supposed to take a ref on the device it returns, but
for_each_member_device() takes refs as it iterates,
for_each_member_device_rcu() does not.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

3a2d0259

bcachefs: Initialize bch_write_op->failed in inline data path · 1267df40

Kent Overstreet authored May 04, 2024

Normally this is initialized in __bch2_write(), which is executed in a
loop, but the inline data path skips this.

Reported-by: syzbot+fd3ccb331eb21f05d13b@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

1267df40

bcachefs: Fix refcount put in sb_field_resize error path · feb077c1
Kent Overstreet authored May 03, 2024
```
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
```
feb077c1

bcachefs: Inodes need extra padding for varint_decode_fast() · 4a8521b6

Kent Overstreet authored May 03, 2024

Reported-by: syzbot+66b9b74f6520068596a9@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

4a8521b6

bcachefs: Fix early error path in bch2_fs_btree_key_cache_exit() · b30b70ad

Kent Overstreet authored May 03, 2024

Reported-by: syzbot+a35cdb62ec34d44fb062@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

b30b70ad

bcachefs: bucket_pos_to_bp_noerror() · a2ddaf96

Kent Overstreet authored May 03, 2024

We don't want the assert when we're checking if the backpointer is
valid.

Reported-by: syzbot+bf7215c0525098e7747a@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

a2ddaf96

bcachefs: don't free error pointers · 7ffec9cc

Kent Overstreet authored May 03, 2024

Reported-by: syzbot+3333603f569fc2ef258c@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

7ffec9cc

bcachefs: Fix a scheduler splat in __bch2_next_write_buffer_flush_journal_buf() · 72e71bf0

Kent Overstreet authored May 06, 2024

We're using mutex_lock() inside a wait_event() conditional -
prepare_to_wait() has already flipped task state, so potentially
blocking ops need annotation.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

72e71bf0

29 Apr, 2024 3 commits
- bcachefs: fix integer conversion bug · c258c08a
  Kent Overstreet authored Apr 25, 2024
```
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
```
  c258c08a
- bcachefs: btree node scan now fills in sectors_written · f7c3dc26
  Kent Overstreet authored Apr 25, 2024
```
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
```
  f7c3dc26
- bcachefs: Remove accidental debug assert · ae927653
  Kent Overstreet authored Apr 22, 2024
```
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
```
  ae927653
28 Apr, 2024 2 commits

Linux 6.9-rc6 · e67572cd
Linus Torvalds authored Apr 28, 2024

e67572cd

Merge tag 'sched-urgent-2024-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 245c8e81

Linus Torvalds authored Apr 28, 2024

Pull scheduler fixes from Ingo Molnar:

 - Fix EEVDF corner cases

 - Fix two nohz_full= related bugs that can cause boot crashes
   and warnings

* tag 'sched-urgent-2024-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/isolation: Fix boot crash when maxcpus < first housekeeping CPU
  sched/isolation: Prevent boot crash when the boot CPU is nohz_full
  sched/eevdf: Prevent vlag from going out of bounds in reweight_eevdf()
  sched/eevdf: Fix miscalculation in reweight_entity() when se is not curr
  sched/eevdf: Always update V if se->on_rq when reweighting

245c8e81