- 26 Sep, 2022 40 commits
-
-
Josef Bacik authored
We have two variants of lock/unlock extent, one set that takes a cached state, another that does not. This is slightly annoying, and generally speaking there are only a few places where we don't have a cached state. Simplify this by making lock_extent/unlock_extent the only variant and make it take a cached state, then convert all the callers appropriately. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Josef Bacik authored
The only places that set extent_changeset is set_record_extent_bits, everywhere else sets it to NULL. Drop this argument from set_extent_bit. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Josef Bacik authored
This is only used for internal locking related helpers, everybody else just passes in NULL. I've changed set_extent_bit to __set_extent_bit and made it static, removed failed_start from set_extent_bit and have it call __set_extent_bit with a NULL failed_start, and I've moved some code down below the now static __set_extent_bit. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Josef Bacik authored
This is only used in the case that we are clearing EXTENT_LOCKED, so infer this value from the bits passed in instead of taking it as an argument. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Josef Bacik authored
This is only ever set if we have EXTENT_LOCKED set, so simply push this into the function itself and remove the function argument. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Josef Bacik authored
These prototypes have nothing to do with the extent_io_tree helpers, move them to their appropriate header. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Josef Bacik authored
We use rb_next/rb_prev and then get the entry for the adjacent items in an extent io tree. We have helpers for this, so convert merge_state to use next_state/prev_state and simplify the code. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Josef Bacik authored
Instead of doing the rb_entry again once we return from this function, simply return the actual states themselves, and then clean up the only user of this helper to handle states instead of nodes. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Josef Bacik authored
We use this to search for an extent state, or return the nodes we need to insert a new extent state. This means we have the following pattern node = tree_search_for_insert(); if (!node) { /* alloc and insert. */ goto again; } state = rb_entry(node, struct extent_state, rb_node); we don't use the node for anything else. Making tree_search_for_insert() return the extent_state means we can drop the rb_node and clean this up by eliminating the rb_entry. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Josef Bacik authored
We have a consistent pattern of n = tree_search(); if (!n) goto out; state = rb_entry(n, struct extent_state, rb_node); while (state) { /* do something. */ } which is a bit redundant. If we make tree_search return the state we can simply have state = tree_search(); while (state) { /* do something. */ } which cleans up the code quite a bit. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Josef Bacik authored
We can simplify a lot of these functions where we have to cycle through extent_state's by simply using next_state() instead of rb_next(). In many spots this allows us to do things like while (state) { /* whatever */ state = next_state(state); } instead of while (1) { state = rb_entry(n, struct extent_state, rb_node); n = rb_next(n); if (!n) break; } Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Josef Bacik authored
This existed when we overloaded the tree manipulation functions for both the extent_io_tree and the extent buffer tree. However the extent buffers are now stored in a radix tree, so we no longer need this abstraction. Remove struct tree_entry and use extent_state directly instead. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Josef Bacik authored
Now that we've moved everything we can unexport all the temporary exports, move the random helpers, and mark everything as static again. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Josef Bacik authored
We no longer need to export this as all users are in extent-io-tree.c, remove it from the header and put it into extent-io-tree.c. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Josef Bacik authored
This is still huge, but unfortunately I cannot make it smaller without renaming tree_search() and changing all the callers to use the new name, then moving those chunks and then changing the name back. This feels like too much churn for code movement, so I've limited this to only things that called tree_search(). With this patch all of the extent_io_tree code is now in extent-io-tree.c. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Josef Bacik authored
These are the last few helpers that do not rely on tree_search() and who's other helpers are exported and in extent-io-tree.c already. Move these across now in order to make the core move smaller. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Josef Bacik authored
In order to avoid moving all of the related code at once temporarily export all of the extent state related helpers. Then move these helpers into extent-io-tree.c. We will clean up the exports and make them static in followup patches. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Josef Bacik authored
A lot of the various internals of extent_io_tree call these two functions for insert or searching the rb tree for entries, so temporarily export them and then move them to extent-io-tree.c. We can't move tree_search() without renaming it, and I don't want to introduce a bunch of churn just to do that, so move these functions first and then we can move a few big functions and then the remaining users of tree_search(). Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Josef Bacik authored
This helper is used by a lot of the core extent_io_tree helpers, so temporarily export it and move it into extent-io-tree.c in order to make it straightforward to migrate the helpers in batches. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Josef Bacik authored
This is used by the subpage code in addition to lock_extent_bits, so export it so we can move it out of extent_io.c Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Josef Bacik authored
These are just variants and wrappers around the actual work horses of the extent state. Extract these out of extent_io.c. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Josef Bacik authored
We only call these functions from the qgroup code which doesn't call with EXTENT_BIT_LOCKED. These are BUG_ON()'s that exist to keep us developers from using these functions with EXTENT_BIT_LOCKED, so convert them to ASSERT()'s. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Josef Bacik authored
Start cleaning up extent_io.c by moving the extent state code out of it. This patch starts with the extent state allocation code and the extent_io_tree init code. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Josef Bacik authored
We're going to move this code in stages, but while we're doing that we need to export these helpers so we can more easily move the code into the new file. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Josef Bacik authored
Currently we have the add/del functions generic so that we can use them for both extent buffers and extent states. We want to separate this code however, so separate these helpers into per-object helpers in anticipation of the split. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Josef Bacik authored
In order to help separate the extent buffer from the extent io tree code we need to break up the init functions. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Josef Bacik authored
Currently we're using find_first_extent_bit_state to check if our state contains the given failrec range, however this is more of an internal extent_io_tree helper, and is technically unsafe to use because we're accessing the state outside of the extent_io_tree lock. Instead use the normal helper find_first_extent_bit which returns the range of the extent state we find in find_first_extent_bit_state and use that to do our sanity checking. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Josef Bacik authored
We still have this oddity of stashing the io_failure_record in the extent state for the io_failure_tree, which is leftover from when we used to stuff private pointers in extent_io_trees. However this doesn't make a lot of sense for the io failure records, we can simply use a normal rb_tree for this. This will allow us to further simplify the extent_io_tree code by removing the io_failure_rec pointer from the extent state. Convert the io_failure_tree to an rb tree + spinlock in the inode, and then use our rb tree simple helpers to insert and find failed records. This greatly cleans up this code and makes it easier to separate out the extent_io_tree code. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Josef Bacik authored
These are internally used functions and are not used outside of extent_io.c. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Josef Bacik authored
This is exported, so rename it to btrfs_clean_io_failure. Additionally we are passing in the io tree's and such from the inode, so instead of doing all that simply pass in the inode itself and get all the components we need directly inside of btrfs_clean_io_failure. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
David Sterba authored
KCSAN reports that there's unlocked access mixed with locked access, which is technically correct but is not a bug. To avoid false alerts at least from KCSAN, add annotation and use a wrapper whenever ->full is accessed for read outside of lock. It is used as a fast check and only advisory. In the worst case the block reserve is found !full and becomes full in the meantime, but properly handled. Depending on the value of ->full, btrfs_block_rsv_release decides where to return the reservation, and block_rsv_release_bytes handles a NULL pointer for block_rsv and if it's not NULL then it double checks the full status under a lock. Link: https://lore.kernel.org/linux-btrfs/CAAwBoOJDjei5Hnem155N_cJwiEkVwJYvgN-tQrwWbZQGhFU=cA@mail.gmail.com/ Link: https://lore.kernel.org/linux-btrfs/YvHU/vsXd7uz5V6j@hungrycats.orgReported-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org> Signed-off-by: David Sterba <dsterba@suse.com>
-
Filipe Manana authored
At space-info.c:__reserve_bytes(), we increment the 'used' variable, but then we don't use the variable anymore, making the increment pointless. The increment became useless with commit 2e294c60 ("btrfs: simplify the logic in need_preemptive_flushing"), so just remove it. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Christoph Hellwig authored
btrfs_check_zoned_mode is really hard to follow, mostly due to the fact that a lot of the checks use duplicate conditions after support for zone emulation for conventional devices on file systems with the ZONED flag was added. Fix this by factoring out the check for host managed devices for !ZONED file systems into a separate helper and then simplifying the rest of the code. Reviewed-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David Sterba <dsterba@suse.com>
-
Christophe JAILLET authored
Add a missing 'r'. s/qgoup/qgroup/ . Codespell does not catch that for some reason. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Gaosheng Cui authored
btrfs_bit_radix_cachep has been removed since commit 45c06543 ("Btrfs: remove unused btrfs_bit_radix slab"), so remove it. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Qu Wenruo authored
Btrfs qgroup has a long history of bringing performance penalty in btrfs_commit_transaction(). Although we tried our best to migrate such impact, there is still an unsolved call site, btrfs_drop_snapshot(). This function will find the highest shared tree block and modify its extent ownership to do a subvolume/snapshot dropping. Such change will affect the whole subtree, and cause tons of qgroup dirty extents and stall btrfs_commit_transaction(). To avoid such problem, here we introduce a new sysfs interface, /sys/fs/btrfs/<uuid>/qgroups/drop_subptree_threshold, to determine at whether and at which level we should skip qgroup accounting for subtree dropping. The default value is BTRFS_MAX_LEVEL, thus every subtree drop will go through qgroup accounting, to ensure qgroup numbers are kept as consistent as possible. While for performance sensitive cases, add a way to change the values to more reasonable values like 3, to make any subtree, which is at or higher than level 3, to mark qgroup inconsistent and skip the accounting. The cost is obvious, the qgroup number is no longer consistent, but at least performance is more reasonable, and users have the control. Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Qu Wenruo authored
The new flag will make btrfs qgroup skip all its time consuming qgroup accounting. The lifespan is the same as BTRFS_QGROUP_RUNTIME_FLAG_CANCEL_RESCAN, only get cleared after a new rescan. Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Qu Wenruo authored
Introduce a new runtime flag, BTRFS_QGROUP_RUNTIME_FLAG_CANCEL_RESCAN, which will inform qgroup rescan to cancel its work asynchronously. This is to address the window when an operation makes qgroup numbers inconsistent (like qgroup inheriting) while a qgroup rescan is running. In that case, qgroup inconsistent flag will be cleared when qgroup rescan finishes. But we changed the ownership of some extents, which means the rescan is already meaningless, and the qgroup inconsistent flag should not be cleared. With the new flag, each time we set INCONSISTENT flag, we also set this new flag to inform any running qgroup rescan to exit immediately, and leaving the INCONSISTENT flag there. The new runtime flag can only be cleared when a new rescan is started. Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Qu Wenruo authored
Currently we only have 3 qgroup flags: - BTRFS_QGROUP_STATUS_FLAG_ON - BTRFS_QGROUP_STATUS_FLAG_RESCAN - BTRFS_QGROUP_STATUS_FLAG_INCONSISTENT These flags match the on-disk flags used in btrfs_qgroup_status. But we're going to introduce extra runtime flags which will not reach disks. So here we introduce a new mask, BTRFS_QGROUP_STATUS_FLAGS_MASK, to make sure only those flags can reach disks. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Qu Wenruo authored
Although we already have info kobject for each qgroup, we don't have global qgroup info attributes to show things like enabled or inconsistent status flags. Add this qgroups attribute groups, and the first member is qgroup_flags, which is a read-only attribute to show human readable qgroup flags. The path is: /sys/fs/btrfs/<uuid>/qgroups/enabled /sys/fs/btrfs/<uuid>/qgroups/inconsistent The output is simple, just 1 or 0. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-