- 29 Aug, 2024 9 commits
-
-
Jens Axboe authored
Merge tag 'md-6.12-20240829' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md into for-6.12/block Pull MD updates from Song: "Major changes in this set are: 1. md-bitmap refactoring, by Yu Kuai; 2. raid5 performance optimization, by Artur Paszkiewicz; 3. Other small fixes, by Yu Kuai and Chen Ni." * tag 'md-6.12-20240829' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md: (49 commits) md/raid5: rename wait_for_overlap to wait_for_reshape md/raid5: only add to wq if reshape is in progress md/raid5: use wait_on_bit() for R5_Overlap md: Remove flush handling md/md-bitmap: make in memory structure internal md/md-bitmap: merge md_bitmap_enabled() into bitmap_operations md/md-bitmap: merge md_bitmap_wait_behind_writes() into bitmap_operations md/md-bitmap: merge md_bitmap_free() into bitmap_operations md/md-bitmap: merge md_bitmap_set_pages() into struct bitmap_operations md/md-bitmap: merge md_bitmap_copy_from_slot() into struct bitmap_operation. md/md-bitmap: merge get_bitmap_from_slot() into bitmap_operations md/md-bitmap: merge md_bitmap_resize() into bitmap_operations md/md-bitmap: pass in mddev directly for md_bitmap_resize() md/md-bitmap: merge md_bitmap_daemon_work() into bitmap_operations md/md-bitmap: merge bitmap_unplug() into bitmap_operations md/md-bitmap: merge md_bitmap_unplug_async() into md_bitmap_unplug() md/md-bitmap: merge md_bitmap_sync_with_cluster() into bitmap_operations md/md-bitmap: merge md_bitmap_cond_end_sync() into bitmap_operations md/md-bitmap: merge md_bitmap_close_sync() into bitmap_operations md/md-bitmap: merge md_bitmap_end_sync() into bitmap_operations ...
-
Song Liu authored
From Artur: The wait_for_overlap wait queue is currently used in two cases, which are not really related: - waiting for actual overlapping bios, which uses R5_Overlap bit, - waiting for events related to reshape. Handling every write request in raid5_make_request() involves adding to and removing from this wait queue, which uses a spinlock. With fast storage and multiple submitting threads the contention on this lock is noticeable. This patch series aims to resolve this by separating the two cases mentioned above and using this wait queue only when reshape is in progress. The results when testing 4k random writes on raid5 with null_blk (8 jobs, qd=64, group_thread_cnt=8): before: 463k IOPS after: 523k IOPS The improvement is not huge with this series alone but it is just one of the bottlenecks. When applied onto some other changes I'm working on, it allowed to go from 845k IOPS to 975k IOPS on the same test. * md-6.12-raid5-opt: md/raid5: rename wait_for_overlap to wait_for_reshape md/raid5: only add to wq if reshape is in progress md/raid5: use wait_on_bit() for R5_Overlap Signed-off-by: Song Liu <song@kernel.org>
-
Artur Paszkiewicz authored
The only remaining uses of wait_for_overlap are related to reshape so rename it accordingly. Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com> Link: https://lore.kernel.org/r/20240827153536.6743-4-artur.paszkiewicz@intel.comSigned-off-by: Song Liu <song@kernel.org>
-
Artur Paszkiewicz authored
Now that actual overlaps are not handled on the wait_for_overlap wq anymore, the remaining cases when we wait on this wq are limited to reshape. If reshape is not in progress, don't add to the wq in raid5_make_request() because add_wait_queue() / remove_wait_queue() operations take a spinlock and cause noticeable contention when multiple threads are submitting requests to the mddev. Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com> Link: https://lore.kernel.org/r/20240827153536.6743-3-artur.paszkiewicz@intel.comSigned-off-by: Song Liu <song@kernel.org>
-
Artur Paszkiewicz authored
Convert uses of wait_for_overlap wait queue with R5_Overlap bit to wait_on_bit() / wake_up_bit(). Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com> Link: https://lore.kernel.org/r/20240827153536.6743-2-artur.paszkiewicz@intel.comSigned-off-by: Song Liu <song@kernel.org>
-
Christoph Hellwig authored
bio_split_rw is designed to split read and write bios with a payload. Currently it is called by __bio_split_to_limits for all operations not explicitly list, which works because bio_may_need_split explicitly checks for bi_vcnt == 1 and thus skips the bypass if there is no payload and bio_for_each_bvec loop will never execute it's body if bi_size is 0. But all this is hard to understand, fragile and wasted pointless cycles. Switch __bio_split_to_limits to only call bio_split_rw for READ and WRITE command and don't attempt any kind split for operation that do not require splitting. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Tested-by: Hans Holmberg <hans.holmberg@wdc.com> Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com> Link: https://lore.kernel.org/r/20240826173820.1690925-5-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>
-
Christoph Hellwig authored
Currently REQ_OP_ZONE_APPEND is handled by the bio_split_rw case in __bio_split_to_limits. This is harmful because REQ_OP_ZONE_APPEND bios do not adhere to the soft max_limits value but instead use their own capped version of max_hw_sectors, leading to incorrect splits that later blow up in bio_split. We still need the bio_split_rw logic to count nr_segs for blk-mq code, so add a new wrapper that passes in the right limit, and turns any bio that would need a split into an error as an additional debugging aid. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Tested-by: Hans Holmberg <hans.holmberg@wdc.com> Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com> Link: https://lore.kernel.org/r/20240826173820.1690925-4-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>
-
Christoph Hellwig authored
queue_limits_max_zone_append_sectors doesn't change the lim argument, so mark it as const. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Tested-by: Hans Holmberg <hans.holmberg@wdc.com> Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com> Link: https://lore.kernel.org/r/20240826173820.1690925-3-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>
-
Christoph Hellwig authored
The current setup with bio_may_exceed_limit and __bio_split_to_limits is a bit of a mess. Change it so that __bio_split_to_limits does all the work and is just a variant of bio_split_to_limits that returns nr_segs. This is done by inlining it and instead have the various bio_split_* helpers directly submit the potentially split bios. To support btrfs, the rw version has a lower level helper split out that just returns the offset to split. This turns out to nicely clean up the btrfs flow as well. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: David Sterba <dsterba@suse.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Tested-by: Hans Holmberg <hans.holmberg@wdc.com> Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com> Link: https://lore.kernel.org/r/20240826173820.1690925-2-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>
-
- 28 Aug, 2024 3 commits
-
-
Song Liu authored
From Yu Kuai (with minor changes by Song Liu): The background is that currently bitmap is using a global spin_lock, causing lock contention and huge IO performance degradation for all raid levels. However, it's impossible to implement a new lock free bitmap with current situation that md-bitmap exposes the internal implementation with lots of exported apis. Hence bitmap_operations is invented, to describe bitmap core implementation, and a new bitmap can be introduced with a new bitmap_operations, we only need to switch to the new one during initialization. And with this we can build bitmap as kernel module, but that's not our concern for now. This version was tested with mdadm tests and lvm2 tests. This set does not introduce new errors in these tests. * md-6.12-bitmap: (42 commits) md/md-bitmap: make in memory structure internal md/md-bitmap: merge md_bitmap_enabled() into bitmap_operations md/md-bitmap: merge md_bitmap_wait_behind_writes() into bitmap_operations md/md-bitmap: merge md_bitmap_free() into bitmap_operations md/md-bitmap: merge md_bitmap_set_pages() into struct bitmap_operations md/md-bitmap: merge md_bitmap_copy_from_slot() into struct bitmap_operation. md/md-bitmap: merge get_bitmap_from_slot() into bitmap_operations md/md-bitmap: merge md_bitmap_resize() into bitmap_operations md/md-bitmap: pass in mddev directly for md_bitmap_resize() md/md-bitmap: merge md_bitmap_daemon_work() into bitmap_operations md/md-bitmap: merge bitmap_unplug() into bitmap_operations md/md-bitmap: merge md_bitmap_unplug_async() into md_bitmap_unplug() md/md-bitmap: merge md_bitmap_sync_with_cluster() into bitmap_operations md/md-bitmap: merge md_bitmap_cond_end_sync() into bitmap_operations md/md-bitmap: merge md_bitmap_close_sync() into bitmap_operations md/md-bitmap: merge md_bitmap_end_sync() into bitmap_operations md/md-bitmap: remove the parameter 'aborted' for md_bitmap_end_sync() md/md-bitmap: merge md_bitmap_start_sync() into bitmap_operations md/md-bitmap: merge md_bitmap_endwrite() into bitmap_operations md/md-bitmap: merge md_bitmap_startwrite() into bitmap_operations ... Signed-off-by: Song Liu <song@kernel.org>
-
Md Haris Iqbal authored
The bio->bi_iter.bi_size is updated when bio_add_page() is called. So we do not need to assign msg->bi_size again to it, since its redudant and can also be harmful. Instead we can use it to add a sanity check, which checks the locally calculated bi_size, with the one sent in msg. Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com> Signed-off-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com> Link: https://lore.kernel.org/r/20240809135346.978320-1-haris.iqbal@ionos.comSigned-off-by: Jens Axboe <axboe@kernel.dk>
-
Yu Kuai authored
For flush request, md has a special flush handling to merge concurrent flush request into single one, however, the whole mechanism is based on a disk level spin_lock 'mddev->lock'. And fsync can be called quite often in some user cases, for consequence, spin lock from IO fast path can cause performance degradation. Fortunately, the block layer already has flush handling to merge concurrent flush request, and it only acquires hctx level spin lock. (see details in blk-flush.c) This patch removes the flush handling in md, and converts to use general block layer flush handling in underlying disks. Flush test for 4 nvme raid10: start 128 threads to do fsync 100000 times, on arm64, see how long it takes. Test script: void* thread_func(void* arg) { int fd = *(int*)arg; for (int i = 0; i < FSYNC_COUNT; i++) { fsync(fd); } return NULL; } int main() { int fd = open("/dev/md0", O_RDWR); if (fd < 0) { perror("open"); exit(1); } pthread_t threads[THREADS]; struct timeval start, end; gettimeofday(&start, NULL); for (int i = 0; i < THREADS; i++) { pthread_create(&threads[i], NULL, thread_func, &fd); } for (int i = 0; i < THREADS; i++) { pthread_join(threads[i], NULL); } gettimeofday(&end, NULL); close(fd); long long elapsed = (end.tv_sec - start.tv_sec) * 1000000LL + (end.tv_usec - start.tv_usec); printf("Elapsed time: %lld microseconds\n", elapsed); return 0; } Test result: about 10 times faster: Before this patch: 50943374 microseconds After this patch: 5096347 microseconds Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240827110616.3860190-1-yukuai1@huaweicloud.comSigned-off-by: Song Liu <song@kernel.org>
-
- 27 Aug, 2024 28 commits
-
-
Yu Kuai authored
Now that struct bitmap_page and bitmap is not used externally anymore, move them from md-bitmap.h to md-bitmap.c (expect that dm-raid is still using define marco 'COUNTER_MAX'). Also fix some checkpatch warnings. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240826074452.1490072-43-yukuai1@huaweicloud.comSigned-off-by: Song Liu <song@kernel.org>
-
Yu Kuai authored
So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240826074452.1490072-42-yukuai1@huaweicloud.comSigned-off-by: Song Liu <song@kernel.org>
-
Yu Kuai authored
So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240826074452.1490072-41-yukuai1@huaweicloud.comSigned-off-by: Song Liu <song@kernel.org>
-
Yu Kuai authored
So that the implementation won't be exposed, and it'll be possible o invent a new bitmap by replacing bitmap_operations. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240826074452.1490072-40-yukuai1@huaweicloud.comSigned-off-by: Song Liu <song@kernel.org>
-
Yu Kuai authored
o that the implementation won't be exposed, and it'll be possible o invent a new bitmap by replacing bitmap_operations. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240826074452.1490072-39-yukuai1@huaweicloud.comSigned-off-by: Song Liu <song@kernel.org>
-
Yu Kuai authored
So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240826074452.1490072-38-yukuai1@huaweicloud.comSigned-off-by: Song Liu <song@kernel.org>
-
Yu Kuai authored
So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240826074452.1490072-37-yukuai1@huaweicloud.comSigned-off-by: Song Liu <song@kernel.org>
-
Yu Kuai authored
So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240826074452.1490072-36-yukuai1@huaweicloud.comSigned-off-by: Song Liu <song@kernel.org>
-
Yu Kuai authored
And move the condition "if (mddev->bitmap)" into md_bitmap_resize() as well, on the one hand make code cleaner, on the other hand try not to access bitmap directly. Since we are here, also change the parameter 'init' from int to bool. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240826074452.1490072-35-yukuai1@huaweicloud.comSigned-off-by: Song Liu <song@kernel.org>
-
Yu Kuai authored
So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240826074452.1490072-34-yukuai1@huaweicloud.comSigned-off-by: Song Liu <song@kernel.org>
-
Yu Kuai authored
So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240826074452.1490072-33-yukuai1@huaweicloud.comSigned-off-by: Song Liu <song@kernel.org>
-
Yu Kuai authored
Add a parameter 'bool sync' to distinguish them, and md_bitmap_unplug_async() won't be exported anymore, hence bitmap_operations only need one op to cover them. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240826074452.1490072-32-yukuai1@huaweicloud.comSigned-off-by: Song Liu <song@kernel.org>
-
Yu Kuai authored
So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240826074452.1490072-31-yukuai1@huaweicloud.comSigned-off-by: Song Liu <song@kernel.org>
-
Yu Kuai authored
So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations. Also change the parameter from bitmap to mddev, to avoid access bitmap outside md-bitmap.c as much as possible. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240826074452.1490072-30-yukuai1@huaweicloud.comSigned-off-by: Song Liu <song@kernel.org>
-
Yu Kuai authored
So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations. Also change the parameter from bitmap to mddev, to avoid access bitmap outside md-bitmap.c as much as possible. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240826074452.1490072-29-yukuai1@huaweicloud.comSigned-off-by: Song Liu <song@kernel.org>
-
Yu Kuai authored
So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations. Also change the parameter from bitmap to mddev, to avoid access bitmap outside md-bitmap.c as much as possible. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240826074452.1490072-28-yukuai1@huaweicloud.comSigned-off-by: Song Liu <song@kernel.org>
-
Yu Kuai authored
For internal callers, aborted are always set to false, while for external callers, aborted are always set to true. Hence there is no need to always pass in true for exported api. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240826074452.1490072-27-yukuai1@huaweicloud.comSigned-off-by: Song Liu <song@kernel.org>
-
Yu Kuai authored
So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations. Also change the parameter from bitmap to mddev, to avoid access bitmap outside md-bitmap.c as much as possible. Also fix lots of code style. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240826074452.1490072-26-yukuai1@huaweicloud.comSigned-off-by: Song Liu <song@kernel.org>
-
Yu Kuai authored
So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations. Also change the parameter from bitmap to mddev, to avoid access bitmap outside md-bitmap.c as much as possible. And change the type of 'success' and 'behind' from int to bool. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240826074452.1490072-25-yukuai1@huaweicloud.comSigned-off-by: Song Liu <song@kernel.org>
-
Yu Kuai authored
So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations. Also change the parameter from bitmap to mddev, to avoid access bitmap outside md-bitmap.c as much as possible. And change the type of 'behind' from int to bool. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240826074452.1490072-24-yukuai1@huaweicloud.comSigned-off-by: Song Liu <song@kernel.org>
-
Yu Kuai authored
So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations. Also change the parameter from bitmap to mddev, to avoid access bitmap outside md-bitmap.c as much as possible. And while we're here, also fix coding style for bitmap_store(). Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240826074452.1490072-23-yukuai1@huaweicloud.comSigned-off-by: Song Liu <song@kernel.org>
-
Yu Kuai authored
So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations. Also change the parameter from bitmap to mddev, to avoid access bitmap outside md-bitmap.c as much as possible. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240826074452.1490072-22-yukuai1@huaweicloud.comSigned-off-by: Song Liu <song@kernel.org>
-
Yu Kuai authored
md_bitmap_setallbits() is not used, hence can be removed. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240826074452.1490072-21-yukuai1@huaweicloud.comSigned-off-by: Song Liu <song@kernel.org>
-
Yu Kuai authored
So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240826074452.1490072-20-yukuai1@huaweicloud.comSigned-off-by: Song Liu <song@kernel.org>
-
Yu Kuai authored
So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240826074452.1490072-19-yukuai1@huaweicloud.comSigned-off-by: Song Liu <song@kernel.org>
-
Yu Kuai authored
md_bitmap_print_sb() is only used inside md-bitmap.c, hence make it static, also rename it to bitmap_print_sb. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240826074452.1490072-18-yukuai1@huaweicloud.comSigned-off-by: Song Liu <song@kernel.org>
-
Yu Kuai authored
So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240826074452.1490072-17-yukuai1@huaweicloud.comSigned-off-by: Song Liu <song@kernel.org>
-
Yu Kuai authored
So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240826074452.1490072-16-yukuai1@huaweicloud.comSigned-off-by: Song Liu <song@kernel.org>
-