1. 06 Mar, 2024 34 commits
  2. 05 Mar, 2024 6 commits
    • Song Liu's avatar
      Merge branch 'dmraid-fix-6.9' into md-6.9 · 3a889fdc
      Song Liu authored
      This is the second half of fixes for dmraid. The first half is available
      at [1].
      
      This set contains fixes:
       - reshape can start unexpected, cause data corruption, patch 1,5,6;
       - deadlocks that reshape concurrent with IO, patch 8;
       - a lockdep warning, patch 9;
      
      For all the dmraid related tests in lvm2 suite, there is no new
      regressions compared against 6.6 kernels (which is good baseline before
      recent regressions).
      
      [1] https://lore.kernel.org/all/CAPhsuW7u1UKHCDOBDhD7DzOVtkGemDz_QnJ4DUq_kSN-Q3G66Q@mail.gmail.com/
      
      * dmraid-fix-6.9:
        dm-raid: fix lockdep waring in "pers->hot_add_disk"
        dm-raid456, md/raid456: fix a deadlock for dm-raid456 while io concurrent with reshape
        dm-raid: add a new helper prepare_suspend() in md_personality
        md/dm-raid: don't call md_reap_sync_thread() directly
        dm-raid: really frozen sync_thread during suspend
        md: add a new helper reshape_interrupted()
        md: export helper md_is_rdwr()
        md: export helpers to stop sync_thread
        md: don't clear MD_RECOVERY_FROZEN for new dm-raid until resume
      3a889fdc
    • Yu Kuai's avatar
      dm-raid: fix lockdep waring in "pers->hot_add_disk" · 95009ae9
      Yu Kuai authored
      The lockdep assert is added by commit a448af25 ("md/raid10: remove
      rcu protection to access rdev from conf") in print_conf(). And I didn't
      notice that dm-raid is calling "pers->hot_add_disk" without holding
      'reconfig_mutex'.
      
      "pers->hot_add_disk" read and write many fields that is protected by
      'reconfig_mutex', and raid_resume() already grab the lock in other
      contex. Hence fix this problem by protecting "pers->host_add_disk"
      with the lock.
      
      Fixes: 9092c02d ("DM RAID: Add ability to restore transiently failed devices on resume")
      Fixes: a448af25 ("md/raid10: remove rcu protection to access rdev from conf")
      Cc: stable@vger.kernel.org # v6.7+
      Signed-off-by: default avatarYu Kuai <yukuai3@huawei.com>
      Signed-off-by: default avatarXiao Ni <xni@redhat.com>
      Acked-by: default avatarMike Snitzer <snitzer@kernel.org>
      Signed-off-by: default avatarSong Liu <song@kernel.org>
      Link: https://lore.kernel.org/r/20240305072306.2562024-10-yukuai1@huaweicloud.com
      95009ae9
    • Yu Kuai's avatar
      dm-raid456, md/raid456: fix a deadlock for dm-raid456 while io concurrent with reshape · 41425f96
      Yu Kuai authored
      For raid456, if reshape is still in progress, then IO across reshape
      position will wait for reshape to make progress. However, for dm-raid,
      in following cases reshape will never make progress hence IO will hang:
      
      1) the array is read-only;
      2) MD_RECOVERY_WAIT is set;
      3) MD_RECOVERY_FROZEN is set;
      
      After commit c467e97f ("md/raid6: use valid sector values to determine
      if an I/O should wait on the reshape") fix the problem that IO across
      reshape position doesn't wait for reshape, the dm-raid test
      shell/lvconvert-raid-reshape.sh start to hang:
      
      [root@fedora ~]# cat /proc/979/stack
      [<0>] wait_woken+0x7d/0x90
      [<0>] raid5_make_request+0x929/0x1d70 [raid456]
      [<0>] md_handle_request+0xc2/0x3b0 [md_mod]
      [<0>] raid_map+0x2c/0x50 [dm_raid]
      [<0>] __map_bio+0x251/0x380 [dm_mod]
      [<0>] dm_submit_bio+0x1f0/0x760 [dm_mod]
      [<0>] __submit_bio+0xc2/0x1c0
      [<0>] submit_bio_noacct_nocheck+0x17f/0x450
      [<0>] submit_bio_noacct+0x2bc/0x780
      [<0>] submit_bio+0x70/0xc0
      [<0>] mpage_readahead+0x169/0x1f0
      [<0>] blkdev_readahead+0x18/0x30
      [<0>] read_pages+0x7c/0x3b0
      [<0>] page_cache_ra_unbounded+0x1ab/0x280
      [<0>] force_page_cache_ra+0x9e/0x130
      [<0>] page_cache_sync_ra+0x3b/0x110
      [<0>] filemap_get_pages+0x143/0xa30
      [<0>] filemap_read+0xdc/0x4b0
      [<0>] blkdev_read_iter+0x75/0x200
      [<0>] vfs_read+0x272/0x460
      [<0>] ksys_read+0x7a/0x170
      [<0>] __x64_sys_read+0x1c/0x30
      [<0>] do_syscall_64+0xc6/0x230
      [<0>] entry_SYSCALL_64_after_hwframe+0x6c/0x74
      
      This is because reshape can't make progress.
      
      For md/raid, the problem doesn't exist because register new sync_thread
      doesn't rely on the IO to be done any more:
      
      1) If array is read-only, it can switch to read-write by ioctl/sysfs;
      2) md/raid never set MD_RECOVERY_WAIT;
      3) If MD_RECOVERY_FROZEN is set, mddev_suspend() doesn't hold
         'reconfig_mutex', hence it can be cleared and reshape can continue by
         sysfs api 'sync_action'.
      
      However, I'm not sure yet how to avoid the problem in dm-raid yet. This
      patch on the one hand make sure raid_message() can't change
      sync_thread() through raid_message() after presuspend(), on the other
      hand detect the above 3 cases before wait for IO do be done in
      dm_suspend(), and let dm-raid requeue those IO.
      
      Cc: stable@vger.kernel.org # v6.7+
      Signed-off-by: default avatarYu Kuai <yukuai3@huawei.com>
      Signed-off-by: default avatarXiao Ni <xni@redhat.com>
      Acked-by: default avatarMike Snitzer <snitzer@kernel.org>
      Signed-off-by: default avatarSong Liu <song@kernel.org>
      Link: https://lore.kernel.org/r/20240305072306.2562024-9-yukuai1@huaweicloud.com
      41425f96
    • Yu Kuai's avatar
      dm-raid: add a new helper prepare_suspend() in md_personality · 5625ff8b
      Yu Kuai authored
      There are no functional changes for now, prepare to fix a deadlock for
      dm-raid456.
      
      Cc: stable@vger.kernel.org # v6.7+
      Signed-off-by: default avatarYu Kuai <yukuai3@huawei.com>
      Signed-off-by: default avatarXiao Ni <xni@redhat.com>
      Acked-by: default avatarMike Snitzer <snitzer@kernel.org>
      Signed-off-by: default avatarSong Liu <song@kernel.org>
      Link: https://lore.kernel.org/r/20240305072306.2562024-8-yukuai1@huaweicloud.com
      5625ff8b
    • Yu Kuai's avatar
      md/dm-raid: don't call md_reap_sync_thread() directly · cd32b27a
      Yu Kuai authored
      Currently md_reap_sync_thread() is called from raid_message() directly
      without holding 'reconfig_mutex', this is definitely unsafe because
      md_reap_sync_thread() can change many fields that is protected by
      'reconfig_mutex'.
      
      However, hold 'reconfig_mutex' here is still problematic because this
      will cause deadlock, for example, commit 130443d6 ("md: refactor
      idle/frozen_sync_thread() to fix deadlock").
      
      Fix this problem by using stop_sync_thread() to unregister sync_thread,
      like md/raid did.
      
      Fixes: be83651f ("DM RAID: Add message/status support for changing sync action")
      Cc: stable@vger.kernel.org # v6.7+
      Signed-off-by: default avatarYu Kuai <yukuai3@huawei.com>
      Signed-off-by: default avatarXiao Ni <xni@redhat.com>
      Acked-by: default avatarMike Snitzer <snitzer@kernel.org>
      Signed-off-by: default avatarSong Liu <song@kernel.org>
      Link: https://lore.kernel.org/r/20240305072306.2562024-7-yukuai1@huaweicloud.com
      cd32b27a
    • Yu Kuai's avatar
      dm-raid: really frozen sync_thread during suspend · 16c4770c
      Yu Kuai authored
      1) commit f52f5c71 ("md: fix stopping sync thread") remove
         MD_RECOVERY_FROZEN from __md_stop_writes() and doesn't realize that
         dm-raid relies on __md_stop_writes() to frozen sync_thread
         indirectly. Fix this problem by adding MD_RECOVERY_FROZEN in
         md_stop_writes(), and since stop_sync_thread() is only used for
         dm-raid in this case, also move stop_sync_thread() to
         md_stop_writes().
      2) The flag MD_RECOVERY_FROZEN doesn't mean that sync thread is frozen,
         it only prevent new sync_thread to start, and it can't stop the
         running sync thread; In order to frozen sync_thread, after seting the
         flag, stop_sync_thread() should be used.
      3) The flag MD_RECOVERY_FROZEN doesn't mean that writes are stopped, use
         it as condition for md_stop_writes() in raid_postsuspend() doesn't
         look correct. Consider that reentrant stop_sync_thread() do nothing,
         always call md_stop_writes() in raid_postsuspend().
      4) raid_message can set/clear the flag MD_RECOVERY_FROZEN at anytime,
         and if MD_RECOVERY_FROZEN is cleared while the array is suspended,
         new sync_thread can start unexpected. Fix this by disallow
         raid_message() to change sync_thread status during suspend.
      
      Note that after commit f52f5c71 ("md: fix stopping sync thread"), the
      test shell/lvconvert-raid-reshape.sh start to hang in stop_sync_thread(),
      and with previous fixes, the test won't hang there anymore, however, the
      test will still fail and complain that ext4 is corrupted. And with this
      patch, the test won't hang due to stop_sync_thread() or fail due to ext4
      is corrupted anymore. However, there is still a deadlock related to
      dm-raid456 that will be fixed in following patches.
      Reported-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Closes: https://lore.kernel.org/all/e5e8afe2-e9a8-49a2-5ab0-958d4065c55e@redhat.com/
      Fixes: 1af2048a ("dm raid: fix deadlock caused by premature md_stop_writes()")
      Fixes: 9dbd1aa3 ("dm raid: add reshaping support to the target")
      Fixes: f52f5c71 ("md: fix stopping sync thread")
      Cc: stable@vger.kernel.org # v6.7+
      Signed-off-by: default avatarYu Kuai <yukuai3@huawei.com>
      Signed-off-by: default avatarXiao Ni <xni@redhat.com>
      Acked-by: default avatarMike Snitzer <snitzer@kernel.org>
      Signed-off-by: default avatarSong Liu <song@kernel.org>
      Link: https://lore.kernel.org/r/20240305072306.2562024-6-yukuai1@huaweicloud.com
      16c4770c