Commit 9164e4a5 authored by Song Liu's avatar Song Liu

Merge branch 'md-suspend-rewrite' into md-next

From Yu Kuai, written by Song Liu

Recent tests with raid10 revealed many issues with the following scenarios:

- add or remove disks to the array
- issue io to the array

At first, we fixed each problem independently respect that io can
concurrent with array reconfiguration. However, with more issues reported
continuously, I am hoping to fix these problems thoroughly.

Refer to how block layer protect io with queue reconfiguration (for
example, change elevator):

blk_mq_freeze_queue
-> wait for all io to be done, and prevent new io to be dispatched
// reconfiguration
blk_mq_unfreeze_queue

I think we can do something similar to synchronize io with array
reconfiguration.

Current synchronization works as the following. For the reconfiguration
operation:

1. Hold 'reconfig_mutex';
2. Check that rdev can be added/removed, one condition is that there is no
   IO (for example, check nr_pending).
3. Do the actual operations to add/remove a rdev, one procedure is
   set/clear a pointer to rdev.
4. Check if there is still no IO on this rdev, if not, revert the
   change.

IO path uses rcu_read_lock/unlock() to access rdev.

- rcu is used wrongly;
- There are lots of places involved that old rdev can be read, however,
many places doesn't handle old value correctly;
- Between step 3 and 4, if new io is dispatched, NULL will be read for
the rdev, and data will be lost if step 4 failed.

The new synchronization is similar to blk_mq_freeze_queue(). To add or
remove disk:

1. Suspend the array, that is, stop new IO from being dispatched
   and wait for inflight IO to finish.
2. Add or remove rdevs to array;
3. Resume the array;

IO path doesn't need to change for now, and all rcu implementation can
be removed.

Then main work is divided into 3 steps:

First, first make sure new apis to suspend the array is general:

- make sure suspend array will wait for io to be done(Done by [1]);
- make sure suspend array can be called for all personalities(Done by [2]);
- make sure suspend array can be called at any time(Done by [3]);
- make sure suspend array doesn't rely on 'reconfig_mutex'(PATCH 3-5);

Second replace old apis with new apis(PATCH 6-16). Specifically, the
synchronization is changed from:

  lock reconfig_mutex
  suspend array
  make changes
  resume array
  unlock reconfig_mutex

to:
   suspend array
   lock reconfig_mutex
   make changes
   unlock reconfig_mutex
   resume array

Finally, for the remain path that involved reconfiguration, suspend the
array first(PATCH 11,12, [4] and PATCH 17):

Preparatory work:
[1] https://lore.kernel.org/all/20230621165110.1498313-1-yukuai1@huaweicloud.com/
[2] https://lore.kernel.org/all/20230628012931.88911-2-yukuai1@huaweicloud.com/
[3] https://lore.kernel.org/all/20230825030956.1527023-1-yukuai1@huaweicloud.com/
[4] https://lore.kernel.org/all/20230825031622.1530464-1-yukuai1@huaweicloud.com/

* md-suspend-rewrite:
  md: rename __mddev_suspend/resume() back to mddev_suspend/resume()
  md: remove old apis to suspend the array
  md: suspend array in md_start_sync() if array need reconfiguration
  md/raid5: replace suspend with quiesce() callback
  md/md-linear: cleanup linear_add()
  md: cleanup mddev_create/destroy_serial_pool()
  md: use new apis to suspend array before mddev_create/destroy_serial_pool
  md: use new apis to suspend array for ioctls involed array reconfiguration
  md: use new apis to suspend array for adding/removing rdev from state_store()
  md: use new apis to suspend array for sysfs apis
  md/raid5: use new apis to suspend array
  md/raid5-cache: use new apis to suspend array
  md/md-bitmap: use new apis to suspend array for location_store()
  md/dm-raid: use new apis to suspend array
  md: add new helpers to suspend/resume and lock/unlock array
  md: add new helpers to suspend/resume array
  md: replace is_md_suspended() with 'mddev->suspended' in md_check_recovery()
  md/raid5-cache: use READ_ONCE/WRITE_ONCE for 'conf->log'
  md: use READ_ONCE/WRITE_ONCE for 'suspend_lo' and 'suspend_hi'
parents 9e55a22f 2b16a525
...@@ -3244,7 +3244,7 @@ static int raid_ctr(struct dm_target *ti, unsigned int argc, char **argv) ...@@ -3244,7 +3244,7 @@ static int raid_ctr(struct dm_target *ti, unsigned int argc, char **argv)
set_bit(MD_RECOVERY_FROZEN, &rs->md.recovery); set_bit(MD_RECOVERY_FROZEN, &rs->md.recovery);
/* Has to be held on running the array */ /* Has to be held on running the array */
mddev_lock_nointr(&rs->md); mddev_suspend_and_lock_nointr(&rs->md);
r = md_run(&rs->md); r = md_run(&rs->md);
rs->md.in_sync = 0; /* Assume already marked dirty */ rs->md.in_sync = 0; /* Assume already marked dirty */
if (r) { if (r) {
...@@ -3268,7 +3268,6 @@ static int raid_ctr(struct dm_target *ti, unsigned int argc, char **argv) ...@@ -3268,7 +3268,6 @@ static int raid_ctr(struct dm_target *ti, unsigned int argc, char **argv)
} }
} }
mddev_suspend(&rs->md);
set_bit(RT_FLAG_RS_SUSPENDED, &rs->runtime_flags); set_bit(RT_FLAG_RS_SUSPENDED, &rs->runtime_flags);
/* Try to adjust the raid4/5/6 stripe cache size to the stripe size */ /* Try to adjust the raid4/5/6 stripe cache size to the stripe size */
...@@ -3798,9 +3797,7 @@ static void raid_postsuspend(struct dm_target *ti) ...@@ -3798,9 +3797,7 @@ static void raid_postsuspend(struct dm_target *ti)
if (!test_bit(MD_RECOVERY_FROZEN, &rs->md.recovery)) if (!test_bit(MD_RECOVERY_FROZEN, &rs->md.recovery))
md_stop_writes(&rs->md); md_stop_writes(&rs->md);
mddev_lock_nointr(&rs->md); mddev_suspend(&rs->md, false);
mddev_suspend(&rs->md);
mddev_unlock(&rs->md);
} }
} }
...@@ -4059,8 +4056,7 @@ static void raid_resume(struct dm_target *ti) ...@@ -4059,8 +4056,7 @@ static void raid_resume(struct dm_target *ti)
clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery); clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
mddev->ro = 0; mddev->ro = 0;
mddev->in_sync = 0; mddev->in_sync = 0;
mddev_resume(mddev); mddev_unlock_and_resume(mddev);
mddev_unlock(mddev);
} }
} }
......
...@@ -175,7 +175,7 @@ static void __init md_setup_drive(struct md_setup_args *args) ...@@ -175,7 +175,7 @@ static void __init md_setup_drive(struct md_setup_args *args)
return; return;
} }
err = mddev_lock(mddev); err = mddev_suspend_and_lock(mddev);
if (err) { if (err) {
pr_err("md: failed to lock array %s\n", name); pr_err("md: failed to lock array %s\n", name);
goto out_mddev_put; goto out_mddev_put;
...@@ -221,7 +221,7 @@ static void __init md_setup_drive(struct md_setup_args *args) ...@@ -221,7 +221,7 @@ static void __init md_setup_drive(struct md_setup_args *args)
if (err) if (err)
pr_warn("md: starting %s failed\n", name); pr_warn("md: starting %s failed\n", name);
out_unlock: out_unlock:
mddev_unlock(mddev); mddev_unlock_and_resume(mddev);
out_mddev_put: out_mddev_put:
mddev_put(mddev); mddev_put(mddev);
} }
......
...@@ -1861,7 +1861,7 @@ void md_bitmap_destroy(struct mddev *mddev) ...@@ -1861,7 +1861,7 @@ void md_bitmap_destroy(struct mddev *mddev)
md_bitmap_wait_behind_writes(mddev); md_bitmap_wait_behind_writes(mddev);
if (!mddev->serialize_policy) if (!mddev->serialize_policy)
mddev_destroy_serial_pool(mddev, NULL, true); mddev_destroy_serial_pool(mddev, NULL);
mutex_lock(&mddev->bitmap_info.mutex); mutex_lock(&mddev->bitmap_info.mutex);
spin_lock(&mddev->lock); spin_lock(&mddev->lock);
...@@ -1977,7 +1977,7 @@ int md_bitmap_load(struct mddev *mddev) ...@@ -1977,7 +1977,7 @@ int md_bitmap_load(struct mddev *mddev)
goto out; goto out;
rdev_for_each(rdev, mddev) rdev_for_each(rdev, mddev)
mddev_create_serial_pool(mddev, rdev, true); mddev_create_serial_pool(mddev, rdev);
if (mddev_is_clustered(mddev)) if (mddev_is_clustered(mddev))
md_cluster_ops->load_bitmaps(mddev, mddev->bitmap_info.nodes); md_cluster_ops->load_bitmaps(mddev, mddev->bitmap_info.nodes);
...@@ -2348,11 +2348,10 @@ location_store(struct mddev *mddev, const char *buf, size_t len) ...@@ -2348,11 +2348,10 @@ location_store(struct mddev *mddev, const char *buf, size_t len)
{ {
int rv; int rv;
rv = mddev_lock(mddev); rv = mddev_suspend_and_lock(mddev);
if (rv) if (rv)
return rv; return rv;
mddev_suspend(mddev);
if (mddev->pers) { if (mddev->pers) {
if (mddev->recovery || mddev->sync_thread) { if (mddev->recovery || mddev->sync_thread) {
rv = -EBUSY; rv = -EBUSY;
...@@ -2429,8 +2428,7 @@ location_store(struct mddev *mddev, const char *buf, size_t len) ...@@ -2429,8 +2428,7 @@ location_store(struct mddev *mddev, const char *buf, size_t len)
} }
rv = 0; rv = 0;
out: out:
mddev_resume(mddev); mddev_unlock_and_resume(mddev);
mddev_unlock(mddev);
if (rv) if (rv)
return rv; return rv;
return len; return len;
...@@ -2539,7 +2537,7 @@ backlog_store(struct mddev *mddev, const char *buf, size_t len) ...@@ -2539,7 +2537,7 @@ backlog_store(struct mddev *mddev, const char *buf, size_t len)
if (backlog > COUNTER_MAX) if (backlog > COUNTER_MAX)
return -EINVAL; return -EINVAL;
rv = mddev_lock(mddev); rv = mddev_suspend_and_lock(mddev);
if (rv) if (rv)
return rv; return rv;
...@@ -2564,16 +2562,16 @@ backlog_store(struct mddev *mddev, const char *buf, size_t len) ...@@ -2564,16 +2562,16 @@ backlog_store(struct mddev *mddev, const char *buf, size_t len)
if (!backlog && mddev->serial_info_pool) { if (!backlog && mddev->serial_info_pool) {
/* serial_info_pool is not needed if backlog is zero */ /* serial_info_pool is not needed if backlog is zero */
if (!mddev->serialize_policy) if (!mddev->serialize_policy)
mddev_destroy_serial_pool(mddev, NULL, false); mddev_destroy_serial_pool(mddev, NULL);
} else if (backlog && !mddev->serial_info_pool) { } else if (backlog && !mddev->serial_info_pool) {
/* serial_info_pool is needed since backlog is not zero */ /* serial_info_pool is needed since backlog is not zero */
rdev_for_each(rdev, mddev) rdev_for_each(rdev, mddev)
mddev_create_serial_pool(mddev, rdev, false); mddev_create_serial_pool(mddev, rdev);
} }
if (old_mwb != backlog) if (old_mwb != backlog)
md_bitmap_update_sb(mddev->bitmap); md_bitmap_update_sb(mddev->bitmap);
mddev_unlock(mddev); mddev_unlock_and_resume(mddev);
return len; return len;
} }
......
...@@ -183,7 +183,6 @@ static int linear_add(struct mddev *mddev, struct md_rdev *rdev) ...@@ -183,7 +183,6 @@ static int linear_add(struct mddev *mddev, struct md_rdev *rdev)
* in linear_congested(), therefore kfree_rcu() is used to free * in linear_congested(), therefore kfree_rcu() is used to free
* oldconf until no one uses it anymore. * oldconf until no one uses it anymore.
*/ */
mddev_suspend(mddev);
oldconf = rcu_dereference_protected(mddev->private, oldconf = rcu_dereference_protected(mddev->private,
lockdep_is_held(&mddev->reconfig_mutex)); lockdep_is_held(&mddev->reconfig_mutex));
mddev->raid_disks++; mddev->raid_disks++;
...@@ -192,7 +191,6 @@ static int linear_add(struct mddev *mddev, struct md_rdev *rdev) ...@@ -192,7 +191,6 @@ static int linear_add(struct mddev *mddev, struct md_rdev *rdev)
rcu_assign_pointer(mddev->private, newconf); rcu_assign_pointer(mddev->private, newconf);
md_set_array_sectors(mddev, linear_size(mddev, 0, 0)); md_set_array_sectors(mddev, linear_size(mddev, 0, 0));
set_capacity_and_notify(mddev->gendisk, mddev->array_sectors); set_capacity_and_notify(mddev->gendisk, mddev->array_sectors);
mddev_resume(mddev);
kfree_rcu(oldconf, rcu); kfree_rcu(oldconf, rcu);
return 0; return 0;
} }
......
...@@ -206,8 +206,7 @@ static int rdev_need_serial(struct md_rdev *rdev) ...@@ -206,8 +206,7 @@ static int rdev_need_serial(struct md_rdev *rdev)
* 1. rdev is the first device which return true from rdev_enable_serial. * 1. rdev is the first device which return true from rdev_enable_serial.
* 2. rdev is NULL, means we want to enable serialization for all rdevs. * 2. rdev is NULL, means we want to enable serialization for all rdevs.
*/ */
void mddev_create_serial_pool(struct mddev *mddev, struct md_rdev *rdev, void mddev_create_serial_pool(struct mddev *mddev, struct md_rdev *rdev)
bool is_suspend)
{ {
int ret = 0; int ret = 0;
...@@ -215,15 +214,12 @@ void mddev_create_serial_pool(struct mddev *mddev, struct md_rdev *rdev, ...@@ -215,15 +214,12 @@ void mddev_create_serial_pool(struct mddev *mddev, struct md_rdev *rdev,
!test_bit(CollisionCheck, &rdev->flags)) !test_bit(CollisionCheck, &rdev->flags))
return; return;
if (!is_suspend)
mddev_suspend(mddev);
if (!rdev) if (!rdev)
ret = rdevs_init_serial(mddev); ret = rdevs_init_serial(mddev);
else else
ret = rdev_init_serial(rdev); ret = rdev_init_serial(rdev);
if (ret) if (ret)
goto abort; return;
if (mddev->serial_info_pool == NULL) { if (mddev->serial_info_pool == NULL) {
/* /*
...@@ -238,10 +234,6 @@ void mddev_create_serial_pool(struct mddev *mddev, struct md_rdev *rdev, ...@@ -238,10 +234,6 @@ void mddev_create_serial_pool(struct mddev *mddev, struct md_rdev *rdev,
pr_err("can't alloc memory pool for serialization\n"); pr_err("can't alloc memory pool for serialization\n");
} }
} }
abort:
if (!is_suspend)
mddev_resume(mddev);
} }
/* /*
...@@ -250,8 +242,7 @@ void mddev_create_serial_pool(struct mddev *mddev, struct md_rdev *rdev, ...@@ -250,8 +242,7 @@ void mddev_create_serial_pool(struct mddev *mddev, struct md_rdev *rdev,
* 2. when bitmap is destroyed while policy is not enabled. * 2. when bitmap is destroyed while policy is not enabled.
* 3. for disable policy, the pool is destroyed only when no rdev needs it. * 3. for disable policy, the pool is destroyed only when no rdev needs it.
*/ */
void mddev_destroy_serial_pool(struct mddev *mddev, struct md_rdev *rdev, void mddev_destroy_serial_pool(struct mddev *mddev, struct md_rdev *rdev)
bool is_suspend)
{ {
if (rdev && !test_bit(CollisionCheck, &rdev->flags)) if (rdev && !test_bit(CollisionCheck, &rdev->flags))
return; return;
...@@ -260,8 +251,6 @@ void mddev_destroy_serial_pool(struct mddev *mddev, struct md_rdev *rdev, ...@@ -260,8 +251,6 @@ void mddev_destroy_serial_pool(struct mddev *mddev, struct md_rdev *rdev,
struct md_rdev *temp; struct md_rdev *temp;
int num = 0; /* used to track if other rdevs need the pool */ int num = 0; /* used to track if other rdevs need the pool */
if (!is_suspend)
mddev_suspend(mddev);
rdev_for_each(temp, mddev) { rdev_for_each(temp, mddev) {
if (!rdev) { if (!rdev) {
if (!mddev->serialize_policy || if (!mddev->serialize_policy ||
...@@ -283,8 +272,6 @@ void mddev_destroy_serial_pool(struct mddev *mddev, struct md_rdev *rdev, ...@@ -283,8 +272,6 @@ void mddev_destroy_serial_pool(struct mddev *mddev, struct md_rdev *rdev,
mempool_destroy(mddev->serial_info_pool); mempool_destroy(mddev->serial_info_pool);
mddev->serial_info_pool = NULL; mddev->serial_info_pool = NULL;
} }
if (!is_suspend)
mddev_resume(mddev);
} }
} }
...@@ -359,11 +346,11 @@ static bool is_suspended(struct mddev *mddev, struct bio *bio) ...@@ -359,11 +346,11 @@ static bool is_suspended(struct mddev *mddev, struct bio *bio)
return true; return true;
if (bio_data_dir(bio) != WRITE) if (bio_data_dir(bio) != WRITE)
return false; return false;
if (mddev->suspend_lo >= mddev->suspend_hi) if (READ_ONCE(mddev->suspend_lo) >= READ_ONCE(mddev->suspend_hi))
return false; return false;
if (bio->bi_iter.bi_sector >= mddev->suspend_hi) if (bio->bi_iter.bi_sector >= READ_ONCE(mddev->suspend_hi))
return false; return false;
if (bio_end_sector(bio) < mddev->suspend_lo) if (bio_end_sector(bio) < READ_ONCE(mddev->suspend_lo))
return false; return false;
return true; return true;
} }
...@@ -431,42 +418,73 @@ static void md_submit_bio(struct bio *bio) ...@@ -431,42 +418,73 @@ static void md_submit_bio(struct bio *bio)
md_handle_request(mddev, bio); md_handle_request(mddev, bio);
} }
/* mddev_suspend makes sure no new requests are submitted /*
* to the device, and that any requests that have been submitted * Make sure no new requests are submitted to the device, and any requests that
* are completely handled. * have been submitted are completely handled.
* Once mddev_detach() is called and completes, the module will be
* completely unused.
*/ */
void mddev_suspend(struct mddev *mddev) int mddev_suspend(struct mddev *mddev, bool interruptible)
{ {
struct md_thread *thread = rcu_dereference_protected(mddev->thread, int err = 0;
lockdep_is_held(&mddev->reconfig_mutex));
WARN_ON_ONCE(thread && current == thread->tsk); /*
if (mddev->suspended++) * hold reconfig_mutex to wait for normal io will deadlock, because
return; * other context can't update super_block, and normal io can rely on
wake_up(&mddev->sb_wait); * updating super_block.
set_bit(MD_ALLOW_SB_UPDATE, &mddev->flags); */
percpu_ref_kill(&mddev->active_io); lockdep_assert_not_held(&mddev->reconfig_mutex);
if (interruptible)
err = mutex_lock_interruptible(&mddev->suspend_mutex);
else
mutex_lock(&mddev->suspend_mutex);
if (err)
return err;
if (mddev->suspended) {
WRITE_ONCE(mddev->suspended, mddev->suspended + 1);
mutex_unlock(&mddev->suspend_mutex);
return 0;
}
if (mddev->pers && mddev->pers->prepare_suspend) percpu_ref_kill(&mddev->active_io);
mddev->pers->prepare_suspend(mddev); if (interruptible)
err = wait_event_interruptible(mddev->sb_wait,
percpu_ref_is_zero(&mddev->active_io));
else
wait_event(mddev->sb_wait,
percpu_ref_is_zero(&mddev->active_io));
if (err) {
percpu_ref_resurrect(&mddev->active_io);
mutex_unlock(&mddev->suspend_mutex);
return err;
}
wait_event(mddev->sb_wait, percpu_ref_is_zero(&mddev->active_io)); /*
clear_bit_unlock(MD_ALLOW_SB_UPDATE, &mddev->flags); * For raid456, io might be waiting for reshape to make progress,
wait_event(mddev->sb_wait, !test_bit(MD_UPDATING_SB, &mddev->flags)); * allow new reshape to start while waiting for io to be done to
* prevent deadlock.
*/
WRITE_ONCE(mddev->suspended, mddev->suspended + 1);
del_timer_sync(&mddev->safemode_timer); del_timer_sync(&mddev->safemode_timer);
/* restrict memory reclaim I/O during raid array is suspend */ /* restrict memory reclaim I/O during raid array is suspend */
mddev->noio_flag = memalloc_noio_save(); mddev->noio_flag = memalloc_noio_save();
mutex_unlock(&mddev->suspend_mutex);
return 0;
} }
EXPORT_SYMBOL_GPL(mddev_suspend); EXPORT_SYMBOL_GPL(mddev_suspend);
void mddev_resume(struct mddev *mddev) void mddev_resume(struct mddev *mddev)
{ {
lockdep_assert_held(&mddev->reconfig_mutex); lockdep_assert_not_held(&mddev->reconfig_mutex);
if (--mddev->suspended)
mutex_lock(&mddev->suspend_mutex);
WRITE_ONCE(mddev->suspended, mddev->suspended - 1);
if (mddev->suspended) {
mutex_unlock(&mddev->suspend_mutex);
return; return;
}
/* entred the memalloc scope from mddev_suspend() */ /* entred the memalloc scope from mddev_suspend() */
memalloc_noio_restore(mddev->noio_flag); memalloc_noio_restore(mddev->noio_flag);
...@@ -477,6 +495,8 @@ void mddev_resume(struct mddev *mddev) ...@@ -477,6 +495,8 @@ void mddev_resume(struct mddev *mddev)
set_bit(MD_RECOVERY_NEEDED, &mddev->recovery); set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
md_wakeup_thread(mddev->thread); md_wakeup_thread(mddev->thread);
md_wakeup_thread(mddev->sync_thread); /* possibly kick off a reshape */ md_wakeup_thread(mddev->sync_thread); /* possibly kick off a reshape */
mutex_unlock(&mddev->suspend_mutex);
} }
EXPORT_SYMBOL_GPL(mddev_resume); EXPORT_SYMBOL_GPL(mddev_resume);
...@@ -672,6 +692,7 @@ int mddev_init(struct mddev *mddev) ...@@ -672,6 +692,7 @@ int mddev_init(struct mddev *mddev)
mutex_init(&mddev->open_mutex); mutex_init(&mddev->open_mutex);
mutex_init(&mddev->reconfig_mutex); mutex_init(&mddev->reconfig_mutex);
mutex_init(&mddev->sync_mutex); mutex_init(&mddev->sync_mutex);
mutex_init(&mddev->suspend_mutex);
mutex_init(&mddev->bitmap_info.mutex); mutex_init(&mddev->bitmap_info.mutex);
INIT_LIST_HEAD(&mddev->disks); INIT_LIST_HEAD(&mddev->disks);
INIT_LIST_HEAD(&mddev->all_mddevs); INIT_LIST_HEAD(&mddev->all_mddevs);
...@@ -2459,7 +2480,7 @@ static int bind_rdev_to_array(struct md_rdev *rdev, struct mddev *mddev) ...@@ -2459,7 +2480,7 @@ static int bind_rdev_to_array(struct md_rdev *rdev, struct mddev *mddev)
pr_debug("md: bind<%s>\n", b); pr_debug("md: bind<%s>\n", b);
if (mddev->raid_disks) if (mddev->raid_disks)
mddev_create_serial_pool(mddev, rdev, false); mddev_create_serial_pool(mddev, rdev);
if ((err = kobject_add(&rdev->kobj, &mddev->kobj, "dev-%s", b))) if ((err = kobject_add(&rdev->kobj, &mddev->kobj, "dev-%s", b)))
goto fail; goto fail;
...@@ -2512,7 +2533,7 @@ static void md_kick_rdev_from_array(struct md_rdev *rdev) ...@@ -2512,7 +2533,7 @@ static void md_kick_rdev_from_array(struct md_rdev *rdev)
bd_unlink_disk_holder(rdev->bdev, rdev->mddev->gendisk); bd_unlink_disk_holder(rdev->bdev, rdev->mddev->gendisk);
list_del_rcu(&rdev->same_set); list_del_rcu(&rdev->same_set);
pr_debug("md: unbind<%pg>\n", rdev->bdev); pr_debug("md: unbind<%pg>\n", rdev->bdev);
mddev_destroy_serial_pool(rdev->mddev, rdev, false); mddev_destroy_serial_pool(rdev->mddev, rdev);
rdev->mddev = NULL; rdev->mddev = NULL;
sysfs_remove_link(&rdev->kobj, "block"); sysfs_remove_link(&rdev->kobj, "block");
sysfs_put(rdev->sysfs_state); sysfs_put(rdev->sysfs_state);
...@@ -2842,11 +2863,7 @@ static int add_bound_rdev(struct md_rdev *rdev) ...@@ -2842,11 +2863,7 @@ static int add_bound_rdev(struct md_rdev *rdev)
*/ */
super_types[mddev->major_version]. super_types[mddev->major_version].
validate_super(mddev, rdev); validate_super(mddev, rdev);
if (add_journal)
mddev_suspend(mddev);
err = mddev->pers->hot_add_disk(mddev, rdev); err = mddev->pers->hot_add_disk(mddev, rdev);
if (add_journal)
mddev_resume(mddev);
if (err) { if (err) {
md_kick_rdev_from_array(rdev); md_kick_rdev_from_array(rdev);
return err; return err;
...@@ -2983,11 +3000,11 @@ state_store(struct md_rdev *rdev, const char *buf, size_t len) ...@@ -2983,11 +3000,11 @@ state_store(struct md_rdev *rdev, const char *buf, size_t len)
} }
} else if (cmd_match(buf, "writemostly")) { } else if (cmd_match(buf, "writemostly")) {
set_bit(WriteMostly, &rdev->flags); set_bit(WriteMostly, &rdev->flags);
mddev_create_serial_pool(rdev->mddev, rdev, false); mddev_create_serial_pool(rdev->mddev, rdev);
need_update_sb = true; need_update_sb = true;
err = 0; err = 0;
} else if (cmd_match(buf, "-writemostly")) { } else if (cmd_match(buf, "-writemostly")) {
mddev_destroy_serial_pool(rdev->mddev, rdev, false); mddev_destroy_serial_pool(rdev->mddev, rdev);
clear_bit(WriteMostly, &rdev->flags); clear_bit(WriteMostly, &rdev->flags);
need_update_sb = true; need_update_sb = true;
err = 0; err = 0;
...@@ -3599,6 +3616,7 @@ rdev_attr_store(struct kobject *kobj, struct attribute *attr, ...@@ -3599,6 +3616,7 @@ rdev_attr_store(struct kobject *kobj, struct attribute *attr,
struct rdev_sysfs_entry *entry = container_of(attr, struct rdev_sysfs_entry, attr); struct rdev_sysfs_entry *entry = container_of(attr, struct rdev_sysfs_entry, attr);
struct md_rdev *rdev = container_of(kobj, struct md_rdev, kobj); struct md_rdev *rdev = container_of(kobj, struct md_rdev, kobj);
struct kernfs_node *kn = NULL; struct kernfs_node *kn = NULL;
bool suspend = false;
ssize_t rv; ssize_t rv;
struct mddev *mddev = rdev->mddev; struct mddev *mddev = rdev->mddev;
...@@ -3606,17 +3624,25 @@ rdev_attr_store(struct kobject *kobj, struct attribute *attr, ...@@ -3606,17 +3624,25 @@ rdev_attr_store(struct kobject *kobj, struct attribute *attr,
return -EIO; return -EIO;
if (!capable(CAP_SYS_ADMIN)) if (!capable(CAP_SYS_ADMIN))
return -EACCES; return -EACCES;
if (!mddev)
return -ENODEV;
if (entry->store == state_store && cmd_match(page, "remove")) if (entry->store == state_store) {
if (cmd_match(page, "remove"))
kn = sysfs_break_active_protection(kobj, attr); kn = sysfs_break_active_protection(kobj, attr);
if (cmd_match(page, "remove") || cmd_match(page, "re-add") ||
cmd_match(page, "writemostly") ||
cmd_match(page, "-writemostly"))
suspend = true;
}
rv = mddev ? mddev_lock(mddev) : -ENODEV; rv = suspend ? mddev_suspend_and_lock(mddev) : mddev_lock(mddev);
if (!rv) { if (!rv) {
if (rdev->mddev == NULL) if (rdev->mddev == NULL)
rv = -ENODEV; rv = -ENODEV;
else else
rv = entry->store(rdev, page, length); rv = entry->store(rdev, page, length);
mddev_unlock(mddev); suspend ? mddev_unlock_and_resume(mddev) : mddev_unlock(mddev);
} }
if (kn) if (kn)
...@@ -3921,7 +3947,7 @@ level_store(struct mddev *mddev, const char *buf, size_t len) ...@@ -3921,7 +3947,7 @@ level_store(struct mddev *mddev, const char *buf, size_t len)
if (slen == 0 || slen >= sizeof(clevel)) if (slen == 0 || slen >= sizeof(clevel))
return -EINVAL; return -EINVAL;
rv = mddev_lock(mddev); rv = mddev_suspend_and_lock(mddev);
if (rv) if (rv)
return rv; return rv;
...@@ -4014,7 +4040,6 @@ level_store(struct mddev *mddev, const char *buf, size_t len) ...@@ -4014,7 +4040,6 @@ level_store(struct mddev *mddev, const char *buf, size_t len)
} }
/* Looks like we have a winner */ /* Looks like we have a winner */
mddev_suspend(mddev);
mddev_detach(mddev); mddev_detach(mddev);
spin_lock(&mddev->lock); spin_lock(&mddev->lock);
...@@ -4100,14 +4125,13 @@ level_store(struct mddev *mddev, const char *buf, size_t len) ...@@ -4100,14 +4125,13 @@ level_store(struct mddev *mddev, const char *buf, size_t len)
blk_set_stacking_limits(&mddev->queue->limits); blk_set_stacking_limits(&mddev->queue->limits);
pers->run(mddev); pers->run(mddev);
set_bit(MD_SB_CHANGE_DEVS, &mddev->sb_flags); set_bit(MD_SB_CHANGE_DEVS, &mddev->sb_flags);
mddev_resume(mddev);
if (!mddev->thread) if (!mddev->thread)
md_update_sb(mddev, 1); md_update_sb(mddev, 1);
sysfs_notify_dirent_safe(mddev->sysfs_level); sysfs_notify_dirent_safe(mddev->sysfs_level);
md_new_event(); md_new_event();
rv = len; rv = len;
out_unlock: out_unlock:
mddev_unlock(mddev); mddev_unlock_and_resume(mddev);
return rv; return rv;
} }
...@@ -4585,7 +4609,7 @@ new_dev_store(struct mddev *mddev, const char *buf, size_t len) ...@@ -4585,7 +4609,7 @@ new_dev_store(struct mddev *mddev, const char *buf, size_t len)
minor != MINOR(dev)) minor != MINOR(dev))
return -EOVERFLOW; return -EOVERFLOW;
err = mddev_lock(mddev); err = mddev_suspend_and_lock(mddev);
if (err) if (err)
return err; return err;
if (mddev->persistent) { if (mddev->persistent) {
...@@ -4606,14 +4630,14 @@ new_dev_store(struct mddev *mddev, const char *buf, size_t len) ...@@ -4606,14 +4630,14 @@ new_dev_store(struct mddev *mddev, const char *buf, size_t len)
rdev = md_import_device(dev, -1, -1); rdev = md_import_device(dev, -1, -1);
if (IS_ERR(rdev)) { if (IS_ERR(rdev)) {
mddev_unlock(mddev); mddev_unlock_and_resume(mddev);
return PTR_ERR(rdev); return PTR_ERR(rdev);
} }
err = bind_rdev_to_array(rdev, mddev); err = bind_rdev_to_array(rdev, mddev);
out: out:
if (err) if (err)
export_rdev(rdev, mddev); export_rdev(rdev, mddev);
mddev_unlock(mddev); mddev_unlock_and_resume(mddev);
if (!err) if (!err)
md_new_event(); md_new_event();
return err ? err : len; return err ? err : len;
...@@ -5179,7 +5203,8 @@ __ATTR(sync_max, S_IRUGO|S_IWUSR, max_sync_show, max_sync_store); ...@@ -5179,7 +5203,8 @@ __ATTR(sync_max, S_IRUGO|S_IWUSR, max_sync_show, max_sync_store);
static ssize_t static ssize_t
suspend_lo_show(struct mddev *mddev, char *page) suspend_lo_show(struct mddev *mddev, char *page)
{ {
return sprintf(page, "%llu\n", (unsigned long long)mddev->suspend_lo); return sprintf(page, "%llu\n",
(unsigned long long)READ_ONCE(mddev->suspend_lo));
} }
static ssize_t static ssize_t
...@@ -5194,15 +5219,13 @@ suspend_lo_store(struct mddev *mddev, const char *buf, size_t len) ...@@ -5194,15 +5219,13 @@ suspend_lo_store(struct mddev *mddev, const char *buf, size_t len)
if (new != (sector_t)new) if (new != (sector_t)new)
return -EINVAL; return -EINVAL;
err = mddev_lock(mddev); err = mddev_suspend(mddev, true);
if (err) if (err)
return err; return err;
mddev_suspend(mddev); WRITE_ONCE(mddev->suspend_lo, new);
mddev->suspend_lo = new;
mddev_resume(mddev); mddev_resume(mddev);
mddev_unlock(mddev);
return len; return len;
} }
static struct md_sysfs_entry md_suspend_lo = static struct md_sysfs_entry md_suspend_lo =
...@@ -5211,7 +5234,8 @@ __ATTR(suspend_lo, S_IRUGO|S_IWUSR, suspend_lo_show, suspend_lo_store); ...@@ -5211,7 +5234,8 @@ __ATTR(suspend_lo, S_IRUGO|S_IWUSR, suspend_lo_show, suspend_lo_store);
static ssize_t static ssize_t
suspend_hi_show(struct mddev *mddev, char *page) suspend_hi_show(struct mddev *mddev, char *page)
{ {
return sprintf(page, "%llu\n", (unsigned long long)mddev->suspend_hi); return sprintf(page, "%llu\n",
(unsigned long long)READ_ONCE(mddev->suspend_hi));
} }
static ssize_t static ssize_t
...@@ -5226,15 +5250,13 @@ suspend_hi_store(struct mddev *mddev, const char *buf, size_t len) ...@@ -5226,15 +5250,13 @@ suspend_hi_store(struct mddev *mddev, const char *buf, size_t len)
if (new != (sector_t)new) if (new != (sector_t)new)
return -EINVAL; return -EINVAL;
err = mddev_lock(mddev); err = mddev_suspend(mddev, true);
if (err) if (err)
return err; return err;
mddev_suspend(mddev); WRITE_ONCE(mddev->suspend_hi, new);
mddev->suspend_hi = new;
mddev_resume(mddev); mddev_resume(mddev);
mddev_unlock(mddev);
return len; return len;
} }
static struct md_sysfs_entry md_suspend_hi = static struct md_sysfs_entry md_suspend_hi =
...@@ -5482,7 +5504,7 @@ serialize_policy_store(struct mddev *mddev, const char *buf, size_t len) ...@@ -5482,7 +5504,7 @@ serialize_policy_store(struct mddev *mddev, const char *buf, size_t len)
if (value == mddev->serialize_policy) if (value == mddev->serialize_policy)
return len; return len;
err = mddev_lock(mddev); err = mddev_suspend_and_lock(mddev);
if (err) if (err)
return err; return err;
if (mddev->pers == NULL || (mddev->pers->level != 1)) { if (mddev->pers == NULL || (mddev->pers->level != 1)) {
...@@ -5491,15 +5513,13 @@ serialize_policy_store(struct mddev *mddev, const char *buf, size_t len) ...@@ -5491,15 +5513,13 @@ serialize_policy_store(struct mddev *mddev, const char *buf, size_t len)
goto unlock; goto unlock;
} }
mddev_suspend(mddev);
if (value) if (value)
mddev_create_serial_pool(mddev, NULL, true); mddev_create_serial_pool(mddev, NULL);
else else
mddev_destroy_serial_pool(mddev, NULL, true); mddev_destroy_serial_pool(mddev, NULL);
mddev->serialize_policy = value; mddev->serialize_policy = value;
mddev_resume(mddev);
unlock: unlock:
mddev_unlock(mddev); mddev_unlock_and_resume(mddev);
return err ?: len; return err ?: len;
} }
...@@ -6262,7 +6282,7 @@ static void __md_stop_writes(struct mddev *mddev) ...@@ -6262,7 +6282,7 @@ static void __md_stop_writes(struct mddev *mddev)
} }
/* disable policy to guarantee rdevs free resources for serialization */ /* disable policy to guarantee rdevs free resources for serialization */
mddev->serialize_policy = 0; mddev->serialize_policy = 0;
mddev_destroy_serial_pool(mddev, NULL, true); mddev_destroy_serial_pool(mddev, NULL);
} }
void md_stop_writes(struct mddev *mddev) void md_stop_writes(struct mddev *mddev)
...@@ -6554,13 +6574,13 @@ static void autorun_devices(int part) ...@@ -6554,13 +6574,13 @@ static void autorun_devices(int part)
if (IS_ERR(mddev)) if (IS_ERR(mddev))
break; break;
if (mddev_lock(mddev)) if (mddev_suspend_and_lock(mddev))
pr_warn("md: %s locked, cannot run\n", mdname(mddev)); pr_warn("md: %s locked, cannot run\n", mdname(mddev));
else if (mddev->raid_disks || mddev->major_version else if (mddev->raid_disks || mddev->major_version
|| !list_empty(&mddev->disks)) { || !list_empty(&mddev->disks)) {
pr_warn("md: %s already running, cannot run %pg\n", pr_warn("md: %s already running, cannot run %pg\n",
mdname(mddev), rdev0->bdev); mdname(mddev), rdev0->bdev);
mddev_unlock(mddev); mddev_unlock_and_resume(mddev);
} else { } else {
pr_debug("md: created %s\n", mdname(mddev)); pr_debug("md: created %s\n", mdname(mddev));
mddev->persistent = 1; mddev->persistent = 1;
...@@ -6570,7 +6590,7 @@ static void autorun_devices(int part) ...@@ -6570,7 +6590,7 @@ static void autorun_devices(int part)
export_rdev(rdev, mddev); export_rdev(rdev, mddev);
} }
autorun_array(mddev); autorun_array(mddev);
mddev_unlock(mddev); mddev_unlock_and_resume(mddev);
} }
/* on success, candidates will be empty, on error /* on success, candidates will be empty, on error
* it won't... * it won't...
...@@ -7120,7 +7140,6 @@ static int set_bitmap_file(struct mddev *mddev, int fd) ...@@ -7120,7 +7140,6 @@ static int set_bitmap_file(struct mddev *mddev, int fd)
struct bitmap *bitmap; struct bitmap *bitmap;
bitmap = md_bitmap_create(mddev, -1); bitmap = md_bitmap_create(mddev, -1);
mddev_suspend(mddev);
if (!IS_ERR(bitmap)) { if (!IS_ERR(bitmap)) {
mddev->bitmap = bitmap; mddev->bitmap = bitmap;
err = md_bitmap_load(mddev); err = md_bitmap_load(mddev);
...@@ -7130,11 +7149,8 @@ static int set_bitmap_file(struct mddev *mddev, int fd) ...@@ -7130,11 +7149,8 @@ static int set_bitmap_file(struct mddev *mddev, int fd)
md_bitmap_destroy(mddev); md_bitmap_destroy(mddev);
fd = -1; fd = -1;
} }
mddev_resume(mddev);
} else if (fd < 0) { } else if (fd < 0) {
mddev_suspend(mddev);
md_bitmap_destroy(mddev); md_bitmap_destroy(mddev);
mddev_resume(mddev);
} }
} }
if (fd < 0) { if (fd < 0) {
...@@ -7423,7 +7439,6 @@ static int update_array_info(struct mddev *mddev, mdu_array_info_t *info) ...@@ -7423,7 +7439,6 @@ static int update_array_info(struct mddev *mddev, mdu_array_info_t *info)
mddev->bitmap_info.space = mddev->bitmap_info.space =
mddev->bitmap_info.default_space; mddev->bitmap_info.default_space;
bitmap = md_bitmap_create(mddev, -1); bitmap = md_bitmap_create(mddev, -1);
mddev_suspend(mddev);
if (!IS_ERR(bitmap)) { if (!IS_ERR(bitmap)) {
mddev->bitmap = bitmap; mddev->bitmap = bitmap;
rv = md_bitmap_load(mddev); rv = md_bitmap_load(mddev);
...@@ -7431,7 +7446,6 @@ static int update_array_info(struct mddev *mddev, mdu_array_info_t *info) ...@@ -7431,7 +7446,6 @@ static int update_array_info(struct mddev *mddev, mdu_array_info_t *info)
rv = PTR_ERR(bitmap); rv = PTR_ERR(bitmap);
if (rv) if (rv)
md_bitmap_destroy(mddev); md_bitmap_destroy(mddev);
mddev_resume(mddev);
} else { } else {
/* remove the bitmap */ /* remove the bitmap */
if (!mddev->bitmap) { if (!mddev->bitmap) {
...@@ -7456,9 +7470,7 @@ static int update_array_info(struct mddev *mddev, mdu_array_info_t *info) ...@@ -7456,9 +7470,7 @@ static int update_array_info(struct mddev *mddev, mdu_array_info_t *info)
module_put(md_cluster_mod); module_put(md_cluster_mod);
mddev->safemode_delay = DEFAULT_SAFEMODE_DELAY; mddev->safemode_delay = DEFAULT_SAFEMODE_DELAY;
} }
mddev_suspend(mddev);
md_bitmap_destroy(mddev); md_bitmap_destroy(mddev);
mddev_resume(mddev);
mddev->bitmap_info.offset = 0; mddev->bitmap_info.offset = 0;
} }
} }
...@@ -7529,6 +7541,20 @@ static inline bool md_ioctl_valid(unsigned int cmd) ...@@ -7529,6 +7541,20 @@ static inline bool md_ioctl_valid(unsigned int cmd)
} }
} }
static bool md_ioctl_need_suspend(unsigned int cmd)
{
switch (cmd) {
case ADD_NEW_DISK:
case HOT_ADD_DISK:
case HOT_REMOVE_DISK:
case SET_BITMAP_FILE:
case SET_ARRAY_INFO:
return true;
default:
return false;
}
}
static int __md_set_array_info(struct mddev *mddev, void __user *argp) static int __md_set_array_info(struct mddev *mddev, void __user *argp)
{ {
mdu_array_info_t info; mdu_array_info_t info;
...@@ -7661,7 +7687,8 @@ static int md_ioctl(struct block_device *bdev, blk_mode_t mode, ...@@ -7661,7 +7687,8 @@ static int md_ioctl(struct block_device *bdev, blk_mode_t mode,
if (!md_is_rdwr(mddev)) if (!md_is_rdwr(mddev))
flush_work(&mddev->sync_work); flush_work(&mddev->sync_work);
err = mddev_lock(mddev); err = md_ioctl_need_suspend(cmd) ? mddev_suspend_and_lock(mddev) :
mddev_lock(mddev);
if (err) { if (err) {
pr_debug("md: ioctl lock interrupted, reason %d, cmd %d\n", pr_debug("md: ioctl lock interrupted, reason %d, cmd %d\n",
err, cmd); err, cmd);
...@@ -7789,7 +7816,10 @@ static int md_ioctl(struct block_device *bdev, blk_mode_t mode, ...@@ -7789,7 +7816,10 @@ static int md_ioctl(struct block_device *bdev, blk_mode_t mode,
if (mddev->hold_active == UNTIL_IOCTL && if (mddev->hold_active == UNTIL_IOCTL &&
err != -EINVAL) err != -EINVAL)
mddev->hold_active = 0; mddev->hold_active = 0;
md_ioctl_need_suspend(cmd) ? mddev_unlock_and_resume(mddev) :
mddev_unlock(mddev); mddev_unlock(mddev);
out: out:
if(did_set_md_closing) if(did_set_md_closing)
clear_bit(MD_CLOSING, &mddev->flags); clear_bit(MD_CLOSING, &mddev->flags);
...@@ -9323,7 +9353,12 @@ static void md_start_sync(struct work_struct *ws) ...@@ -9323,7 +9353,12 @@ static void md_start_sync(struct work_struct *ws)
{ {
struct mddev *mddev = container_of(ws, struct mddev, sync_work); struct mddev *mddev = container_of(ws, struct mddev, sync_work);
int spares = 0; int spares = 0;
bool suspend = false;
if (md_spares_need_change(mddev))
suspend = true;
suspend ? mddev_suspend_and_lock_nointr(mddev) :
mddev_lock_nointr(mddev); mddev_lock_nointr(mddev);
if (!md_is_rdwr(mddev)) { if (!md_is_rdwr(mddev)) {
...@@ -9360,7 +9395,7 @@ static void md_start_sync(struct work_struct *ws) ...@@ -9360,7 +9395,7 @@ static void md_start_sync(struct work_struct *ws)
goto not_running; goto not_running;
} }
mddev_unlock(mddev); suspend ? mddev_unlock_and_resume(mddev) : mddev_unlock(mddev);
md_wakeup_thread(mddev->sync_thread); md_wakeup_thread(mddev->sync_thread);
sysfs_notify_dirent_safe(mddev->sysfs_action); sysfs_notify_dirent_safe(mddev->sysfs_action);
md_new_event(); md_new_event();
...@@ -9372,7 +9407,7 @@ static void md_start_sync(struct work_struct *ws) ...@@ -9372,7 +9407,7 @@ static void md_start_sync(struct work_struct *ws)
clear_bit(MD_RECOVERY_REQUESTED, &mddev->recovery); clear_bit(MD_RECOVERY_REQUESTED, &mddev->recovery);
clear_bit(MD_RECOVERY_CHECK, &mddev->recovery); clear_bit(MD_RECOVERY_CHECK, &mddev->recovery);
clear_bit(MD_RECOVERY_RUNNING, &mddev->recovery); clear_bit(MD_RECOVERY_RUNNING, &mddev->recovery);
mddev_unlock(mddev); suspend ? mddev_unlock_and_resume(mddev) : mddev_unlock(mddev);
wake_up(&resync_wait); wake_up(&resync_wait);
if (test_and_clear_bit(MD_RECOVERY_RECOVER, &mddev->recovery) && if (test_and_clear_bit(MD_RECOVERY_RECOVER, &mddev->recovery) &&
...@@ -9404,19 +9439,7 @@ static void md_start_sync(struct work_struct *ws) ...@@ -9404,19 +9439,7 @@ static void md_start_sync(struct work_struct *ws)
*/ */
void md_check_recovery(struct mddev *mddev) void md_check_recovery(struct mddev *mddev)
{ {
if (test_bit(MD_ALLOW_SB_UPDATE, &mddev->flags) && mddev->sb_flags) { if (READ_ONCE(mddev->suspended))
/* Write superblock - thread that called mddev_suspend()
* holds reconfig_mutex for us.
*/
set_bit(MD_UPDATING_SB, &mddev->flags);
smp_mb__after_atomic();
if (test_bit(MD_ALLOW_SB_UPDATE, &mddev->flags))
md_update_sb(mddev, 0);
clear_bit_unlock(MD_UPDATING_SB, &mddev->flags);
wake_up(&mddev->sb_wait);
}
if (is_md_suspended(mddev))
return; return;
if (mddev->bitmap) if (mddev->bitmap)
......
...@@ -248,10 +248,6 @@ struct md_cluster_info; ...@@ -248,10 +248,6 @@ struct md_cluster_info;
* become failed. * become failed.
* @MD_HAS_PPL: The raid array has PPL feature set. * @MD_HAS_PPL: The raid array has PPL feature set.
* @MD_HAS_MULTIPLE_PPLS: The raid array has multiple PPLs feature set. * @MD_HAS_MULTIPLE_PPLS: The raid array has multiple PPLs feature set.
* @MD_ALLOW_SB_UPDATE: md_check_recovery is allowed to update the metadata
* without taking reconfig_mutex.
* @MD_UPDATING_SB: md_check_recovery is updating the metadata without
* explicitly holding reconfig_mutex.
* @MD_NOT_READY: do_md_run() is active, so 'array_state', ust not report that * @MD_NOT_READY: do_md_run() is active, so 'array_state', ust not report that
* array is ready yet. * array is ready yet.
* @MD_BROKEN: This is used to stop writes and mark array as failed. * @MD_BROKEN: This is used to stop writes and mark array as failed.
...@@ -268,8 +264,6 @@ enum mddev_flags { ...@@ -268,8 +264,6 @@ enum mddev_flags {
MD_FAILFAST_SUPPORTED, MD_FAILFAST_SUPPORTED,
MD_HAS_PPL, MD_HAS_PPL,
MD_HAS_MULTIPLE_PPLS, MD_HAS_MULTIPLE_PPLS,
MD_ALLOW_SB_UPDATE,
MD_UPDATING_SB,
MD_NOT_READY, MD_NOT_READY,
MD_BROKEN, MD_BROKEN,
MD_DELETED, MD_DELETED,
...@@ -316,6 +310,7 @@ struct mddev { ...@@ -316,6 +310,7 @@ struct mddev {
unsigned long sb_flags; unsigned long sb_flags;
int suspended; int suspended;
struct mutex suspend_mutex;
struct percpu_ref active_io; struct percpu_ref active_io;
int ro; int ro;
int sysfs_active; /* set when sysfs deletes int sysfs_active; /* set when sysfs deletes
...@@ -809,15 +804,14 @@ extern int md_rdev_init(struct md_rdev *rdev); ...@@ -809,15 +804,14 @@ extern int md_rdev_init(struct md_rdev *rdev);
extern void md_rdev_clear(struct md_rdev *rdev); extern void md_rdev_clear(struct md_rdev *rdev);
extern void md_handle_request(struct mddev *mddev, struct bio *bio); extern void md_handle_request(struct mddev *mddev, struct bio *bio);
extern void mddev_suspend(struct mddev *mddev); extern int mddev_suspend(struct mddev *mddev, bool interruptible);
extern void mddev_resume(struct mddev *mddev); extern void mddev_resume(struct mddev *mddev);
extern void md_reload_sb(struct mddev *mddev, int raid_disk); extern void md_reload_sb(struct mddev *mddev, int raid_disk);
extern void md_update_sb(struct mddev *mddev, int force); extern void md_update_sb(struct mddev *mddev, int force);
extern void mddev_create_serial_pool(struct mddev *mddev, struct md_rdev *rdev, extern void mddev_create_serial_pool(struct mddev *mddev, struct md_rdev *rdev);
bool is_suspend); extern void mddev_destroy_serial_pool(struct mddev *mddev,
extern void mddev_destroy_serial_pool(struct mddev *mddev, struct md_rdev *rdev, struct md_rdev *rdev);
bool is_suspend);
struct md_rdev *md_find_rdev_nr_rcu(struct mddev *mddev, int nr); struct md_rdev *md_find_rdev_nr_rcu(struct mddev *mddev, int nr);
struct md_rdev *md_find_rdev_rcu(struct mddev *mddev, dev_t dev); struct md_rdev *md_find_rdev_rcu(struct mddev *mddev, dev_t dev);
...@@ -855,6 +849,33 @@ static inline void mddev_check_write_zeroes(struct mddev *mddev, struct bio *bio ...@@ -855,6 +849,33 @@ static inline void mddev_check_write_zeroes(struct mddev *mddev, struct bio *bio
mddev->queue->limits.max_write_zeroes_sectors = 0; mddev->queue->limits.max_write_zeroes_sectors = 0;
} }
static inline int mddev_suspend_and_lock(struct mddev *mddev)
{
int ret;
ret = mddev_suspend(mddev, true);
if (ret)
return ret;
ret = mddev_lock(mddev);
if (ret)
mddev_resume(mddev);
return ret;
}
static inline void mddev_suspend_and_lock_nointr(struct mddev *mddev)
{
mddev_suspend(mddev, false);
mutex_lock(&mddev->reconfig_mutex);
}
static inline void mddev_unlock_and_resume(struct mddev *mddev)
{
mddev_unlock(mddev);
mddev_resume(mddev);
}
struct mdu_array_info_s; struct mdu_array_info_s;
struct mdu_disk_info_s; struct mdu_disk_info_s;
......
...@@ -327,8 +327,9 @@ void r5l_wake_reclaim(struct r5l_log *log, sector_t space); ...@@ -327,8 +327,9 @@ void r5l_wake_reclaim(struct r5l_log *log, sector_t space);
void r5c_check_stripe_cache_usage(struct r5conf *conf) void r5c_check_stripe_cache_usage(struct r5conf *conf)
{ {
int total_cached; int total_cached;
struct r5l_log *log = READ_ONCE(conf->log);
if (!r5c_is_writeback(conf->log)) if (!r5c_is_writeback(log))
return; return;
total_cached = atomic_read(&conf->r5c_cached_partial_stripes) + total_cached = atomic_read(&conf->r5c_cached_partial_stripes) +
...@@ -344,7 +345,7 @@ void r5c_check_stripe_cache_usage(struct r5conf *conf) ...@@ -344,7 +345,7 @@ void r5c_check_stripe_cache_usage(struct r5conf *conf)
*/ */
if (total_cached > conf->min_nr_stripes * 1 / 2 || if (total_cached > conf->min_nr_stripes * 1 / 2 ||
atomic_read(&conf->empty_inactive_list_nr) > 0) atomic_read(&conf->empty_inactive_list_nr) > 0)
r5l_wake_reclaim(conf->log, 0); r5l_wake_reclaim(log, 0);
} }
/* /*
...@@ -353,7 +354,9 @@ void r5c_check_stripe_cache_usage(struct r5conf *conf) ...@@ -353,7 +354,9 @@ void r5c_check_stripe_cache_usage(struct r5conf *conf)
*/ */
void r5c_check_cached_full_stripe(struct r5conf *conf) void r5c_check_cached_full_stripe(struct r5conf *conf)
{ {
if (!r5c_is_writeback(conf->log)) struct r5l_log *log = READ_ONCE(conf->log);
if (!r5c_is_writeback(log))
return; return;
/* /*
...@@ -363,7 +366,7 @@ void r5c_check_cached_full_stripe(struct r5conf *conf) ...@@ -363,7 +366,7 @@ void r5c_check_cached_full_stripe(struct r5conf *conf)
if (atomic_read(&conf->r5c_cached_full_stripes) >= if (atomic_read(&conf->r5c_cached_full_stripes) >=
min(R5C_FULL_STRIPE_FLUSH_BATCH(conf), min(R5C_FULL_STRIPE_FLUSH_BATCH(conf),
conf->chunk_sectors >> RAID5_STRIPE_SHIFT(conf))) conf->chunk_sectors >> RAID5_STRIPE_SHIFT(conf)))
r5l_wake_reclaim(conf->log, 0); r5l_wake_reclaim(log, 0);
} }
/* /*
...@@ -396,7 +399,7 @@ void r5c_check_cached_full_stripe(struct r5conf *conf) ...@@ -396,7 +399,7 @@ void r5c_check_cached_full_stripe(struct r5conf *conf)
*/ */
static sector_t r5c_log_required_to_flush_cache(struct r5conf *conf) static sector_t r5c_log_required_to_flush_cache(struct r5conf *conf)
{ {
struct r5l_log *log = conf->log; struct r5l_log *log = READ_ONCE(conf->log);
if (!r5c_is_writeback(log)) if (!r5c_is_writeback(log))
return 0; return 0;
...@@ -449,7 +452,7 @@ static inline void r5c_update_log_state(struct r5l_log *log) ...@@ -449,7 +452,7 @@ static inline void r5c_update_log_state(struct r5l_log *log)
void r5c_make_stripe_write_out(struct stripe_head *sh) void r5c_make_stripe_write_out(struct stripe_head *sh)
{ {
struct r5conf *conf = sh->raid_conf; struct r5conf *conf = sh->raid_conf;
struct r5l_log *log = conf->log; struct r5l_log *log = READ_ONCE(conf->log);
BUG_ON(!r5c_is_writeback(log)); BUG_ON(!r5c_is_writeback(log));
...@@ -491,7 +494,7 @@ static void r5c_handle_parity_cached(struct stripe_head *sh) ...@@ -491,7 +494,7 @@ static void r5c_handle_parity_cached(struct stripe_head *sh)
*/ */
static void r5c_finish_cache_stripe(struct stripe_head *sh) static void r5c_finish_cache_stripe(struct stripe_head *sh)
{ {
struct r5l_log *log = sh->raid_conf->log; struct r5l_log *log = READ_ONCE(sh->raid_conf->log);
if (log->r5c_journal_mode == R5C_JOURNAL_MODE_WRITE_THROUGH) { if (log->r5c_journal_mode == R5C_JOURNAL_MODE_WRITE_THROUGH) {
BUG_ON(test_bit(STRIPE_R5C_CACHING, &sh->state)); BUG_ON(test_bit(STRIPE_R5C_CACHING, &sh->state));
...@@ -683,7 +686,6 @@ static void r5c_disable_writeback_async(struct work_struct *work) ...@@ -683,7 +686,6 @@ static void r5c_disable_writeback_async(struct work_struct *work)
disable_writeback_work); disable_writeback_work);
struct mddev *mddev = log->rdev->mddev; struct mddev *mddev = log->rdev->mddev;
struct r5conf *conf = mddev->private; struct r5conf *conf = mddev->private;
int locked = 0;
if (log->r5c_journal_mode == R5C_JOURNAL_MODE_WRITE_THROUGH) if (log->r5c_journal_mode == R5C_JOURNAL_MODE_WRITE_THROUGH)
return; return;
...@@ -692,14 +694,14 @@ static void r5c_disable_writeback_async(struct work_struct *work) ...@@ -692,14 +694,14 @@ static void r5c_disable_writeback_async(struct work_struct *work)
/* wait superblock change before suspend */ /* wait superblock change before suspend */
wait_event(mddev->sb_wait, wait_event(mddev->sb_wait,
conf->log == NULL || !READ_ONCE(conf->log) ||
(!test_bit(MD_SB_CHANGE_PENDING, &mddev->sb_flags) && !test_bit(MD_SB_CHANGE_PENDING, &mddev->sb_flags));
(locked = mddev_trylock(mddev))));
if (locked) { log = READ_ONCE(conf->log);
mddev_suspend(mddev); if (log) {
mddev_suspend(mddev, false);
log->r5c_journal_mode = R5C_JOURNAL_MODE_WRITE_THROUGH; log->r5c_journal_mode = R5C_JOURNAL_MODE_WRITE_THROUGH;
mddev_resume(mddev); mddev_resume(mddev);
mddev_unlock(mddev);
} }
} }
...@@ -1151,7 +1153,7 @@ static void r5l_run_no_space_stripes(struct r5l_log *log) ...@@ -1151,7 +1153,7 @@ static void r5l_run_no_space_stripes(struct r5l_log *log)
static sector_t r5c_calculate_new_cp(struct r5conf *conf) static sector_t r5c_calculate_new_cp(struct r5conf *conf)
{ {
struct stripe_head *sh; struct stripe_head *sh;
struct r5l_log *log = conf->log; struct r5l_log *log = READ_ONCE(conf->log);
sector_t new_cp; sector_t new_cp;
unsigned long flags; unsigned long flags;
...@@ -1159,12 +1161,12 @@ static sector_t r5c_calculate_new_cp(struct r5conf *conf) ...@@ -1159,12 +1161,12 @@ static sector_t r5c_calculate_new_cp(struct r5conf *conf)
return log->next_checkpoint; return log->next_checkpoint;
spin_lock_irqsave(&log->stripe_in_journal_lock, flags); spin_lock_irqsave(&log->stripe_in_journal_lock, flags);
if (list_empty(&conf->log->stripe_in_journal_list)) { if (list_empty(&log->stripe_in_journal_list)) {
/* all stripes flushed */ /* all stripes flushed */
spin_unlock_irqrestore(&log->stripe_in_journal_lock, flags); spin_unlock_irqrestore(&log->stripe_in_journal_lock, flags);
return log->next_checkpoint; return log->next_checkpoint;
} }
sh = list_first_entry(&conf->log->stripe_in_journal_list, sh = list_first_entry(&log->stripe_in_journal_list,
struct stripe_head, r5c); struct stripe_head, r5c);
new_cp = sh->log_start; new_cp = sh->log_start;
spin_unlock_irqrestore(&log->stripe_in_journal_lock, flags); spin_unlock_irqrestore(&log->stripe_in_journal_lock, flags);
...@@ -1399,7 +1401,7 @@ void r5c_flush_cache(struct r5conf *conf, int num) ...@@ -1399,7 +1401,7 @@ void r5c_flush_cache(struct r5conf *conf, int num)
struct stripe_head *sh, *next; struct stripe_head *sh, *next;
lockdep_assert_held(&conf->device_lock); lockdep_assert_held(&conf->device_lock);
if (!conf->log) if (!READ_ONCE(conf->log))
return; return;
count = 0; count = 0;
...@@ -1420,7 +1422,7 @@ void r5c_flush_cache(struct r5conf *conf, int num) ...@@ -1420,7 +1422,7 @@ void r5c_flush_cache(struct r5conf *conf, int num)
static void r5c_do_reclaim(struct r5conf *conf) static void r5c_do_reclaim(struct r5conf *conf)
{ {
struct r5l_log *log = conf->log; struct r5l_log *log = READ_ONCE(conf->log);
struct stripe_head *sh; struct stripe_head *sh;
int count = 0; int count = 0;
unsigned long flags; unsigned long flags;
...@@ -1549,7 +1551,7 @@ static void r5l_reclaim_thread(struct md_thread *thread) ...@@ -1549,7 +1551,7 @@ static void r5l_reclaim_thread(struct md_thread *thread)
{ {
struct mddev *mddev = thread->mddev; struct mddev *mddev = thread->mddev;
struct r5conf *conf = mddev->private; struct r5conf *conf = mddev->private;
struct r5l_log *log = conf->log; struct r5l_log *log = READ_ONCE(conf->log);
if (!log) if (!log)
return; return;
...@@ -1591,7 +1593,7 @@ void r5l_quiesce(struct r5l_log *log, int quiesce) ...@@ -1591,7 +1593,7 @@ void r5l_quiesce(struct r5l_log *log, int quiesce)
bool r5l_log_disk_error(struct r5conf *conf) bool r5l_log_disk_error(struct r5conf *conf)
{ {
struct r5l_log *log = conf->log; struct r5l_log *log = READ_ONCE(conf->log);
/* don't allow write if journal disk is missing */ /* don't allow write if journal disk is missing */
if (!log) if (!log)
...@@ -2583,9 +2585,7 @@ int r5c_journal_mode_set(struct mddev *mddev, int mode) ...@@ -2583,9 +2585,7 @@ int r5c_journal_mode_set(struct mddev *mddev, int mode)
mode == R5C_JOURNAL_MODE_WRITE_BACK) mode == R5C_JOURNAL_MODE_WRITE_BACK)
return -EINVAL; return -EINVAL;
mddev_suspend(mddev);
conf->log->r5c_journal_mode = mode; conf->log->r5c_journal_mode = mode;
mddev_resume(mddev);
pr_debug("md/raid:%s: setting r5c cache mode to %d: %s\n", pr_debug("md/raid:%s: setting r5c cache mode to %d: %s\n",
mdname(mddev), mode, r5c_journal_mode_str[mode]); mdname(mddev), mode, r5c_journal_mode_str[mode]);
...@@ -2610,11 +2610,11 @@ static ssize_t r5c_journal_mode_store(struct mddev *mddev, ...@@ -2610,11 +2610,11 @@ static ssize_t r5c_journal_mode_store(struct mddev *mddev,
if (strlen(r5c_journal_mode_str[mode]) == len && if (strlen(r5c_journal_mode_str[mode]) == len &&
!strncmp(page, r5c_journal_mode_str[mode], len)) !strncmp(page, r5c_journal_mode_str[mode], len))
break; break;
ret = mddev_lock(mddev); ret = mddev_suspend_and_lock(mddev);
if (ret) if (ret)
return ret; return ret;
ret = r5c_journal_mode_set(mddev, mode); ret = r5c_journal_mode_set(mddev, mode);
mddev_unlock(mddev); mddev_unlock_and_resume(mddev);
return ret ?: length; return ret ?: length;
} }
...@@ -2635,7 +2635,7 @@ int r5c_try_caching_write(struct r5conf *conf, ...@@ -2635,7 +2635,7 @@ int r5c_try_caching_write(struct r5conf *conf,
struct stripe_head_state *s, struct stripe_head_state *s,
int disks) int disks)
{ {
struct r5l_log *log = conf->log; struct r5l_log *log = READ_ONCE(conf->log);
int i; int i;
struct r5dev *dev; struct r5dev *dev;
int to_cache = 0; int to_cache = 0;
...@@ -2802,7 +2802,7 @@ void r5c_finish_stripe_write_out(struct r5conf *conf, ...@@ -2802,7 +2802,7 @@ void r5c_finish_stripe_write_out(struct r5conf *conf,
struct stripe_head *sh, struct stripe_head *sh,
struct stripe_head_state *s) struct stripe_head_state *s)
{ {
struct r5l_log *log = conf->log; struct r5l_log *log = READ_ONCE(conf->log);
int i; int i;
int do_wakeup = 0; int do_wakeup = 0;
sector_t tree_index; sector_t tree_index;
...@@ -2941,7 +2941,7 @@ int r5c_cache_data(struct r5l_log *log, struct stripe_head *sh) ...@@ -2941,7 +2941,7 @@ int r5c_cache_data(struct r5l_log *log, struct stripe_head *sh)
/* check whether this big stripe is in write back cache. */ /* check whether this big stripe is in write back cache. */
bool r5c_big_stripe_cached(struct r5conf *conf, sector_t sect) bool r5c_big_stripe_cached(struct r5conf *conf, sector_t sect)
{ {
struct r5l_log *log = conf->log; struct r5l_log *log = READ_ONCE(conf->log);
sector_t tree_index; sector_t tree_index;
void *slot; void *slot;
...@@ -3049,14 +3049,14 @@ int r5l_start(struct r5l_log *log) ...@@ -3049,14 +3049,14 @@ int r5l_start(struct r5l_log *log)
void r5c_update_on_rdev_error(struct mddev *mddev, struct md_rdev *rdev) void r5c_update_on_rdev_error(struct mddev *mddev, struct md_rdev *rdev)
{ {
struct r5conf *conf = mddev->private; struct r5conf *conf = mddev->private;
struct r5l_log *log = conf->log; struct r5l_log *log = READ_ONCE(conf->log);
if (!log) if (!log)
return; return;
if ((raid5_calc_degraded(conf) > 0 || if ((raid5_calc_degraded(conf) > 0 ||
test_bit(Journal, &rdev->flags)) && test_bit(Journal, &rdev->flags)) &&
conf->log->r5c_journal_mode == R5C_JOURNAL_MODE_WRITE_BACK) log->r5c_journal_mode == R5C_JOURNAL_MODE_WRITE_BACK)
schedule_work(&log->disable_writeback_work); schedule_work(&log->disable_writeback_work);
} }
...@@ -3145,7 +3145,7 @@ int r5l_init_log(struct r5conf *conf, struct md_rdev *rdev) ...@@ -3145,7 +3145,7 @@ int r5l_init_log(struct r5conf *conf, struct md_rdev *rdev)
spin_lock_init(&log->stripe_in_journal_lock); spin_lock_init(&log->stripe_in_journal_lock);
atomic_set(&log->stripe_in_journal_count, 0); atomic_set(&log->stripe_in_journal_count, 0);
conf->log = log; WRITE_ONCE(conf->log, log);
set_bit(MD_HAS_JOURNAL, &conf->mddev->flags); set_bit(MD_HAS_JOURNAL, &conf->mddev->flags);
return 0; return 0;
...@@ -3173,7 +3173,7 @@ void r5l_exit_log(struct r5conf *conf) ...@@ -3173,7 +3173,7 @@ void r5l_exit_log(struct r5conf *conf)
* 'reconfig_mutex' is held by caller, set 'confg->log' to NULL to * 'reconfig_mutex' is held by caller, set 'confg->log' to NULL to
* ensure disable_writeback_work wakes up and exits. * ensure disable_writeback_work wakes up and exits.
*/ */
conf->log = NULL; WRITE_ONCE(conf->log, NULL);
wake_up(&conf->mddev->sb_wait); wake_up(&conf->mddev->sb_wait);
flush_work(&log->disable_writeback_work); flush_work(&log->disable_writeback_work);
......
...@@ -70,6 +70,8 @@ MODULE_PARM_DESC(devices_handle_discard_safely, ...@@ -70,6 +70,8 @@ MODULE_PARM_DESC(devices_handle_discard_safely,
"Set to Y if all devices in each array reliably return zeroes on reads from discarded regions"); "Set to Y if all devices in each array reliably return zeroes on reads from discarded regions");
static struct workqueue_struct *raid5_wq; static struct workqueue_struct *raid5_wq;
static void raid5_quiesce(struct mddev *mddev, int quiesce);
static inline struct hlist_head *stripe_hash(struct r5conf *conf, sector_t sect) static inline struct hlist_head *stripe_hash(struct r5conf *conf, sector_t sect)
{ {
int hash = (sect >> RAID5_STRIPE_SHIFT(conf)) & HASH_MASK; int hash = (sect >> RAID5_STRIPE_SHIFT(conf)) & HASH_MASK;
...@@ -2492,15 +2494,12 @@ static int resize_chunks(struct r5conf *conf, int new_disks, int new_sectors) ...@@ -2492,15 +2494,12 @@ static int resize_chunks(struct r5conf *conf, int new_disks, int new_sectors)
unsigned long cpu; unsigned long cpu;
int err = 0; int err = 0;
/* /* Never shrink. */
* Never shrink. And mddev_suspend() could deadlock if this is called
* from raid5d. In that case, scribble_disks and scribble_sectors
* should equal to new_disks and new_sectors
*/
if (conf->scribble_disks >= new_disks && if (conf->scribble_disks >= new_disks &&
conf->scribble_sectors >= new_sectors) conf->scribble_sectors >= new_sectors)
return 0; return 0;
mddev_suspend(conf->mddev);
raid5_quiesce(conf->mddev, true);
cpus_read_lock(); cpus_read_lock();
for_each_present_cpu(cpu) { for_each_present_cpu(cpu) {
...@@ -2514,7 +2513,8 @@ static int resize_chunks(struct r5conf *conf, int new_disks, int new_sectors) ...@@ -2514,7 +2513,8 @@ static int resize_chunks(struct r5conf *conf, int new_disks, int new_sectors)
} }
cpus_read_unlock(); cpus_read_unlock();
mddev_resume(conf->mddev); raid5_quiesce(conf->mddev, false);
if (!err) { if (!err) {
conf->scribble_disks = new_disks; conf->scribble_disks = new_disks;
conf->scribble_sectors = new_sectors; conf->scribble_sectors = new_sectors;
...@@ -7025,7 +7025,7 @@ raid5_store_stripe_size(struct mddev *mddev, const char *page, size_t len) ...@@ -7025,7 +7025,7 @@ raid5_store_stripe_size(struct mddev *mddev, const char *page, size_t len)
new != roundup_pow_of_two(new)) new != roundup_pow_of_two(new))
return -EINVAL; return -EINVAL;
err = mddev_lock(mddev); err = mddev_suspend_and_lock(mddev);
if (err) if (err)
return err; return err;
...@@ -7049,7 +7049,6 @@ raid5_store_stripe_size(struct mddev *mddev, const char *page, size_t len) ...@@ -7049,7 +7049,6 @@ raid5_store_stripe_size(struct mddev *mddev, const char *page, size_t len)
goto out_unlock; goto out_unlock;
} }
mddev_suspend(mddev);
mutex_lock(&conf->cache_size_mutex); mutex_lock(&conf->cache_size_mutex);
size = conf->max_nr_stripes; size = conf->max_nr_stripes;
...@@ -7064,10 +7063,9 @@ raid5_store_stripe_size(struct mddev *mddev, const char *page, size_t len) ...@@ -7064,10 +7063,9 @@ raid5_store_stripe_size(struct mddev *mddev, const char *page, size_t len)
err = -ENOMEM; err = -ENOMEM;
} }
mutex_unlock(&conf->cache_size_mutex); mutex_unlock(&conf->cache_size_mutex);
mddev_resume(mddev);
out_unlock: out_unlock:
mddev_unlock(mddev); mddev_unlock_and_resume(mddev);
return err ?: len; return err ?: len;
} }
...@@ -7153,7 +7151,7 @@ raid5_store_skip_copy(struct mddev *mddev, const char *page, size_t len) ...@@ -7153,7 +7151,7 @@ raid5_store_skip_copy(struct mddev *mddev, const char *page, size_t len)
return -EINVAL; return -EINVAL;
new = !!new; new = !!new;
err = mddev_lock(mddev); err = mddev_suspend_and_lock(mddev);
if (err) if (err)
return err; return err;
conf = mddev->private; conf = mddev->private;
...@@ -7162,15 +7160,13 @@ raid5_store_skip_copy(struct mddev *mddev, const char *page, size_t len) ...@@ -7162,15 +7160,13 @@ raid5_store_skip_copy(struct mddev *mddev, const char *page, size_t len)
else if (new != conf->skip_copy) { else if (new != conf->skip_copy) {
struct request_queue *q = mddev->queue; struct request_queue *q = mddev->queue;
mddev_suspend(mddev);
conf->skip_copy = new; conf->skip_copy = new;
if (new) if (new)
blk_queue_flag_set(QUEUE_FLAG_STABLE_WRITES, q); blk_queue_flag_set(QUEUE_FLAG_STABLE_WRITES, q);
else else
blk_queue_flag_clear(QUEUE_FLAG_STABLE_WRITES, q); blk_queue_flag_clear(QUEUE_FLAG_STABLE_WRITES, q);
mddev_resume(mddev);
} }
mddev_unlock(mddev); mddev_unlock_and_resume(mddev);
return err ?: len; return err ?: len;
} }
...@@ -7225,15 +7221,13 @@ raid5_store_group_thread_cnt(struct mddev *mddev, const char *page, size_t len) ...@@ -7225,15 +7221,13 @@ raid5_store_group_thread_cnt(struct mddev *mddev, const char *page, size_t len)
if (new > 8192) if (new > 8192)
return -EINVAL; return -EINVAL;
err = mddev_lock(mddev); err = mddev_suspend_and_lock(mddev);
if (err) if (err)
return err; return err;
conf = mddev->private; conf = mddev->private;
if (!conf) if (!conf)
err = -ENODEV; err = -ENODEV;
else if (new != conf->worker_cnt_per_group) { else if (new != conf->worker_cnt_per_group) {
mddev_suspend(mddev);
old_groups = conf->worker_groups; old_groups = conf->worker_groups;
if (old_groups) if (old_groups)
flush_workqueue(raid5_wq); flush_workqueue(raid5_wq);
...@@ -7250,9 +7244,8 @@ raid5_store_group_thread_cnt(struct mddev *mddev, const char *page, size_t len) ...@@ -7250,9 +7244,8 @@ raid5_store_group_thread_cnt(struct mddev *mddev, const char *page, size_t len)
kfree(old_groups[0].workers); kfree(old_groups[0].workers);
kfree(old_groups); kfree(old_groups);
} }
mddev_resume(mddev);
} }
mddev_unlock(mddev); mddev_unlock_and_resume(mddev);
return err ?: len; return err ?: len;
} }
...@@ -8558,8 +8551,8 @@ static int raid5_start_reshape(struct mddev *mddev) ...@@ -8558,8 +8551,8 @@ static int raid5_start_reshape(struct mddev *mddev)
* the reshape wasn't running - like Discard or Read - have * the reshape wasn't running - like Discard or Read - have
* completed. * completed.
*/ */
mddev_suspend(mddev); raid5_quiesce(mddev, true);
mddev_resume(mddev); raid5_quiesce(mddev, false);
/* Add some new drives, as many as will fit. /* Add some new drives, as many as will fit.
* We know there are enough to make the newly sized array work. * We know there are enough to make the newly sized array work.
...@@ -8974,12 +8967,12 @@ static int raid5_change_consistency_policy(struct mddev *mddev, const char *buf) ...@@ -8974,12 +8967,12 @@ static int raid5_change_consistency_policy(struct mddev *mddev, const char *buf)
struct r5conf *conf; struct r5conf *conf;
int err; int err;
err = mddev_lock(mddev); err = mddev_suspend_and_lock(mddev);
if (err) if (err)
return err; return err;
conf = mddev->private; conf = mddev->private;
if (!conf) { if (!conf) {
mddev_unlock(mddev); mddev_unlock_and_resume(mddev);
return -ENODEV; return -ENODEV;
} }
...@@ -8989,19 +8982,14 @@ static int raid5_change_consistency_policy(struct mddev *mddev, const char *buf) ...@@ -8989,19 +8982,14 @@ static int raid5_change_consistency_policy(struct mddev *mddev, const char *buf)
err = log_init(conf, NULL, true); err = log_init(conf, NULL, true);
if (!err) { if (!err) {
err = resize_stripes(conf, conf->pool_size); err = resize_stripes(conf, conf->pool_size);
if (err) { if (err)
mddev_suspend(mddev);
log_exit(conf); log_exit(conf);
mddev_resume(mddev);
}
} }
} else } else
err = -EINVAL; err = -EINVAL;
} else if (strncmp(buf, "resync", 6) == 0) { } else if (strncmp(buf, "resync", 6) == 0) {
if (raid5_has_ppl(conf)) { if (raid5_has_ppl(conf)) {
mddev_suspend(mddev);
log_exit(conf); log_exit(conf);
mddev_resume(mddev);
err = resize_stripes(conf, conf->pool_size); err = resize_stripes(conf, conf->pool_size);
} else if (test_bit(MD_HAS_JOURNAL, &conf->mddev->flags) && } else if (test_bit(MD_HAS_JOURNAL, &conf->mddev->flags) &&
r5l_log_disk_error(conf)) { r5l_log_disk_error(conf)) {
...@@ -9014,11 +9002,9 @@ static int raid5_change_consistency_policy(struct mddev *mddev, const char *buf) ...@@ -9014,11 +9002,9 @@ static int raid5_change_consistency_policy(struct mddev *mddev, const char *buf)
break; break;
} }
if (!journal_dev_exists) { if (!journal_dev_exists)
mddev_suspend(mddev);
clear_bit(MD_HAS_JOURNAL, &mddev->flags); clear_bit(MD_HAS_JOURNAL, &mddev->flags);
mddev_resume(mddev); else /* need remove journal device first */
} else /* need remove journal device first */
err = -EBUSY; err = -EBUSY;
} else } else
err = -EINVAL; err = -EINVAL;
...@@ -9029,7 +9015,7 @@ static int raid5_change_consistency_policy(struct mddev *mddev, const char *buf) ...@@ -9029,7 +9015,7 @@ static int raid5_change_consistency_policy(struct mddev *mddev, const char *buf)
if (!err) if (!err)
md_update_sb(mddev, 1); md_update_sb(mddev, 1);
mddev_unlock(mddev); mddev_unlock_and_resume(mddev);
return err; return err;
} }
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment