Commit f0703040 authored by Takahiro Yasui's avatar Takahiro Yasui Committed by Alasdair G Kergon

dm raid1: fix deadlock when suspending failed device

To prevent deadlock, bios in the hold list should be flushed before
dm_rh_stop_recovery() is called in mirror_suspend().

The recovery can't start because there are pending bios and therefore
dm_rh_stop_recovery deadlocks.

When there are pending bios in the hold list, the recovery waits for
the completion of the bios after recovery_count is acquired.
The recovery_count is released when the recovery finished, however,
the bios in the hold list are processed after dm_rh_stop_recovery() in
mirror_presuspend(). dm_rh_stop_recovery() also acquires recovery_count,
then deadlock occurs.
Signed-off-by: default avatarTakahiro Yasui <tyasui@redhat.com>
Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
Reviewed-by: default avatarMikulas Patocka <mpatocka@redhat.com>
parent 924e600d
...@@ -465,9 +465,17 @@ static void map_region(struct dm_io_region *io, struct mirror *m, ...@@ -465,9 +465,17 @@ static void map_region(struct dm_io_region *io, struct mirror *m,
static void hold_bio(struct mirror_set *ms, struct bio *bio) static void hold_bio(struct mirror_set *ms, struct bio *bio)
{ {
/* /*
* If device is suspended, complete the bio. * Lock is required to avoid race condition during suspend
* process.
*/ */
spin_lock_irq(&ms->lock);
if (atomic_read(&ms->suspend)) { if (atomic_read(&ms->suspend)) {
spin_unlock_irq(&ms->lock);
/*
* If device is suspended, complete the bio.
*/
if (dm_noflush_suspending(ms->ti)) if (dm_noflush_suspending(ms->ti))
bio_endio(bio, DM_ENDIO_REQUEUE); bio_endio(bio, DM_ENDIO_REQUEUE);
else else
...@@ -478,7 +486,6 @@ static void hold_bio(struct mirror_set *ms, struct bio *bio) ...@@ -478,7 +486,6 @@ static void hold_bio(struct mirror_set *ms, struct bio *bio)
/* /*
* Hold bio until the suspend is complete. * Hold bio until the suspend is complete.
*/ */
spin_lock_irq(&ms->lock);
bio_list_add(&ms->holds, bio); bio_list_add(&ms->holds, bio);
spin_unlock_irq(&ms->lock); spin_unlock_irq(&ms->lock);
} }
...@@ -1260,6 +1267,20 @@ static void mirror_presuspend(struct dm_target *ti) ...@@ -1260,6 +1267,20 @@ static void mirror_presuspend(struct dm_target *ti)
atomic_set(&ms->suspend, 1); atomic_set(&ms->suspend, 1);
/*
* Process bios in the hold list to start recovery waiting
* for bios in the hold list. After the process, no bio has
* a chance to be added in the hold list because ms->suspend
* is set.
*/
spin_lock_irq(&ms->lock);
holds = ms->holds;
bio_list_init(&ms->holds);
spin_unlock_irq(&ms->lock);
while ((bio = bio_list_pop(&holds)))
hold_bio(ms, bio);
/* /*
* We must finish up all the work that we've * We must finish up all the work that we've
* generated (i.e. recovery work). * generated (i.e. recovery work).
...@@ -1280,22 +1301,6 @@ static void mirror_presuspend(struct dm_target *ti) ...@@ -1280,22 +1301,6 @@ static void mirror_presuspend(struct dm_target *ti)
* we know that all of our I/O has been pushed. * we know that all of our I/O has been pushed.
*/ */
flush_workqueue(ms->kmirrord_wq); flush_workqueue(ms->kmirrord_wq);
/*
* Now set ms->suspend is set and the workqueue flushed, no more
* entries can be added to ms->hold list, so process it.
*
* Bios can still arrive concurrently with or after this
* presuspend function, but they cannot join the hold list
* because ms->suspend is set.
*/
spin_lock_irq(&ms->lock);
holds = ms->holds;
bio_list_init(&ms->holds);
spin_unlock_irq(&ms->lock);
while ((bio = bio_list_pop(&holds)))
hold_bio(ms, bio);
} }
static void mirror_postsuspend(struct dm_target *ti) static void mirror_postsuspend(struct dm_target *ti)
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment