md/raid10: fix handling of error on last working device in array.
If we get a read error on the last working device in a RAID10 which
contains the target block, then we don't fail the device (which is
good) but we don't abort retries, which is wrong.
We end up in an infinite loop retrying the read on the one device.
This patch fixes the problem in two places:
1/ in raid10_end_read_request we don't even ask for a retry if this
was the last usable device. This is efficient but a little racy
and will sometimes retry when it should not.
2/ in handle_read_error we are careful to exclude any device from
retry which we tried to mark as faulty (that might have failed if
it was the last device). This is race-free but less efficient.
Signed-off-by: NeilBrown <neilb@suse.de>
Showing
Please register or sign in to comment