• NeilBrown's avatar
    md: make it easier to wait for bad blocks to be acknowledged. · de393cde
    NeilBrown authored
    It is only safe to choose not to write to a bad block if that bad
    block is safely recorded in metadata - i.e. if it has been
    'acknowledged'.
    
    If it hasn't we need to wait for the acknowledgement.
    
    We support that using rdev->blocked wait and
    md_wait_for_blocked_rdev by introducing a new device flag
    'BlockedBadBlock'.
    
    This flag is only advisory.
    It is cleared whenever we acknowledge a bad block, so that a waiter
    can re-check the particular bad blocks that it is interested it.
    
    It should be set by a caller when they find they need to wait.
    This (set after test) is inherently racy, but as
    md_wait_for_blocked_rdev already has a timeout, losing the race will
    have minimal impact.
    
    When we clear "Blocked" was also clear "BlockedBadBlocks" incase it
    was set incorrectly (see above race).
    
    We also modify the way we manage 'Blocked' to fit better with the new
    handling of 'BlockedBadBlocks' and to make it consistent between
    externally managed and internally managed metadata.   This requires
    that each raidXd loop checks if the metadata needs to be written and
    triggers a write (md_check_recovery) if needed.  Otherwise a queued
    write request might cause raidXd to wait for the metadata to write,
    and only that thread can write it.
    
    Before writing metadata, we set FaultRecorded for all devices that
    are Faulty, then after writing the metadata we clear Blocked for any
    device for which the Fault was certainly Recorded.
    
    The 'faulty' device flag now appears in sysfs if the device is faulty
    *or* it has unacknowledged bad blocks.  So user-space which does not
    understand bad blocks can continue to function correctly.
    User space which does, should not assume a device is faulty until it
    sees the 'faulty' flag, and then sees the list of unacknowledged bad
    blocks is empty.
    Signed-off-by: default avatarNeilBrown <neilb@suse.de>
    de393cde
raid10.c 65.9 KB