• Qu Wenruo's avatar
    btrfs: do not use replace target device as an extra mirror · 5f50fa91
    Qu Wenruo authored
    [BUG]
    Currently btrfs can use dev-replace device as an extra mirror for
    read-repair.  But it can lead to NODATASUM corruption in the following
    case:
    
     There is a RAID1 data chunk, and dev-replace is running from
     dev2 to dev0.
    
     |//| = Replaced data
              X       X+1MB     X+2MB
      Dev 2:  |       |         |           <- Source dev
      Dev 0:  |///////|         |           <- Target dev
    
    Then a read on dev 2 X+2MB happens.
    And something wrong happened inside devid 2, causing an -EIO.
    
    In that case, read-repair would try the next mirror, and since we can
    use target device as an extra mirror, we will use that mirror instead.
    
    But unfortunately since the read is beyond the current replace cursor,
    we should not trust it at all, what we get would be just uninitialized
    garbage.
    
    But if this read is for NODATASUM range, then we just trust them and
    cause data corruption.
    
    [CAUSE]
    We used to have some checks to make sure we only return such extra
    mirror when the range is before our left cursor.
    
    The first commit introducing this behavior is ad6d620e ("Btrfs:
    allow repair code to include target disk when searching mirrors").
    
    But later a fix, 22ab04e8 ("Btrfs: fix race between device replace
    and chunk allocation") changed the behavior, to always let
    btrfs_map_block() include the extra mirror to address a race in
    dev-replace which can cause missing writes to target device.
    
    This means, we lose the tracking of cursor for the extra mirror, thus
    can lead to above corruption.
    
    [FIX]
    The extra mirror is never a reliable one, at the beginning of
    dev-replace, the reliability is zero, while only at the end of the
    replace it's a fully reliable mirror.
    
    We either do the complex tracking, or never trust it.
    
    IMHO it's much easier to maintain if we don't trust it at all, and the
    extra mirror can only benefit for a limited period of time (during
    replace).
    
    Thus this patch would completely remove the ability to use target device
    as an extra mirror.
    Signed-off-by: default avatarQu Wenruo <wqu@suse.com>
    Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
    5f50fa91
volumes.c 215 KB