• Tomasz Majchrzak's avatar
    md: report 'write_pending' state when array in sync · 16f88949
    Tomasz Majchrzak authored
    If there is a bad block on a disk and there is a recovery performed from
    this disk, the same bad block is reported for a new disk. It involves
    setting MD_CHANGE_PENDING flag in rdev_set_badblocks. For external
    metadata this flag is not being cleared as array state is reported as
    'clean'. The read request to bad block in RAID5 array gets stuck as it
    is waiting for a flag to be cleared - as per commit c3cce6cd
    ("md/raid5: ensure device failure recorded before write request
    returns.").
    
    The meaning of MD_CHANGE_PENDING and MD_CHANGE_CLEAN flags has been
    clarified in commit 070dc6dd ("md: resolve confusion of
    MD_CHANGE_CLEAN"), however MD_CHANGE_PENDING flag has been used in
    personality error handlers since and it doesn't fully comply with
    initial purpose. It was supposed to notify that write request is about
    to start, however now it is also used to request metadata update.
    Initially (in md_allow_write, md_write_start) MD_CHANGE_PENDING flag has
    been set and in_sync has been set to 0 at the same time. Error handlers
    just set the flag without modifying in_sync value. Sysfs array state is
    a single value so now it reports 'clean' when MD_CHANGE_PENDING flag is
    set and in_sync is set to 1. Userspace has no idea it is expected to
    take some action.
    
    Swap the order that array state is checked so 'write_pending' is
    reported ahead of 'clean' ('write_pending' is a misleading name but it
    is too late to rename it now).
    Signed-off-by: default avatarTomasz Majchrzak <tomasz.majchrzak@intel.com>
    Signed-off-by: default avatarShaohua Li <shli@fb.com>
    16f88949
md.c 232 KB