• Dan Williams's avatar
    md: fix prexor vs sync_request race · e0a115e5
    Dan Williams authored
    During the initial array synchronization process there is a window between
    when a prexor operation is scheduled to a specific stripe and when it
    completes for a sync_request to be scheduled to the same stripe.  When
    this happens the prexor completes and the stripe is unconditionally marked
    "insync", effectively canceling the sync_request for the stripe.  Prior to
    2.6.23 this was not a problem because the prexor operation was done under
    sh->lock.  The effect in older kernels being that the prexor would still
    erroneously mark the stripe "insync", but sync_request would be held off
    and re-mark the stripe as "!in_sync".
    
    Change the write completion logic to not mark the stripe "in_sync" if a
    prexor was performed.  The effect of the change is to sometimes not set
    STRIPE_INSYNC.  The worst this can do is cause the resync to stall waiting
    for STRIPE_INSYNC to be set.  If this were happening, then STRIPE_SYNCING
    would be set and handle_issuing_new_read_requests would cause all
    available blocks to eventually be read, at which point prexor would never
    be used on that stripe any more and STRIPE_INSYNC would eventually be set.
    
    echo repair > /sys/block/mdN/md/sync_action will correct arrays that may
    have lost this race.
    
    Cc: <stable@kernel.org>
    Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
    Signed-off-by: default avatarNeil Brown <neilb@suse.de>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    e0a115e5
raid5.c 138 KB