• Lars Ellenberg's avatar
    drbd: if there is no good data accessible, writes should be IO errors · 0ead5cca
    Lars Ellenberg authored
    If DRBD lost all path to good data,
    and the on-no-data-accessible policy is OND_SUSPEND_IO,
    all pending and new IO requests are suspended (will block).
    
    If that setting is OND_IO_ERROR, IO will still be completed.
    READ to "clean" areas (e.g. on an D_INCONSISTENT device,
    and bitmap indicates a block is already in sync) will succeed.
    READ to "unclean" areas (bitmap indicates block is out-of-sync),
    will return EIO.
    
    If we are already D_DISKLESS (or D_FAILED), we also return EIO.
    
    Unfortunately, on a former R_PRIMARY C_SYNC_TARGET D_INCONSISTENT,
    after replication link loss, new WRITE requests still went through OK.
    
    The would also set the "out-of-sync" bit on their way, so READ after
    WRITE would still return EIO. Also, the data generation UUIDs had not
    been bumped, we would cause data divergence, without being able to
    detect it on the next sync handshake, given the right sequence of events
    in a multiple error scenario and "improper" order of recovery actions.
    
    The right thing to do is to return EIO for all new writes,
    unless we have access to good, current, D_UP_TO_DATE data.
    
    The "established best practices" way to avoid these situations in the
    first place is to set OND_SUSPEND_IO, or even do a hard-reset from
    the pri-on-incon-degr policy helper hook.
    Signed-off-by: default avatarPhilipp Reisner <philipp.reisner@linbit.com>
    Signed-off-by: default avatarLars Ellenberg <lars.ellenberg@linbit.com>
    Signed-off-by: default avatarJens Axboe <axboe@fb.com>
    0ead5cca
drbd_req.c 54.8 KB