• Niklas Cassel's avatar
    scsi: sd: Handle read/write CDL timeout failures · 390e2d1a
    Niklas Cassel authored
    Commands using a duration limit descriptor that has limit policies set to a
    value other than 0x0 may be failed by the device if one of the limits are
    exceeded. For such commands, since the failure is the result of the user
    duration limit configuration and workload, the commands should not be
    retried and terminated immediately. Furthermore, to allow the user to
    differentiate these "soft" failures from hard errors due to hardware
    problem, a different error code than EIO should be returned.
    
    There are 2 cases to consider:
    
    (1) The failure is due to a limit policy failing the command with a check
    condition sense key, that is, any limit policy other than 0xD.  For this
    case, scsi_check_sense() is modified to detect failures with the ABORTED
    COMMAND sense key and the COMMAND TIMEOUT BEFORE PROCESSING or COMMAND
    TIMEOUT DURING PROCESSING or COMMAND TIMEOUT DURING PROCESSING DUE TO ERROR
    RECOVERY additional sense code. For these failures, a SUCCESS disposition
    is returned so that scsi_finish_command() is called to terminate the
    command.
    
    (2) The failure is due to a limit policy set to 0xD, which result in the
    command being terminated with a GOOD status, COMPLETED sense key, and DATA
    CURRENTLY UNAVAILABLE additional sense code. To handle this case, the
    scsi_check_sense() is modified to return a SUCCESS disposition so that
    scsi_finish_command() is called to terminate the command.  In addition,
    scsi_decide_disposition() has to be modified to see if a command being
    terminated with GOOD status has sense data.  This is as defined in SCSI
    Primary Commands - 6 (SPC-6), so all according to spec, even if GOOD status
    commands were not checked before.
    
    If scsi_check_sense() detects sense data representing a duration limit,
    scsi_check_sense() will set the newly introduced SCSI ML byte
    SCSIML_STAT_DL_TIMEOUT. This SCSI ML byte is checked in scsi_noretry_cmd(),
    so that a command that failed because of a CDL timeout cannot be
    retried. The SCSI ML byte is also checked in scsi_result_to_blk_status() to
    complete the command request with the BLK_STS_DURATION_LIMIT status, which
    result in the user seeing ETIME errors for the failed commands.
    Co-developed-by: default avatarDamien Le Moal <dlemoal@kernel.org>
    Signed-off-by: default avatarDamien Le Moal <dlemoal@kernel.org>
    Reviewed-by: default avatarHannes Reinecke <hare@suse.de>
    Signed-off-by: default avatarNiklas Cassel <niklas.cassel@wdc.com>
    Link: https://lore.kernel.org/r/20230511011356.227789-12-nks@flawful.orgSigned-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
    390e2d1a
scsi_error.c 71.6 KB