• Uday Shankar's avatar
    nvme: check IO start time when deciding to defer KA · 774a9636
    Uday Shankar authored
    When a command completes, we set a flag which will skip sending a
    keep alive at the next run of nvme_keep_alive_work when TBKAS is on.
    However, if the command was submitted long ago, it's possible that
    the controller may have also restarted its keep alive timer (as a
    result of receiving the command) long ago. The following trace
    demonstrates the issue, assuming TBKAS is on and KATO = 8 for
    simplicity:
    
    1. t = 0: submit I/O commands A, B, C, D, E
    2. t = 0.5: commands A, B, C, D, E reach controller, restart its keep
                alive timer
    3. t = 1: A completes
    4. t = 2: run nvme_keep_alive_work, see recent completion, do nothing
    5. t = 3: B completes
    6. t = 4: run nvme_keep_alive_work, see recent completion, do nothing
    7. t = 5: C completes
    8. t = 6: run nvme_keep_alive_work, see recent completion, do nothing
    9. t = 7: D completes
    10. t = 8: run nvme_keep_alive_work, see recent completion, do nothing
    11. t = 9: E completes
    
    At this point, 8.5 seconds have passed without restarting the
    controller's keep alive timer, so the controller will detect a keep
    alive timeout.
    
    Fix this by checking the IO start time when deciding to defer sending a
    keep alive command. Only set comp_seen if the command started after the
    most recent run of nvme_keep_alive_work. With this change, the
    completions of B, C, and D will not set comp_seen and the run of
    nvme_keep_alive_work at t = 4 will send a keep alive.
    Reported-by: default avatarCosta Sapuntzakis <costa@purestorage.com>
    Reported-by: default avatarRandy Jennings <randyj@purestorage.com>
    Signed-off-by: default avatarUday Shankar <ushankar@purestorage.com>
    Reviewed-by: default avatarHannes Reinecke <hare@suse.de>
    Reviewed-by: default avatarSagi Grimberg <sagi@grimberg.me>
    Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
    Signed-off-by: default avatarKeith Busch <kbusch@kernel.org>
    774a9636
nvme.h 30.3 KB