• Sagi Grimberg's avatar
    nvme: make fabrics command run on a separate request queue · e7832cb4
    Sagi Grimberg authored
    We have a fundamental issue that fabric commands use the admin_q.
    The reason is, that admin-connect, register reads and writes and
    admin commands cannot be guaranteed ordering while we are running
    controller resets.
    
    For example, when we reset a controller we perform:
    1. disable the controller
    2. teardown the admin queue
    3. re-establish the admin queue
    4. enable the controller
    
    In order to perform (3), we need to unquiesce the admin queue, however
    we may have some admin commands that are already pending on the
    quiesced admin_q and will immediate execute when we unquiesce it before
    we execute (4). The host must not send admin commands to the controller
    before enabling the controller.
    
    To fix this, we have the fabric commands (admin connect and property
    get/set, but not I/O queue connect) use a separate fabrics_q and make
    sure to quiesce the admin_q before we disable the controller, and
    unquiesce it only after we enable the controller.
    
    This fixes the error prints from nvmet in a controller reset storm test:
    kernel: nvmet: got cmd 6 while CC.EN == 0 on qid = 0
    Which indicate that the host is sending an admin command when the
    controller is not enabled.
    Reviewed-by: default avatarJames Smart <james.smart@broadcom.com>
    Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
    Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
    e7832cb4
rdma.c 54 KB