• Michal Wnukowski's avatar
    nvme-pci: add a memory barrier to nvme_dbbuf_update_and_check_event · f1ed3df2
    Michal Wnukowski authored
    In many architectures loads may be reordered with older stores to
    different locations.  In the nvme driver the following two operations
    could be reordered:
    
     - Write shadow doorbell (dbbuf_db) into memory.
     - Read EventIdx (dbbuf_ei) from memory.
    
    This can result in a potential race condition between driver and VM host
    processing requests (if given virtual NVMe controller has a support for
    shadow doorbell).  If that occurs, then the NVMe controller may decide to
    wait for MMIO doorbell from guest operating system, and guest driver may
    decide not to issue MMIO doorbell on any of subsequent commands.
    
    This issue is purely timing-dependent one, so there is no easy way to
    reproduce it. Currently the easiest known approach is to run "Oracle IO
    Numbers" (orion) that is shipped with Oracle DB:
    
    orion -run advanced -num_large 0 -size_small 8 -type rand -simulate \
    	concat -write 40 -duration 120 -matrix row -testname nvme_test
    
    Where nvme_test is a .lun file that contains a list of NVMe block
    devices to run test against. Limiting number of vCPUs assigned to given
    VM instance seems to increase chances for this bug to occur. On test
    environment with VM that got 4 NVMe drives and 1 vCPU assigned the
    virtual NVMe controller hang could be observed within 10-20 minutes.
    That correspond to about 400-500k IO operations processed (or about
    100GB of IO read/writes).
    
    Orion tool was used as a validation and set to run in a loop for 36
    hours (equivalent of pushing 550M IO operations). No issues were
    observed. That suggest that the patch fixes the issue.
    
    Fixes: f9f38e33 ("nvme: improve performance for virtual NVMe devices")
    Signed-off-by: default avatarMichal Wnukowski <wnukowski@google.com>
    Reviewed-by: default avatarKeith Busch <keith.busch@intel.com>
    Reviewed-by: default avatarSagi Grimberg <sagi@grimberg.me>
    [hch: updated changelog and comment a bit]
    Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
    f1ed3df2
pci.c 69.3 KB