1. 21 Apr, 2017 8 commits
    • James Smart's avatar
      nvme_fcloop: split job struct from transport for req_release · ce79bfc2
      James Smart authored
      Current design has the fcloop job struct, used for both initiator and
      target processing, allocated as part of the initiator request structure.
      On aborts, the initiator side (based on the request) may terminate, yet
      the target side wants to continue processing. the target side can't do
      that if the initiator side goes away.
      Revise fcloop to allocate an independent target side structure when it
      starts an io from the initiator.
      
      Added a lock to the request struct as well to synchronize pointer updates
      on abort calls.
      
      Modified target downcalls to recognize conditions where initiator has
      aborted the io (thus nulled the pointer between job structs), thus
      avoid referencing sgl lists which are gone and no longer making upcalls
      to the initiator.
      
      In conditions where the targetport is no longer connected, have the
      initiator return an access failure rather than simulating a command
      completion.
      Signed-off-by: default avatarJames Smart <james.smart@broadcom.com>
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      ce79bfc2
    • James Smart's avatar
      nvmet_fc: add req_release to lldd api · 19b58d94
      James Smart authored
      With the advent of the opdone calls changing context, the lldd can no
      longer assume that once the op->done call returns for RSP operations
      that the request struct is no longer being accessed.
      
      As such, revise the lldd api for a req_release callback that the
      transport will call when the job is complete. This will also be used
      with abort cases.
      
      Fixed text in api header for change in io complete semantics.
      
      Revised lpfc to support the new req_release api.
      Signed-off-by: default avatarJames Smart <james.smart@broadcom.com>
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      19b58d94
    • James Smart's avatar
      nvmet_fc: add target feature flags for upcall isr contexts · 39498fae
      James Smart authored
      Two new feature flags were added to control whether upcalls to the
      transport result in context switches or stay in the calling context.
      
      NVMET_FCTGTFEAT_CMD_IN_ISR:
        By default, if the flag is not set, the transport assumes the
        lldd is in a non-isr context and in the cpu context it should be
        for the io queue. As such, the cmd handler is called directly in the
        calling context.
        If the flag is set, indicating the upcall is an isr context, the
        transport mandates a transition to a workqueue. The workqueue assigned
        to the queue is used for the context.
      NVMET_FCTGTFEAT_OPDONE_IN_ISR
        By default, if the flag is not set, the transport assumes the
        lldd is in a non-isr context and in the cpu context it should be
        for the io queue. As such, the fcp operation done callback is called
        directly in the calling context.
        If the flag is set, indicating the upcall is an isr context, the
        transport mandates a transition to a workqueue. The workqueue assigned
        to the queue is used for the context.
      
      Updated lpfc for flags
      Signed-off-by: default avatarJames Smart <james.smart@broadcom.com>
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      39498fae
    • Logan Gunthorpe's avatar
      nvmet: convert from kmap to nvmet_copy_from_sgl · 1c05cf90
      Logan Gunthorpe authored
      This is safer as it doesn't rely on the data being stored in
      a single page in an sgl.
      
      It also aids our effort to start phasing out users of sg_page. See [1].
      
      For this we kmalloc some memory, copy to it and free at the end. Note:
      we can't allocate this memory on the stack as the kbuild test robot
      reports some frame size overflows on i386.
      
      [1] https://lwn.net/Articles/720053/Signed-off-by: default avatarLogan Gunthorpe <logang@deltatee.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarMax Gurtovoy <maxg@mellanox.com>
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      1c05cf90
    • Helen Koike's avatar
      nvme: improve performance for virtual NVMe devices · f9f38e33
      Helen Koike authored
      This change provides a mechanism to reduce the number of MMIO doorbell
      writes for the NVMe driver. When running in a virtualized environment
      like QEMU, the cost of an MMIO is quite hefy here. The main idea for
      the patch is provide the device two memory location locations:
       1) to store the doorbell values so they can be lookup without the doorbell
          MMIO write
       2) to store an event index.
      I believe the doorbell value is obvious, the event index not so much.
      Similar to the virtio specification, the virtual device can tell the
      driver (guest OS) not to write MMIO unless you are writing past this
      value.
      
      FYI: doorbell values are written by the nvme driver (guest OS) and the
      event index is written by the virtual device (host OS).
      
      The patch implements a new admin command that will communicate where
      these two memory locations reside. If the command fails, the nvme
      driver will work as before without any optimizations.
      
      Contributions:
        Eric Northup <digitaleric@google.com>
        Frank Swiderski <fes@google.com>
        Ted Tso <tytso@mit.edu>
        Keith Busch <keith.busch@intel.com>
      
      Just to give an idea on the performance boost with the vendor
      extension: Running fio [1], a stock NVMe driver I get about 200K read
      IOPs with my vendor patch I get about 1000K read IOPs. This was
      running with a null device i.e. the backing device simply returned
      success on every read IO request.
      
      [1] Running on a 4 core machine:
        fio --time_based --name=benchmark --runtime=30
        --filename=/dev/nvme0n1 --nrfiles=1 --ioengine=libaio --iodepth=32
        --direct=1 --invalidate=1 --verify=0 --verify_fatal=0 --numjobs=4
        --rw=randread --blocksize=4k --randrepeat=false
      Signed-off-by: default avatarRob Nelson <rlnelson@google.com>
      [mlin: port for upstream]
      Signed-off-by: default avatarMing Lin <mlin@kernel.org>
      [koike: updated for upstream]
      Signed-off-by: default avatarHelen Koike <helen.koike@collabora.co.uk>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarKeith Busch <keith.busch@intel.com>
      f9f38e33
    • Keith Busch's avatar
      nvme/pci: Don't set reserved SQ create flags · 81c1cd98
      Keith Busch authored
      The QPRIO field is only valid if weighted round robin arbitration is used,
      and this driver doesn't enable that controller configuration option.
      Signed-off-by: default avatarKeith Busch <keith.busch@intel.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarSagi Grimberg <sagi@grimberg.me>
      81c1cd98
    • Jens Axboe's avatar
      blk-stat: kill blk_stat_rq_ddir() · 99c749a4
      Jens Axboe authored
      No point in providing and exporting this helper. There's just
      one (real) user of it, just use rq_data_dir().
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      99c749a4
    • Josef Bacik's avatar
      nbd: set the max segments to USHRT_MAX · 1cc1f17a
      Josef Bacik authored
      I lack the basic understanding of what segments mean, so we were being
      limited to 512kib requests even with higher max_sectors sizes set.
      Setting the maximum number of segments to unlimited allows us to
      actually have arbitrarily large IO's go through NBD.
      Signed-off-by: default avatarJosef Bacik <jbacik@fb.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      1cc1f17a
  2. 20 Apr, 2017 32 commits