1. 22 Jun, 2015 1 commit
    • Bob Liu's avatar
      drivers: xen-blkfront: only talk_to_blkback() when in XenbusStateInitialising · a9b54bb9
      Bob Liu authored
      Patch 69b91ede
      "drivers: xen-blkback: delay pending_req allocation to connect_ring"
      exposed an problem that Xen blkfront has. There is a race
      with XenStored and the drivers such that we can see two:
      
      vbd vbd-268440320: blkfront:blkback_changed to state 2.
      vbd vbd-268440320: blkfront:blkback_changed to state 2.
      vbd vbd-268440320: blkfront:blkback_changed to state 4.
      
      state changes to XenbusStateInitWait ('2'). The end result is that
      blkback_changed() receives two notify and calls twice setup_blkring().
      
      While the backend driver may only get the first setup_blkring() which is
      wrong and reads out-dated (or reads them as they are being updated
      with new ring-ref values).
      
      The end result is that the ring ends up being incorrectly set.
      
      The other drivers in the tree have such checks already in.
      Reported-and-Tested-by: default avatarRobert Butera <robert.butera@oracle.com>
      Signed-off-by: default avatarBob Liu <bob.liu@oracle.com>
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      a9b54bb9
  2. 06 Jun, 2015 3 commits
    • Bob Liu's avatar
      xen/block: add multi-page ring support · 86839c56
      Bob Liu authored
      Extend xen/block to support multi-page ring, so that more requests can be
      issued by using more than one pages as the request ring between blkfront
      and backend.
      As a result, the performance can get improved significantly.
      
      We got some impressive improvements on our highend iscsi storage cluster
      backend. If using 64 pages as the ring, the IOPS increased about 15 times
      for the throughput testing and above doubled for the latency testing.
      
      The reason was the limit on outstanding requests is 32 if use only one-page
      ring, but in our case the iscsi lun was spread across about 100 physical
      drives, 32 was really not enough to keep them busy.
      
      Changes in v2:
       - Rebased to 4.0-rc6.
       - Document on how multi-page ring feature working to linux io/blkif.h.
      
      Changes in v3:
       - Remove changes to linux io/blkif.h and follow the protocol defined
         in io/blkif.h of XEN tree.
       - Rebased to 4.1-rc3
      
      Changes in v4:
       - Turn to use 'ring-page-order' and 'max-ring-page-order'.
       - A few comments from Roger.
      
      Changes in v5:
       - Clarify with 4k granularity to comment
       - Address more comments from Roger
      Signed-off-by: default avatarBob Liu <bob.liu@oracle.com>
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      86839c56
    • Bob Liu's avatar
      driver: xen-blkfront: move talk_to_blkback to a more suitable place · 8ab0144a
      Bob Liu authored
      The major responsibility of talk_to_blkback() is allocate and initialize
      the request ring and write the ring info to xenstore.
      But this work should be done after backend entered 'XenbusStateInitWait' as
      defined in the protocol file.
      See xen/include/public/io/blkif.h in XEN git tree:
      Front                                Back
      =================================    =====================================
      XenbusStateInitialising              XenbusStateInitialising
       o Query virtual device               o Query backend device identification
         properties.                          data.
       o Setup OS device instance.          o Open and validate backend device.
                                            o Publish backend features and
                                              transport parameters.
                                                           |
                                                           |
                                                           V
                                           XenbusStateInitWait
      
      o Query backend features and
        transport parameters.
      o Allocate and initialize the
        request ring.
      
      There is no problem with this yet, but it is an violation of the design and
      furthermore it would not allow frontend/backend to negotiate 'multi-page'
      and 'multi-queue' features.
      
      Changes in v2:
       - Re-write the commit message to be more clear.
      Signed-off-by: default avatarBob Liu <bob.liu@oracle.com>
      Acked-by: default avatarRoger Pau Monné <roger.pau@citrix.com>
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      8ab0144a
    • Bob Liu's avatar
      drivers: xen-blkback: delay pending_req allocation to connect_ring · 69b91ede
      Bob Liu authored
      This is a pre-patch for multi-page ring feature.
      In connect_ring, we can know exactly how many pages are used for the shared
      ring, delay pending_req allocation here so that we won't waste too much memory.
      Signed-off-by: default avatarBob Liu <bob.liu@oracle.com>
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      69b91ede
  3. 05 Jun, 2015 5 commits
  4. 02 Jun, 2015 2 commits
    • Akinobu Mita's avatar
      null_blk: restart request processing on completion handler · 8b70f45e
      Akinobu Mita authored
      When irqmode=2 (IRQ completion handler is timer) and queue_mode=1
      (Block interface to use is rq), the completion handler should restart
      request handling for any pending requests on a queue because request
      processing stops when the number of commands are queued more than
      hw_queue_depth (null_rq_prep_fn returns BLKPREP_DEFER).
      
      Without this change, the following command cannot finish.
      
      	# modprobe null_blk irqmode=2 queue_mode=1 hw_queue_depth=1
      	# fio --name=t --rw=read --size=1g --direct=1 \
      	  --ioengine=libaio --iodepth=64 --filename=/dev/nullb0
      Signed-off-by: default avatarAkinobu Mita <akinobu.mita@gmail.com>
      Cc: Jens Axboe <axboe@fb.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      8b70f45e
    • Akinobu Mita's avatar
      null_blk: prevent timer handler running on a different CPU where started · 419c21a3
      Akinobu Mita authored
      When irqmode=2 (IRQ completion handler is timer), timer handler should
      be called on the same CPU where the timer has been started.
      
      Since completion_queues are per-cpu and the completion handler only
      touches completion_queue for local CPU, we need to prevent the handler
      from running on a different CPU where the timer has been started.
      Otherwise, the IO cannot be completed until another completion handler
      is executed on that CPU.
      Signed-off-by: default avatarAkinobu Mita <akinobu.mita@gmail.com>
      Cc: Jens Axboe <axboe@fb.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      419c21a3
  5. 01 Jun, 2015 3 commits
    • Keith Busch's avatar
      NVMe: Remove hctx reliance for multi-namespace · 42483228
      Keith Busch authored
      The driver needs to track shared tags to support multiple namespaces
      that may be dynamically allocated or deleted. Relying on the first
      request_queue's hctx's is not appropriate as we cannot clear outstanding
      tags for all namespaces using this handle, nor can the driver easily track
      all request_queue's hctx as namespaces are attached/detached. Instead,
      this patch uses the nvme_dev's tagset to get the shared tag resources
      instead of through a request_queue hctx.
      Signed-off-by: default avatarKeith Busch <keith.busch@intel.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      42483228
    • Jens Axboe's avatar
      843e8ddb
    • Keith Busch's avatar
      blk-mq: Shared tag enhancements · f26cdc85
      Keith Busch authored
      Storage controllers may expose multiple block devices that share hardware
      resources managed by blk-mq. This patch enhances the shared tags so a
      low-level driver can access the shared resources not tied to the unshared
      h/w contexts. This way the LLD can dynamically add and delete disks and
      request queues without having to track all the request_queue hctx's to
      iterate outstanding tags.
      Signed-off-by: default avatarKeith Busch <keith.busch@intel.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      f26cdc85
  6. 29 May, 2015 4 commits
  7. 26 May, 2015 1 commit
  8. 22 May, 2015 12 commits
  9. 20 May, 2015 9 commits