1. 10 Aug, 2017 4 commits
    • Max Gurtovoy's avatar
      nvme-pci: fix CMB sysfs file removal in reset path · 1c78f773
      Max Gurtovoy authored
      Currently we create the sysfs entry even if we fail mapping
      it. In that case, the unmapping will not remove the sysfs created
      file. There is no good reason to create a sysfs entry for a non
      working CMB and show his characteristics.
      
      Fixes: f63572df ("nvme: unmap CMB and remove sysfs file in reset path")
      Signed-off-by: default avatarMax Gurtovoy <maxg@mellanox.com>
      Reviewed-by: default avatarStephen Bates <sbates@raithlin.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      1c78f773
    • James Smart's avatar
      lpfc: support nvmet_fc defer_rcv callback · 50738420
      James Smart authored
      Currently, calls to nvmet_fc_rcv_fcp_req() always copied the
      FC-NVME cmd iu to a temporary buffer before returning, allowing
      the driver to immediately repost the buffer to the hardware.
      
      To address timing conditions on queue element structures vs async
      command reception, the nvmet_fc transport occasionally may need to
      hold on to the command iu buffer for a short period. In these cases,
      the nvmet_fc_rcv_fcp_req() will return a special return code
      (-EOVERFLOW). In these cases, the LLDD must delay until the new
      defer_rcv lldd callback is called before recycling the buffer back
      to the hw.
      
      This patch adds support for the new nvmet_fc transport defer_rcv
      callback and recognition of the new error code when passing commands
      to the transport.
      Signed-off-by: default avatarDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: default avatarJames Smart <james.smart@broadcom.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      50738420
    • James Smart's avatar
      nvmet_fc: add defer_req callback for deferment of cmd buffer return · 0fb228d3
      James Smart authored
      At queue creation, the transport allocates a local job struct
      (struct nvmet_fc_fcp_iod) for each possible element of the queue.
      When a new CMD is received from the wire, a jobs struct is allocated
      from the queue and then used for the duration of the command.
      The job struct contains buffer space for the wire command iu. Thus,
      upon allocation of the job struct, the cmd iu buffer is copied to
      the job struct and the LLDD may immediately free/reuse the CMD IU
      buffer passed in the call.
      
      However, in some circumstances, due to the packetized nature of FC
      and the api of the FC LLDD which may issue a hw command to send the
      wire response, but the LLDD may not get the hw completion for the
      command and upcall the nvmet_fc layer before a new command may be
      asynchronously received on the wire. In other words, its possible
      for the initiator to get the response from the wire, thus believe a
      command slot free, and send a new command iu. The new command iu
      may be received by the LLDD and passed to the transport before the
      LLDD had serviced the hw completion and made the teardown calls for
      the original job struct. As such, there is no available job struct
      available for the new io. E.g. it appears like the host sent more
      queue elements than the queue size. It didn't based on it's
      understanding.
      
      Rather than treat this as a hard connection failure queue the new
      request until the job struct does free up. As the buffer isn't
      copied as there's no job struct, a special return value must be
      returned to the LLDD to signify to hold off on recycling the cmd
      iu buffer.  And later, when a job struct is allocated and the
      buffer copied, a new LLDD callback is introduced to notify the
      LLDD and allow it to recycle it's command iu buffer.
      Signed-off-by: default avatarJames Smart <james.smart@broadcom.com>
      Reviewed-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      0fb228d3
    • Martin Wilck's avatar
      nvme: strip trailing 0-bytes in wwid_show · 758f3735
      Martin Wilck authored
      Some broken controllers (such as earlier Linux targets) pad model or
      serial fields with 0-bytes rather than spaces. The NVMe spec disallows
      0 bytes in "ASCII" fields.  Thus strip trailing 0-bytes, too. Also make
      sure that we get no underflow for pathological input.
      Signed-off-by: default avatarMartin Wilck <mwilck@suse.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.de>
      Reviewed-by: default avatarKeith Busch <keith.busch@intel.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      758f3735
  2. 26 Jul, 2017 1 commit
    • Scott Bauer's avatar
      nvme: validate admin queue before unquiesce · 7dd1ab16
      Scott Bauer authored
      With a misbehaving controller it's possible we'll never
      enter the live state and create an admin queue. When we
      fail out of reset work it's possible we failed out early
      enough without setting up the admin queue. We tear down
      queues after a failed reset, but needed to do some more
      sanitization.
      
      Fixes 443bd90f: "nvme: host: unquiesce queue in nvme_kill_queues()"
      
      [  189.650995] nvme nvme1: pci function 0000:0b:00.0
      [  317.680055] nvme nvme0: Device not ready; aborting reset
      [  317.680183] nvme nvme0: Removing after probe failure status: -19
      [  317.681258] kasan: GPF could be caused by NULL-ptr deref or user memory access
      [  317.681397] general protection fault: 0000 [#1] SMP KASAN
      [  317.682984] CPU: 3 PID: 477 Comm: kworker/3:2 Not tainted 4.13.0-rc1+ #5
      [  317.683112] Hardware name: Gigabyte Technology Co., Ltd. Z170X-UD5/Z170X-UD5-CF, BIOS F5 03/07/2016
      [  317.683284] Workqueue: events nvme_remove_dead_ctrl_work [nvme]
      [  317.683398] task: ffff8803b0990000 task.stack: ffff8803c2ef0000
      [  317.683516] RIP: 0010:blk_mq_unquiesce_queue+0x2b/0xa0
      [  317.683614] RSP: 0018:ffff8803c2ef7d40 EFLAGS: 00010282
      [  317.683716] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 1ffff1006fbdcde3
      [  317.683847] RDX: 0000000000000038 RSI: 1ffff1006f5a9245 RDI: 0000000000000000
      [  317.683978] RBP: ffff8803c2ef7d58 R08: 1ffff1007bcdc974 R09: 0000000000000000
      [  317.684108] R10: 1ffff1007bcdc975 R11: 0000000000000000 R12: 00000000000001c0
      [  317.684239] R13: ffff88037ad49228 R14: ffff88037ad492d0 R15: ffff88037ad492e0
      [  317.684371] FS:  0000000000000000(0000) GS:ffff8803de6c0000(0000) knlGS:0000000000000000
      [  317.684519] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  317.684627] CR2: 0000002d1860c000 CR3: 000000045b40d000 CR4: 00000000003406e0
      [  317.684758] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  317.684888] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  317.685018] Call Trace:
      [  317.685084]  nvme_kill_queues+0x4d/0x170 [nvme_core]
      [  317.685185]  nvme_remove_dead_ctrl_work+0x3a/0x90 [nvme]
      [  317.685289]  process_one_work+0x771/0x1170
      [  317.685372]  worker_thread+0xde/0x11e0
      [  317.685452]  ? pci_mmcfg_check_reserved+0x110/0x110
      [  317.685550]  kthread+0x2d3/0x3d0
      [  317.685617]  ? process_one_work+0x1170/0x1170
      [  317.685704]  ? kthread_create_on_node+0xc0/0xc0
      [  317.685785]  ret_from_fork+0x25/0x30
      [  317.685798] Code: 0f 1f 44 00 00 55 48 b8 00 00 00 00 00 fc ff df 48 89 e5 41 54 4c 8d a7 c0 01 00 00 53 48 89 fb 4c 89 e2 48 c1 ea 03 48 83 ec 08 <80> 3c 02 00 75 50 48 8b bb c0 01 00 00 e8 33 8a f9 00 0f ba b3
      [  317.685872] RIP: blk_mq_unquiesce_queue+0x2b/0xa0 RSP: ffff8803c2ef7d40
      [  317.685908] ---[ end trace a3f8704150b1e8b4 ]---
      Signed-off-by: default avatarScott Bauer <scott.bauer@intel.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      7dd1ab16
  3. 25 Jul, 2017 5 commits
    • Christoph Hellwig's avatar
      nvme-pci: fix HMB size calculation · 50cdb7c6
      Christoph Hellwig authored
      It's possible the preferred HMB size may not be a multiple of the
      chunk_size. This patch moves len to function scope and uses that in
      the for loop increment so the last iteration doesn't cause the total
      size to exceed the allocated HMB size.
      
      Based on an earlier patch from Keith Busch.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Reviewed-by: default avatarKeith Busch <keith.busch@intel.com>
      Fixes: 87ad72a5 ("nvme-pci: implement host memory buffer support")
      50cdb7c6
    • James Smart's avatar
      nvme-fc: revise TRADDR parsing · 9c5358e1
      James Smart authored
      The FC-NVME spec hasn't locked down on the format string for TRADDR.
      Currently the spec is lobbying for "nn-<16hexdigits>:pn-<16hexdigits>"
      where the wwn's are hex values but not prefixed by 0x.
      
      Most implementations so far expect a string format of
      "nn-0x<16hexdigits>:pn-0x<16hexdigits>" to be used. The transport
      uses the match_u64 parser which requires a leading 0x prefix to set
      the base properly. If it's not there, a match will either fail or return
      a base 10 value.
      
      The resolution in T11 is pushing out. Therefore, to fix things now and
      to cover any eventuality and any implementations already in the field,
      this patch adds support for both formats.
      
      The change consists of replacing the token matching routine with a
      routine that validates the fixed string format, and then builds
      a local copy of the hex name with a 0x prefix before calling
      the system parser.
      
      Note: the same parser routine exists in both the initiator and target
      transports. Given this is about the only "shared" item, we chose to
      replicate rather than create an interdendency on some shared code.
      Signed-off-by: default avatarJames Smart <james.smart@broadcom.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      9c5358e1
    • James Smart's avatar
      nvme-fc: address target disconnect race conditions in fcp io submit · 8b25f351
      James Smart authored
      There are cases where threads are in the process of submitting new
      io when the LLDD calls in to remove the remote port. In some cases,
      the next io actually goes to the LLDD, who knows the remoteport isn't
      present and rejects it. To properly recovery/restart these i/o's we
      don't want to hard fail them, we want to treat them as temporary
      resource errors in which a delayed retry will work.
      
      Add a couple more checks on remoteport connectivity and commonize the
      busy response handling when it's seen.
      Signed-off-by: default avatarJames Smart <james.smart@broadcom.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      8b25f351
    • Jon Derrick's avatar
      nvme: fabrics commands should use the fctype field for data direction · 2fd4167f
      Jon Derrick authored
      Fabrics commands with opcode 0x7F use the fctype field to indicate data
      direction.
      Signed-off-by: default avatarJon Derrick <jonathan.derrick@intel.com>
      Reviewed-by: default avatarSagi Grimberg <sai@grmberg.me>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Fixes: eb793e2c ("nvme.h: add NVMe over Fabrics definitions")
      2fd4167f
    • Johannes Thumshirn's avatar
      nvme: also provide a UUID in the WWID sysfs attribute · 6484f5d1
      Johannes Thumshirn authored
      The WWID sysfs attribute can provide multiple means of a World Wide ID
      for a NVMe device. It can either be a NGUID, a EUI-64 or a concatenation
      of VID, Serial Number, Model and the Namespace ID in this order of
      preference.
      
      If the target also sends us a UUID use the UUID for identification and
      give it the highest priority.
      
      This eases generation of /dev/disk/by-* symlinks.
      Signed-off-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      6484f5d1
  4. 24 Jul, 2017 3 commits
    • Christoph Hellwig's avatar
      blk-mq: map queues to all present CPUs · 76451d79
      Christoph Hellwig authored
      We already do this for PCI mappings, and the higher level code now
      expects that CPU on/offlining doesn't have an affect on the queue
      mappings.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Tested-by: default avatarMax Gurtovoy <maxg@mellanox.com>
      Reviewed-by: default avatarMax Gurtovoy <maxg@mellanox.com>
      Reviewed-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      76451d79
    • Christoph Hellwig's avatar
      block: disable runtime-pm for blk-mq · 765e40b6
      Christoph Hellwig authored
      The blk-mq code lacks support for looking at the rpm_status field, tracking
      active requests and the RQF_PM flag.
      
      Due to the default switch to blk-mq for scsi people start to run into
      suspend / resume issue due to this fact, so make sure we disable the runtime
      PM functionality until it is properly implemented.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarMing Lei <ming.lei@redhat.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      765e40b6
    • Bart Van Assche's avatar
      xen-blkfront: Fix handling of non-supported operations · 31c4ccc3
      Bart Van Assche authored
      This patch fixes the following sparse warnings:
      
      drivers/block/xen-blkfront.c:916:45: warning: incorrect type in argument 2 (different base types)
      drivers/block/xen-blkfront.c:916:45:    expected restricted blk_status_t [usertype] error
      drivers/block/xen-blkfront.c:916:45:    got int [signed] error
      drivers/block/xen-blkfront.c:1599:47: warning: incorrect type in assignment (different base types)
      drivers/block/xen-blkfront.c:1599:47:    expected int [signed] error
      drivers/block/xen-blkfront.c:1599:47:    got restricted blk_status_t [usertype] <noident>
      drivers/block/xen-blkfront.c:1607:55: warning: incorrect type in assignment (different base types)
      drivers/block/xen-blkfront.c:1607:55:    expected int [signed] error
      drivers/block/xen-blkfront.c:1607:55:    got restricted blk_status_t [usertype] <noident>
      drivers/block/xen-blkfront.c:1625:55: warning: incorrect type in assignment (different base types)
      drivers/block/xen-blkfront.c:1625:55:    expected int [signed] error
      drivers/block/xen-blkfront.c:1625:55:    got restricted blk_status_t [usertype] <noident>
      drivers/block/xen-blkfront.c:1628:62: warning: restricted blk_status_t degrades to integer
      
      Compile-tested only.
      
      Fixes: commit 2a842aca ("block: introduce new block status code type")
      Signed-off-by: default avatarBart Van Assche <bart.vanassche@wdc.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Roger Pau Monné <roger.pau@citrix.com>
      Cc: <xen-devel@lists.xenproject.org>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      31c4ccc3
  5. 22 Jul, 2017 4 commits
  6. 21 Jul, 2017 23 commits