Commits · c5ace19efb0ac884a9a417e2a1499ce9849bdaa5 · Kirill Smelkov / linux

An error occurred fetching the project authors.

10 Aug, 2023 2 commits

vhost-scsi: Rename vhost_scsi_iov_to_sgl · c5ace19e

Mike Christie authored 1 year ago

Rename vhost_scsi_iov_to_sgl to vhost_scsi_map_iov_to_sgl so it matches
matches the naming style used for vhost_scsi_copy_iov_to_sgl.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Message-Id: <20230709202859.138387-3-michael.christie@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>

c5ace19e

vhost-scsi: Fix alignment handling with windows · 5ced58bf

Mike Christie authored 1 year ago

The linux block layer requires bios/requests to have lengths with a 512
byte alignment. Some drivers/layers like dm-crypt and the directi IO code
will test for it and just fail. Other drivers like SCSI just assume the
requirement is met and will end up in infinte retry loops. The problem
for drivers like SCSI is that it uses functions like blk_rq_cur_sectors
and blk_rq_sectors which divide the request's length by 512. If there's
lefovers then it just gets dropped. But other code in the block/scsi
layer may use blk_rq_bytes/blk_rq_cur_bytes and end up thinking there is
still data left and try to retry the cmd. We can then end up getting
stuck in retry loops where part of the block/scsi thinks there is data
left, but other parts think we want to do IOs of zero length.

Linux will always check for alignment, but windows will not. When
vhost-scsi then translates the iovec it gets from a windows guest to a
scatterlist, we can end up with sg items where the sg->length is not
divisible by 512 due to the misaligned offset:

sg[0].offset = 255;
sg[0].length = 3841;
sg...
sg[N].offset = 0;
sg[N].length = 255;

When the lio backends then convert the SG to bios or other iovecs, we
end up sending them with the same misaligned values and can hit the
issues above.

This just has us drop down to allocating a temp page and copying the data
when we detect a misaligned buffer and the IO is large enough that it
will get split into multiple bad IOs.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Message-Id: <20230709202859.138387-2-michael.christie@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>

5ced58bf

03 Jul, 2023 4 commits

vhost_scsi: add support for worker ioctls · d74b55e6

Mike Christie authored 1 year ago

This has vhost-scsi support the worker ioctls by calling the
vhost_worker_ioctl helper.

With a single worker, the single thread becomes a bottlneck when trying
to use 3 or more virtqueues like:

fio --filename=/dev/sdb  --direct=1 --rw=randrw --bs=4k \
--ioengine=libaio --iodepth=128  --numjobs=3

With the patches and doing a worker per vq, we can scale to at least
16 vCPUs/vqs (that's my system limit) with the same command fio command
above with numjobs=16:

fio --filename=/dev/sdb  --direct=1 --rw=randrw --bs=4k \
--ioengine=libaio --iodepth=64  --numjobs=16

which gives around 2002K IOPs.

Note that for testing I dropped depth to 64 above because the vhost/virt
layer supports only 1024 total commands per device. And the only tuning I
did was set LIO's emulate_pr to 0 to avoid LIO's PR lock in the main IO
path which becomes an issue at around 12 jobs/virtqueues.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Message-Id: <20230626232307.97930-17-michael.christie@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

d74b55e6

vhost_scsi: flush IO vqs then send TMF rsp · 0a3eac52

Mike Christie authored 1 year ago

With one worker we will always send the scsi cmd responses then send the
TMF rsp, because LIO will always complete the scsi cmds first then call
into us to send the TMF response.

With multiple workers, the IO vq workers could be running while the
TMF/ctl vq worker is running so this has us do a flush before completing
the TMF to make sure cmds are completed when it's work is later queued
and run.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Message-Id: <20230626232307.97930-12-michael.christie@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

0a3eac52

vhost_scsi: convert to vhost_vq_work_queue · 78af31cc

Mike Christie authored 1 year ago

Convert from vhost_work_queue to vhost_vq_work_queue so we can
remove vhost_work_queue.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Message-Id: <20230626232307.97930-11-michael.christie@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

78af31cc

vhost_scsi: make SCSI cmd completion per vq · 48ae70dd

Mike Christie authored 1 year ago

This patch separates the scsi cmd completion code paths so we can complete
cmds based on their vq instead of having all cmds complete on the same
worker/CPU. This will be useful with the next patches that allow us to
create mulitple worker threads and bind them to different vqs, and we can
have completions running on different threads/CPUs.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230626232307.97930-10-michael.christie@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

48ae70dd

21 Apr, 2023 5 commits

vhost-scsi: Reduce vhost_scsi_mutex use · bea273c7

Mike Christie authored 1 year ago

We on longer need to hold the vhost_scsi_mutex the entire time we
set/clear the endpoint. The tv_tpg_mutex handles tpg accesses not related
to the tpg list, the port link/unlink functions use the tv_tpg_mutex while
accessing the tpg->vhost_scsi pointer, vhost_scsi_do_plug will no longer
queue events after the virtqueue's backend has been cleared and flushed,
and we don't drop our refcount to the tpg until after we have stopped
cmds and wait for outstanding cmds to complete.

This moves the vhost_scsi_mutex use to it's documented use of being used
to access the tpg list. We then don't need to hold it while a flush is
being performed causing other device's vhost_scsi_set_endpoint
and vhost_scsi_make_tpg/vhost_scsi_drop_tpg calls to have to wait on a
flakey device.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Message-Id: <20230321020624.13323-8-michael.christie@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

bea273c7

vhost-scsi: Drop vhost_scsi_mutex use in port callouts · f5ed6f9e

Mike Christie authored 1 year ago

We are using the vhost_scsi_mutex to make sure vhost_scsi_port_link and
vhost_scsi_port_unlink see if vhost_scsi_clear_endpoint has cleared
tpg->vhost_scsi and it can't be freed while they are using.

However, we currently set the tpg->vhost_scsi pointer while holding
tv_tpg_mutex. So, we can just hold that while calling
vhost_scsi_hotplug/hotunplug. We then don't need to hold the
vhost_scsi_mutex while vhost_scsi_clear_endpoint is holding it and doing
a flush which could cause the LUN map/unmap to have to wait on another
device's flush.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Message-Id: <20230321020624.13323-7-michael.christie@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

f5ed6f9e

vhost-scsi: Check for a cleared backend before queueing an event · ced9eb37

Mike Christie authored 1 year ago

We currenly hold the vhost_scsi_mutex while clearing the endpoint and
while performing vhost_scsi_do_plug, so tpg->vhost_scsi can't be freed
from uder us, and to make sure anything queued is handled by the
full call in vhost_scsi_clear_endpoint.

This patch removes the need for the vhost_scsi_mutex for the latter
case. In the next patches, we won't hold the vhost_scsi_mutex while
flushing so this patch adds a check for the clearing of the virtqueue
from vhost_scsi_clear_endpoint. We then know that once
vhost_scsi_clear_endpoint has cleared the backend that no new events
will be queued, and the flush after the vhost_vq_set_backend(vq, NULL)
call will see everything that's been queued to that point. So the flush
will then handle all events without the need for the vhost_scsi_mutex.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Message-Id: <20230321020624.13323-6-michael.christie@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

ced9eb37

vhost-scsi: Drop device mutex use in vhost_scsi_do_plug · eb1b2914

Mike Christie authored 1 year ago

We don't need the device mutex in vhost_scsi_do_plug because:
1. we have the vhost_scsi_mutex so the tpg->vhost_scsi pointer will not
change on us and the vhost_scsi can't be freed from under us if it was
set.
2. vhost_scsi_clear_endpoint will stop the virtqueues and flush them while
holding the vhost_scsi_mutex so we know once vhost_scsi_clear_endpoint
has completed that vhost_scsi_do_plug can't send new events and any
queued ones have completed.

So this patch drops the device mutex use in vhost_scsi_do_plug.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Message-Id: <20230321020624.13323-5-michael.christie@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

eb1b2914

vhost-scsi: Delay releasing our refcount on the tpg · 9a10cb4d

Mike Christie authored 1 year ago

We currently hold the vhost_scsi_mutex the entire time we are running
vhost_scsi_clear_endpoint. One of the reasons for this is that it prevents
userspace from being able to free the se_tpg from under us after we have
called target_undepend_item. However, it forces management operations for
for other devices to have to wait on a flakey device's:

vhost_scsi_clear_endpoint -> vhost_scsi_flush()

call which can which can take a long time.

This moves the target_undepend_item call and the tpg unsetup code to after
we have stopped new IO from starting up and after we have waited on
running IO. We can then release our refcount on the tpg and session
knowing our device is no longer accessing them. We can then drop the
vhost_scsi_mutex use during thee flush call in later patches in this set,
when we have removed other reasons for holding it.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Message-Id: <20230321020624.13323-4-michael.christie@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

9a10cb4d

04 Apr, 2023 2 commits

vhost-scsi: Fix crash during LUN unmapping · 4c363c81

Mike Christie authored 1 year ago

We normally clear the endpoint then unmap LUNs so the devices are fully
shutdown when the LUN is unmapped, but it's legal to unmap before
clearing. If the user does that while TMFs are running then we can end
up crashing.

vhost_scsi_port_unlink assumes that the LUN's tmf struct will always be on
the tmf_queue list. However, if a TMF is running then it will have been
removed while it's executing. If we do a LUN unmap at this time, then
we assume the entry is on the list and just start accessing it and free
it.

This fixes the bug by just allocating the vhost_scsi_tmf struct when it's
needed like is done with the se_tmr struct that's needed when we submit
the TMF. In this path perf is not an issue and we can use GFP_KERNEL
since it won't swing directly back on us, so we don't need to preallocate
the struct.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Message-Id: <20230321020624.13323-3-michael.christie@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

4c363c81

vhost-scsi: Fix vhost_scsi struct use after free · e508efc3

Mike Christie authored 1 year ago

If vhost_scsi_setup_vq_cmds fails we leave the tpg->vhost_scsi pointer
set. If the device is freed and then the user unmaps the LUN, the call to
vhost_scsi_port_unlink -> vhost_scsi_hotunplug will see the that
tpg->vhost_scsi is still set and try to use it.

This has us clear the vhost_scsi pointer in the failure path. It also
has us take tv_tpg_mutex in this failure path, because tv_tpg_vhost_count
is accessed under this mutex in vhost_scsi_drop_nexus and in the future
we will want to serialize access to tpg->vhost_scsi with that mutex
instead of the vhost_scsi_mutex.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Message-Id: <20230321020624.13323-2-michael.christie@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

e508efc3

30 Mar, 2023 1 commit

iov_iter: add iter_iovec() helper · de4f5fed

Jens Axboe authored 1 year ago

This returns a pointer to the current iovec entry in the iterator. Only
useful with ITER_IOVEC right now, but it prepares us to treat ITER_UBUF
and ITER_IOVEC identically for the first segment.

Rename struct iov_iter->iov to iov_iter->__iov to find any potentially
troublesome spots, and also to prevent anyone from adding new code that
accesses iter->iov directly.
Signed-off-by: Jens Axboe <axboe@kernel.dk>

de4f5fed

17 Mar, 2023 1 commit

scsi: vhost-scsi: Remove default fabric ops callouts · 22ebaf61

Dmitry Bogdanov authored 1 year ago

Remove callouts that are identical to the default implementations in TCM
Core.
Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com>
Link: https://lore.kernel.org/r/20230313181110.20566-9-d.bogdanov@yadro.comSigned-off-by: Martin K. Petersen <martin.petersen@oracle.com>

22ebaf61

21 Feb, 2023 1 commit

vhost-scsi: convert sysfs snprintf and sprintf to sysfs_emit · b3d4f02e

Bo Liu authored 2 years ago

Follow the advice of the Documentation/filesystems/sysfs.rst
and show() should only use sysfs_emit() or sysfs_emit_at()
when formatting the value to be returned to user space.
Signed-off-by: Bo Liu <liubo03@inspur.com>
Message-Id: <20230129091145.2837-1-liubo03@inspur.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

b3d4f02e

27 Jan, 2023 1 commit

vhost-scsi: unbreak any layout for response · 6dd88fd5

Jason Wang authored 2 years ago

Al Viro said:

"""
Since "vhost/scsi: fix reuse of &vq->iov[out] in response"
we have this:
                cmd->tvc_resp_iov = vq->iov[vc.out];
                cmd->tvc_in_iovs = vc.in;
combined with
                iov_iter_init(&iov_iter, ITER_DEST, &cmd->tvc_resp_iov,
                              cmd->tvc_in_iovs, sizeof(v_rsp));
in vhost_scsi_complete_cmd_work().  We used to have ->tvc_resp_iov
_pointing_ to vq->iov[vc.out]; back then iov_iter_init() asked to
set an iovec-backed iov_iter over the tail of vq->iov[], with
length being the amount of iovecs in the tail.

Now we have a copy of one element of that array.  Fortunately, the members
following it in the containing structure are two non-NULL kernel pointers,
so copy_to_iter() will not copy anything beyond the first iovec - kernel
pointer is not (on the majority of architectures) going to be accepted by
access_ok() in copyout() and it won't be skipped since the "length" (in
reality - another non-NULL kernel pointer) won't be zero.

So it's not going to give a guest-to-qemu escalation, but it's definitely
a bug.  Frankly, my preference would be to verify that the very first iovec
is long enough to hold rsp_size.  Due to the above, any users that try to
give us vq->iov[vc.out].iov_len < sizeof(struct virtio_scsi_cmd_resp)
would currently get a failure in vhost_scsi_complete_cmd_work()
anyway.
"""

However, the spec doesn't say anything about the legacy descriptor
layout for the respone. So this patch tries to not assume the response
to reside in a single separate descriptor which is what commit
79c14141 ("vhost/scsi: Convert completion path to use") tries to
achieve towards to ANY_LAYOUT.

This is done by allocating and using dedicate resp iov in the
command. To be safety, start with UIO_MAXIOV to be consistent with the
limitation that we advertise to the vhost_get_vq_desc().

Testing with the hacked virtio-scsi driver that use 1 descriptor for 1
byte in the response.
Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin Coddington <bcodding@redhat.com>
Cc: Nicholas Bellinger <nab@linux-iscsi.org>
Fixes: a77ec83a ("vhost/scsi: fix reuse of &vq->iov[out] in response")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20230119073647.76467-1-jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

6dd88fd5

25 Nov, 2022 1 commit

use less confusing names for iov_iter direction initializers · de4eda9d

Al Viro authored 2 years ago

READ/WRITE proved to be actively confusing - the meanings are
"data destination, as used with read(2)" and "data source, as
used with write(2)", but people keep interpreting those as
"we read data from it" and "we write data to it", i.e. exactly
the wrong way.

Call them ITER_DEST and ITER_SOURCE - at least that is harder
to misinterpret...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

de4eda9d

11 Aug, 2022 2 commits

vhost scsi: Allow user to control num virtqueues · f49c2226

Mike Christie authored 2 years ago

We are currently hard coded to always create 128 IO virtqueues, so this
adds a modparam to control it. For large systems where we are ok with
using memory for virtqueues it allows us to add up to 1024. This limit
was just selected becuase that's qemu's limit.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Message-Id: <20220708030525.5065-3-michael.christie@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>

f49c2226

vhost-scsi: Fix max number of virtqueues · 5a4b0420

Mike Christie authored 2 years ago

Qemu takes it's num_queues limit then adds the fixed queues (control and
event) to the total it will request from the kernel. So when a user
requests 128 (or qemu does it's num_queues calculation based on vCPUS
and other system limits), we hit errors due to userspace trying to setup
130 queues when vhost-scsi has a hard coded limit of 128.

This has vhost-scsi adjust it's max so we can do a total of 130 virtqueues
(128 IO and 2 fixed). For the case where the user has 128 vCPUs the guest
OS can then nicely map each IO virtqueue to a vCPU and not have the odd case
where 2 vCPUs share a virtqueue.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Message-Id: <20220708030525.5065-2-michael.christie@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>

5a4b0420

09 Aug, 2022 1 commit

iov_iter: advancing variants of iov_iter_get_pages{,_alloc}() · 1ef255e2

Al Viro authored 2 years ago

Most of the users immediately follow successful iov_iter_get_pages()
with advancing by the amount it had returned.

Provide inline wrappers doing that, convert trivial open-coded
uses of those.

BTW, iov_iter_get_pages() never returns more than it had been asked
to; such checks in cifs ought to be removed someday...
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

1ef255e2

31 May, 2022 2 commits

vhost: rename vhost_work_dev_flush · b2ffa407

Mike Christie authored 2 years ago

This patch renames vhost_work_dev_flush to just vhost_dev_flush to
relfect that it flushes everything on the device and that drivers
don't know/care that polls are based on vhost_works. Drivers just
flush the entire device and polls, and works for vhost-scsi
management TMFs and IO net virtqueues, etc all are flushed.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20220517180850.198915-9-michael.christie@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

b2ffa407

vhost-scsi: drop flush after vhost_dev_cleanup · c3d284cf

Mike Christie authored 2 years ago

The flush after vhost_dev_cleanup is not needed because:

1. It doesn't do anything. vhost_dev_cleanup will stop the worker thread
so the flush call will just return since the worker has not device.

2. It's not needed for the re-queue case. vhost_scsi_evt_handle_kick grabs
the mutex and if the backend is NULL will return without queueing a work.
vhost_scsi_clear_endpoint will set the backend to NULL under the vq->mutex
then drops the mutex and does a flush. So we know when
vhost_scsi_clear_endpoint has dropped the mutex after clearing the backend
no evt related work will be able to requeue. The flush would then make sure
any queued evts are run and return.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20220517180850.198915-7-michael.christie@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

c3d284cf

05 Sep, 2021 1 commit

vhost scsi: Convert to SPDX identifier · 23b228cb

Cai Huoqing authored 3 years ago

use SPDX-License-Identifier instead of a verbose license text
Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
Link: https://lore.kernel.org/r/20210821123320.734-1-caihuoqing@baidu.comSigned-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

23b228cb

03 Jul, 2021 3 commits

vhost-scsi: reduce flushes during endpoint clearing · d60146c1

Mike Christie authored 3 years ago

vhost_scsi_flush will flush everything, so we can clear the backends then
flush, then destroy. We don't need to flush before each vq destruction
because after the flush we will have made sure there can be no new cmds
started and there are no running cmds.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20210525174733.6212-4-michael.christie@oracle.comSigned-off-by: Michael S. Tsirkin <mst@redhat.com>

d60146c1

vhost-scsi: remove extra flushes · 31fbea3a

Mike Christie authored 3 years ago

The vhost work flush function was flushing the entire work queue, so
there is no need for the double vhost_work_dev_flush calls in
vhost_scsi_flush.

And we do not need to call vhost_poll_flush for each poller because
that call also ends up flushing the same work queue thread the
vhost_work_dev_flush call flushed.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20210525174733.6212-3-michael.christie@oracle.comSigned-off-by: Michael S. Tsirkin <mst@redhat.com>

31fbea3a

vhost: remove work arg from vhost_work_flush · 1465cb61

Mike Christie authored 3 years ago

vhost_work_flush doesn't do anything with the work arg. This patch drops
it and then renames vhost_work_flush to vhost_work_dev_flush to reflect
that the function flushes all the works in the dev and not just a
specific queue or work item.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210525174733.6212-2-michael.christie@oracle.comReviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

1465cb61

04 Mar, 2021 5 commits

scsi: target: vhost-scsi: Use LIO wq cmd submission helper · 6ec29cb8

Mike Christie authored 4 years ago

Convert vhost-scsi to use the LIO wq cmd submission helper.

Link: https://lore.kernel.org/r/20210227170006.5077-18-michael.christie@oracle.comSigned-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

6ec29cb8

scsi: target: core: Add gfp_t arg to target_cmd_init_cdb() · 08694199

Mike Christie authored 4 years ago

tcm_loop could be used like a normal block device, so we can't use
GFP_KERNEL and should use GFP_NOIO. This adds a gfp_t arg to
target_cmd_init_cdb() and converts the users. For every driver but loop
GFP_KERNEL is kept.

This will also be useful in subsequent patches where loop needs to do
target_submit_prep() from interrupt context to get a ref to the se_device,
and so it will need to use GFP_ATOMIC.

Link: https://lore.kernel.org/r/20210227170006.5077-16-michael.christie@oracle.comTested-by: Laurence Oberman <loberman@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

08694199

scsi: target: vhost-scsi: Convert to new submission API · eb929804

Mike Christie authored 4 years ago

target_submit_cmd_map_sgls() is being removed, so convert vhost-scsi to the
new submission API. This has it use target_init_cmd(),
target_submit_prep(), target_submit() because we need to have LIO core map
sgls which is now done in target_submit_prep(), and in the next patches we
will do the target_submit step from the LIO workqueue.

Note: vhost-scsi never calls target_stop_session() so
target_submit_cmd_map_sgls() never failed (in the new API target_init_cmd()
handles target_stop_session() being called when cmds are being
submitted). If it were to have used target_stop_session() and got an error,
we would have hit a refcount bug like xen and usb, because it does:

if (rc < 0) {
	transport_send_check_condition_and_sense(se_cmd,
			TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE, 0);
	transport_generic_free_cmd(se_cmd, 0);
}

transport_send_check_condition_and_sense() calls queue_status which does
transport_generic_free_cmd(), and then we do an extra
transport_generic_free_cmd() call above which would have dropped the
refcount to -1 and the refcount code would spit out errors.

Link: https://lore.kernel.org/r/20210227170006.5077-12-michael.christie@oracle.com
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

eb929804

scsi: sbitmap: Move allocation hint into sbitmap · c548e62b

Ming Lei authored 4 years ago

Allocation hint should have belonged to sbitmap. Also, when sbitmap's depth
is high and there is no need to use mulitple wakeup queues, user can
benefit from percpu allocation hint too.

Move allocation hint into sbitmap, then SCSI device queue can benefit from
allocation hint when converting to plain sbitmap.

Convert vhost/scsi.c to use sbitmap allocation with percpu alloc hint. This
is more efficient than the previous approach.

Link: https://lore.kernel.org/r/20210122023317.687987-5-ming.lei@redhat.com
Cc: Omar Sandoval <osandov@fb.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Cc: Ewan D. Milne <emilne@redhat.com>
Cc: Mike Christie <michael.christie@oracle.com>
Cc: virtualization@lists.linux-foundation.org
Tested-by: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

c548e62b

scsi: sbitmap: Maintain allocation round_robin in sbitmap · efe1f3a1

Ming Lei authored 4 years ago

Currently the allocation round_robin info is maintained by sbitmap_queue.

However, bit allocation really belongs to sbitmap. Move it there.

Link: https://lore.kernel.org/r/20210122023317.687987-3-ming.lei@redhat.com
Cc: Omar Sandoval <osandov@fb.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Cc: Ewan D. Milne <emilne@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: virtualization@lists.linux-foundation.org
Tested-by: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

efe1f3a1

23 Feb, 2021 1 commit

vhost scsi: alloc vhost_scsi with kvzalloc() to avoid delay · 489084dd

Dongli Zhang authored 4 years ago

The size of 'struct vhost_scsi' is order-10 (~2.3MB). It may take long time
delay by kzalloc() to compact memory pages by retrying multiple times when
there is a lack of high-order pages. As a result, there is latency to
create a VM (with vhost-scsi) or to hotadd vhost-scsi-based storage.

The prior commit 595cb754 ("vhost/scsi: use vmalloc for order-10
allocation") prefers to fallback only when really needed, while this patch
allocates with kvzalloc() with __GFP_NORETRY implicitly set to avoid
retrying memory pages compact for multiple times.

The __GFP_NORETRY is implicitly set if the size to allocate is more than
PAGE_SZIE and when __GFP_RETRY_MAYFAIL is not explicitly set.

Cc: Aruna Ramakrishna <aruna.ramakrishna@oracle.com>
Cc: Joe Jin <joe.jin@oracle.com>
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Link: https://lore.kernel.org/r/20210123080853.4214-1-dongli.zhang@oracle.comSigned-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>

489084dd

18 Dec, 2020 1 commit

vhost scsi: fix error return code in vhost_scsi_set_endpoint() · 2e1139d6

Zhang Changzhong authored 4 years ago

Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.

Fixes: 25b98b64 ("vhost scsi: alloc cmds per vq instead of session")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Link: https://lore.kernel.org/r/1607071411-33484-1-git-send-email-zhangchangzhong@huawei.comSigned-off-by: Michael S. Tsirkin <mst@redhat.com>

2e1139d6

25 Nov, 2020 1 commit

vhost scsi: fix lun reset completion handling · b4fffc17

Mike Christie authored 4 years ago

vhost scsi owns the scsi se_cmd but lio frees the se_cmd->se_tmr
before calling release_cmd, so while with normal cmd completion we
can access the se_cmd from the vhost work, we can't do the same with
se_cmd->se_tmr. This has us copy the tmf response in
vhost_scsi_queue_tm_rsp to our internal vhost-scsi tmf struct for
when it gets sent to the guest from our worker thread.

Fixes: efd838fe ("vhost scsi: Add support for LUN resets.")
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Link: https://lore.kernel.org/r/1605887459-3864-1-git-send-email-michael.christie@oracle.comSigned-off-by: Michael S. Tsirkin <mst@redhat.com>

b4fffc17

15 Nov, 2020 4 commits

vhost scsi: Add support for LUN resets. · efd838fe

Mike Christie authored 4 years ago

In newer versions of virtio-scsi we just reset the timer when an a
command times out, so TMFs are never sent for the cmd time out case.
However, in older kernels and for the TMF inject cases, we can still get
resets and we end up just failing immediately so the guest might see the
device get offlined and IO errors.

For the older kernel cases, we want the same end result as the
modern virtio-scsi driver where we let the lower levels fire their error
handling and handle the problem. And at the upper levels we want to
wait. This patch ties the LUN reset handling into the LIO TMF code which
will just wait for outstanding commands to complete like we are doing in
the modern virtio-scsi case.

Note: I did not handle the ABORT case to keep this simple. For ABORTs
LIO just waits on the cmd like how it does for the RESET case. If
an ABORT fails, the guest OS ends up escalating to LUN RESET, so in
the end we get the same behavior where we wait on the outstanding
cmds.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/1604986403-4931-6-git-send-email-michael.christie@oracle.comSigned-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>

efd838fe

vhost scsi: add lun parser helper · 18f1becb

Mike Christie authored 4 years ago

Move code to parse lun from req's lun_buf to helper, so tmf code
can use it in the next patch.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/1604986403-4931-5-git-send-email-michael.christie@oracle.comSigned-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>

18f1becb

vhost scsi: fix cmd completion race · 47a3565e

Mike Christie authored 4 years ago

We might not do the final se_cmd put from vhost_scsi_complete_cmd_work.
When the last put happens a little later then we could race where
vhost_scsi_complete_cmd_work does vhost_signal, the guest runs and sends
more IO, and vhost_scsi_handle_vq runs but does not find any free cmds.

This patch has us delay completing the cmd until the last lio core ref
is dropped. We then know that once we signal to the guest that the cmd
is completed that if it queues a new command it will find a free cmd.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Maurizio Lombardi <mlombard@redhat.com>
Link: https://lore.kernel.org/r/1604986403-4931-4-git-send-email-michael.christie@oracle.comSigned-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>

47a3565e

vhost scsi: alloc cmds per vq instead of session · 25b98b64

Mike Christie authored 4 years ago

We currently are limited to 256 cmds per session. This leads to problems
where if the user has increased virtqueue_size to more than 2 or
cmd_per_lun to more than 256 vhost_scsi_get_tag can fail and the guest
will get IO errors.

This patch moves the cmd allocation to per vq so we can easily match
whatever the user has specified for num_queues and
virtqueue_size/cmd_per_lun. It also makes it easier to control how much
memory we preallocate. For cases, where perf is not as important and
we can use the current defaults (1 vq and 128 cmds per vq) memory use
from preallocate cmds is cut in half. For cases, where we are willing
to use more memory for higher perf, cmd mem use will now increase as
the num queues and queue depth increases.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/1604986403-4931-3-git-send-email-michael.christie@oracle.comSigned-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Maurizio Lombardi <mlombard@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>

25b98b64

29 Jul, 2020 1 commit

vhost/scsi: fix up req type endian-ness · 295c1b98

Michael S. Tsirkin authored 4 years ago

vhost/scsi doesn't handle type conversion correctly
for request type when using virtio 1.0 and up for BE,
or cross-endian platforms.

Fix it up using vhost_32_to_cpu.

Cc: stable@vger.kernel.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

295c1b98