Commit cac85e46 authored by Linus Torvalds's avatar Linus Torvalds

Merge tag 'vfio-v6.3-rc1' of https://github.com/awilliam/linux-vfio

Pull VFIO updates from Alex Williamson:

 - Remove redundant resource check in vfio-platform (Angus Chen)

 - Use GFP_KERNEL_ACCOUNT for persistent userspace allocations, allowing
   removal of arbitrary kernel limits in favor of cgroup control (Yishai
   Hadas)

 - mdev tidy-ups, including removing the module-only build restriction
   for sample drivers, Kconfig changes to select mdev support,
   documentation movement to keep sample driver usage instructions with
   sample drivers rather than with API docs, remove references to
   out-of-tree drivers in docs (Christoph Hellwig)

 - Fix collateral breakages from mdev Kconfig changes (Arnd Bergmann)

 - Make mlx5 migration support match device support, improve source and
   target flows to improve pre-copy support and reduce downtime (Yishai
   Hadas)

 - Convert additional mdev sysfs case to use sysfs_emit() (Bo Liu)

 - Resolve copy-paste error in mdev mbochs sample driver Kconfig (Ye
   Xingchen)

 - Avoid propagating missing reset error in vfio-platform if reset
   requirement is relaxed by module option (Tomasz Duszynski)

 - Range size fixes in mlx5 variant driver for missed last byte and
   stricter range calculation (Yishai Hadas)

 - Fixes to suspended vaddr support and locked_vm accounting, excluding
   mdev configurations from the former due to potential to indefinitely
   block kernel threads, fix underflow and restore locked_vm on new mm
   (Steve Sistare)

 - Update outdated vfio documentation due to new IOMMUFD interfaces in
   recent kernels (Yi Liu)

 - Resolve deadlock between group_lock and kvm_lock, finally (Matthew
   Rosato)

 - Fix NULL pointer in group initialization error path with IOMMUFD (Yan
   Zhao)

* tag 'vfio-v6.3-rc1' of https://github.com/awilliam/linux-vfio: (32 commits)
  vfio: Fix NULL pointer dereference caused by uninitialized group->iommufd
  docs: vfio: Update vfio.rst per latest interfaces
  vfio: Update the kdoc for vfio_device_ops
  vfio/mlx5: Fix range size calculation upon tracker creation
  vfio: no need to pass kvm pointer during device open
  vfio: fix deadlock between group lock and kvm lock
  vfio: revert "iommu driver notify callback"
  vfio/type1: revert "implement notify callback"
  vfio/type1: revert "block on invalid vaddr"
  vfio/type1: restore locked_vm
  vfio/type1: track locked_vm per dma
  vfio/type1: prevent underflow of locked_vm via exec()
  vfio/type1: exclude mdevs from VFIO_UPDATE_VADDR
  vfio: platform: ignore missing reset if disabled at module init
  vfio/mlx5: Improve the target side flow to reduce downtime
  vfio/mlx5: Improve the source side flow upon pre_copy
  vfio/mlx5: Check whether VF is migratable
  samples: fix the prompt about SAMPLE_VFIO_MDEV_MBOCHS
  vfio/mdev: Use sysfs_emit() to instead of sprintf()
  vfio-mdev: add back CONFIG_VFIO dependency
  ...
parents 84cc6674 d649c34c
......@@ -60,7 +60,7 @@ devices as examples, as these devices are the first devices to use this module::
| mdev.ko |
| +-----------+ | mdev_register_parent() +--------------+
| | | +<------------------------+ |
| | | | | nvidia.ko |<-> physical
| | | | | ccw_device.ko|<-> physical
| | | +------------------------>+ | device
| | | | callbacks +--------------+
| | Physical | |
......@@ -69,12 +69,6 @@ devices as examples, as these devices are the first devices to use this module::
| | | | | i915.ko |<-> physical
| | | +------------------------>+ | device
| | | | callbacks +--------------+
| | | |
| | | | mdev_register_parent() +--------------+
| | | +<------------------------+ |
| | | | | ccw_device.ko|<-> physical
| | | +------------------------>+ | device
| | | | callbacks +--------------+
| +-----------+ |
+---------------+
......@@ -270,106 +264,6 @@ these callbacks are supported in the TYPE1 IOMMU module. To enable them for
other IOMMU backend modules, such as PPC64 sPAPR module, they need to provide
these two callback functions.
Using the Sample Code
=====================
mtty.c in samples/vfio-mdev/ directory is a sample driver program to
demonstrate how to use the mediated device framework.
The sample driver creates an mdev device that simulates a serial port over a PCI
card.
1. Build and load the mtty.ko module.
This step creates a dummy device, /sys/devices/virtual/mtty/mtty/
Files in this device directory in sysfs are similar to the following::
# tree /sys/devices/virtual/mtty/mtty/
/sys/devices/virtual/mtty/mtty/
|-- mdev_supported_types
| |-- mtty-1
| | |-- available_instances
| | |-- create
| | |-- device_api
| | |-- devices
| | `-- name
| `-- mtty-2
| |-- available_instances
| |-- create
| |-- device_api
| |-- devices
| `-- name
|-- mtty_dev
| `-- sample_mtty_dev
|-- power
| |-- autosuspend_delay_ms
| |-- control
| |-- runtime_active_time
| |-- runtime_status
| `-- runtime_suspended_time
|-- subsystem -> ../../../../class/mtty
`-- uevent
2. Create a mediated device by using the dummy device that you created in the
previous step::
# echo "83b8f4f2-509f-382f-3c1e-e6bfe0fa1001" > \
/sys/devices/virtual/mtty/mtty/mdev_supported_types/mtty-2/create
3. Add parameters to qemu-kvm::
-device vfio-pci,\
sysfsdev=/sys/bus/mdev/devices/83b8f4f2-509f-382f-3c1e-e6bfe0fa1001
4. Boot the VM.
In the Linux guest VM, with no hardware on the host, the device appears
as follows::
# lspci -s 00:05.0 -xxvv
00:05.0 Serial controller: Device 4348:3253 (rev 10) (prog-if 02 [16550])
Subsystem: Device 4348:3253
Physical Slot: 5
Control: I/O+ Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
Interrupt: pin A routed to IRQ 10
Region 0: I/O ports at c150 [size=8]
Region 1: I/O ports at c158 [size=8]
Kernel driver in use: serial
00: 48 43 53 32 01 00 00 02 10 02 00 07 00 00 00 00
10: 51 c1 00 00 59 c1 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 48 43 53 32
30: 00 00 00 00 00 00 00 00 00 00 00 00 0a 01 00 00
In the Linux guest VM, dmesg output for the device is as follows:
serial 0000:00:05.0: PCI INT A -> Link[LNKA] -> GSI 10 (level, high) -> IRQ 10
0000:00:05.0: ttyS1 at I/O 0xc150 (irq = 10) is a 16550A
0000:00:05.0: ttyS2 at I/O 0xc158 (irq = 10) is a 16550A
5. In the Linux guest VM, check the serial ports::
# setserial -g /dev/ttyS*
/dev/ttyS0, UART: 16550A, Port: 0x03f8, IRQ: 4
/dev/ttyS1, UART: 16550A, Port: 0xc150, IRQ: 10
/dev/ttyS2, UART: 16550A, Port: 0xc158, IRQ: 10
6. Using minicom or any terminal emulation program, open port /dev/ttyS1 or
/dev/ttyS2 with hardware flow control disabled.
7. Type data on the minicom terminal or send data to the terminal emulation
program and read the data.
Data is loop backed from hosts mtty driver.
8. Destroy the mediated device that you created::
# echo 1 > /sys/bus/mdev/devices/83b8f4f2-509f-382f-3c1e-e6bfe0fa1001/remove
References
==========
......
......@@ -249,19 +249,21 @@ VFIO bus driver API
VFIO bus drivers, such as vfio-pci make use of only a few interfaces
into VFIO core. When devices are bound and unbound to the driver,
the driver should call vfio_register_group_dev() and
vfio_unregister_group_dev() respectively::
Following interfaces are called when devices are bound to and
unbound from the driver::
void vfio_init_group_dev(struct vfio_device *device,
struct device *dev,
const struct vfio_device_ops *ops);
void vfio_uninit_group_dev(struct vfio_device *device);
int vfio_register_group_dev(struct vfio_device *device);
int vfio_register_emulated_iommu_dev(struct vfio_device *device);
void vfio_unregister_group_dev(struct vfio_device *device);
The driver should embed the vfio_device in its own structure and call
vfio_init_group_dev() to pre-configure it before going to registration
and call vfio_uninit_group_dev() after completing the un-registration.
The driver should embed the vfio_device in its own structure and use
vfio_alloc_device() to allocate the structure, and can register
@init/@release callbacks to manage any private state wrapping the
vfio_device::
vfio_alloc_device(dev_struct, member, dev, ops);
void vfio_put_device(struct vfio_device *device);
vfio_register_group_dev() indicates to the core to begin tracking the
iommu_group of the specified dev and register the dev as owned by a VFIO bus
driver. Once vfio_register_group_dev() returns it is possible for userspace to
......@@ -270,28 +272,64 @@ ready before calling it. The driver provides an ops structure for callbacks
similar to a file operations structure::
struct vfio_device_ops {
int (*open)(struct vfio_device *vdev);
char *name;
int (*init)(struct vfio_device *vdev);
void (*release)(struct vfio_device *vdev);
int (*bind_iommufd)(struct vfio_device *vdev,
struct iommufd_ctx *ictx, u32 *out_device_id);
void (*unbind_iommufd)(struct vfio_device *vdev);
int (*attach_ioas)(struct vfio_device *vdev, u32 *pt_id);
int (*open_device)(struct vfio_device *vdev);
void (*close_device)(struct vfio_device *vdev);
ssize_t (*read)(struct vfio_device *vdev, char __user *buf,
size_t count, loff_t *ppos);
ssize_t (*write)(struct vfio_device *vdev,
const char __user *buf,
size_t size, loff_t *ppos);
ssize_t (*write)(struct vfio_device *vdev, const char __user *buf,
size_t count, loff_t *size);
long (*ioctl)(struct vfio_device *vdev, unsigned int cmd,
unsigned long arg);
int (*mmap)(struct vfio_device *vdev,
struct vm_area_struct *vma);
int (*mmap)(struct vfio_device *vdev, struct vm_area_struct *vma);
void (*request)(struct vfio_device *vdev, unsigned int count);
int (*match)(struct vfio_device *vdev, char *buf);
void (*dma_unmap)(struct vfio_device *vdev, u64 iova, u64 length);
int (*device_feature)(struct vfio_device *device, u32 flags,
void __user *arg, size_t argsz);
};
Each function is passed the vdev that was originally registered
in the vfio_register_group_dev() call above. This allows the bus driver
to obtain its private data using container_of(). The open/release
callbacks are issued when a new file descriptor is created for a
device (via VFIO_GROUP_GET_DEVICE_FD). The ioctl interface provides
a direct pass through for VFIO_DEVICE_* ioctls. The read/write/mmap
interfaces implement the device region access defined by the device's
own VFIO_DEVICE_GET_REGION_INFO ioctl.
in the vfio_register_group_dev() or vfio_register_emulated_iommu_dev()
call above. This allows the bus driver to obtain its private data using
container_of().
::
- The init/release callbacks are issued when vfio_device is initialized
and released.
- The open/close device callbacks are issued when the first
instance of a file descriptor for the device is created (eg.
via VFIO_GROUP_GET_DEVICE_FD) for a user session.
- The ioctl callback provides a direct pass through for some VFIO_DEVICE_*
ioctls.
- The [un]bind_iommufd callbacks are issued when the device is bound to
and unbound from iommufd.
- The attach_ioas callback is issued when the device is attached to an
IOAS managed by the bound iommufd. The attached IOAS is automatically
detached when the device is unbound from iommufd.
- The read/write/mmap callbacks implement the device region access defined
by the device's own VFIO_DEVICE_GET_REGION_INFO ioctl.
- The request callback is issued when device is going to be unregistered,
such as when trying to unbind the device from the vfio bus driver.
- The dma_unmap callback is issued when a range of iovas are unmapped
in the container or IOAS attached by the device. Drivers which make
use of the vfio page pinning interface must implement this callback in
order to unpin pages within the dma_unmap range. Drivers must tolerate
this callback even before calls to open_device().
PPC64 sPAPR implementation note
-------------------------------
......
......@@ -553,7 +553,6 @@ These are the steps:
* ZCRYPT
* S390_AP_IOMMU
* VFIO
* VFIO_MDEV
* KVM
If using make menuconfig select the following to build the vfio_ap module::
......
......@@ -21882,7 +21882,6 @@ F: tools/testing/selftests/filesystems/fat/
VFIO DRIVER
M: Alex Williamson <alex.williamson@redhat.com>
R: Cornelia Huck <cohuck@redhat.com>
L: kvm@vger.kernel.org
S: Maintained
T: git https://github.com/awilliam/linux-vfio.git
......
......@@ -714,7 +714,9 @@ config EADM_SCH
config VFIO_CCW
def_tristate n
prompt "Support for VFIO-CCW subchannels"
depends on S390_CCW_IOMMU && VFIO_MDEV
depends on S390_CCW_IOMMU
depends on VFIO
select VFIO_MDEV
help
This driver allows usage of I/O subchannels via VFIO-CCW.
......@@ -724,8 +726,10 @@ config VFIO_CCW
config VFIO_AP
def_tristate n
prompt "VFIO support for AP devices"
depends on S390_AP_IOMMU && VFIO_MDEV && KVM
depends on S390_AP_IOMMU && KVM
depends on VFIO
depends on ZCRYPT
select VFIO_MDEV
help
This driver grants access to Adjunct Processor (AP) devices
via the VFIO mediated device interface.
......
......@@ -594,7 +594,6 @@ CONFIG_SYNC_FILE=y
CONFIG_VFIO=m
CONFIG_VFIO_PCI=m
CONFIG_MLX5_VFIO_PCI=m
CONFIG_VFIO_MDEV=m
CONFIG_VIRTIO_PCI=m
CONFIG_VIRTIO_BALLOON=m
CONFIG_VIRTIO_INPUT=y
......
......@@ -583,7 +583,6 @@ CONFIG_SYNC_FILE=y
CONFIG_VFIO=m
CONFIG_VFIO_PCI=m
CONFIG_MLX5_VFIO_PCI=m
CONFIG_VFIO_MDEV=m
CONFIG_VIRTIO_PCI=m
CONFIG_VIRTIO_BALLOON=m
CONFIG_VIRTIO_INPUT=y
......
......@@ -127,9 +127,10 @@ config DRM_I915_GVT_KVMGT
depends on X86
depends on 64BIT
depends on KVM
depends on VFIO_MDEV
depends on VFIO
select DRM_I915_GVT
select KVM_EXTERNAL_WRITE_TRACKING
select VFIO_MDEV
help
Choose this option if you want to enable Intel GVT-g graphics
......
......@@ -360,7 +360,7 @@ static int vfio_fops_open(struct inode *inode, struct file *filep)
{
struct vfio_container *container;
container = kzalloc(sizeof(*container), GFP_KERNEL);
container = kzalloc(sizeof(*container), GFP_KERNEL_ACCOUNT);
if (!container)
return -ENOMEM;
......@@ -376,11 +376,6 @@ static int vfio_fops_open(struct inode *inode, struct file *filep)
static int vfio_fops_release(struct inode *inode, struct file *filep)
{
struct vfio_container *container = filep->private_data;
struct vfio_iommu_driver *driver = container->iommu_driver;
if (driver && driver->ops->notify)
driver->ops->notify(container->iommu_data,
VFIO_IOMMU_CONTAINER_CLOSE);
filep->private_data = NULL;
......
......@@ -28,7 +28,7 @@ static int vfio_fsl_mc_open_device(struct vfio_device *core_vdev)
int i;
vdev->regions = kcalloc(count, sizeof(struct vfio_fsl_mc_region),
GFP_KERNEL);
GFP_KERNEL_ACCOUNT);
if (!vdev->regions)
return -ENOMEM;
......
......@@ -29,7 +29,7 @@ static int vfio_fsl_mc_irqs_allocate(struct vfio_fsl_mc_device *vdev)
irq_count = mc_dev->obj_desc.irq_count;
mc_irq = kcalloc(irq_count, sizeof(*mc_irq), GFP_KERNEL);
mc_irq = kcalloc(irq_count, sizeof(*mc_irq), GFP_KERNEL_ACCOUNT);
if (!mc_irq)
return -ENOMEM;
......@@ -77,7 +77,7 @@ static int vfio_set_trigger(struct vfio_fsl_mc_device *vdev,
if (fd < 0) /* Disable only */
return 0;
irq->name = kasprintf(GFP_KERNEL, "vfio-irq[%d](%s)",
irq->name = kasprintf(GFP_KERNEL_ACCOUNT, "vfio-irq[%d](%s)",
hwirq, dev_name(&vdev->mc_dev->dev));
if (!irq->name)
return -ENOMEM;
......
......@@ -140,7 +140,7 @@ static int vfio_group_ioctl_set_container(struct vfio_group *group,
ret = iommufd_vfio_compat_ioas_create(iommufd);
if (ret) {
iommufd_ctx_put(group->iommufd);
iommufd_ctx_put(iommufd);
goto out_unlock;
}
......@@ -157,6 +157,18 @@ static int vfio_group_ioctl_set_container(struct vfio_group *group,
return ret;
}
static void vfio_device_group_get_kvm_safe(struct vfio_device *device)
{
spin_lock(&device->group->kvm_ref_lock);
if (!device->group->kvm)
goto unlock;
_vfio_device_get_kvm_safe(device, device->group->kvm);
unlock:
spin_unlock(&device->group->kvm_ref_lock);
}
static int vfio_device_group_open(struct vfio_device *device)
{
int ret;
......@@ -167,13 +179,23 @@ static int vfio_device_group_open(struct vfio_device *device)
goto out_unlock;
}
mutex_lock(&device->dev_set->lock);
/*
* Here we pass the KVM pointer with the group under the lock. If the
* device driver will use it, it must obtain a reference and release it
* during close_device.
* Before the first device open, get the KVM pointer currently
* associated with the group (if there is one) and obtain a reference
* now that will be held until the open_count reaches 0 again. Save
* the pointer in the device for use by drivers.
*/
ret = vfio_device_open(device, device->group->iommufd,
device->group->kvm);
if (device->open_count == 0)
vfio_device_group_get_kvm_safe(device);
ret = vfio_device_open(device, device->group->iommufd);
if (device->open_count == 0)
vfio_device_put_kvm(device);
mutex_unlock(&device->dev_set->lock);
out_unlock:
mutex_unlock(&device->group->group_lock);
......@@ -183,7 +205,14 @@ static int vfio_device_group_open(struct vfio_device *device)
void vfio_device_group_close(struct vfio_device *device)
{
mutex_lock(&device->group->group_lock);
mutex_lock(&device->dev_set->lock);
vfio_device_close(device, device->group->iommufd);
if (device->open_count == 0)
vfio_device_put_kvm(device);
mutex_unlock(&device->dev_set->lock);
mutex_unlock(&device->group->group_lock);
}
......@@ -453,6 +482,7 @@ static struct vfio_group *vfio_group_alloc(struct iommu_group *iommu_group,
refcount_set(&group->drivers, 1);
mutex_init(&group->group_lock);
spin_lock_init(&group->kvm_ref_lock);
INIT_LIST_HEAD(&group->device_list);
mutex_init(&group->device_lock);
group->iommu_group = iommu_group;
......@@ -806,9 +836,9 @@ void vfio_file_set_kvm(struct file *file, struct kvm *kvm)
if (!vfio_file_is_group(file))
return;
mutex_lock(&group->group_lock);
spin_lock(&group->kvm_ref_lock);
group->kvm = kvm;
mutex_unlock(&group->group_lock);
spin_unlock(&group->kvm_ref_lock);
}
EXPORT_SYMBOL_GPL(vfio_file_set_kvm);
......
# SPDX-License-Identifier: GPL-2.0-only
config VFIO_MDEV
tristate "Mediated device driver framework"
default n
help
Provides a framework to virtualize devices.
See Documentation/driver-api/vfio-mediated-device.rst for more details.
If you don't know what do here, say N.
tristate
......@@ -96,7 +96,7 @@ static MDEV_TYPE_ATTR_RO(device_api);
static ssize_t name_show(struct mdev_type *mtype,
struct mdev_type_attribute *attr, char *buf)
{
return sprintf(buf, "%s\n",
return sysfs_emit(buf, "%s\n",
mtype->pretty_name ? mtype->pretty_name : mtype->sysfs_name);
}
......
......@@ -744,7 +744,7 @@ hisi_acc_vf_pci_resume(struct hisi_acc_vf_core_device *hisi_acc_vdev)
{
struct hisi_acc_vf_migration_file *migf;
migf = kzalloc(sizeof(*migf), GFP_KERNEL);
migf = kzalloc(sizeof(*migf), GFP_KERNEL_ACCOUNT);
if (!migf)
return ERR_PTR(-ENOMEM);
......@@ -863,7 +863,7 @@ hisi_acc_open_saving_migf(struct hisi_acc_vf_core_device *hisi_acc_vdev)
struct hisi_acc_vf_migration_file *migf;
int ret;
migf = kzalloc(sizeof(*migf), GFP_KERNEL);
migf = kzalloc(sizeof(*migf), GFP_KERNEL_ACCOUNT);
if (!migf)
return ERR_PTR(-ENOMEM);
......
......@@ -7,6 +7,29 @@
enum { CQ_OK = 0, CQ_EMPTY = -1, CQ_POLL_ERR = -2 };
static int mlx5vf_is_migratable(struct mlx5_core_dev *mdev, u16 func_id)
{
int query_sz = MLX5_ST_SZ_BYTES(query_hca_cap_out);
void *query_cap = NULL, *cap;
int ret;
query_cap = kzalloc(query_sz, GFP_KERNEL);
if (!query_cap)
return -ENOMEM;
ret = mlx5_vport_get_other_func_cap(mdev, func_id, query_cap,
MLX5_CAP_GENERAL_2);
if (ret)
goto out;
cap = MLX5_ADDR_OF(query_hca_cap_out, query_cap, capability);
if (!MLX5_GET(cmd_hca_cap_2, cap, migratable))
ret = -EOPNOTSUPP;
out:
kfree(query_cap);
return ret;
}
static int mlx5vf_cmd_get_vhca_id(struct mlx5_core_dev *mdev, u16 function_id,
u16 *vhca_id);
static void
......@@ -195,6 +218,10 @@ void mlx5vf_cmd_set_migratable(struct mlx5vf_pci_core_device *mvdev,
if (mvdev->vf_id < 0)
goto end;
ret = mlx5vf_is_migratable(mvdev->mdev, mvdev->vf_id + 1);
if (ret)
goto end;
if (mlx5vf_cmd_get_vhca_id(mvdev->mdev, mvdev->vf_id + 1,
&mvdev->vhca_id))
goto end;
......@@ -373,7 +400,7 @@ mlx5vf_alloc_data_buffer(struct mlx5_vf_migration_file *migf,
struct mlx5_vhca_data_buffer *buf;
int ret;
buf = kzalloc(sizeof(*buf), GFP_KERNEL);
buf = kzalloc(sizeof(*buf), GFP_KERNEL_ACCOUNT);
if (!buf)
return ERR_PTR(-ENOMEM);
......@@ -473,7 +500,7 @@ void mlx5vf_mig_file_cleanup_cb(struct work_struct *_work)
}
static int add_buf_header(struct mlx5_vhca_data_buffer *header_buf,
size_t image_size)
size_t image_size, bool initial_pre_copy)
{
struct mlx5_vf_migration_file *migf = header_buf->migf;
struct mlx5_vf_migration_header header = {};
......@@ -481,7 +508,9 @@ static int add_buf_header(struct mlx5_vhca_data_buffer *header_buf,
struct page *page;
u8 *to_buff;
header.image_size = cpu_to_le64(image_size);
header.record_size = cpu_to_le64(image_size);
header.flags = cpu_to_le32(MLX5_MIGF_HEADER_FLAGS_TAG_MANDATORY);
header.tag = cpu_to_le32(MLX5_MIGF_HEADER_TAG_FW_DATA);
page = mlx5vf_get_migration_page(header_buf, 0);
if (!page)
return -EINVAL;
......@@ -489,12 +518,13 @@ static int add_buf_header(struct mlx5_vhca_data_buffer *header_buf,
memcpy(to_buff, &header, sizeof(header));
kunmap_local(to_buff);
header_buf->length = sizeof(header);
header_buf->header_image_size = image_size;
header_buf->start_pos = header_buf->migf->max_pos;
migf->max_pos += header_buf->length;
spin_lock_irqsave(&migf->list_lock, flags);
list_add_tail(&header_buf->buf_elm, &migf->buf_list);
spin_unlock_irqrestore(&migf->list_lock, flags);
if (initial_pre_copy)
migf->pre_copy_initial_bytes += sizeof(header);
return 0;
}
......@@ -508,11 +538,14 @@ static void mlx5vf_save_callback(int status, struct mlx5_async_work *context)
if (!status) {
size_t image_size;
unsigned long flags;
bool initial_pre_copy = migf->state != MLX5_MIGF_STATE_PRE_COPY &&
!async_data->last_chunk;
image_size = MLX5_GET(save_vhca_state_out, async_data->out,
actual_image_size);
if (async_data->header_buf) {
status = add_buf_header(async_data->header_buf, image_size);
status = add_buf_header(async_data->header_buf, image_size,
initial_pre_copy);
if (status)
goto err;
}
......@@ -522,6 +555,8 @@ static void mlx5vf_save_callback(int status, struct mlx5_async_work *context)
spin_lock_irqsave(&migf->list_lock, flags);
list_add_tail(&async_data->buf->buf_elm, &migf->buf_list);
spin_unlock_irqrestore(&migf->list_lock, flags);
if (initial_pre_copy)
migf->pre_copy_initial_bytes += image_size;
migf->state = async_data->last_chunk ?
MLX5_MIGF_STATE_COMPLETE : MLX5_MIGF_STATE_PRE_COPY;
wake_up_interruptible(&migf->poll_wait);
......@@ -583,11 +618,16 @@ int mlx5vf_cmd_save_vhca_state(struct mlx5vf_pci_core_device *mvdev,
}
if (MLX5VF_PRE_COPY_SUPP(mvdev)) {
header_buf = mlx5vf_get_data_buffer(migf,
sizeof(struct mlx5_vf_migration_header), DMA_NONE);
if (IS_ERR(header_buf)) {
err = PTR_ERR(header_buf);
goto err_free;
if (async_data->last_chunk && migf->buf_header) {
header_buf = migf->buf_header;
migf->buf_header = NULL;
} else {
header_buf = mlx5vf_get_data_buffer(migf,
sizeof(struct mlx5_vf_migration_header), DMA_NONE);
if (IS_ERR(header_buf)) {
err = PTR_ERR(header_buf);
goto err_free;
}
}
}
......@@ -790,7 +830,7 @@ static int mlx5vf_create_tracker(struct mlx5_core_dev *mdev,
node = interval_tree_iter_first(ranges, 0, ULONG_MAX);
for (i = 0; i < num_ranges; i++) {
void *addr_range_i_base = range_list_ptr + record_size * i;
unsigned long length = node->last - node->start;
unsigned long length = node->last - node->start + 1;
MLX5_SET64(page_track_range, addr_range_i_base, start_address,
node->start);
......@@ -800,7 +840,7 @@ static int mlx5vf_create_tracker(struct mlx5_core_dev *mdev,
}
WARN_ON(node);
log_addr_space_size = ilog2(total_ranges_len);
log_addr_space_size = ilog2(roundup_pow_of_two(total_ranges_len));
if (log_addr_space_size <
(MLX5_CAP_ADV_VIRTUALIZATION(mdev, pg_track_log_min_addr_space)) ||
log_addr_space_size >
......@@ -1032,18 +1072,18 @@ mlx5vf_create_rc_qp(struct mlx5_core_dev *mdev,
void *in;
int err;
qp = kzalloc(sizeof(*qp), GFP_KERNEL);
qp = kzalloc(sizeof(*qp), GFP_KERNEL_ACCOUNT);
if (!qp)
return ERR_PTR(-ENOMEM);
qp->rq.wqe_cnt = roundup_pow_of_two(max_recv_wr);
log_rq_stride = ilog2(MLX5_SEND_WQE_DS);
log_rq_sz = ilog2(qp->rq.wqe_cnt);
err = mlx5_db_alloc_node(mdev, &qp->db, mdev->priv.numa_node);
if (err)
goto err_free;
if (max_recv_wr) {
qp->rq.wqe_cnt = roundup_pow_of_two(max_recv_wr);
log_rq_stride = ilog2(MLX5_SEND_WQE_DS);
log_rq_sz = ilog2(qp->rq.wqe_cnt);
err = mlx5_frag_buf_alloc_node(mdev,
wq_get_byte_sz(log_rq_sz, log_rq_stride),
&qp->buf, mdev->priv.numa_node);
......@@ -1213,12 +1253,13 @@ static int alloc_recv_pages(struct mlx5_vhca_recv_buf *recv_buf,
int i;
recv_buf->page_list = kvcalloc(npages, sizeof(*recv_buf->page_list),
GFP_KERNEL);
GFP_KERNEL_ACCOUNT);
if (!recv_buf->page_list)
return -ENOMEM;
for (;;) {
filled = alloc_pages_bulk_array(GFP_KERNEL, npages - done,
filled = alloc_pages_bulk_array(GFP_KERNEL_ACCOUNT,
npages - done,
recv_buf->page_list + done);
if (!filled)
goto err;
......@@ -1248,7 +1289,7 @@ static int register_dma_recv_pages(struct mlx5_core_dev *mdev,
recv_buf->dma_addrs = kvcalloc(recv_buf->npages,
sizeof(*recv_buf->dma_addrs),
GFP_KERNEL);
GFP_KERNEL_ACCOUNT);
if (!recv_buf->dma_addrs)
return -ENOMEM;
......
......@@ -9,6 +9,7 @@
#include <linux/kernel.h>
#include <linux/vfio_pci_core.h>
#include <linux/mlx5/driver.h>
#include <linux/mlx5/vport.h>
#include <linux/mlx5/cq.h>
#include <linux/mlx5/qp.h>
......@@ -26,15 +27,33 @@ enum mlx5_vf_migf_state {
enum mlx5_vf_load_state {
MLX5_VF_LOAD_STATE_READ_IMAGE_NO_HEADER,
MLX5_VF_LOAD_STATE_READ_HEADER,
MLX5_VF_LOAD_STATE_PREP_HEADER_DATA,
MLX5_VF_LOAD_STATE_READ_HEADER_DATA,
MLX5_VF_LOAD_STATE_PREP_IMAGE,
MLX5_VF_LOAD_STATE_READ_IMAGE,
MLX5_VF_LOAD_STATE_LOAD_IMAGE,
};
struct mlx5_vf_migration_tag_stop_copy_data {
__le64 stop_copy_size;
};
enum mlx5_vf_migf_header_flags {
MLX5_MIGF_HEADER_FLAGS_TAG_MANDATORY = 0,
MLX5_MIGF_HEADER_FLAGS_TAG_OPTIONAL = 1 << 0,
};
enum mlx5_vf_migf_header_tag {
MLX5_MIGF_HEADER_TAG_FW_DATA = 0,
MLX5_MIGF_HEADER_TAG_STOP_COPY_SIZE = 1 << 0,
};
struct mlx5_vf_migration_header {
__le64 image_size;
__le64 record_size;
/* For future use in case we may need to change the kernel protocol */
__le64 flags;
__le32 flags; /* Use mlx5_vf_migf_header_flags */
__le32 tag; /* Use mlx5_vf_migf_header_tag */
__u8 data[]; /* Its size is given in the record_size */
};
struct mlx5_vhca_data_buffer {
......@@ -42,7 +61,6 @@ struct mlx5_vhca_data_buffer {
loff_t start_pos;
u64 length;
u64 allocated_length;
u64 header_image_size;
u32 mkey;
enum dma_data_direction dma_dir;
u8 dmaed:1;
......@@ -72,6 +90,10 @@ struct mlx5_vf_migration_file {
enum mlx5_vf_load_state load_state;
u32 pdn;
loff_t max_pos;
u64 record_size;
u32 record_tag;
u64 stop_copy_prep_size;
u64 pre_copy_initial_bytes;
struct mlx5_vhca_data_buffer *buf;
struct mlx5_vhca_data_buffer *buf_header;
spinlock_t list_lock;
......
This diff is collapsed.
......@@ -1244,7 +1244,7 @@ static int vfio_msi_cap_len(struct vfio_pci_core_device *vdev, u8 pos)
if (vdev->msi_perm)
return len;
vdev->msi_perm = kmalloc(sizeof(struct perm_bits), GFP_KERNEL);
vdev->msi_perm = kmalloc(sizeof(struct perm_bits), GFP_KERNEL_ACCOUNT);
if (!vdev->msi_perm)
return -ENOMEM;
......@@ -1731,11 +1731,11 @@ int vfio_config_init(struct vfio_pci_core_device *vdev)
* no requirements on the length of a capability, so the gap between
* capabilities needs byte granularity.
*/
map = kmalloc(pdev->cfg_size, GFP_KERNEL);
map = kmalloc(pdev->cfg_size, GFP_KERNEL_ACCOUNT);
if (!map)
return -ENOMEM;
vconfig = kmalloc(pdev->cfg_size, GFP_KERNEL);
vconfig = kmalloc(pdev->cfg_size, GFP_KERNEL_ACCOUNT);
if (!vconfig) {
kfree(map);
return -ENOMEM;
......
......@@ -144,7 +144,8 @@ static void vfio_pci_probe_mmaps(struct vfio_pci_core_device *vdev)
* of the exclusive page in case that hot-add
* device's bar is assigned into it.
*/
dummy_res = kzalloc(sizeof(*dummy_res), GFP_KERNEL);
dummy_res =
kzalloc(sizeof(*dummy_res), GFP_KERNEL_ACCOUNT);
if (dummy_res == NULL)
goto no_mmap;
......@@ -863,7 +864,7 @@ int vfio_pci_core_register_dev_region(struct vfio_pci_core_device *vdev,
region = krealloc(vdev->region,
(vdev->num_regions + 1) * sizeof(*region),
GFP_KERNEL);
GFP_KERNEL_ACCOUNT);
if (!region)
return -ENOMEM;
......@@ -1644,7 +1645,7 @@ static int __vfio_pci_add_vma(struct vfio_pci_core_device *vdev,
{
struct vfio_pci_mmap_vma *mmap_vma;
mmap_vma = kmalloc(sizeof(*mmap_vma), GFP_KERNEL);
mmap_vma = kmalloc(sizeof(*mmap_vma), GFP_KERNEL_ACCOUNT);
if (!mmap_vma)
return -ENOMEM;
......
......@@ -180,7 +180,7 @@ static int vfio_pci_igd_opregion_init(struct vfio_pci_core_device *vdev)
if (!addr || !(~addr))
return -ENODEV;
opregionvbt = kzalloc(sizeof(*opregionvbt), GFP_KERNEL);
opregionvbt = kzalloc(sizeof(*opregionvbt), GFP_KERNEL_ACCOUNT);
if (!opregionvbt)
return -ENOMEM;
......
......@@ -177,7 +177,7 @@ static int vfio_intx_enable(struct vfio_pci_core_device *vdev)
if (!vdev->pdev->irq)
return -ENODEV;
vdev->ctx = kzalloc(sizeof(struct vfio_pci_irq_ctx), GFP_KERNEL);
vdev->ctx = kzalloc(sizeof(struct vfio_pci_irq_ctx), GFP_KERNEL_ACCOUNT);
if (!vdev->ctx)
return -ENOMEM;
......@@ -216,7 +216,7 @@ static int vfio_intx_set_signal(struct vfio_pci_core_device *vdev, int fd)
if (fd < 0) /* Disable only */
return 0;
vdev->ctx[0].name = kasprintf(GFP_KERNEL, "vfio-intx(%s)",
vdev->ctx[0].name = kasprintf(GFP_KERNEL_ACCOUNT, "vfio-intx(%s)",
pci_name(pdev));
if (!vdev->ctx[0].name)
return -ENOMEM;
......@@ -284,7 +284,8 @@ static int vfio_msi_enable(struct vfio_pci_core_device *vdev, int nvec, bool msi
if (!is_irq_none(vdev))
return -EINVAL;
vdev->ctx = kcalloc(nvec, sizeof(struct vfio_pci_irq_ctx), GFP_KERNEL);
vdev->ctx = kcalloc(nvec, sizeof(struct vfio_pci_irq_ctx),
GFP_KERNEL_ACCOUNT);
if (!vdev->ctx)
return -ENOMEM;
......@@ -343,7 +344,8 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev,
if (fd < 0)
return 0;
vdev->ctx[vector].name = kasprintf(GFP_KERNEL, "vfio-msi%s[%d](%s)",
vdev->ctx[vector].name = kasprintf(GFP_KERNEL_ACCOUNT,
"vfio-msi%s[%d](%s)",
msix ? "x" : "", vector,
pci_name(pdev));
if (!vdev->ctx[vector].name)
......
......@@ -470,7 +470,7 @@ int vfio_pci_ioeventfd(struct vfio_pci_core_device *vdev, loff_t offset,
goto out_unlock;
}
ioeventfd = kzalloc(sizeof(*ioeventfd), GFP_KERNEL);
ioeventfd = kzalloc(sizeof(*ioeventfd), GFP_KERNEL_ACCOUNT);
if (!ioeventfd) {
ret = -ENOMEM;
goto out_unlock;
......
......@@ -142,7 +142,7 @@ static int vfio_platform_regions_init(struct vfio_platform_device *vdev)
cnt++;
vdev->regions = kcalloc(cnt, sizeof(struct vfio_platform_region),
GFP_KERNEL);
GFP_KERNEL_ACCOUNT);
if (!vdev->regions)
return -ENOMEM;
......@@ -150,9 +150,6 @@ static int vfio_platform_regions_init(struct vfio_platform_device *vdev)
struct resource *res =
vdev->get_resource(vdev, i);
if (!res)
goto err;
vdev->regions[i].addr = res->start;
vdev->regions[i].size = resource_size(res);
vdev->regions[i].flags = 0;
......@@ -650,10 +647,13 @@ int vfio_platform_init_common(struct vfio_platform_device *vdev)
mutex_init(&vdev->igate);
ret = vfio_platform_get_reset(vdev);
if (ret && vdev->reset_required)
if (ret && vdev->reset_required) {
dev_err(dev, "No reset function found for device %s\n",
vdev->name);
return ret;
return ret;
}
return 0;
}
EXPORT_SYMBOL_GPL(vfio_platform_init_common);
......
......@@ -186,9 +186,8 @@ static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
if (fd < 0) /* Disable only */
return 0;
irq->name = kasprintf(GFP_KERNEL, "vfio-irq[%d](%s)",
irq->hwirq, vdev->name);
irq->name = kasprintf(GFP_KERNEL_ACCOUNT, "vfio-irq[%d](%s)",
irq->hwirq, vdev->name);
if (!irq->name)
return -ENOMEM;
......@@ -286,7 +285,8 @@ int vfio_platform_irq_init(struct vfio_platform_device *vdev)
while (vdev->get_irq(vdev, cnt) >= 0)
cnt++;
vdev->irqs = kcalloc(cnt, sizeof(struct vfio_platform_irq), GFP_KERNEL);
vdev->irqs = kcalloc(cnt, sizeof(struct vfio_platform_irq),
GFP_KERNEL_ACCOUNT);
if (!vdev->irqs)
return -ENOMEM;
......
......@@ -18,8 +18,7 @@ struct vfio_container;
void vfio_device_put_registration(struct vfio_device *device);
bool vfio_device_try_get_registration(struct vfio_device *device);
int vfio_device_open(struct vfio_device *device,
struct iommufd_ctx *iommufd, struct kvm *kvm);
int vfio_device_open(struct vfio_device *device, struct iommufd_ctx *iommufd);
void vfio_device_close(struct vfio_device *device,
struct iommufd_ctx *iommufd);
......@@ -74,6 +73,7 @@ struct vfio_group {
struct file *opened_file;
struct blocking_notifier_head notifier;
struct iommufd_ctx *iommufd;
spinlock_t kvm_ref_lock;
};
int vfio_device_set_group(struct vfio_device *device,
......@@ -95,11 +95,6 @@ static inline bool vfio_device_is_noiommu(struct vfio_device *vdev)
}
#if IS_ENABLED(CONFIG_VFIO_CONTAINER)
/* events for the backend driver notify callback */
enum vfio_iommu_notify_type {
VFIO_IOMMU_CONTAINER_CLOSE = 0,
};
/**
* struct vfio_iommu_driver_ops - VFIO IOMMU driver callbacks
*/
......@@ -130,8 +125,6 @@ struct vfio_iommu_driver_ops {
void *data, size_t count, bool write);
struct iommu_domain *(*group_iommu_domain)(void *iommu_data,
struct iommu_group *group);
void (*notify)(void *iommu_data,
enum vfio_iommu_notify_type event);
};
struct vfio_iommu_driver {
......@@ -257,4 +250,18 @@ extern bool vfio_noiommu __read_mostly;
enum { vfio_noiommu = false };
#endif
#ifdef CONFIG_HAVE_KVM
void _vfio_device_get_kvm_safe(struct vfio_device *device, struct kvm *kvm);
void vfio_device_put_kvm(struct vfio_device *device);
#else
static inline void _vfio_device_get_kvm_safe(struct vfio_device *device,
struct kvm *kvm)
{
}
static inline void vfio_device_put_kvm(struct vfio_device *device)
{
}
#endif
#endif
This diff is collapsed.
......@@ -16,6 +16,9 @@
#include <linux/fs.h>
#include <linux/idr.h>
#include <linux/iommu.h>
#ifdef CONFIG_HAVE_KVM
#include <linux/kvm_host.h>
#endif
#include <linux/list.h>
#include <linux/miscdevice.h>
#include <linux/module.h>
......@@ -345,6 +348,55 @@ void vfio_unregister_group_dev(struct vfio_device *device)
}
EXPORT_SYMBOL_GPL(vfio_unregister_group_dev);
#ifdef CONFIG_HAVE_KVM
void _vfio_device_get_kvm_safe(struct vfio_device *device, struct kvm *kvm)
{
void (*pfn)(struct kvm *kvm);
bool (*fn)(struct kvm *kvm);
bool ret;
lockdep_assert_held(&device->dev_set->lock);
pfn = symbol_get(kvm_put_kvm);
if (WARN_ON(!pfn))
return;
fn = symbol_get(kvm_get_kvm_safe);
if (WARN_ON(!fn)) {
symbol_put(kvm_put_kvm);
return;
}
ret = fn(kvm);
symbol_put(kvm_get_kvm_safe);
if (!ret) {
symbol_put(kvm_put_kvm);
return;
}
device->put_kvm = pfn;
device->kvm = kvm;
}
void vfio_device_put_kvm(struct vfio_device *device)
{
lockdep_assert_held(&device->dev_set->lock);
if (!device->kvm)
return;
if (WARN_ON(!device->put_kvm))
goto clear;
device->put_kvm(device->kvm);
device->put_kvm = NULL;
symbol_put(kvm_put_kvm);
clear:
device->kvm = NULL;
}
#endif
/* true if the vfio_device has open_device() called but not close_device() */
static bool vfio_assert_device_open(struct vfio_device *device)
{
......@@ -352,7 +404,7 @@ static bool vfio_assert_device_open(struct vfio_device *device)
}
static int vfio_device_first_open(struct vfio_device *device,
struct iommufd_ctx *iommufd, struct kvm *kvm)
struct iommufd_ctx *iommufd)
{
int ret;
......@@ -368,7 +420,6 @@ static int vfio_device_first_open(struct vfio_device *device,
if (ret)
goto err_module_put;
device->kvm = kvm;
if (device->ops->open_device) {
ret = device->ops->open_device(device);
if (ret)
......@@ -377,7 +428,6 @@ static int vfio_device_first_open(struct vfio_device *device,
return 0;
err_unuse_iommu:
device->kvm = NULL;
if (iommufd)
vfio_iommufd_unbind(device);
else
......@@ -394,7 +444,6 @@ static void vfio_device_last_close(struct vfio_device *device,
if (device->ops->close_device)
device->ops->close_device(device);
device->kvm = NULL;
if (iommufd)
vfio_iommufd_unbind(device);
else
......@@ -402,19 +451,18 @@ static void vfio_device_last_close(struct vfio_device *device,
module_put(device->dev->driver->owner);
}
int vfio_device_open(struct vfio_device *device,
struct iommufd_ctx *iommufd, struct kvm *kvm)
int vfio_device_open(struct vfio_device *device, struct iommufd_ctx *iommufd)
{
int ret = 0;
mutex_lock(&device->dev_set->lock);
lockdep_assert_held(&device->dev_set->lock);
device->open_count++;
if (device->open_count == 1) {
ret = vfio_device_first_open(device, iommufd, kvm);
ret = vfio_device_first_open(device, iommufd);
if (ret)
device->open_count--;
}
mutex_unlock(&device->dev_set->lock);
return ret;
}
......@@ -422,12 +470,12 @@ int vfio_device_open(struct vfio_device *device,
void vfio_device_close(struct vfio_device *device,
struct iommufd_ctx *iommufd)
{
mutex_lock(&device->dev_set->lock);
lockdep_assert_held(&device->dev_set->lock);
vfio_assert_device_open(device);
if (device->open_count == 1)
vfio_device_last_close(device, iommufd);
device->open_count--;
mutex_unlock(&device->dev_set->lock);
}
/*
......
......@@ -112,7 +112,7 @@ int vfio_virqfd_enable(void *opaque,
int ret = 0;
__poll_t events;
virqfd = kzalloc(sizeof(*virqfd), GFP_KERNEL);
virqfd = kzalloc(sizeof(*virqfd), GFP_KERNEL_ACCOUNT);
if (!virqfd)
return -ENOMEM;
......
......@@ -46,7 +46,6 @@ struct vfio_device {
struct vfio_device_set *dev_set;
struct list_head dev_set_list;
unsigned int migration_flags;
/* Driver must reference the kvm during open_device or never touch it */
struct kvm *kvm;
/* Members below here are private, not for driver use */
......@@ -58,6 +57,7 @@ struct vfio_device {
struct list_head group_next;
struct list_head iommu_entry;
struct iommufd_access *iommufd_access;
void (*put_kvm)(struct kvm *kvm);
#if IS_ENABLED(CONFIG_IOMMUFD)
struct iommufd_device *iommufd_device;
struct iommufd_ctx *iommufd_ictx;
......@@ -70,6 +70,10 @@ struct vfio_device {
*
* @init: initialize private fields in device structure
* @release: Reclaim private fields in device structure
* @bind_iommufd: Called when binding the device to an iommufd
* @unbind_iommufd: Opposite of bind_iommufd
* @attach_ioas: Called when attaching device to an IOAS/HWPT managed by the
* bound iommufd. Undo in unbind_iommufd.
* @open_device: Called when the first file descriptor is opened for this device
* @close_device: Opposite of open_device
* @read: Perform read(2) on device file descriptor
......
......@@ -49,7 +49,11 @@
/* Supports VFIO_DMA_UNMAP_FLAG_ALL */
#define VFIO_UNMAP_ALL 9
/* Supports the vaddr flag for DMA map and unmap */
/*
* Supports the vaddr flag for DMA map and unmap. Not supported for mediated
* devices, so this capability is subject to change as groups are added or
* removed.
*/
#define VFIO_UPDATE_VADDR 10
/*
......@@ -1343,8 +1347,7 @@ struct vfio_iommu_type1_info_dma_avail {
* Map process virtual addresses to IO virtual addresses using the
* provided struct vfio_dma_map. Caller sets argsz. READ &/ WRITE required.
*
* If flags & VFIO_DMA_MAP_FLAG_VADDR, update the base vaddr for iova, and
* unblock translation of host virtual addresses in the iova range. The vaddr
* If flags & VFIO_DMA_MAP_FLAG_VADDR, update the base vaddr for iova. The vaddr
* must have previously been invalidated with VFIO_DMA_UNMAP_FLAG_VADDR. To
* maintain memory consistency within the user application, the updated vaddr
* must address the same memory object as originally mapped. Failure to do so
......@@ -1395,9 +1398,9 @@ struct vfio_bitmap {
* must be 0. This cannot be combined with the get-dirty-bitmap flag.
*
* If flags & VFIO_DMA_UNMAP_FLAG_VADDR, do not unmap, but invalidate host
* virtual addresses in the iova range. Tasks that attempt to translate an
* iova's vaddr will block. DMA to already-mapped pages continues. This
* cannot be combined with the get-dirty-bitmap flag.
* virtual addresses in the iova range. DMA to already-mapped pages continues.
* Groups may not be added to the container while any addresses are invalid.
* This cannot be combined with the get-dirty-bitmap flag.
*/
struct vfio_iommu_type1_dma_unmap {
__u32 argsz;
......
......@@ -191,23 +191,25 @@ config SAMPLE_UHID
Build UHID sample program.
config SAMPLE_VFIO_MDEV_MTTY
tristate "Build VFIO mtty example mediated device sample code -- loadable modules only"
depends on VFIO_MDEV && m
tristate "Build VFIO mtty example mediated device sample code"
depends on VFIO
select VFIO_MDEV
help
Build a virtual tty sample driver for use as a VFIO
mediated device
config SAMPLE_VFIO_MDEV_MDPY
tristate "Build VFIO mdpy example mediated device sample code -- loadable modules only"
depends on VFIO_MDEV && m
tristate "Build VFIO mdpy example mediated device sample code"
depends on VFIO
select VFIO_MDEV
help
Build a virtual display sample driver for use as a VFIO
mediated device. It is a simple framebuffer and supports
the region display interface (VFIO_GFX_PLANE_TYPE_REGION).
config SAMPLE_VFIO_MDEV_MDPY_FB
tristate "Build VFIO mdpy example guest fbdev driver -- loadable module only"
depends on FB && m
tristate "Build VFIO mdpy example guest fbdev driver"
depends on FB
select FB_CFB_FILLRECT
select FB_CFB_COPYAREA
select FB_CFB_IMAGEBLIT
......@@ -215,8 +217,9 @@ config SAMPLE_VFIO_MDEV_MDPY_FB
Guest fbdev driver for the virtual display sample driver.
config SAMPLE_VFIO_MDEV_MBOCHS
tristate "Build VFIO mdpy example mediated device sample code -- loadable modules only"
depends on VFIO_MDEV && m
tristate "Build VFIO mbochs example mediated device sample code"
depends on VFIO
select VFIO_MDEV
select DMA_SHARED_BUFFER
help
Build a virtual display sample driver for use as a VFIO
......
Using the mtty vfio-mdev sample code
====================================
mtty is a sample vfio-mdev driver that demonstrates how to use the mediated
device framework.
The sample driver creates an mdev device that simulates a serial port over a PCI
card.
1. Build and load the mtty.ko module.
This step creates a dummy device, /sys/devices/virtual/mtty/mtty/
Files in this device directory in sysfs are similar to the following::
# tree /sys/devices/virtual/mtty/mtty/
/sys/devices/virtual/mtty/mtty/
|-- mdev_supported_types
| |-- mtty-1
| | |-- available_instances
| | |-- create
| | |-- device_api
| | |-- devices
| | `-- name
| `-- mtty-2
| |-- available_instances
| |-- create
| |-- device_api
| |-- devices
| `-- name
|-- mtty_dev
| `-- sample_mtty_dev
|-- power
| |-- autosuspend_delay_ms
| |-- control
| |-- runtime_active_time
| |-- runtime_status
| `-- runtime_suspended_time
|-- subsystem -> ../../../../class/mtty
`-- uevent
2. Create a mediated device by using the dummy device that you created in the
previous step::
# echo "83b8f4f2-509f-382f-3c1e-e6bfe0fa1001" > \
/sys/devices/virtual/mtty/mtty/mdev_supported_types/mtty-2/create
3. Add parameters to qemu-kvm::
-device vfio-pci,\
sysfsdev=/sys/bus/mdev/devices/83b8f4f2-509f-382f-3c1e-e6bfe0fa1001
4. Boot the VM.
In the Linux guest VM, with no hardware on the host, the device appears
as follows::
# lspci -s 00:05.0 -xxvv
00:05.0 Serial controller: Device 4348:3253 (rev 10) (prog-if 02 [16550])
Subsystem: Device 4348:3253
Physical Slot: 5
Control: I/O+ Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
Interrupt: pin A routed to IRQ 10
Region 0: I/O ports at c150 [size=8]
Region 1: I/O ports at c158 [size=8]
Kernel driver in use: serial
00: 48 43 53 32 01 00 00 02 10 02 00 07 00 00 00 00
10: 51 c1 00 00 59 c1 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 48 43 53 32
30: 00 00 00 00 00 00 00 00 00 00 00 00 0a 01 00 00
In the Linux guest VM, dmesg output for the device is as follows:
serial 0000:00:05.0: PCI INT A -> Link[LNKA] -> GSI 10 (level, high) -> IRQ 10
0000:00:05.0: ttyS1 at I/O 0xc150 (irq = 10) is a 16550A
0000:00:05.0: ttyS2 at I/O 0xc158 (irq = 10) is a 16550A
5. In the Linux guest VM, check the serial ports::
# setserial -g /dev/ttyS*
/dev/ttyS0, UART: 16550A, Port: 0x03f8, IRQ: 4
/dev/ttyS1, UART: 16550A, Port: 0xc150, IRQ: 10
/dev/ttyS2, UART: 16550A, Port: 0xc158, IRQ: 10
6. Using minicom or any terminal emulation program, open port /dev/ttyS1 or
/dev/ttyS2 with hardware flow control disabled.
7. Type data on the minicom terminal or send data to the terminal emulation
program and read the data.
Data is loop backed from hosts mtty driver.
8. Destroy the mediated device that you created::
# echo 1 > /sys/bus/mdev/devices/83b8f4f2-509f-382f-3c1e-e6bfe0fa1001/remove
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment