Commit d72e31c9 authored by Alex Williamson's avatar Alex Williamson Committed by Joerg Roedel

iommu: IOMMU Groups

IOMMU device groups are currently a rather vague associative notion
with assembly required by the user or user level driver provider to
do anything useful.  This patch intends to grow the IOMMU group concept
into something a bit more consumable.

To do this, we first create an object representing the group, struct
iommu_group.  This structure is allocated (iommu_group_alloc) and
filled (iommu_group_add_device) by the iommu driver.  The iommu driver
is free to add devices to the group using it's own set of policies.
This allows inclusion of devices based on physical hardware or topology
limitations of the platform, as well as soft requirements, such as
multi-function trust levels or peer-to-peer protection of the
interconnects.  Each device may only belong to a single iommu group,
which is linked from struct device.iommu_group.  IOMMU groups are
maintained using kobject reference counting, allowing for automatic
removal of empty, unreferenced groups.  It is the responsibility of
the iommu driver to remove devices from the group
(iommu_group_remove_device).

IOMMU groups also include a userspace representation in sysfs under
/sys/kernel/iommu_groups.  When allocated, each group is given a
dynamically assign ID (int).  The ID is managed by the core IOMMU group
code to support multiple heterogeneous iommu drivers, which could
potentially collide in group naming/numbering.  This also keeps group
IDs to small, easily managed values.  A directory is created under
/sys/kernel/iommu_groups for each group.  A further subdirectory named
"devices" contains links to each device within the group.  The iommu_group
file in the device's sysfs directory, which formerly contained a group
number when read, is now a link to the iommu group.  Example:

$ ls -l /sys/kernel/iommu_groups/26/devices/
total 0
lrwxrwxrwx. 1 root root 0 Apr 17 12:57 0000:00:1e.0 ->
		../../../../devices/pci0000:00/0000:00:1e.0
lrwxrwxrwx. 1 root root 0 Apr 17 12:57 0000:06:0d.0 ->
		../../../../devices/pci0000:00/0000:00:1e.0/0000:06:0d.0
lrwxrwxrwx. 1 root root 0 Apr 17 12:57 0000:06:0d.1 ->
		../../../../devices/pci0000:00/0000:00:1e.0/0000:06:0d.1

$ ls -l  /sys/kernel/iommu_groups/26/devices/*/iommu_group
[truncating perms/owner/timestamp]
/sys/kernel/iommu_groups/26/devices/0000:00:1e.0/iommu_group ->
					../../../kernel/iommu_groups/26
/sys/kernel/iommu_groups/26/devices/0000:06:0d.0/iommu_group ->
					../../../../kernel/iommu_groups/26
/sys/kernel/iommu_groups/26/devices/0000:06:0d.1/iommu_group ->
					../../../../kernel/iommu_groups/26

Groups also include several exported functions for use by user level
driver providers, for example VFIO.  These include:

iommu_group_get(): Acquires a reference to a group from a device
iommu_group_put(): Releases reference
iommu_group_for_each_dev(): Iterates over group devices using callback
iommu_group_[un]register_notifier(): Allows notification of device add
        and remove operations relevant to the group
iommu_group_id(): Return the group number

This patch also extends the IOMMU API to allow attaching groups to
domains.  This is currently a simple wrapper for iterating through
devices within a group, but it's expected that the IOMMU API may
eventually make groups a more integral part of domains.

Groups intentionally do not try to manage group ownership.  A user
level driver provider must independently acquire ownership for each
device within a group before making use of the group as a whole.
This may change in the future if group usage becomes more pervasive
across both DMA and IOMMU ops.

Groups intentionally do not provide a mechanism for driver locking
or otherwise manipulating driver matching/probing of devices within
the group.  Such interfaces are generic to devices and beyond the
scope of IOMMU groups.  If implemented, user level providers have
ready access via iommu_group_for_each_dev and group notifiers.

iommu_device_group() is removed here as it has no users.  The
replacement is:

	group = iommu_group_get(dev);
	id = iommu_group_id(group);
	iommu_group_put(group);

AMD-Vi & Intel VT-d support re-added in following patches.
Signed-off-by: default avatarAlex Williamson <alex.williamson@redhat.com>
Acked-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: default avatarJoerg Roedel <joerg.roedel@amd.com>
parent 74416e1e
What: /sys/kernel/iommu_groups/
Date: May 2012
KernelVersion: v3.5
Contact: Alex Williamson <alex.williamson@redhat.com>
Description: /sys/kernel/iommu_groups/ contains a number of sub-
directories, each representing an IOMMU group. The
name of the sub-directory matches the iommu_group_id()
for the group, which is an integer value. Within each
subdirectory is another directory named "devices" with
links to the sysfs devices contained in this group.
The group directory also optionally contains a "name"
file if the IOMMU driver has chosen to register a more
common name for the group.
Users:
......@@ -3227,26 +3227,6 @@ static int amd_iommu_domain_has_cap(struct iommu_domain *domain,
return 0;
}
static int amd_iommu_device_group(struct device *dev, unsigned int *groupid)
{
struct iommu_dev_data *dev_data = dev->archdata.iommu;
struct pci_dev *pdev = to_pci_dev(dev);
u16 devid;
if (!dev_data)
return -ENODEV;
if (pdev->is_virtfn || !iommu_group_mf)
devid = dev_data->devid;
else
devid = calc_devid(pdev->bus->number,
PCI_DEVFN(PCI_SLOT(pdev->devfn), 0));
*groupid = amd_iommu_alias_table[devid];
return 0;
}
static struct iommu_ops amd_iommu_ops = {
.domain_init = amd_iommu_domain_init,
.domain_destroy = amd_iommu_domain_destroy,
......@@ -3256,7 +3236,6 @@ static struct iommu_ops amd_iommu_ops = {
.unmap = amd_iommu_unmap,
.iova_to_phys = amd_iommu_iova_to_phys,
.domain_has_cap = amd_iommu_domain_has_cap,
.device_group = amd_iommu_device_group,
.pgsize_bitmap = AMD_IOMMU_PGSIZES,
};
......
......@@ -4090,54 +4090,6 @@ static int intel_iommu_domain_has_cap(struct iommu_domain *domain,
return 0;
}
/*
* Group numbers are arbitrary. Device with the same group number
* indicate the iommu cannot differentiate between them. To avoid
* tracking used groups we just use the seg|bus|devfn of the lowest
* level we're able to differentiate devices
*/
static int intel_iommu_device_group(struct device *dev, unsigned int *groupid)
{
struct pci_dev *pdev = to_pci_dev(dev);
struct pci_dev *bridge;
union {
struct {
u8 devfn;
u8 bus;
u16 segment;
} pci;
u32 group;
} id;
if (iommu_no_mapping(dev))
return -ENODEV;
id.pci.segment = pci_domain_nr(pdev->bus);
id.pci.bus = pdev->bus->number;
id.pci.devfn = pdev->devfn;
if (!device_to_iommu(id.pci.segment, id.pci.bus, id.pci.devfn))
return -ENODEV;
bridge = pci_find_upstream_pcie_bridge(pdev);
if (bridge) {
if (pci_is_pcie(bridge)) {
id.pci.bus = bridge->subordinate->number;
id.pci.devfn = 0;
} else {
id.pci.bus = bridge->bus->number;
id.pci.devfn = bridge->devfn;
}
}
if (!pdev->is_virtfn && iommu_group_mf)
id.pci.devfn = PCI_DEVFN(PCI_SLOT(id.pci.devfn), 0);
*groupid = id.group;
return 0;
}
static struct iommu_ops intel_iommu_ops = {
.domain_init = intel_iommu_domain_init,
.domain_destroy = intel_iommu_domain_destroy,
......@@ -4147,7 +4099,6 @@ static struct iommu_ops intel_iommu_ops = {
.unmap = intel_iommu_unmap,
.iova_to_phys = intel_iommu_iova_to_phys,
.domain_has_cap = intel_iommu_domain_has_cap,
.device_group = intel_iommu_device_group,
.pgsize_bitmap = INTEL_IOMMU_PGSIZES,
};
......
This diff is collapsed.
......@@ -26,6 +26,7 @@
#define IOMMU_CACHE (4) /* DMA cache coherency */
struct iommu_ops;
struct iommu_group;
struct bus_type;
struct device;
struct iommu_domain;
......@@ -60,6 +61,8 @@ struct iommu_domain {
* @iova_to_phys: translate iova to physical address
* @domain_has_cap: domain capabilities query
* @commit: commit iommu domain
* @add_device: add device to iommu grouping
* @remove_device: remove device from iommu grouping
* @pgsize_bitmap: bitmap of supported page sizes
*/
struct iommu_ops {
......@@ -75,10 +78,18 @@ struct iommu_ops {
unsigned long iova);
int (*domain_has_cap)(struct iommu_domain *domain,
unsigned long cap);
int (*device_group)(struct device *dev, unsigned int *groupid);
int (*add_device)(struct device *dev);
void (*remove_device)(struct device *dev);
unsigned long pgsize_bitmap;
};
#define IOMMU_GROUP_NOTIFY_ADD_DEVICE 1 /* Device added */
#define IOMMU_GROUP_NOTIFY_DEL_DEVICE 2 /* Pre Device removed */
#define IOMMU_GROUP_NOTIFY_BIND_DRIVER 3 /* Pre Driver bind */
#define IOMMU_GROUP_NOTIFY_BOUND_DRIVER 4 /* Post Driver bind */
#define IOMMU_GROUP_NOTIFY_UNBIND_DRIVER 5 /* Pre Driver unbind */
#define IOMMU_GROUP_NOTIFY_UNBOUND_DRIVER 6 /* Post Driver unbind */
extern int bus_set_iommu(struct bus_type *bus, struct iommu_ops *ops);
extern bool iommu_present(struct bus_type *bus);
extern struct iommu_domain *iommu_domain_alloc(struct bus_type *bus);
......@@ -97,7 +108,29 @@ extern int iommu_domain_has_cap(struct iommu_domain *domain,
unsigned long cap);
extern void iommu_set_fault_handler(struct iommu_domain *domain,
iommu_fault_handler_t handler, void *token);
extern int iommu_device_group(struct device *dev, unsigned int *groupid);
extern int iommu_attach_group(struct iommu_domain *domain,
struct iommu_group *group);
extern void iommu_detach_group(struct iommu_domain *domain,
struct iommu_group *group);
extern struct iommu_group *iommu_group_alloc(void);
extern void *iommu_group_get_iommudata(struct iommu_group *group);
extern void iommu_group_set_iommudata(struct iommu_group *group,
void *iommu_data,
void (*release)(void *iommu_data));
extern int iommu_group_set_name(struct iommu_group *group, const char *name);
extern int iommu_group_add_device(struct iommu_group *group,
struct device *dev);
extern void iommu_group_remove_device(struct device *dev);
extern int iommu_group_for_each_dev(struct iommu_group *group, void *data,
int (*fn)(struct device *, void *));
extern struct iommu_group *iommu_group_get(struct device *dev);
extern void iommu_group_put(struct iommu_group *group);
extern int iommu_group_register_notifier(struct iommu_group *group,
struct notifier_block *nb);
extern int iommu_group_unregister_notifier(struct iommu_group *group,
struct notifier_block *nb);
extern int iommu_group_id(struct iommu_group *group);
/**
* report_iommu_fault() - report about an IOMMU fault to the IOMMU framework
......@@ -142,6 +175,7 @@ static inline int report_iommu_fault(struct iommu_domain *domain,
#else /* CONFIG_IOMMU_API */
struct iommu_ops {};
struct iommu_group {};
static inline bool iommu_present(struct bus_type *bus)
{
......@@ -197,11 +231,75 @@ static inline void iommu_set_fault_handler(struct iommu_domain *domain,
{
}
static inline int iommu_device_group(struct device *dev, unsigned int *groupid)
int iommu_attach_group(struct iommu_domain *domain, struct iommu_group *group)
{
return -ENODEV;
}
void iommu_detach_group(struct iommu_domain *domain, struct iommu_group *group)
{
}
struct iommu_group *iommu_group_alloc(void)
{
return ERR_PTR(-ENODEV);
}
void *iommu_group_get_iommudata(struct iommu_group *group)
{
return NULL;
}
void iommu_group_set_iommudata(struct iommu_group *group, void *iommu_data,
void (*release)(void *iommu_data))
{
}
int iommu_group_set_name(struct iommu_group *group, const char *name)
{
return -ENODEV;
}
int iommu_group_add_device(struct iommu_group *group, struct device *dev)
{
return -ENODEV;
}
void iommu_group_remove_device(struct device *dev)
{
}
int iommu_group_for_each_dev(struct iommu_group *group, void *data,
int (*fn)(struct device *, void *))
{
return -ENODEV;
}
struct iommu_group *iommu_group_get(struct device *dev)
{
return NULL;
}
void iommu_group_put(struct iommu_group *group)
{
}
int iommu_group_register_notifier(struct iommu_group *group,
struct notifier_block *nb)
{
return -ENODEV;
}
int iommu_group_unregister_notifier(struct iommu_group *group,
struct notifier_block *nb)
{
return 0;
}
int iommu_group_id(struct iommu_group *group)
{
return -ENODEV;
}
#endif /* CONFIG_IOMMU_API */
#endif /* __LINUX_IOMMU_H */
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment