Commits · a11dda723c6493bb1853bbc61c093377f96e2d47 · Kirill Smelkov / linux

10 Jul, 2024 1 commit

iommufd: Require drivers to supply the cache_invalidate_user ops · a11dda72

Jason Gunthorpe authored Jun 28, 2024

If drivers don't do this then iommufd will oops invalidation ioctls with
something like:

  Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
  Mem abort info:
    ESR = 0x0000000086000004
    EC = 0x21: IABT (current EL), IL = 32 bits
    SET = 0, FnV = 0
    EA = 0, S1PTW = 0
    FSC = 0x04: level 0 translation fault
  user pgtable: 4k pages, 48-bit VAs, pgdp=0000000101059000
  [0000000000000000] pgd=0000000000000000, p4d=0000000000000000
  Internal error: Oops: 0000000086000004 [#1] PREEMPT SMP
  Modules linked in:
  CPU: 2 PID: 371 Comm: qemu-system-aar Not tainted 6.8.0-rc7-gde77230ac23a #9
  Hardware name: linux,dummy-virt (DT)
  pstate: 81400809 (Nzcv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=-c)
  pc : 0x0
  lr : iommufd_hwpt_invalidate+0xa4/0x204
  sp : ffff800080f3bcc0
  x29: ffff800080f3bcf0 x28: ffff0000c369b300 x27: 0000000000000000
  x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
  x23: 0000000000000000 x22: 00000000c1e334a0 x21: ffff0000c1e334a0
  x20: ffff800080f3bd38 x19: ffff800080f3bd58 x18: 0000000000000000
  x17: 0000000000000000 x16: 0000000000000000 x15: 0000ffff8240d6d8
  x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
  x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000000
  x8 : 0000001000000002 x7 : 0000fffeac1ec950 x6 : 0000000000000000
  x5 : ffff800080f3bd78 x4 : 0000000000000003 x3 : 0000000000000002
  x2 : 0000000000000000 x1 : ffff800080f3bcc8 x0 : ffff0000c6034d80
  Call trace:
   0x0
   iommufd_fops_ioctl+0x154/0x274
   __arm64_sys_ioctl+0xac/0xf0
   invoke_syscall+0x48/0x110
   el0_svc_common.constprop.0+0x40/0xe0
   do_el0_svc+0x1c/0x28
   el0_svc+0x34/0xb4
   el0t_64_sync_handler+0x120/0x12c
   el0t_64_sync+0x190/0x194

All existing drivers implement this op for nesting, this is mostly a
bisection aid.

Fixes: 8c6eabae ("iommufd: Add IOMMU_HWPT_INVALIDATE")
Link: https://lore.kernel.org/r/0-v1-e153859bd707+61-iommufd_check_ops_jgg@nvidia.comReviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

a11dda72

09 Jul, 2024 7 commits

Merge branch 'iommufd_pri' into iommufd for-next · 18dcca24

Jason Gunthorpe authored Jul 09, 2024

Lu Baolu says:

====================
This series implements the functionality of delivering IO page faults to
user space through the IOMMUFD framework. One feasible use case is the
nested translation. Nested translation is a hardware feature that supports
two-stage translation tables for IOMMU. The second-stage translation table
is managed by the host VMM, while the first-stage translation table is
owned by user space. This allows user space to control the IOMMU mappings
for its devices.

When an IO page fault occurs on the first-stage translation table, the
IOMMU hardware can deliver the page fault to user space through the
IOMMUFD framework. User space can then handle the page fault and respond
to the device top-down through the IOMMUFD. This allows user space to
implement its own IO page fault handling policies.

User space application that is capable of handling IO page faults should
allocate a fault object, and bind the fault object to any domain that it
is willing to handle the fault generatd for them. On a successful return
of fault object allocation, the user can retrieve and respond to page
faults by reading or writing to the file descriptor (FD) returned.

The iommu selftest framework has been updated to test the IO page fault
delivery and response functionality.
====================

* iommufd_pri:
  iommufd/selftest: Add coverage for IOPF test
  iommufd/selftest: Add IOPF support for mock device
  iommufd: Associate fault object with iommufd_hw_pgtable
  iommufd: Fault-capable hwpt attach/detach/replace
  iommufd: Add iommufd fault object
  iommufd: Add fault and response message definitions
  iommu: Extend domain attach group with handle support
  iommu: Add attach handle to struct iopf_group
  iommu: Remove sva handle list
  iommu: Introduce domain attachment handle

Link: https://lore.kernel.org/all/20240702063444.105814-1-baolu.lu@linux.intel.comSigned-off-by: Jason Gunthorpe <jgg@nvidia.com>

18dcca24

iommufd/selftest: Add coverage for IOPF test · d1211768

Lu Baolu authored Jul 02, 2024

Extend the selftest tool to add coverage of testing IOPF handling. This
would include the following tests:

- Allocating and destroying an iommufd fault object.
- Allocating and destroying an IOPF-capable HWPT.
- Attaching/detaching/replacing an IOPF-capable HWPT on a device.
- Triggering an IOPF on the mock device.
- Retrieving and responding to the IOPF through the file interface.

Link: https://lore.kernel.org/r/20240702063444.105814-11-baolu.lu@linux.intel.comSigned-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

d1211768

iommufd/selftest: Add IOPF support for mock device · ddee1997

Lu Baolu authored Jul 02, 2024

Extend the selftest mock device to support generating and responding to
an IOPF. Also add an ioctl interface to userspace applications to trigger
the IOPF on the mock device. This would allow userspace applications to
test the IOMMUFD's handling of IOPFs without having to rely on any real
hardware.

Link: https://lore.kernel.org/r/20240702063444.105814-10-baolu.lu@linux.intel.comSigned-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

ddee1997

iommufd: Associate fault object with iommufd_hw_pgtable · 34765cbc

Lu Baolu authored Jul 02, 2024

When allocating a user iommufd_hw_pagetable, the user space is allowed to
associate a fault object with the hw_pagetable by specifying the fault
object ID in the page table allocation data and setting the
IOMMU_HWPT_FAULT_ID_VALID flag bit.

On a successful return of hwpt allocation, the user can retrieve and
respond to page faults by reading and writing the file interface of the
fault object.

Once a fault object has been associated with a hwpt, the hwpt is
iopf-capable, indicated by hwpt->fault is non NULL. Attaching,
detaching, or replacing an iopf-capable hwpt to an RID or PASID will
differ from those that are not iopf-capable.

Link: https://lore.kernel.org/r/20240702063444.105814-9-baolu.lu@linux.intel.comSigned-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

34765cbc

iommufd: Fault-capable hwpt attach/detach/replace · b7d88336

Lu Baolu authored Jul 02, 2024

Add iopf-capable hw page table attach/detach/replace helpers. The pointer
to iommufd_device is stored in the domain attachment handle, so that it
can be echo'ed back in the iopf_group.

The iopf-capable hw page tables can only be attached to devices that
support the IOMMU_DEV_FEAT_IOPF feature. On the first attachment of an
iopf-capable hw_pagetable to the device, the IOPF feature is enabled on
the device. Similarly, after the last iopf-capable hwpt is detached from
the device, the IOPF feature is disabled on the device.

The current implementation allows a replacement between iopf-capable and
non-iopf-capable hw page tables. This matches the nested translation use
case, where a parent domain is attached by default and can then be
replaced with a nested user domain with iopf support.

Link: https://lore.kernel.org/r/20240702063444.105814-8-baolu.lu@linux.intel.comSigned-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

b7d88336

iommufd: Add iommufd fault object · 07838f7f

Lu Baolu authored Jul 02, 2024

An iommufd fault object provides an interface for delivering I/O page
faults to user space. These objects are created and destroyed by user
space, and they can be associated with or dissociated from hardware page
table objects during page table allocation or destruction.

User space interacts with the fault object through a file interface. This
interface offers a straightforward and efficient way for user space to
handle page faults. It allows user space to read fault messages
sequentially and respond to them by writing to the same file. The file
interface supports reading messages in poll mode, so it's recommended that
user space applications use io_uring to enhance read and write efficiency.

A fault object can be associated with any iopf-capable iommufd_hw_pgtable
during the pgtable's allocation. All I/O page faults triggered by devices
when accessing the I/O addresses of an iommufd_hw_pgtable are routed
through the fault object to user space. Similarly, user space's responses
to these page faults are routed back to the iommu device driver through
the same fault object.

Link: https://lore.kernel.org/r/20240702063444.105814-7-baolu.lu@linux.intel.comSigned-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

07838f7f

iommufd: Add fault and response message definitions · c714f158

Lu Baolu authored Jul 02, 2024

iommu_hwpt_pgfaults represent fault messages that the userspace can
retrieve. Multiple iommu_hwpt_pgfaults might be put in an iopf group,
with the IOMMU_PGFAULT_FLAGS_LAST_PAGE flag set only for the last
iommu_hwpt_pgfault.

An iommu_hwpt_page_response is a response message that the userspace
should send to the kernel after finishing handling a group of fault
messages. The @dev_id, @pasid, and @grpid fields in the message
identify an outstanding iopf group for a device. The @cookie field,
which matches the cookie field of the last fault in the group, will
be used by the kernel to look up the pending message.

Link: https://lore.kernel.org/r/20240702063444.105814-6-baolu.lu@linux.intel.comSigned-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

c714f158

04 Jul, 2024 4 commits

iommu: Extend domain attach group with handle support · 8519e689

Lu Baolu authored Jul 02, 2024

Unlike the SVA case where each PASID of a device has an SVA domain
attached to it, the I/O page faults are handled by the fault handler
of the SVA domain. The I/O page faults for a user page table might
be handled by the domain attached to RID or the domain attached to
the PASID, depending on whether the PASID table is managed by user
space or kernel. As a result, there is a need for the domain attach
group interfaces to have attach handle support. The attach handle
will be forwarded to the fault handler of the user domain.

Add some variants of the domain attaching group interfaces so that they
could support the attach handle and export them for use in IOMMUFD.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20240702063444.105814-5-baolu.lu@linux.intel.comSigned-off-by: Will Deacon <will@kernel.org>

8519e689

iommu: Add attach handle to struct iopf_group · 06cdcc32

Lu Baolu authored Jul 02, 2024

Previously, the domain that a page fault targets is stored in an
iopf_group, which represents a minimal set of page faults. With the
introduction of attach handle, replace the domain with the handle
so that the fault handler can obtain more information as needed
when handling the faults.

iommu_report_device_fault() is currently used for SVA page faults,
which handles the page fault in an internal cycle. The domain is retrieved
with iommu_get_domain_for_dev_pasid() if the pasid in the fault message
is valid. This doesn't work in IOMMUFD case, where if the pasid table of
a device is wholly managed by user space, there is no domain attached to
the PASID of the device, and all page faults are forwarded through a
NESTING domain attaching to RID.

Add a static flag in iommu ops, which indicates if the IOMMU driver
supports user-managed PASID tables. In the iopf deliver path, if no
attach handle found for the iopf PASID, roll back to RID domain when
the IOMMU driver supports this capability.

iommu_get_domain_for_dev_pasid() is no longer used and can be removed.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20240702063444.105814-4-baolu.lu@linux.intel.comSigned-off-by: Will Deacon <will@kernel.org>

06cdcc32

iommu: Remove sva handle list · 3e7f57d1

Lu Baolu authored Jul 02, 2024

The struct sva_iommu represents an association between an SVA domain and
a PASID of a device. It's stored in the iommu group's pasid array and also
tracked by a list in the per-mm data structure. Removes duplicate tracking
of sva_iommu by eliminating the list.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20240702063444.105814-3-baolu.lu@linux.intel.comSigned-off-by: Will Deacon <will@kernel.org>

3e7f57d1

iommu: Introduce domain attachment handle · 14678219

Lu Baolu authored Jul 02, 2024

Currently, when attaching a domain to a device or its PASID, domain is
stored within the iommu group. It could be retrieved for use during the
window between attachment and detachment.

With new features introduced, there's a need to store more information
than just a domain pointer. This information essentially represents the
association between a domain and a device. For example, the SVA code
already has a custom struct iommu_sva which represents a bond between
sva domain and a PASID of a device. Looking forward, the IOMMUFD needs
a place to store the iommufd_device pointer in the core, so that the
device object ID could be quickly retrieved in the critical fault handling
path.

Introduce domain attachment handle that explicitly represents the
attachment relationship between a domain and a device or its PASID.
Co-developed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20240702063444.105814-2-baolu.lu@linux.intel.comSigned-off-by: Will Deacon <will@kernel.org>

14678219

28 Jun, 2024 11 commits

iommufd/iova_bitmap: Remove iterator logic · 53e6b656

Joao Martins authored Jun 27, 2024

The newly introduced dynamic pinning/windowing greatly simplifies the code
and there's no obvious performance advantage that has been identified that
justifies maintinaing both schemes.

Remove the iterator logic and have iova_bitmap_for_each() just invoke the
callback with the total iova/length.

Fixes: 2780025e ("iommufd/iova_bitmap: Handle recording beyond the mapped pages")
Link: https://lore.kernel.org/r/20240627110105.62325-12-joao.m.martins@oracle.comSigned-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Matt Ochs <mochs@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

53e6b656

iommufd/iova_bitmap: Dynamic pinning on iova_bitmap_set() · 7a7bba16

Joao Martins authored Jun 27, 2024

Today zerocopy iova bitmaps use a static iteration scheme where it walks
the bitmap data in a max iteration size of 2M of bitmap of data at a time.
That translates to a fixed window of IOVA space that can span up to 64G
(e.g. base pages, x86). Here 'window' refers to the IOVA space represented
by the bitmap data it is iterating. This static scheme is the ideal one
where the reported page-size is the same as the one behind the dirty
tracker.

However, problems start to appear when the dirty tracker may
dirty in many PTE sizes beyond or unaligned at the boundaries of the
iteration window. Such is the case for the IOMMU and
commit 2780025e ("iommufd/iova_bitmap: Handle recording beyond the mapped pages")
tried to fix the problem by handling the PTEs that get dirty which
surprass the end of the iteration. But the fix was incomplete and it
didn't handle all the data structure issues namely:

1) when there's nothing to dirty but the end of the iteration IOVA range is
a IOMMU hugepage PTE that crosses iterations, when it goes to the next
iteration it finds the other end of the said hugepage but don't account that
it had checked for that IOPTE already. iommu driver then walk the IOVA
space as if it is a new page without accounting that it is past the start
of a bigger page which ends up setting (future) dirty bits slightly
offset-ed. Note that the partial ranges here are self induced
due as a result of the fixed 'window' scheme being unaligned to this
hugepage IOPTE.

2) on the same line of thinking between pinning pages of different
iterations it could allow DMA to mark PTEs as dirty on the second part of
this previously mentioned partial hugepage. This leads to marking part of
the hugepage as dirty but still clearing IOPTE leading to missed dirty
data.

So to fix these problems more fundamentally and avoid future ones: instead
of iterating the whole bitmap in fixed chunks, instead only pin the bitmap
pages when it has dirty bits to set. The logic is simple in
iova_bitmap_set(): check where the current iova range to be marked as dirty
is pinned and pin the bitmap pages where to-be-recorded @iova starts if
it's not. If it's partially mapped out of the whole set, continue pinning
it and set bits until the whole dirty-size is covered. The latter is more
relevant with AMD iommu pgtable v1 format where you can have up
64G/128G/256G page sizes and thus you can set 64G at a time. Code also gets
simpler and easier to follow.

Fixing this without changing this iteration scheme means changing iommu
drivers to ignore any partial pages and not clear dirty bits, which is a
bit hacky. Though getting to walk only part of a IOMMU hugepage is a
self-induced due to this iteration scheme as it doesn't (and can't) align the
iteration boundary to the huge IOPTE at the end. Thus it can't know what
the hugepage size the iteration should align to until it walks the begin/end.

Dynamically pinning adds some comparisons inside iova_bitmap_set() to check
if something needs to be pinned if the IOVA range is out of range. Though
it has the benefit that non-dirty IOVA ranges only walk page tables without
needing to pin any bitmap pages. This dynamic scheme should be better for IOMMUs
where upper layers don't need or know what PTE sizes IOVAs map into (and there
could be more than one PTE size[*]) until they walk the IOMMU page tables.

A follow-up change will remove the iteration logic.

[*] Specially on AMD v1 iommu pgtable format where most powers of two are
supported as page-size.

Link: https://lore.kernel.org/linux-iommu/6b90f949-48da-4cb3-ad9a-ed54f1351a9a@oracle.com/
Fixes: 2780025e ("iommufd/iova_bitmap: Handle recording beyond the mapped pages")
Link: https://lore.kernel.org/r/20240627110105.62325-11-joao.m.martins@oracle.comSigned-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Matt Ochs <mochs@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

7a7bba16

iommufd/iova_bitmap: Consolidate iova_bitmap_set exit conditionals · 00fa1a89

Joao Martins authored Jun 27, 2024

There's no need to have two conditionals when they are closely tied
together. Move the setting of bitmap::set_ahead_length after it checks for
::pages array out of bounds access.

Link: https://lore.kernel.org/r/20240627110105.62325-10-joao.m.martins@oracle.comSigned-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Matt Ochs <mochs@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

00fa1a89

iommufd/iova_bitmap: Move initial pinning to iova_bitmap_for_each() · 781bc087

Joao Martins authored Jun 27, 2024

The pinned pages are only relevant when it starts iterating the bitmap so
defer that into iova_bitmap_for_each().

Link: https://lore.kernel.org/r/20240627110105.62325-9-joao.m.martins@oracle.comSigned-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Matt Ochs <mochs@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

781bc087

iommufd/iova_bitmap: Cache mapped length in iova_bitmap_map struct · a84c690e

Joao Martins authored Jun 27, 2024

The amount of IOVA mapped will be used more often in iova_bitmap_set() in
preparation to dynamically iterate the bitmap. Cache said length to avoid
having to calculate it all the time.

Link: https://lore.kernel.org/r/20240627110105.62325-8-joao.m.martins@oracle.comSigned-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Matt Ochs <mochs@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

a84c690e

iommufd/iova_bitmap: Check iova_bitmap_done() after set ahead · 79258365

Joao Martins authored Jun 27, 2024

After iova_bitmap_set_ahead() returns it may be at the end of the range.
Move iova_bitmap_set_ahead() earlier to avoid unnecessary attempt in
trying to pin the next pages by reusing iova_bitmap_done() check.

Fixes: 2780025e ("iommufd/iova_bitmap: Handle recording beyond the mapped pages")
Link: https://lore.kernel.org/r/20240627110105.62325-7-joao.m.martins@oracle.comSigned-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Matt Ochs <mochs@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

79258365

iommufd/selftest: Do not record head iova to better match iommu drivers · dceb5304

Joao Martins authored Jun 27, 2024

Do not set a hugepage-aligned IOVA for incrementing an IOVA, to better
match current IOMMU driver implementations. Keep the logic of clearing all
IOPTE dirty bits for a whole hugepage, even if the range being dirtied
starts from part of the hugepage. This is also similar to AMD driver (iommu
v1 format) where IOMMU uses various subpage PTE data for dirty tracking
(for non-standard page sizes).

Link: https://lore.kernel.org/r/20240627110105.62325-6-joao.m.martins@oracle.comSigned-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Matt Ochs <mochs@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

dceb5304

iommufd/selftest: Fix tests to use MOCK_PAGE_SIZE based buffer sizes · ffa3c799

Joao Martins authored Jun 27, 2024

commit a9af47e3 ("iommufd/selftest: Test IOMMU_HWPT_GET_DIRTY_BITMAP")
added tests covering edge cases in the boundaries of iova bitmap. Although
it used buffer sizes thinking in PAGE_SIZE (4K) as opposed to the
MOCK_PAGE_SIZE (2K) that is used in iommufd mock selftests. This meant that
isn't correctly exercising everything specifically the u32 and 4K bitmap
test cases. Fix selftests buffer sizes to be based on mock page size.

Link: https://lore.kernel.org/r/20240627110105.62325-5-joao.m.martins@oracle.comReported-by: Kevin Tian <kevin.tian@intel.com>
Closes: https://lore.kernel.org/linux-iommu/96efb6cf-a41c-420f-9673-2f0b682cac8c@oracle.com/
Fixes: a9af47e3 ("iommufd/selftest: Test IOMMU_HWPT_GET_DIRTY_BITMAP")
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Matt Ochs <mochs@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

ffa3c799

iommufd/selftest: Add tests for <= u8 bitmap sizes · 33335584

Joao Martins authored Jun 27, 2024

Add more tests for bitmaps smaller than or equal to an u8, though skip the
tests if the IOVA buffer size is smaller than the mock page size.

Link: https://lore.kernel.org/r/20240627110105.62325-4-joao.m.martins@oracle.comSigned-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Matt Ochs <mochs@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

33335584

iommufd/selftest: Fix iommufd_test_dirty() to handle <u8 bitmaps · 9560393b

Joao Martins authored Jun 27, 2024

The calculation returns 0 if it sets less than the number of bits per
byte. For calculating memory allocation from bits, lets round it up to
one byte.

Link: https://lore.kernel.org/r/20240627110105.62325-3-joao.m.martins@oracle.comReported-by: Matt Ochs <mochs@nvidia.com>
Fixes: a9af47e3 ("iommufd/selftest: Test IOMMU_HWPT_GET_DIRTY_BITMAP")
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Matt Ochs <mochs@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

9560393b

iommufd/selftest: Fix dirty bitmap tests with u8 bitmaps · ec61f820

Joao Martins authored Jun 27, 2024

With 64k base pages, the first 128k iova length test requires less than a
byte for a bitmap, exposing a bug in the tests that assume that bitmaps are
at least a byte.

Rather than dealing with bytes, have _test_mock_dirty_bitmaps() pass the
number of bits. The caller functions are adjusted to also use bits as well,
and converting to bytes when clearing, allocating and freeing the bitmap.

Link: https://lore.kernel.org/r/20240627110105.62325-2-joao.m.martins@oracle.comReported-by: Matt Ochs <mochs@nvidia.com>
Fixes: a9af47e3 ("iommufd/selftest: Test IOMMU_HWPT_GET_DIRTY_BITMAP")
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Matt Ochs <mochs@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

ec61f820

23 Jun, 2024 8 commits

Linux 6.10-rc5 · f2661062
Linus Torvalds authored Jun 23, 2024

f2661062

Merge tag 'i2c-for-6.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · 7c16f0a4

Linus Torvalds authored Jun 23, 2024

Pull i2c fixes from Wolfram Sang:
 "The core gains placeholders for recently added functions when
  CONFIG_I2C is not defined as well documentation fixes to start using
  inclusive terminology.

  The drivers get paths in DT bindings fixed as well as proper interrupt
  handling for the ocores driver"

* tag 'i2c-for-6.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  docs: i2c: summary: be clearer with 'controller/target' and 'adapter/client' pairs
  docs: i2c: summary: document 'local' and 'remote' targets
  docs: i2c: summary: document use of inclusive language
  docs: i2c: summary: update speed mode description
  docs: i2c: summary: update I2C specification link
  docs: i2c: summary: start sentences consistently.
  i2c: Add nop fwnode operations
  i2c: ocores: set IACK bit after core is enabled
  dt-bindings: i2c: google,cros-ec-i2c-tunnel: correct path to i2c-controller schema
  dt-bindings: i2c: atmel,at91sam: correct path to i2c-controller schema

7c16f0a4

Merge tag '6.10-rc4-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6 · d14f2780

Linus Torvalds authored Jun 23, 2024

Pull smb client fixes from Steve French:
 "Five smb3 client fixes

   - three nets/fiolios cifs fixes

   - fix typo in module parameters description

   - fix incorrect swap warning"

* tag '6.10-rc4-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: Move the 'pid' from the subreq to the req
  cifs: Only pick a channel once per read request
  cifs: Defer read completion
  cifs: fix typo in module parameter enable_gcm_256
  cifs: drop the incorrect assertion in cifs_swap_rw()

d14f2780

Merge tag 'fixes-2024-06-23' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock · 0971e82e

Linus Torvalds authored Jun 23, 2024

Pull memblock fix from Mike Rapoport:
 "Fix fragility in checks for unset node ID.

  Use numa_valid_node() function to verify that nid is a valid node
  ID instead of inconsistent comparisons with either NUMA_NO_NODE or
  MAX_NUMNODES"

* tag 'fixes-2024-06-23' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock:
  memblock: use numa_valid_node() helper to check for invalid node ID

0971e82e

Merge tag 'mips-fixes_6.10_2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux · b67eeff7

Linus Torvalds authored Jun 23, 2024

Pull MIPS fixes from Thomas Bogendoerfer:

 - fix lseek in o32 compat mode

 - fix for microMIPS MT ASE helpers

* tag 'mips-fixes_6.10_2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
  mips: fix compat_sys_lseek syscall
  MIPS: mipsmtregs: Fix target register for MFTC0

b67eeff7

Merge tag 'x86_urgent_for_v6.10_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · b9e6612d

Linus Torvalds authored Jun 23, 2024

Pull x86 fixes from Borislav Petkov:

 - An ARM-relevant fix to not free default RMIDs of a resource control
   group

 - A randconfig build fix for the VMware virtual GPU driver

* tag 'x86_urgent_for_v6.10_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/resctrl: Don't try to free nonexistent RMIDs
  drm/vmwgfx: Fix missing HYPERVISOR_GUEST dependency

b9e6612d

Merge tag 'powerpc-6.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · d1505b5c

Linus Torvalds authored Jun 23, 2024

Pull powerpc fixes from Michael Ellerman:

 - Prevent use-after-free in 64-bit KVM VFIO

 - Add generated Power8 crypto asm to .gitignore

Thanks to Al Viro and Nathan Lynch.

* tag 'powerpc-6.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  KVM: PPC: Book3S HV: Prevent UAF in kvm_spapr_tce_attach_iommu_group()
  powerpc/crypto: Add generated P8 asm to .gitignore

d1505b5c

Merge tag 'i2c-host-fixes-6.10-rc5' of... · 2c50f892

Wolfram Sang authored Jun 23, 2024

Merge tag 'i2c-host-fixes-6.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/andi.shyti/linux into i2c/for-current

This pull request fixes the paths of the dt-schema to their
complete locations for the ChromeOS EC tunnel driver and the
Atmel at91sam drivers.

Additionally, the OpenCores driver receives a fix for an issue
that dates back to version 2.6.18. Specifically, the interrupts
need to be acknowledged (clearing all pending interrupts) after
enabling the core.

2c50f892

22 Jun, 2024 9 commits

Merge tag 'rust-fixes-6.10' of https://github.com/Rust-for-Linux/linux · 5f583a31

Linus Torvalds authored Jun 22, 2024

Pull rust fix from Miguel Ojeda:

 - Avoid unused import warning in 'rusttest'.

* tag 'rust-fixes-6.10' of https://github.com/Rust-for-Linux/linux:
  rust: avoid unused import warning in `rusttest`

5f583a31

Merge tag 'regulator-fix-v6.10-rc4' of... · 2765de94

Linus Torvalds authored Jun 22, 2024

Merge tag 'regulator-fix-v6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator

Pull regulator fixes from Mark Brown:
 "A few driver specific fixes for incorrect device descriptions, plus a
  fix for a missing symbol export which causes build failures for some
  newly added drivers in other trees"

* tag 'regulator-fix-v6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: axp20x: AXP717: fix LDO supply rails and off-by-ones
  regulator: bd71815: fix ramp values
  regulator: core: Fix modpost error "regulator_get_regmap" undefined
  regulator: tps6594-regulator: Fix the number of irqs for TPS65224 and TPS6594

2765de94

Merge tag 'spi-fix-v6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi · e24638af

Linus Torvalds authored Jun 22, 2024

Pull spi fixes from Mark Brown:
 "A number of fixes that have built up for SPI, a bunch of driver
  specific ones including an unfortunate revert of an optimisation for
  the i.MX driver which was causing issues with some configurations,
  plus a couple of core fixes for the rarely used octal mode and for a
  bad interaction between multi-CS support and target mode"

* tag 'spi-fix-v6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: spi-imx: imx51: revert burst length calculation back to bits_per_word
  spi: Fix SPI slave probe failure
  spi: Fix OCTAL mode support
  spi: stm32: qspi: Clamp stm32_qspi_get_mode() output to CCR_BUSWIDTH_4
  spi: stm32: qspi: Fix dual flash mode sanity test in stm32_qspi_setup()
  spi: cs42l43: Drop cs35l56 SPI speed down to 11MHz
  spi: cs42l43: Correct SPI root clock speed

e24638af

Merge tag 'nfsd-6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux · c2fc9462

Linus Torvalds authored Jun 22, 2024

Pull nfsd fixes from Chuck Lever:

 - Fix crashes triggered by administrative operations on the server

* tag 'nfsd-6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
  NFSD: grab nfsd_mutex in nfsd_nl_rpc_status_get_dumpit()
  nfsd: fix oops when reading pool_stats before server is started

c2fc9462

Merge tag 'xfs-6.10-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · 563a5067

Linus Torvalds authored Jun 22, 2024

Pull xfs fix from Chandan Babu:

 - Fix assertion failure due to a race between unlink and cluster buffer
   instantiation.

* tag 'xfs-6.10-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: fix unlink vs cluster buffer instantiation race

563a5067

Merge tag 'bcachefs-2024-06-22' of https://evilpiepirate.org/git/bcachefs · c3de9b57

Linus Torvalds authored Jun 22, 2024

Pull bcachefs fixes from Kent Overstreet:
 "Lots of (mostly boring) fixes for syzbot bugs and rare(r) CI bugs.

  The LRU_TIME_BITS fix was slightly more involved; we only have 48 bits
  for the LRU position (we would prefer 64), so wraparound is possible
  for the cached data LRUs on a filesystem that has done sufficient
  (petabytes) reads; this is now handled.

  One notable user reported bugfix, where we were forgetting to
  correctly set the bucket data type, which should have been
  BCH_DATA_need_gc_gens instead of BCH_DATA_free; this was causing us to
  go emergency read-only on a filesystem that had seen heavy enough use
  to see bucket gen wraparoud.

  We're now starting to fix simple (safe) errors without requiring user
  intervention - i.e. a small incremental step towards full self
  healing.

  This is currently limited to just certain allocation information
  counters, and the error is still logged in the superblock; see that
  patch for more information. ("bcachefs: Fix safe errors by default")"

* tag 'bcachefs-2024-06-22' of https://evilpiepirate.org/git/bcachefs: (22 commits)
  bcachefs: Move the ei_flags setting to after initialization
  bcachefs: Fix a UAF after write_super()
  bcachefs: Use bch2_print_string_as_lines for long err
  bcachefs: Fix I_NEW warning in race path in bch2_inode_insert()
  bcachefs: Replace bare EEXIST with private error codes
  bcachefs: Fix missing alloc_data_type_set()
  closures: Change BUG_ON() to WARN_ON()
  bcachefs: fix alignment of VMA for memory mapped files on THP
  bcachefs: Fix safe errors by default
  bcachefs: Fix bch2_trans_put()
  bcachefs: set_worker_desc() for delete_dead_snapshots
  bcachefs: Fix bch2_sb_downgrade_update()
  bcachefs: Handle cached data LRU wraparound
  bcachefs: Guard against overflowing LRU_TIME_BITS
  bcachefs: delete_dead_snapshots() doesn't need to go RW
  bcachefs: Fix early init error path in journal code
  bcachefs: Check for invalid btree IDs
  bcachefs: Fix btree ID bitmasks
  bcachefs: Fix shift overflow in read_one_super()
  bcachefs: Fix a locking bug in the do_discard_fast() path
  ...

c3de9b57

Merge tag 'ata-6.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux · da3b6ef1

Linus Torvalds authored Jun 22, 2024

Pull ata fix from Niklas Cassel:

 - We currently enable DIPM (device initiated power management) in the
   device (using a SET FEATURES call to the device), regardless if the
   HBA supports any LPM states or not. It seems counter intuitive, and
   potentially dangerous to enable a device side feature, when the HBA
   does not have the corresponding support. Thus, make sure that we do
   not enable DIPM if the HBA does not support any LPM states.

* tag 'ata-6.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
  ata: ahci: Do not enable LPM if no LPM states are supported by the HBA

da3b6ef1

Merge tag 'pwm/for-6.10-rc5-fixes-take2' of... · 1f5c5371

Linus Torvalds authored Jun 22, 2024

Merge tag 'pwm/for-6.10-rc5-fixes-take2' of git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux

Pull pwm fixes from Uwe Kleine-König:
 "Three fixes for the pwm-stm32 driver.

  The first patch prevents an integer wrap-around for small periods. In
  the second patch the calculation of the prescaler is fixed which
  resulted in values for the ARR register that don't fit into the
  corresponding register bit field. The last commit improves an error
  message that was wrongly copied from another error path"

* tag 'pwm/for-6.10-rc5-fixes-take2' of git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux:
  pwm: stm32: Fix error message to not describe the previous error path
  pwm: stm32: Fix calculation of prescaler
  pwm: stm32: Refuse too small period requests

1f5c5371

Merge tag 'arm-fixes-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc · 56bf7334

Linus Torvalds authored Jun 22, 2024

Pull SoC fixes from Arnd Bergmann:
 "There are seven oneline patches that each address a distinct problem
  on the NXP i.MX platform, mostly the popular i.MX8M variant.

  The only other two fixes are for error handling on the psci firmware
  driver and SD card support on the milkv duo riscv board"

* tag 'arm-fixes-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  firmware: psci: Fix return value from psci_system_suspend()
  riscv: dts: sophgo: disable write-protection for milkv duo
  arm64: dts: imx8qm-mek: fix gpio number for reg_usdhc2_vmmc
  arm64: dts: freescale: imx8mm-verdin: enable hysteresis on slow input pin
  arm64: dts: imx93-11x11-evk: Remove the 'no-sdio' property
  arm64: dts: freescale: imx8mp-venice-gw73xx-2x: fix BT shutdown GPIO
  arm: dts: imx53-qsb-hdmi: Disable panel instead of deleting node
  arm64: dts: imx8mp: Fix TC9595 input clock on DH i.MX8M Plus DHCOM SoM
  arm64: dts: freescale: imx8mm-verdin: Fix GPU speed

56bf7334