Commit c5541ba3 authored by David Hildenbrand's avatar David Hildenbrand Committed by Andrew Morton

mm: follow_pte() improvements

follow_pte() is now our main function to lookup PTEs in VM_PFNMAP/VM_IO
VMAs.  Let's perform some more sanity checks to make this exported
function harder to abuse.

Further, extend the doc a bit, it still focuses on the KVM use case with
MMU notifiers.  Drop the KVM+follow_pfn() comment, follow_pfn() is no
more, and we have other users nowadays.

Also extend the doc regarding refcounted pages and the interaction with
MMU notifiers.

KVM is one example that uses MMU notifiers and can deal with refcounted
pages properly.  VFIO is one example that doesn't use MMU notifiers, and
to prevent use-after-free, rejects refcounted pages: pfn_valid(pfn) &&
!PageReserved(pfn_to_page(pfn)).  Protection changes are less of a concern
for users like VFIO: the behavior is similar to longterm-pinning a page,
and getting the PTE protection changed afterwards.

The primary concern with refcounted pages is use-after-free, which callers
should be aware of.

Link: https://lkml.kernel.org/r/20240410155527.474777-4-david@redhat.comSigned-off-by: default avatarDavid Hildenbrand <david@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Fei Li <fei1.li@intel.com>
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Yonghua Huang <yonghua.huang@intel.com>
Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
parent 29ae7d96
...@@ -5933,15 +5933,21 @@ int __pmd_alloc(struct mm_struct *mm, pud_t *pud, unsigned long address) ...@@ -5933,15 +5933,21 @@ int __pmd_alloc(struct mm_struct *mm, pud_t *pud, unsigned long address)
* *
* On a successful return, the pointer to the PTE is stored in @ptepp; * On a successful return, the pointer to the PTE is stored in @ptepp;
* the corresponding lock is taken and its location is stored in @ptlp. * the corresponding lock is taken and its location is stored in @ptlp.
* The contents of the PTE are only stable until @ptlp is released; *
* any further use, if any, must be protected against invalidation * The contents of the PTE are only stable until @ptlp is released using
* with MMU notifiers. * pte_unmap_unlock(). This function will fail if the PTE is non-present.
* Present PTEs may include PTEs that map refcounted pages, such as
* anonymous folios in COW mappings.
*
* Callers must be careful when relying on PTE content after
* pte_unmap_unlock(). Especially if the PTE maps a refcounted page,
* callers must protect against invalidation with MMU notifiers; otherwise
* access to the PFN at a later point in time can trigger use-after-free.
* *
* Only IO mappings and raw PFN mappings are allowed. The mmap semaphore * Only IO mappings and raw PFN mappings are allowed. The mmap semaphore
* should be taken for read. * should be taken for read.
* *
* KVM uses this function. While it is arguably less bad than the historic * This function must not be used to modify PTE content.
* ``follow_pfn``, it is not a good general-purpose API.
* *
* Return: zero on success, -ve otherwise. * Return: zero on success, -ve otherwise.
*/ */
...@@ -5955,6 +5961,10 @@ int follow_pte(struct vm_area_struct *vma, unsigned long address, ...@@ -5955,6 +5961,10 @@ int follow_pte(struct vm_area_struct *vma, unsigned long address,
pmd_t *pmd; pmd_t *pmd;
pte_t *ptep; pte_t *ptep;
mmap_assert_locked(mm);
if (unlikely(address < vma->vm_start || address >= vma->vm_end))
goto out;
if (!(vma->vm_flags & (VM_IO | VM_PFNMAP))) if (!(vma->vm_flags & (VM_IO | VM_PFNMAP)))
goto out; goto out;
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment