• Axel Rasmussen's avatar
    userfaultfd: add UFFDIO_CONTINUE ioctl · f6191471
    Axel Rasmussen authored
    This ioctl is how userspace ought to resolve "minor" userfaults.  The
    idea is, userspace is notified that a minor fault has occurred.  It
    might change the contents of the page using its second non-UFFD mapping,
    or not.  Then, it calls UFFDIO_CONTINUE to tell the kernel "I have
    ensured the page contents are correct, carry on setting up the mapping".
    
    Note that it doesn't make much sense to use UFFDIO_{COPY,ZEROPAGE} for
    MINOR registered VMAs.  ZEROPAGE maps the VMA to the zero page; but in
    the minor fault case, we already have some pre-existing underlying page.
    Likewise, UFFDIO_COPY isn't useful if we have a second non-UFFD mapping.
    We'd just use memcpy() or similar instead.
    
    It turns out hugetlb_mcopy_atomic_pte() already does very close to what
    we want, if an existing page is provided via `struct page **pagep`.  We
    already special-case the behavior a bit for the UFFDIO_ZEROPAGE case, so
    just extend that design: add an enum for the three modes of operation,
    and make the small adjustments needed for the MCOPY_ATOMIC_CONTINUE
    case.  (Basically, look up the existing page, and avoid adding the
    existing page to the page cache or calling set_page_huge_active() on
    it.)
    
    Link: https://lkml.kernel.org/r/20210301222728.176417-5-axelrasmussen@google.comSigned-off-by: default avatarAxel Rasmussen <axelrasmussen@google.com>
    Reviewed-by: default avatarPeter Xu <peterx@redhat.com>
    Cc: Adam Ruprecht <ruprecht@google.com>
    Cc: Alexander Viro <viro@zeniv.linux.org.uk>
    Cc: Alexey Dobriyan <adobriyan@gmail.com>
    Cc: Andrea Arcangeli <aarcange@redhat.com>
    Cc: Anshuman Khandual <anshuman.khandual@arm.com>
    Cc: Cannon Matthews <cannonmatthews@google.com>
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: Chinwen Chang <chinwen.chang@mediatek.com>
    Cc: David Rientjes <rientjes@google.com>
    Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
    Cc: Huang Ying <ying.huang@intel.com>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Jann Horn <jannh@google.com>
    Cc: Jerome Glisse <jglisse@redhat.com>
    Cc: Kirill A. Shutemov <kirill@shutemov.name>
    Cc: Lokesh Gidra <lokeshgidra@google.com>
    Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
    Cc: Michael Ellerman <mpe@ellerman.id.au>
    Cc: "Michal Koutn" <mkoutny@suse.com>
    Cc: Michel Lespinasse <walken@google.com>
    Cc: Mike Kravetz <mike.kravetz@oracle.com>
    Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
    Cc: Mina Almasry <almasrymina@google.com>
    Cc: Nicholas Piggin <npiggin@gmail.com>
    Cc: Oliver Upton <oupton@google.com>
    Cc: Shaohua Li <shli@fb.com>
    Cc: Shawn Anastasio <shawn@anastas.io>
    Cc: Steven Price <steven.price@arm.com>
    Cc: Steven Rostedt <rostedt@goodmis.org>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    f6191471
hugetlb.c 164 KB