Commit 57e5d4f2 authored by Martin Cracauer's avatar Martin Cracauer Committed by Linus Torvalds

userfaultfd: wp: UFFDIO_REGISTER_MODE_WP documentation update

Add documentation about the write protection support.

[peterx@redhat.com: rewrite in rst format; fixups here and there]
Signed-off-by: default avatarMartin Cracauer <cracauer@cons.org>
Signed-off-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
Signed-off-by: default avatarPeter Xu <peterx@redhat.com>
Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
Reviewed-by: default avatarJerome Glisse <jglisse@redhat.com>
Reviewed-by: default avatarMike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Bobby Powers <bobbypowers@gmail.com>
Cc: Brian Geffon <bgeffon@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Denis Plotnikov <dplotnikov@virtuozzo.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Kirill A . Shutemov" <kirill@shutemov.name>
Cc: Marty McFadden <mcfadden8@llnl.gov>
Cc: Maya Gokhale <gokhale2@llnl.gov>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Shaohua Li <shli@fb.com>
Link: http://lkml.kernel.org/r/20200220163112.11409-17-peterx@redhat.comSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
parent 23080e27
......@@ -108,6 +108,57 @@ UFFDIO_COPY. They're atomic as in guaranteeing that nothing can see an
half copied page since it'll keep userfaulting until the copy has
finished.
Notes:
- If you requested UFFDIO_REGISTER_MODE_MISSING when registering then
you must provide some kind of page in your thread after reading from
the uffd. You must provide either UFFDIO_COPY or UFFDIO_ZEROPAGE.
The normal behavior of the OS automatically providing a zero page on
an annonymous mmaping is not in place.
- None of the page-delivering ioctls default to the range that you
registered with. You must fill in all fields for the appropriate
ioctl struct including the range.
- You get the address of the access that triggered the missing page
event out of a struct uffd_msg that you read in the thread from the
uffd. You can supply as many pages as you want with UFFDIO_COPY or
UFFDIO_ZEROPAGE. Keep in mind that unless you used DONTWAKE then
the first of any of those IOCTLs wakes up the faulting thread.
- Be sure to test for all errors including (pollfd[0].revents &
POLLERR). This can happen, e.g. when ranges supplied were
incorrect.
Write Protect Notifications
---------------------------
This is equivalent to (but faster than) using mprotect and a SIGSEGV
signal handler.
Firstly you need to register a range with UFFDIO_REGISTER_MODE_WP.
Instead of using mprotect(2) you use ioctl(uffd, UFFDIO_WRITEPROTECT,
struct *uffdio_writeprotect) while mode = UFFDIO_WRITEPROTECT_MODE_WP
in the struct passed in. The range does not default to and does not
have to be identical to the range you registered with. You can write
protect as many ranges as you like (inside the registered range).
Then, in the thread reading from uffd the struct will have
msg.arg.pagefault.flags & UFFD_PAGEFAULT_FLAG_WP set. Now you send
ioctl(uffd, UFFDIO_WRITEPROTECT, struct *uffdio_writeprotect) again
while pagefault.mode does not have UFFDIO_WRITEPROTECT_MODE_WP set.
This wakes up the thread which will continue to run with writes. This
allows you to do the bookkeeping about the write in the uffd reading
thread before the ioctl.
If you registered with both UFFDIO_REGISTER_MODE_MISSING and
UFFDIO_REGISTER_MODE_WP then you need to think about the sequence in
which you supply a page and undo write protect. Note that there is a
difference between writes into a WP area and into a !WP area. The
former will have UFFD_PAGEFAULT_FLAG_WP set, the latter
UFFD_PAGEFAULT_FLAG_WRITE. The latter did not fail on protection but
you still need to supply a page when UFFDIO_REGISTER_MODE_MISSING was
used.
QEMU/KVM
========
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment