Commit e71e2ace authored by Peter Collingbourne's avatar Peter Collingbourne Committed by Linus Torvalds

userfaultfd: do not untag user pointers

Patch series "userfaultfd: do not untag user pointers", v5.

If a user program uses userfaultfd on ranges of heap memory, it may end
up passing a tagged pointer to the kernel in the range.start field of
the UFFDIO_REGISTER ioctl.  This can happen when using an MTE-capable
allocator, or on Android if using the Tagged Pointers feature for MTE
readiness [1].

When a fault subsequently occurs, the tag is stripped from the fault
address returned to the application in the fault.address field of struct
uffd_msg.  However, from the application's perspective, the tagged
address *is* the memory address, so if the application is unaware of
memory tags, it may get confused by receiving an address that is, from
its point of view, outside of the bounds of the allocation.  We observed
this behavior in the kselftest for userfaultfd [2] but other
applications could have the same problem.

Address this by not untagging pointers passed to the userfaultfd ioctls.
Instead, let the system call fail.  Also change the kselftest to use
mmap so that it doesn't encounter this problem.

[1] https://source.android.com/devices/tech/debug/tagged-pointers
[2] tools/testing/selftests/vm/userfaultfd.c

This patch (of 2):

Do not untag pointers passed to the userfaultfd ioctls.  Instead, let
the system call fail.  This will provide an early indication of problems
with tag-unaware userspace code instead of letting the code get confused
later, and is consistent with how we decided to handle brk/mmap/mremap
in commit dcde2373 ("mm: Avoid creating virtual address aliases in
brk()/mmap()/mremap()"), as well as being consistent with the existing
tagged address ABI documentation relating to how ioctl arguments are
handled.

The code change is a revert of commit 7d032574 ("userfaultfd: untag
user pointers") plus some fixups to some additional calls to
validate_range that have appeared since then.

[1] https://source.android.com/devices/tech/debug/tagged-pointers
[2] tools/testing/selftests/vm/userfaultfd.c

Link: https://lkml.kernel.org/r/20210714195437.118982-1-pcc@google.com
Link: https://lkml.kernel.org/r/20210714195437.118982-2-pcc@google.com
Link: https://linux-review.googlesource.com/id/I761aa9f0344454c482b83fcfcce547db0a25501b
Fixes: 63f0c603 ("arm64: Introduce prctl() options to control the tagged user addresses ABI")
Signed-off-by: default avatarPeter Collingbourne <pcc@google.com>
Reviewed-by: default avatarAndrey Konovalov <andreyknvl@gmail.com>
Reviewed-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
Cc: Alistair Delva <adelva@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Dave Martin <Dave.Martin@arm.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: Mitch Phillips <mitchp@google.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: William McVicker <willmcvicker@google.com>
Cc: <stable@vger.kernel.org>	[5.4]
Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
parent 704f4cba
...@@ -45,14 +45,24 @@ how the user addresses are used by the kernel: ...@@ -45,14 +45,24 @@ how the user addresses are used by the kernel:
1. User addresses not accessed by the kernel but used for address space 1. User addresses not accessed by the kernel but used for address space
management (e.g. ``mprotect()``, ``madvise()``). The use of valid management (e.g. ``mprotect()``, ``madvise()``). The use of valid
tagged pointers in this context is allowed with the exception of tagged pointers in this context is allowed with these exceptions:
``brk()``, ``mmap()`` and the ``new_address`` argument to
``mremap()`` as these have the potential to alias with existing - ``brk()``, ``mmap()`` and the ``new_address`` argument to
user addresses. ``mremap()`` as these have the potential to alias with existing
user addresses.
NOTE: This behaviour changed in v5.6 and so some earlier kernels may
incorrectly accept valid tagged pointers for the ``brk()``, NOTE: This behaviour changed in v5.6 and so some earlier kernels may
``mmap()`` and ``mremap()`` system calls. incorrectly accept valid tagged pointers for the ``brk()``,
``mmap()`` and ``mremap()`` system calls.
- The ``range.start``, ``start`` and ``dst`` arguments to the
``UFFDIO_*`` ``ioctl()``s used on a file descriptor obtained from
``userfaultfd()``, as fault addresses subsequently obtained by reading
the file descriptor will be untagged, which may otherwise confuse
tag-unaware programs.
NOTE: This behaviour changed in v5.14 and so some earlier kernels may
incorrectly accept valid tagged pointers for this system call.
2. User addresses accessed by the kernel (e.g. ``write()``). This ABI 2. User addresses accessed by the kernel (e.g. ``write()``). This ABI
relaxation is disabled by default and the application thread needs to relaxation is disabled by default and the application thread needs to
......
...@@ -1236,23 +1236,21 @@ static __always_inline void wake_userfault(struct userfaultfd_ctx *ctx, ...@@ -1236,23 +1236,21 @@ static __always_inline void wake_userfault(struct userfaultfd_ctx *ctx,
} }
static __always_inline int validate_range(struct mm_struct *mm, static __always_inline int validate_range(struct mm_struct *mm,
__u64 *start, __u64 len) __u64 start, __u64 len)
{ {
__u64 task_size = mm->task_size; __u64 task_size = mm->task_size;
*start = untagged_addr(*start); if (start & ~PAGE_MASK)
if (*start & ~PAGE_MASK)
return -EINVAL; return -EINVAL;
if (len & ~PAGE_MASK) if (len & ~PAGE_MASK)
return -EINVAL; return -EINVAL;
if (!len) if (!len)
return -EINVAL; return -EINVAL;
if (*start < mmap_min_addr) if (start < mmap_min_addr)
return -EINVAL; return -EINVAL;
if (*start >= task_size) if (start >= task_size)
return -EINVAL; return -EINVAL;
if (len > task_size - *start) if (len > task_size - start)
return -EINVAL; return -EINVAL;
return 0; return 0;
} }
...@@ -1316,7 +1314,7 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, ...@@ -1316,7 +1314,7 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx,
vm_flags |= VM_UFFD_MINOR; vm_flags |= VM_UFFD_MINOR;
} }
ret = validate_range(mm, &uffdio_register.range.start, ret = validate_range(mm, uffdio_register.range.start,
uffdio_register.range.len); uffdio_register.range.len);
if (ret) if (ret)
goto out; goto out;
...@@ -1522,7 +1520,7 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx, ...@@ -1522,7 +1520,7 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx,
if (copy_from_user(&uffdio_unregister, buf, sizeof(uffdio_unregister))) if (copy_from_user(&uffdio_unregister, buf, sizeof(uffdio_unregister)))
goto out; goto out;
ret = validate_range(mm, &uffdio_unregister.start, ret = validate_range(mm, uffdio_unregister.start,
uffdio_unregister.len); uffdio_unregister.len);
if (ret) if (ret)
goto out; goto out;
...@@ -1671,7 +1669,7 @@ static int userfaultfd_wake(struct userfaultfd_ctx *ctx, ...@@ -1671,7 +1669,7 @@ static int userfaultfd_wake(struct userfaultfd_ctx *ctx,
if (copy_from_user(&uffdio_wake, buf, sizeof(uffdio_wake))) if (copy_from_user(&uffdio_wake, buf, sizeof(uffdio_wake)))
goto out; goto out;
ret = validate_range(ctx->mm, &uffdio_wake.start, uffdio_wake.len); ret = validate_range(ctx->mm, uffdio_wake.start, uffdio_wake.len);
if (ret) if (ret)
goto out; goto out;
...@@ -1711,7 +1709,7 @@ static int userfaultfd_copy(struct userfaultfd_ctx *ctx, ...@@ -1711,7 +1709,7 @@ static int userfaultfd_copy(struct userfaultfd_ctx *ctx,
sizeof(uffdio_copy)-sizeof(__s64))) sizeof(uffdio_copy)-sizeof(__s64)))
goto out; goto out;
ret = validate_range(ctx->mm, &uffdio_copy.dst, uffdio_copy.len); ret = validate_range(ctx->mm, uffdio_copy.dst, uffdio_copy.len);
if (ret) if (ret)
goto out; goto out;
/* /*
...@@ -1768,7 +1766,7 @@ static int userfaultfd_zeropage(struct userfaultfd_ctx *ctx, ...@@ -1768,7 +1766,7 @@ static int userfaultfd_zeropage(struct userfaultfd_ctx *ctx,
sizeof(uffdio_zeropage)-sizeof(__s64))) sizeof(uffdio_zeropage)-sizeof(__s64)))
goto out; goto out;
ret = validate_range(ctx->mm, &uffdio_zeropage.range.start, ret = validate_range(ctx->mm, uffdio_zeropage.range.start,
uffdio_zeropage.range.len); uffdio_zeropage.range.len);
if (ret) if (ret)
goto out; goto out;
...@@ -1818,7 +1816,7 @@ static int userfaultfd_writeprotect(struct userfaultfd_ctx *ctx, ...@@ -1818,7 +1816,7 @@ static int userfaultfd_writeprotect(struct userfaultfd_ctx *ctx,
sizeof(struct uffdio_writeprotect))) sizeof(struct uffdio_writeprotect)))
return -EFAULT; return -EFAULT;
ret = validate_range(ctx->mm, &uffdio_wp.range.start, ret = validate_range(ctx->mm, uffdio_wp.range.start,
uffdio_wp.range.len); uffdio_wp.range.len);
if (ret) if (ret)
return ret; return ret;
...@@ -1866,7 +1864,7 @@ static int userfaultfd_continue(struct userfaultfd_ctx *ctx, unsigned long arg) ...@@ -1866,7 +1864,7 @@ static int userfaultfd_continue(struct userfaultfd_ctx *ctx, unsigned long arg)
sizeof(uffdio_continue) - (sizeof(__s64)))) sizeof(uffdio_continue) - (sizeof(__s64))))
goto out; goto out;
ret = validate_range(ctx->mm, &uffdio_continue.range.start, ret = validate_range(ctx->mm, uffdio_continue.range.start,
uffdio_continue.range.len); uffdio_continue.range.len);
if (ret) if (ret)
goto out; goto out;
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment