• David Hildenbrand's avatar
    mm/gup: fix FOLL_FORCE COW security issue and remove FOLL_COW · 5535be30
    David Hildenbrand authored
    Ever since the Dirty COW (CVE-2016-5195) security issue happened, we know
    that FOLL_FORCE can be possibly dangerous, especially if there are races
    that can be exploited by user space.
    
    Right now, it would be sufficient to have some code that sets a PTE of a
    R/O-mapped shared page dirty, in order for it to erroneously become
    writable by FOLL_FORCE.  The implications of setting a write-protected PTE
    dirty might not be immediately obvious to everyone.
    
    And in fact ever since commit 9ae0f87d ("mm/shmem: unconditionally set
    pte dirty in mfill_atomic_install_pte"), we can use UFFDIO_CONTINUE to map
    a shmem page R/O while marking the pte dirty.  This can be used by
    unprivileged user space to modify tmpfs/shmem file content even if the
    user does not have write permissions to the file, and to bypass memfd
    write sealing -- Dirty COW restricted to tmpfs/shmem (CVE-2022-2590).
    
    To fix such security issues for good, the insight is that we really only
    need that fancy retry logic (FOLL_COW) for COW mappings that are not
    writable (!VM_WRITE).  And in a COW mapping, we really only broke COW if
    we have an exclusive anonymous page mapped.  If we have something else
    mapped, or the mapped anonymous page might be shared (!PageAnonExclusive),
    we have to trigger a write fault to break COW.  If we don't find an
    exclusive anonymous page when we retry, we have to trigger COW breaking
    once again because something intervened.
    
    Let's move away from this mandatory-retry + dirty handling and rely on our
    PageAnonExclusive() flag for making a similar decision, to use the same
    COW logic as in other kernel parts here as well.  In case we stumble over
    a PTE in a COW mapping that does not map an exclusive anonymous page, COW
    was not properly broken and we have to trigger a fake write-fault to break
    COW.
    
    Just like we do in can_change_pte_writable() added via commit 64fe24a3
    ("mm/mprotect: try avoiding write faults for exclusive anonymous pages
    when changing protection") and commit 76aefad6 ("mm/mprotect: fix
    soft-dirty check in can_change_pte_writable()"), take care of softdirty
    and uffd-wp manually.
    
    For example, a write() via /proc/self/mem to a uffd-wp-protected range has
    to fail instead of silently granting write access and bypassing the
    userspace fault handler.  Note that FOLL_FORCE is not only used for debug
    access, but also triggered by applications without debug intentions, for
    example, when pinning pages via RDMA.
    
    This fixes CVE-2022-2590. Note that only x86_64 and aarch64 are
    affected, because only those support CONFIG_HAVE_ARCH_USERFAULTFD_MINOR.
    
    Fortunately, FOLL_COW is no longer required to handle FOLL_FORCE. So
    let's just get rid of it.
    
    Thanks to Nadav Amit for pointing out that the pte_dirty() check in
    FOLL_FORCE code is problematic and might be exploitable.
    
    Note 1: We don't check for the PTE being dirty because it doesn't matter
    	for making a "was COWed" decision anymore, and whoever modifies the
    	page has to set the page dirty either way.
    
    Note 2: Kernels before extended uffd-wp support and before
    	PageAnonExclusive (< 5.19) can simply revert the problematic
    	commit instead and be safe regarding UFFDIO_CONTINUE. A backport to
    	v5.19 requires minor adjustments due to lack of
    	vma_soft_dirty_enabled().
    
    Link: https://lkml.kernel.org/r/20220809205640.70916-1-david@redhat.com
    Fixes: 9ae0f87d ("mm/shmem: unconditionally set pte dirty in mfill_atomic_install_pte")
    Signed-off-by: default avatarDavid Hildenbrand <david@redhat.com>
    Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Cc: Axel Rasmussen <axelrasmussen@google.com>
    Cc: Nadav Amit <nadav.amit@gmail.com>
    Cc: Peter Xu <peterx@redhat.com>
    Cc: Hugh Dickins <hughd@google.com>
    Cc: Andrea Arcangeli <aarcange@redhat.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Cc: John Hubbard <jhubbard@nvidia.com>
    Cc: Jason Gunthorpe <jgg@nvidia.com>
    Cc: David Laight <David.Laight@ACULAB.COM>
    Cc: <stable@vger.kernel.org>	[5.16]
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    5535be30
gup.c 90.6 KB