• Lorenzo Stoakes's avatar
    mm/gup: disallow FOLL_LONGTERM GUP-nonfast writing to file-backed mappings · 8ac26843
    Lorenzo Stoakes authored
    Writing to file-backed mappings which require folio dirty tracking using
    GUP is a fundamentally broken operation, as kernel write access to GUP
    mappings do not adhere to the semantics expected by a file system.
    
    A GUP caller uses the direct mapping to access the folio, which does not
    cause write notify to trigger, nor does it enforce that the caller marks
    the folio dirty.
    
    The problem arises when, after an initial write to the folio, writeback
    results in the folio being cleaned and then the caller, via the GUP
    interface, writes to the folio again.
    
    As a result of the use of this secondary, direct, mapping to the folio no
    write notify will occur, and if the caller does mark the folio dirty, this
    will be done so unexpectedly.
    
    For example, consider the following scenario:-
    
    1. A folio is written to via GUP which write-faults the memory, notifying
       the file system and dirtying the folio.
    2. Later, writeback is triggered, resulting in the folio being cleaned and
       the PTE being marked read-only.
    3. The GUP caller writes to the folio, as it is mapped read/write via the
       direct mapping.
    4. The GUP caller, now done with the page, unpins it and sets it dirty
       (though it does not have to).
    
    This results in both data being written to a folio without writenotify,
    and the folio being dirtied unexpectedly (if the caller decides to do so).
    
    This issue was first reported by Jan Kara [1] in 2018, where the problem
    resulted in file system crashes.
    
    This is only relevant when the mappings are file-backed and the underlying
    file system requires folio dirty tracking.  File systems which do not,
    such as shmem or hugetlb, are not at risk and therefore can be written to
    without issue.
    
    Unfortunately this limitation of GUP has been present for some time and
    requires future rework of the GUP API in order to provide correct write
    access to such mappings.
    
    However, for the time being we introduce this check to prevent the most
    egregious case of this occurring, use of the FOLL_LONGTERM pin.
    
    These mappings are considerably more likely to be written to after folios
    are cleaned and thus simply must not be permitted to do so.
    
    This patch changes only the slow-path GUP functions, a following patch
    adapts the GUP-fast path along similar lines.
    
    [1] https://lore.kernel.org/linux-mm/20180103100430.GE4911@quack2.suse.cz/
    
    Link: https://lkml.kernel.org/r/7282506742d2390c125949c2f9894722750bb68a.1683235180.git.lstoakes@gmail.comSigned-off-by: default avatarLorenzo Stoakes <lstoakes@gmail.com>
    Suggested-by: default avatarJason Gunthorpe <jgg@nvidia.com>
    Reviewed-by: default avatarJohn Hubbard <jhubbard@nvidia.com>
    Reviewed-by: default avatarMika Penttilä <mpenttil@redhat.com>
    Reviewed-by: default avatarJan Kara <jack@suse.cz>
    Reviewed-by: default avatarJason Gunthorpe <jgg@nvidia.com>
    Acked-by: default avatarDavid Hildenbrand <david@redhat.com>
    Cc: Kirill A . Shutemov <kirill@shutemov.name>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    8ac26843
gup.c 90 KB