• David Hildenbrand's avatar
    mm/autonuma: use can_change_(pte|pmd)_writable() to replace savedwrite · 6a56ccbc
    David Hildenbrand authored
    commit b191f9b1 ("mm: numa: preserve PTE write permissions across a
    NUMA hinting fault") added remembering write permissions using ordinary
    pte_write() for PROT_NONE mapped pages to avoid write faults when
    remapping the page !PROT_NONE on NUMA hinting faults.
    
    That commit noted:
    
        The patch looks hacky but the alternatives looked worse. The tidest was
        to rewalk the page tables after a hinting fault but it was more complex
        than this approach and the performance was worse. It's not generally
        safe to just mark the page writable during the fault if it's a write
        fault as it may have been read-only for COW so that approach was
        discarded.
    
    Later, commit 288bc549 ("mm/autonuma: let architecture override how
    the write bit should be stashed in a protnone pte.") introduced a family
    of savedwrite PTE functions that didn't necessarily improve the whole
    situation.
    
    One confusing thing is that nowadays, if a page is pte_protnone()
    and pte_savedwrite() then also pte_write() is true. Another source of
    confusion is that there is only a single pte_mk_savedwrite() call in the
    kernel. All other write-protection code seems to silently rely on
    pte_wrprotect().
    
    Ever since PageAnonExclusive was introduced and we started using it in
    mprotect context via commit 64fe24a3 ("mm/mprotect: try avoiding write
    faults for exclusive anonymous pages when changing protection"), we do
    have machinery in place to avoid write faults when changing protection,
    which is exactly what we want to do here.
    
    Let's similarly do what ordinary mprotect() does nowadays when upgrading
    write permissions and reuse can_change_pte_writable() and
    can_change_pmd_writable() to detect if we can upgrade PTE permissions to be
    writable.
    
    For anonymous pages there should be absolutely no change: if an
    anonymous page is not exclusive, it could not have been mapped writable --
    because only exclusive anonymous pages can be mapped writable.
    
    However, there *might* be a change for writable shared mappings that
    require writenotify: if they are not dirty, we cannot map them writable.
    While it might not matter in practice, we'd need a different way to
    identify whether writenotify is actually required -- and ordinary mprotect
    would benefit from that as well.
    
    Note that we don't optimize for the actual migration case:
    (1) When migration succeeds the new PTE will not be writable because the
        source PTE was not writable (protnone); in the future we
        might just optimize that case similarly by reusing
        can_change_pte_writable()/can_change_pmd_writable() when removing
        migration PTEs.
    (2) When migration fails, we'd have to recalculate the "writable" flag
        because we temporarily dropped the PT lock; for now keep it simple and
        set "writable=false".
    
    We'll remove all savedwrite leftovers next.
    
    Link: https://lkml.kernel.org/r/20221108174652.198904-6-david@redhat.comSigned-off-by: default avatarDavid Hildenbrand <david@redhat.com>
    Cc: Andrea Arcangeli <aarcange@redhat.com>
    Cc: Anshuman Khandual <anshuman.khandual@arm.com>
    Cc: Dave Chinner <david@fromorbit.com>
    Cc: Hugh Dickins <hughd@google.com>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Cc: Michael Ellerman <mpe@ellerman.id.au>
    Cc: Mike Rapoport <rppt@kernel.org>
    Cc: Nadav Amit <namit@vmware.com>
    Cc: Nicholas Piggin <npiggin@gmail.com>
    Cc: Peter Xu <peterx@redhat.com>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    6a56ccbc
memory.c 159 KB