• David Hildenbrand's avatar
    mm/memory: optimize unmap/zap with PTE-mapped THP · 10ebac4f
    David Hildenbrand authored
    Similar to how we optimized fork(), let's implement PTE batching when
    consecutive (present) PTEs map consecutive pages of the same large folio.
    
    Most infrastructure we need for batching (mmu gather, rmap) is already
    there.  We only have to add get_and_clear_full_ptes() and
    clear_full_ptes().  Similarly, extend zap_install_uffd_wp_if_needed() to
    process a PTE range.
    
    We won't bother sanity-checking the mapcount of all subpages, but only
    check the mapcount of the first subpage we process.  If there is a real
    problem hiding somewhere, we can trigger it simply by using small folios,
    or when we zap single pages of a large folio.  Ideally, we had that check
    in rmap code (including for delayed rmap), but then we cannot print the
    PTE.  Let's keep it simple for now.  If we ever have a cheap
    folio_mapcount(), we might just want to check for underflows there.
    
    To keep small folios as fast as possible force inlining of a specialized
    variant using __always_inline with nr=1.
    
    Link: https://lkml.kernel.org/r/20240214204435.167852-11-david@redhat.comSigned-off-by: default avatarDavid Hildenbrand <david@redhat.com>
    Reviewed-by: default avatarRyan Roberts <ryan.roberts@arm.com>
    Cc: Alexander Gordeev <agordeev@linux.ibm.com>
    Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
    Cc: Arnd Bergmann <arnd@arndb.de>
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
    Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
    Cc: Heiko Carstens <hca@linux.ibm.com>
    Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
    Cc: Michael Ellerman <mpe@ellerman.id.au>
    Cc: Michal Hocko <mhocko@suse.com>
    Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com>
    Cc: Nicholas Piggin <npiggin@gmail.com>
    Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
    Cc: Sven Schnelle <svens@linux.ibm.com>
    Cc: Vasily Gorbik <gor@linux.ibm.com>
    Cc: Will Deacon <will@kernel.org>
    Cc: Yin Fengwei <fengwei.yin@intel.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    10ebac4f
memory.c 177 KB