• David Hildenbrand's avatar
    mm/memory: optimize fork() with PTE-mapped THP · f8d93776
    David Hildenbrand authored
    Let's implement PTE batching when consecutive (present) PTEs map
    consecutive pages of the same large folio, and all other PTE bits besides
    the PFNs are equal.
    
    We will optimize folio_pte_batch() separately, to ignore selected PTE
    bits.  This patch is based on work by Ryan Roberts.
    
    Use __always_inline for __copy_present_ptes() and keep the handling for
    single PTEs completely separate from the multi-PTE case: we really want
    the compiler to optimize for the single-PTE case with small folios, to not
    degrade performance.
    
    Note that PTE batching will never exceed a single page table and will
    always stay within VMA boundaries.
    
    Further, processing PTE-mapped THP that maybe pinned and have
    PageAnonExclusive set on at least one subpage should work as expected, but
    there is room for improvement: We will repeatedly (1) detect a PTE batch
    (2) detect that we have to copy a page (3) fall back and allocate a single
    page to copy a single page.  For now we won't care as pinned pages are a
    corner case, and we should rather look into maintaining only a single
    PageAnonExclusive bit for large folios.
    
    Link: https://lkml.kernel.org/r/20240129124649.189745-14-david@redhat.comSigned-off-by: default avatarDavid Hildenbrand <david@redhat.com>
    Reviewed-by: default avatarRyan Roberts <ryan.roberts@arm.com>
    Reviewed-by: default avatarMike Rapoport (IBM) <rppt@kernel.org>
    Cc: Albert Ou <aou@eecs.berkeley.edu>
    Cc: Alexander Gordeev <agordeev@linux.ibm.com>
    Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
    Cc: Aneesh Kumar K.V <aneesh.kumar@kernel.org>
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
    Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
    Cc: David S. Miller <davem@davemloft.net>
    Cc: Dinh Nguyen <dinguyen@kernel.org>
    Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
    Cc: Heiko Carstens <hca@linux.ibm.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Michael Ellerman <mpe@ellerman.id.au>
    Cc: Naveen N. Rao <naveen.n.rao@linux.ibm.com>
    Cc: Nicholas Piggin <npiggin@gmail.com>
    Cc: Palmer Dabbelt <palmer@dabbelt.com>
    Cc: Paul Walmsley <paul.walmsley@sifive.com>
    Cc: Russell King (Oracle) <linux@armlinux.org.uk>
    Cc: Sven Schnelle <svens@linux.ibm.com>
    Cc: Vasily Gorbik <gor@linux.ibm.com>
    Cc: Will Deacon <will@kernel.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    f8d93776
memory.c 174 KB