• Mel Gorman's avatar
    mm: remove remaining references to NUMA hinting bits and helpers · 21d9ee3e
    Mel Gorman authored
    This patch removes the NUMA PTE bits and associated helpers.  As a
    side-effect it increases the maximum possible swap space on x86-64.
    
    One potential source of problems is races between the marking of PTEs
    PROT_NONE, NUMA hinting faults and migration.  It must be guaranteed that
    a PTE being protected is not faulted in parallel, seen as a pte_none and
    corrupting memory.  The base case is safe but transhuge has problems in
    the past due to an different migration mechanism and a dependance on page
    lock to serialise migrations and warrants a closer look.
    
    task_work hinting update			parallel fault
    ------------------------			--------------
    change_pmd_range
      change_huge_pmd
        __pmd_trans_huge_lock
          pmdp_get_and_clear
    						__handle_mm_fault
    						pmd_none
    						  do_huge_pmd_anonymous_page
    						  read? pmd_lock blocks until hinting complete, fail !pmd_none test
    						  write? __do_huge_pmd_anonymous_page acquires pmd_lock, checks pmd_none
          pmd_modify
          set_pmd_at
    
    task_work hinting update			parallel migration
    ------------------------			------------------
    change_pmd_range
      change_huge_pmd
        __pmd_trans_huge_lock
          pmdp_get_and_clear
    						__handle_mm_fault
    						  do_huge_pmd_numa_page
    						    migrate_misplaced_transhuge_page
    						    pmd_lock waits for updates to complete, recheck pmd_same
          pmd_modify
          set_pmd_at
    
    Both of those are safe and the case where a transhuge page is inserted
    during a protection update is unchanged.  The case where two processes try
    migrating at the same time is unchanged by this series so should still be
    ok.  I could not find a case where we are accidentally depending on the
    PTE not being cleared and flushed.  If one is missed, it'll manifest as
    corruption problems that start triggering shortly after this series is
    merged and only happen when NUMA balancing is enabled.
    Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
    Tested-by: default avatarSasha Levin <sasha.levin@oracle.com>
    Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
    Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
    Cc: Dave Jones <davej@redhat.com>
    Cc: Hugh Dickins <hughd@google.com>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Paul Mackerras <paulus@samba.org>
    Cc: Rik van Riel <riel@redhat.com>
    Cc: Mark Brown <broonie@kernel.org>
    Cc: Stephen Rothwell <sfr@canb.auug.org.au>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    21d9ee3e
pgtable.h 21.2 KB