• Mel Gorman's avatar
    mm: use paravirt friendly ops for NUMA hinting ptes · 29c77870
    Mel Gorman authored
    David Vrabel identified a regression when using automatic NUMA balancing
    under Xen whereby page table entries were getting corrupted due to the
    use of native PTE operations.  Quoting him
    
    	Xen PV guest page tables require that their entries use machine
    	addresses if the preset bit (_PAGE_PRESENT) is set, and (for
    	successful migration) non-present PTEs must use pseudo-physical
    	addresses.  This is because on migration MFNs in present PTEs are
    	translated to PFNs (canonicalised) so they may be translated back
    	to the new MFN in the destination domain (uncanonicalised).
    
    	pte_mknonnuma(), pmd_mknonnuma(), pte_mknuma() and pmd_mknuma()
    	set and clear the _PAGE_PRESENT bit using pte_set_flags(),
    	pte_clear_flags(), etc.
    
    	In a Xen PV guest, these functions must translate MFNs to PFNs
    	when clearing _PAGE_PRESENT and translate PFNs to MFNs when setting
    	_PAGE_PRESENT.
    
    His suggested fix converted p[te|md]_[set|clear]_flags to using
    paravirt-friendly ops but this is overkill.  He suggested an alternative
    of using p[te|md]_modify in the NUMA page table operations but this is
    does more work than necessary and would require looking up a VMA for
    protections.
    
    This patch modifies the NUMA page table operations to use paravirt
    friendly operations to set/clear the flags of interest.  Unfortunately
    this will take a performance hit when updating the PTEs on
    CONFIG_PARAVIRT but I do not see a way around it that does not break
    Xen.
    Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
    Acked-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
    Tested-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
    Cc: Ingo Molnar <mingo@kernel.org>
    Cc: Peter Anvin <hpa@zytor.com>
    Cc: Fengguang Wu <fengguang.wu@intel.com>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Steven Noonan <steven@uplinklabs.net>
    Cc: Rik van Riel <riel@redhat.com>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Andrea Arcangeli <aarcange@redhat.com>
    Cc: Dave Hansen <dave.hansen@intel.com>
    Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
    Cc: Cyrill Gorcunov <gorcunov@gmail.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    29c77870
pgtable.h 21.3 KB