• Ingo Molnar's avatar
    mm/rmap, migration: Make rmap_walk_anon() and try_to_unmap_anon() more scalable · 4fc3f1d6
    Ingo Molnar authored
    rmap_walk_anon() and try_to_unmap_anon() appears to be too
    careful about locking the anon vma: while it needs protection
    against anon vma list modifications, it does not need exclusive
    access to the list itself.
    
    Transforming this exclusive lock to a read-locked rwsem removes
    a global lock from the hot path of page-migration intense
    threaded workloads which can cause pathological performance like
    this:
    
        96.43%        process 0  [kernel.kallsyms]  [k] perf_trace_sched_switch
                      |
                      --- perf_trace_sched_switch
                          __schedule
                          schedule
                          schedule_preempt_disabled
                          __mutex_lock_common.isra.6
                          __mutex_lock_slowpath
                          mutex_lock
                         |
                         |--50.61%-- rmap_walk
                         |          move_to_new_page
                         |          migrate_pages
                         |          migrate_misplaced_page
                         |          __do_numa_page.isra.69
                         |          handle_pte_fault
                         |          handle_mm_fault
                         |          __do_page_fault
                         |          do_page_fault
                         |          page_fault
                         |          __memset_sse2
                         |          |
                         |           --100.00%-- worker_thread
                         |                     |
                         |                      --100.00%-- start_thread
                         |
                          --49.39%-- page_lock_anon_vma
                                    try_to_unmap_anon
                                    try_to_unmap
                                    migrate_pages
                                    migrate_misplaced_page
                                    __do_numa_page.isra.69
                                    handle_pte_fault
                                    handle_mm_fault
                                    __do_page_fault
                                    do_page_fault
                                    page_fault
                                    __memset_sse2
                                    |
                                     --100.00%-- worker_thread
                                               start_thread
    
    With this change applied the profile is now nicely flat
    and there's no anon-vma related scheduling/blocking.
    
    Rename anon_vma_[un]lock() => anon_vma_[un]lock_write(),
    to make it clearer that it's an exclusive write-lock in
    that case - suggested by Rik van Riel.
    Suggested-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
    Cc: Paul Turner <pjt@google.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
    Cc: Christoph Lameter <cl@linux.com>
    Cc: Rik van Riel <riel@redhat.com>
    Cc: Mel Gorman <mgorman@suse.de>
    Cc: Andrea Arcangeli <aarcange@redhat.com>
    Cc: Johannes Weiner <hannes@cmpxchg.org>
    Cc: Hugh Dickins <hughd@google.com>
    Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
    Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
    4fc3f1d6
huge_memory.c 65.7 KB