• Lee Schermerhorn's avatar
    mlock: downgrade mmap sem while populating mlocked regions · 8edb08ca
    Lee Schermerhorn authored
    We need to hold the mmap_sem for write to initiatate mlock()/munlock()
    because we may need to merge/split vmas.  However, this can lead to very
    long lock hold times attempting to fault in a large memory region to mlock
    it into memory.  This can hold off other faults against the mm
    [multithreaded tasks] and other scans of the mm, such as via /proc.  To
    alleviate this, downgrade the mmap_sem to read mode during the population
    of the region for locking.  This is especially the case if we need to
    reclaim memory to lock down the region.  We [probably?] don't need to do
    this for unlocking as all of the pages should be resident--they're already
    mlocked.
    
    Now, the caller's of the mlock functions [mlock_fixup() and
    mlock_vma_pages_range()] expect the mmap_sem to be returned in write mode.
     Changing all callers appears to be way too much effort at this point.
    So, restore write mode before returning.  Note that this opens a window
    where the mmap list could change in a multithreaded process.  So, at least
    for mlock_fixup(), where we could be called in a loop over multiple vmas,
    we check that a vma still exists at the start address and that vma still
    covers the page range [start,end).  If not, we return an error, -EAGAIN,
    and let the caller deal with it.
    
    Return -EAGAIN from mlock_vma_pages_range() function and mlock_fixup() if
    the vma at 'start' disappears or changes so that the page range
    [start,end) is no longer contained in the vma.  Again, let the caller deal
    with it.  Looks like only sys_remap_file_pages() [via mmap_region()]
    should actually care.
    
    With this patch, I no longer see processes like ps(1) blocked for seconds
    or minutes at a time waiting for a large [multiple gigabyte] region to be
    locked down.  However, I occassionally see delays while unlocking or
    unmapping a large mlocked region.  Should we also downgrade the mmap_sem
    for the unlock path?
    Signed-off-by: default avatarLee Schermerhorn <lee.schermerhorn@hp.com>
    Signed-off-by: default avatarRik van Riel <riel@redhat.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    8edb08ca
mlock.c 16.2 KB