• Andrew Morton's avatar
    [PATCH] Make nonlinear mappings fully pageable · e1513512
    Andrew Morton authored
    This patch requires arch support.  I have patches for ia32, ppc64 and x86_64.
    Other architectures will break.  It is a five-minute fix.  See
    
    	http://mail.nl.linux.org/linux-mm/2003-03/msg00174.html
    
    for implementation details.
    
    
    Patch from: Ingo Molnar <mingo@elte.hu>
    
    the attached patch, against BK-curr, is a preparation to make
    remap_file_pages() usable on swappable vmas as well.  When 'swapping out'
    shared-named mappings the page offset is written into the pte.
    
    it takes one bit from the swap-type bits, otherwise it does not change the
    pte layout - so it should be easy to adapt any other architecture to this
    change as well.  (this patch does not introduce the protection-bits-in-pte
    approach used in my previous patch.)
    
    On 32-bit pte sizes with an effective usable pte range of 29 bits, this
    limits mmap()-able file size to 4096 * 2^29 == 2 TBs.  If the usable range is
    smaller, then the maximum mmap() size is reduced as well.  The worst-case i
    found (PPC) was 2 hw-reserved bits in the swap-case, which limits us to 1 TB
    filesize.  Is there any other hw that has an even worse ratio of sw-usable
    pte bits?
    
    this mmap() limit can be eliminated by simply not converting the swapped out
    pte to a file-pte, but clearning it and falling back to the linear mapping
    upon swapin.  This puts the limit into remap_file_pages() alone, but i really
    hope no-one wants to use remap_file_pages() on a 32-bit platform, on a larger
    than 1-2 TB file.
    
    sys_remap_file_pages() is now enforcing the 'prot' parameter to be zero.
    This restriction might be lifted in the future - i really hope we can have
    more flexible remapping once 64-bit platforms are commonplace - eg.  things
    like memory debuggers could just use the permission bits directly, instead of
    creating many small vmas.
    
    i've tested swappable nonlinear ptes and they are swapped out/in
    correctly.
    
    some other changes in -A0 relative to 2.5.63-BK:
    
     - slightly smarter TLB flushing in install_page(). This is still only a
       stupid helper functions - a more efficient 'walk the pagecache tree
       and pagetable at once and use TLB-gather' implementation is preferred.
    
     - cleanup: pass on pgprot_t instead of unsigned long prot.
    
     - some sanity checks to make sure file_pte() rules are followed.
    
     - do not reduce the vma's default protection to PROT_NONE when using
       remap_file_pages() on it. With swappable ptes this is now safe.
    e1513512
swapfile.c 37.5 KB