[PATCH] Make nonlinear mappings fully pageable
This patch requires arch support. I have patches for ia32, ppc64 and x86_64. Other architectures will break. It is a five-minute fix. See http://mail.nl.linux.org/linux-mm/2003-03/msg00174.html for implementation details. Patch from: Ingo Molnar <mingo@elte.hu> the attached patch, against BK-curr, is a preparation to make remap_file_pages() usable on swappable vmas as well. When 'swapping out' shared-named mappings the page offset is written into the pte. it takes one bit from the swap-type bits, otherwise it does not change the pte layout - so it should be easy to adapt any other architecture to this change as well. (this patch does not introduce the protection-bits-in-pte approach used in my previous patch.) On 32-bit pte sizes with an effective usable pte range of 29 bits, this limits mmap()-able file size to 4096 * 2^29 == 2 TBs. If the usable range is smaller, then the maximum mmap() size is reduced as well. The worst-case i found (PPC) was 2 hw-reserved bits in the swap-case, which limits us to 1 TB filesize. Is there any other hw that has an even worse ratio of sw-usable pte bits? this mmap() limit can be eliminated by simply not converting the swapped out pte to a file-pte, but clearning it and falling back to the linear mapping upon swapin. This puts the limit into remap_file_pages() alone, but i really hope no-one wants to use remap_file_pages() on a 32-bit platform, on a larger than 1-2 TB file. sys_remap_file_pages() is now enforcing the 'prot' parameter to be zero. This restriction might be lifted in the future - i really hope we can have more flexible remapping once 64-bit platforms are commonplace - eg. things like memory debuggers could just use the permission bits directly, instead of creating many small vmas. i've tested swappable nonlinear ptes and they are swapped out/in correctly. some other changes in -A0 relative to 2.5.63-BK: - slightly smarter TLB flushing in install_page(). This is still only a stupid helper functions - a more efficient 'walk the pagecache tree and pagetable at once and use TLB-gather' implementation is preferred. - cleanup: pass on pgprot_t instead of unsigned long prot. - some sanity checks to make sure file_pte() rules are followed. - do not reduce the vma's default protection to PROT_NONE when using remap_file_pages() on it. With swappable ptes this is now safe.
Showing
Please register or sign in to comment