• Andrew Morton's avatar
    [PATCH] re-slabify i386 pgd's and pmd's · 6beadb3b
    Andrew Morton authored
    From: William Lee Irwin III <wli@holomorphy.com>
    
    The original pgd/pmd slabification patches had a critical bug on
    non-PAE where both modifications of pgd entries to remove pagetables
    attached for non-PSE mappings back to a PSE state and modifications of
    pgd entries to attach pagetables to bring PSE mappings into a non-PSE
    state were not propagated to cached pgd's. PAE was immune to it owing
    to the shared kernel pmd.
    
    The following patch vs. 2.5.69 restores the slabification done to cache
    preconstructed pagetables with the proper propagation of conversions
    to and from PSE mappings to cached pgd's for the non-PAE case.
    
    This is an optimization to reduce the bitblitting overhead for spawning
    small tasks (for larger ones, bottom-level pagetable copies dominate)
    primarily on non-PAE; the PAE code change is largely to remove #ifdefs
    and to treat the two cases uniformly, though some positive but small
    performance improvement has been observed for PAE in one of mbligh's
    posts. The non-PAE performance improvement has been observed on a box
    running a script-heavy end-user workload as a large long-term profile
    hit count reduction for pgd_alloc() and relatives thereof.
    
    I would very much appreciate outside testers. Even though I've been
    able to verify this boots and runs properly and survives several cycles
    of restarting X on my non-PAE Thinkpad T21, that environment has never
    been able to reproduce the bug. Those with the proper graphics hardware
    to prod the affected codepaths into action are the ones best suited to
    verify proper functionality. There is also some locking introduced; if
    some performance verification on non-PAE SMP i386 targets (my SMP
    targets unfortunately all require PAE due to arch code dependencies)
    that also have the proper hardware could be done, that would help
    determine whether alternative locking schemes that competed against
    the one shown here are preferable (in particular, the ticket-based
    scheme mentioned in the comments).
    6beadb3b
init.c 14.3 KB