• Aneesh Kumar K.V's avatar
    powerpc/mm: don't do tlbie for updatepp request with NO HPTE fault · aefa5688
    Aneesh Kumar K.V authored
    upatepp can get called for a nohpte fault when we find from the linux
    page table that the translation was hashed before. In that case
    we are sure that there is no existing translation, hence we could
    avoid doing tlbie.
    
    We could possibly race with a parallel fault filling the TLB. But
    that should be ok because updatepp is only ever relaxing permissions.
    We also look at linux pte permission bits when filling hash pte
    permission bits. We also hold the linux pte busy bits while
    inserting/updating a hashpte entry, hence a paralle update of
    linux pte is not possible. On the other hand mprotect involves
    ptep_modify_prot_start which cause a hpte invalidate and not updatepp.
    
    Performance number:
    We use randbox_access_bench written by Anton.
    
    Kernel with THP disabled and smaller hash page table size.
    
        86.60%  random_access_b  [kernel.kallsyms]                [k] .native_hpte_updatepp
         2.10%  random_access_b  random_access_bench              [.] doit
         1.99%  random_access_b  [kernel.kallsyms]                [k] .do_raw_spin_lock
         1.85%  random_access_b  [kernel.kallsyms]                [k] .native_hpte_insert
         1.26%  random_access_b  [kernel.kallsyms]                [k] .native_flush_hash_range
         1.18%  random_access_b  [kernel.kallsyms]                [k] .__delay
         0.69%  random_access_b  [kernel.kallsyms]                [k] .native_hpte_remove
         0.37%  random_access_b  [kernel.kallsyms]                [k] .clear_user_page
         0.34%  random_access_b  [kernel.kallsyms]                [k] .__hash_page_64K
         0.32%  random_access_b  [kernel.kallsyms]                [k] fast_exception_return
         0.30%  random_access_b  [kernel.kallsyms]                [k] .hash_page_mm
    
    With Fix:
    
        27.54%  random_access_b  random_access_bench              [.] doit
        22.90%  random_access_b  [kernel.kallsyms]                [k] .native_hpte_insert
         5.76%  random_access_b  [kernel.kallsyms]                [k] .native_hpte_remove
         5.20%  random_access_b  [kernel.kallsyms]                [k] fast_exception_return
         5.12%  random_access_b  [kernel.kallsyms]                [k] .__hash_page_64K
         4.80%  random_access_b  [kernel.kallsyms]                [k] .hash_page_mm
         3.31%  random_access_b  [kernel.kallsyms]                [k] data_access_common
         1.84%  random_access_b  [kernel.kallsyms]                [k] .trace_hardirqs_on_caller
    Signed-off-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
    Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
    aefa5688
spu_base.c 19.2 KB