• Xu Kuohai's avatar
    bpf, arm64: Implement bpf_arch_text_poke() for arm64 · b2ad54e1
    Xu Kuohai authored
    Implement bpf_arch_text_poke() for arm64, so bpf prog or bpf trampoline
    can be patched with it.
    
    When the target address is NULL, the original instruction is patched to
    a NOP.
    
    When the target address and the source address are within the branch
    range, the original instruction is patched to a bl instruction to the
    target address directly.
    
    To support attaching bpf trampoline to both regular kernel function and
    bpf prog, we follow the ftrace patchsite way for bpf prog. That is, two
    instructions are inserted at the beginning of bpf prog, the first one
    saves the return address to x9, and the second is a nop which will be
    patched to a bl instruction when a bpf trampoline is attached.
    
    However, when a bpf trampoline is attached to bpf prog, the distance
    between target address and source address may exceed 128MB, the maximum
    branch range, because bpf trampoline and bpf prog are allocated
    separately with vmalloc. So long jump should be handled.
    
    When a bpf prog is constructed, a plt pointing to empty trampoline
    dummy_tramp is placed at the end:
    
            bpf_prog:
                    mov x9, lr
                    nop // patchsite
                    ...
                    ret
    
            plt:
                    ldr x10, target
                    br x10
            target:
                    .quad dummy_tramp // plt target
    
    This is also the state when no trampoline is attached.
    
    When a short-jump bpf trampoline is attached, the patchsite is patched to
    a bl instruction to the trampoline directly:
    
            bpf_prog:
                    mov x9, lr
                    bl <short-jump bpf trampoline address> // patchsite
                    ...
                    ret
    
            plt:
                    ldr x10, target
                    br x10
            target:
                    .quad dummy_tramp // plt target
    
    When a long-jump bpf trampoline is attached, the plt target is filled with
    the trampoline address and the patchsite is patched to a bl instruction to
    the plt:
    
            bpf_prog:
                    mov x9, lr
                    bl plt // patchsite
                    ...
                    ret
    
            plt:
                    ldr x10, target
                    br x10
            target:
                    .quad <long-jump bpf trampoline address>
    
    dummy_tramp is used to prevent another CPU from jumping to an unknown
    location during the patching process, making the patching process easier.
    
    The patching process is as follows:
    
    1. when neither the old address or the new address is a long jump, the
       patchsite is replaced with a bl to the new address, or nop if the new
       address is NULL;
    
    2. when the old address is not long jump but the new one is, the
       branch target address is written to plt first, then the patchsite
       is replaced with a bl instruction to the plt;
    
    3. when the old address is long jump but the new one is not, the address
       of dummy_tramp is written to plt first, then the patchsite is replaced
       with a bl to the new address, or a nop if the new address is NULL;
    
    4. when both the old address and the new address are long jump, the
       new address is written to plt and the patchsite is not changed.
    Signed-off-by: default avatarXu Kuohai <xukuohai@huawei.com>
    Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
    Reviewed-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
    Reviewed-by: default avatarKP Singh <kpsingh@kernel.org>
    Reviewed-by: default avatarJean-Philippe Brucker <jean-philippe@linaro.org>
    Acked-by: default avatarSong Liu <songliubraving@fb.com>
    Link: https://lore.kernel.org/bpf/20220711150823.2128542-4-xukuohai@huawei.com
    b2ad54e1
bpf_jit_comp.c 47.8 KB