1. 17 May, 2023 28 commits
  2. 16 May, 2023 7 commits
  3. 15 May, 2023 5 commits
    • Yafang Shao's avatar
      bpf: Fix memleak due to fentry attach failure · 108598c3
      Yafang Shao authored
      If it fails to attach fentry, the allocated bpf trampoline image will be
      left in the system. That can be verified by checking /proc/kallsyms.
      
      This meamleak can be verified by a simple bpf program as follows:
      
        SEC("fentry/trap_init")
        int fentry_run()
        {
            return 0;
        }
      
      It will fail to attach trap_init because this function is freed after
      kernel init, and then we can find the trampoline image is left in the
      system by checking /proc/kallsyms.
      
        $ tail /proc/kallsyms
        ffffffffc0613000 t bpf_trampoline_6442453466_1  [bpf]
        ffffffffc06c3000 t bpf_trampoline_6442453466_1  [bpf]
      
        $ bpftool btf dump file /sys/kernel/btf/vmlinux | grep "FUNC 'trap_init'"
        [2522] FUNC 'trap_init' type_id=119 linkage=static
      
        $ echo $((6442453466 & 0x7fffffff))
        2522
      
      Note that there are two left bpf trampoline images, that is because the
      libbpf will fallback to raw tracepoint if -EINVAL is returned.
      
      Fixes: e21aa341 ("bpf: Fix fexit trampoline.")
      Signed-off-by: default avatarYafang Shao <laoar.shao@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarSong Liu <song@kernel.org>
      Cc: Jiri Olsa <olsajiri@gmail.com>
      Link: https://lore.kernel.org/bpf/20230515130849.57502-2-laoar.shao@gmail.com
      108598c3
    • Yafang Shao's avatar
      bpf: Remove bpf trampoline selector · 47e79cbe
      Yafang Shao authored
      After commit e21aa341 ("bpf: Fix fexit trampoline."), the selector is only
      used to indicate how many times the bpf trampoline image are updated and been
      displayed in the trampoline ksym name. After the trampoline is freed, the
      selector will start from 0 again. So the selector is a useless value to the
      user. We can remove it.
      
      If the user want to check whether the bpf trampoline image has been updated
      or not, the user can compare the address. Each time the trampoline image is
      updated, the address will change consequently. Jiri also pointed out another
      issue that perf is still using the old name "bpf_trampoline_%lu", so this
      change can fix the issue in perf.
      
      Fixes: e21aa341 ("bpf: Fix fexit trampoline.")
      Signed-off-by: default avatarYafang Shao <laoar.shao@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarSong Liu <song@kernel.org>
      Cc: Jiri Olsa <olsajiri@gmail.com>
      Link: https://lore.kernel.org/bpf/ZFvOOlrmHiY9AgXE@krava
      Link: https://lore.kernel.org/bpf/20230515130849.57502-3-laoar.shao@gmail.com
      47e79cbe
    • Florent Revest's avatar
      bpf, arm64: Support struct arguments in the BPF trampoline · 90564f1e
      Florent Revest authored
      This extends the BPF trampoline JIT to support attachment to functions
      that take small structures (up to 128bit) as argument. This is trivially
      achieved by saving/restoring a number of "argument registers" rather
      than a number of arguments.
      
      The AAPCS64 section 6.8.2 describes the parameter passing ABI.
      "Composite types" (like C structs) below 16 bytes (as enforced by the
      BPF verifier) are provided as part of the 8 argument registers as
      explained in the section C.12.
      Signed-off-by: default avatarFlorent Revest <revest@chromium.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarYonghong Song <yhs@fb.com>
      Acked-by: default avatarXu Kuohai <xukuohai@huawei.com>
      Link: https://lore.kernel.org/bpf/20230511140507.514888-1-revest@chromium.org
      90564f1e
    • Alan Maguire's avatar
      bpftool: JIT limited misreported as negative value on aarch64 · 04cb8453
      Alan Maguire authored
      On aarch64, "bpftool feature" reports an incorrect BPF JIT limit:
      
      $ sudo /sbin/bpftool feature
      Scanning system configuration...
      bpf() syscall restricted to privileged users
      JIT compiler is enabled
      JIT compiler hardening is disabled
      JIT compiler kallsyms exports are enabled for root
      skipping kernel config, can't open file: No such file or directory
      Global memory limit for JIT compiler for unprivileged users is -201326592 bytes
      
      This is because /proc/sys/net/core/bpf_jit_limit reports
      
      $ sudo cat /proc/sys/net/core/bpf_jit_limit
      68169519595520
      
      ...and an int is assumed in read_procfs().  Change read_procfs()
      to return a long to avoid negative value reporting.
      
      Fixes: 7a4522bb ("tools: bpftool: add probes for /proc/ eBPF parameters")
      Reported-by: default avatarNicky Veitch <nicky.veitch@oracle.com>
      Signed-off-by: default avatarAlan Maguire <alan.maguire@oracle.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Acked-by: default avatarQuentin Monnet <quentin@isovalent.com>
      Link: https://lore.kernel.org/bpf/20230512113134.58996-1-alan.maguire@oracle.com
      04cb8453
    • Andrii Nakryiko's avatar
      bpf: fix calculation of subseq_idx during precision backtracking · d84b1a67
      Andrii Nakryiko authored
      Subsequent instruction index (subseq_idx) is an index of an instruction
      that was verified/executed by verifier after the currently processed
      instruction. It is maintained during precision backtracking processing
      and is used to detect various subprog calling conditions.
      
      This patch fixes the bug with incorrectly resetting subseq_idx to -1
      when going from child state to parent state during backtracking. If we
      don't maintain correct subseq_idx we can misidentify subprog calls
      leading to precision tracking bugs.
      
      One such case was triggered by test_global_funcs/global_func9 test where
      global subprog call happened to be the very last instruction in parent
      state, leading to subseq_idx==-1, triggering WARN_ONCE:
      
        [   36.045754] verifier backtracking bug
        [   36.045764] WARNING: CPU: 13 PID: 2073 at kernel/bpf/verifier.c:3503 __mark_chain_precision+0xcc6/0xde0
        [   36.046819] Modules linked in: aesni_intel(E) crypto_simd(E) cryptd(E) kvm_intel(E) kvm(E) irqbypass(E) i2c_piix4(E) serio_raw(E) i2c_core(E) crc32c_intel)
        [   36.048040] CPU: 13 PID: 2073 Comm: test_progs Tainted: G        W  OE      6.3.0-07976-g4d585f48-dirty #972
        [   36.048783] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
        [   36.049648] RIP: 0010:__mark_chain_precision+0xcc6/0xde0
        [   36.050038] Code: 3d 82 c6 05 bb 35 32 02 01 e8 66 21 ec ff 0f 0b b8 f2 ff ff ff e9 30 f5 ff ff 48 c7 c7 f3 61 3d 82 4c 89 0c 24 e8 4a 21 ec ff <0f> 0b 4c0
      
      With the fix precision tracking across multiple states works correctly now:
      
      mark_precise: frame0: last_idx 45 first_idx 38 subseq_idx -1
      mark_precise: frame0: regs=r8 stack= before 44: (61) r7 = *(u32 *)(r10 -4)
      mark_precise: frame0: regs=r8 stack= before 43: (85) call pc+41
      mark_precise: frame0: regs=r8 stack= before 42: (07) r1 += -48
      mark_precise: frame0: regs=r8 stack= before 41: (bf) r1 = r10
      mark_precise: frame0: regs=r8 stack= before 40: (63) *(u32 *)(r10 -48) = r1
      mark_precise: frame0: regs=r8 stack= before 39: (b4) w1 = 0
      mark_precise: frame0: regs=r8 stack= before 38: (85) call pc+38
      mark_precise: frame0: parent state regs=r8 stack=:  R0_w=scalar() R1_w=map_value(off=4,ks=4,vs=8,imm=0) R6=1 R7_w=scalar() R8_r=P0 R10=fpm
      mark_precise: frame0: last_idx 36 first_idx 28 subseq_idx 38
      mark_precise: frame0: regs=r8 stack= before 36: (18) r1 = 0xffff888104f2ed14
      mark_precise: frame0: regs=r8 stack= before 35: (85) call pc+33
      mark_precise: frame0: regs=r8 stack= before 33: (18) r1 = 0xffff888104f2ed10
      mark_precise: frame0: regs=r8 stack= before 32: (85) call pc+36
      mark_precise: frame0: regs=r8 stack= before 31: (07) r1 += -4
      mark_precise: frame0: regs=r8 stack= before 30: (bf) r1 = r10
      mark_precise: frame0: regs=r8 stack= before 29: (63) *(u32 *)(r10 -4) = r7
      mark_precise: frame0: regs=r8 stack= before 28: (4c) w7 |= w0
      mark_precise: frame0: parent state regs=r8 stack=:  R0_rw=scalar() R6=1 R7_rw=scalar() R8_rw=P0 R10=fp0 fp-48_r=mmmmmmmm
      mark_precise: frame0: last_idx 27 first_idx 16 subseq_idx 28
      mark_precise: frame0: regs=r8 stack= before 27: (85) call pc+31
      mark_precise: frame0: regs=r8 stack= before 26: (b7) r1 = 0
      mark_precise: frame0: regs=r8 stack= before 25: (b7) r8 = 0
      
      Note how subseq_idx starts out as -1, then is preserved as 38 and then 28 as we
      go up the parent state chain.
      Reported-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Fixes: fde2a388 ("bpf: support precision propagation in the presence of subprogs")
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/r/20230515180710.1535018-1-andrii@kernel.orgSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      d84b1a67