- 25 Mar, 2024 4 commits
-
-
Puranjay Mohan authored
Implement a helper function to check if an instruction is addr_space_cast from as(0) to as(1). Use this helper in the x86 JIT. Other JITs can use this helper when they add support for this instruction. Signed-off-by: Puranjay Mohan <puranjay12@gmail.com> Link: https://lore.kernel.org/r/20240324183226.29674-1-puranjay12@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Andrii Nakryiko authored
get_kernel_nofault() (or, rather, underlying copy_from_kernel_nofault()) is not free and it does pop up in performance profiles when kprobes are heavily utilized with CONFIG_X86_KERNEL_IBT=y config. Let's avoid using it if we know that fentry_ip - 4 can't cross page boundary. We do that by masking lowest 12 bits and checking if they are Another benefit (and actually what caused a closer look at this part of code) is that now LBR record is (typically) not wasted on copy_from_kernel_nofault() call and code, which helps tools like retsnoop that grab LBR records from inside BPF code in kretprobes. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Link: https://lore.kernel.org/bpf/20240319212013.1046779-1-andrii@kernel.org
-
Geliang Tang authored
To simplify the code, use BPF selftests helper start_server() in bpf_tcp_ca.c instead of open-coding it. This helper is defined in network_helpers.c, and exported in network_helpers.h, which is already included in bpf_tcp_ca.c. Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/9926a79118db27dd6d91c4854db011c599cabd0e.1711331517.git.tanggeliang@kylinos.cn
-
Yonghong Song authored
There is a difference between kernel uapi bpf.h and tools uapi bpf.h. There is no functionality difference, but let us sync properly to make it easy for later bpf.h update. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240325033842.1693553-1-yonghong.song@linux.dev
-
- 22 Mar, 2024 3 commits
-
-
Yonghong Song authored
The new sec_def specifies sk_skb program type with BPF_SK_SKB_VERDICT attachment type. This way, libbpf will set expected_attach_type properly for the program. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240319175412.2941149-1-yonghong.song@linux.dev
-
Jiri Olsa authored
Some distros seem to enable the -fcf-protection=branch by default, which breaks our setup on first instruction of uprobe trigger functions and place there endbr64 instruction. Marking them with nocf_check attribute to skip that. Ignoring unknown attribute warning in gcc for bench objects, because nocf_check can be used only when -fcf-protection=branch is enabled, otherwise we get a warning and break compilation. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240322134936.1075395-1-jolsa@kernel.org
-
Alan Maguire authored
With glibc 2.28, selftests compilation fails for benchs/bench_trigger.c: benchs/bench_trigger.c: In function ‘inc_counter’: benchs/bench_trigger.c:25:23: error: implicit declaration of function ‘gettid’; did you mean ‘getgid’? [-Werror=implicit-function-declaration] 25 | tid = gettid(); | ^~~~~~ | getgid cc1: all warnings being treated as errors It appears support for the gettid() wrapper is variable across glibc versions, so may be safer to use syscall(SYS_gettid) instead. Fixes: 520fad2e ("selftests/bpf: scale benchmark counting by using per-CPU counters") Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240322095728.95671-1-alan.maguire@oracle.com
-
- 21 Mar, 2024 2 commits
-
-
Harishankar Vishwanathan authored
In case of GE/GT/SGE/JST instructions, regs_refine_cond_op() reuses the logic that does analysis of LE/LT/SLE/SLT instructions. This commit avoids the use of a goto to perform the reuse. Signed-off-by: Harishankar Vishwanathan <harishankar.vishwanathan@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240321002955.808604-1-harishankar.vishwanathan@gmail.com
-
Quentin Monnet authored
Bpftool's Makefile uses $(HOST_CFLAGS) to build the bootstrap version of bpftool, in order to pick the flags for the host (where we run the bootstrap version) and not for the target system (where we plan to run the full bpftool binary). But we pass too much information through this variable. In particular, we set HOST_CFLAGS by copying most of the $(CFLAGS); but we do this after the feature detection for bpftool, which means that $(CFLAGS), hence $(HOST_CFLAGS), contain all macro definitions for using the different optional features. For example, -DHAVE_LLVM_SUPPORT may be passed to the $(HOST_CFLAGS), even though the LLVM disassembler is not used in the bootstrap version, and the related library may even be missing for the host architecture. A similar thing happens with the $(LDFLAGS), that we use unchanged for linking the bootstrap version even though they may contains flags to link against additional libraries. To address the $(HOST_CFLAGS) issue, we move the definition of $(HOST_CFLAGS) earlier in the Makefile, before the $(CFLAGS) update resulting from the feature probing - none of which being relevant to the bootstrap version. To clean up the $(LDFLAGS) for the bootstrap version, we introduce a dedicated $(HOST_LDFLAGS) variable that we base on $(LDFLAGS), before the feature probing as well. On my setup, the following macro and libraries are removed from the compiler invocation to build bpftool after this patch: -DUSE_LIBCAP -DHAVE_LLVM_SUPPORT -I/usr/lib/llvm-17/include -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -lLLVM-17 -L/usr/lib/llvm-17/lib Another advantage of cleaning up these flags is that displaying available features with "bpftool version" becomes more accurate for the bootstrap bpftool, and no longer reflects the features detected (and available only) for the final binary. Cc: Jean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: Quentin Monnet <qmo@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Message-ID: <20240320014103.45641-1-qmo@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
- 20 Mar, 2024 9 commits
-
-
Andrii Nakryiko authored
When benchmarking with multiple threads (-pN, where N>1), we start contending on single atomic counter that both BPF trigger benchmarks are using, as well as "baseline" tests in user space (trig-base and trig-uprobe-base benchmarks). As such, we start bottlenecking on something completely irrelevant to benchmark at hand. Scale counting up by using per-CPU counters on BPF side. On use space side we do the next best thing: hash thread ID to approximate per-CPU behavior. It seems to work quite well in practice. To demonstrate the difference, I ran three benchmarks with 1, 2, 4, 8, 16, and 32 threads: - trig-uprobe-base (no syscalls, pure tight counting loop in user-space); - trig-base (get_pgid() syscall, atomic counter in user-space); - trig-fentry (syscall to trigger fentry program, atomic uncontended per-CPU counter on BPF side). Command used: for b in uprobe-base base fentry; do \ for p in 1 2 4 8 16 32; do \ printf "%-11s %2d: %s\n" $b $p \ "$(sudo ./bench -w2 -d5 -a -p$p trig-$b | tail -n1 | cut -d'(' -f1 | cut -d' ' -f3-)"; \ done; \ done Before these changes, aggregate throughput across all threads doesn't scale well with number of threads, it actually even falls sharply for uprobe-base due to a very high contention: uprobe-base 1: 138.998 ± 0.650M/s uprobe-base 2: 70.526 ± 1.147M/s uprobe-base 4: 63.114 ± 0.302M/s uprobe-base 8: 54.177 ± 0.138M/s uprobe-base 16: 45.439 ± 0.057M/s uprobe-base 32: 37.163 ± 0.242M/s base 1: 16.940 ± 0.182M/s base 2: 19.231 ± 0.105M/s base 4: 21.479 ± 0.038M/s base 8: 23.030 ± 0.037M/s base 16: 22.034 ± 0.004M/s base 32: 18.152 ± 0.013M/s fentry 1: 14.794 ± 0.054M/s fentry 2: 17.341 ± 0.055M/s fentry 4: 23.792 ± 0.024M/s fentry 8: 21.557 ± 0.047M/s fentry 16: 21.121 ± 0.004M/s fentry 32: 17.067 ± 0.023M/s After these changes, we see almost perfect linear scaling, as expected. The sub-linear scaling when going from 8 to 16 threads is interesting and consistent on my test machine, but I haven't investigated what is causing it this peculiar slowdown (across all benchmarks, could be due to hyperthreading effects, not sure). uprobe-base 1: 139.980 ± 0.648M/s uprobe-base 2: 270.244 ± 0.379M/s uprobe-base 4: 532.044 ± 1.519M/s uprobe-base 8: 1004.571 ± 3.174M/s uprobe-base 16: 1720.098 ± 0.744M/s uprobe-base 32: 3506.659 ± 8.549M/s base 1: 16.869 ± 0.071M/s base 2: 33.007 ± 0.092M/s base 4: 64.670 ± 0.203M/s base 8: 121.969 ± 0.210M/s base 16: 207.832 ± 0.112M/s base 32: 424.227 ± 1.477M/s fentry 1: 14.777 ± 0.087M/s fentry 2: 28.575 ± 0.146M/s fentry 4: 56.234 ± 0.176M/s fentry 8: 106.095 ± 0.385M/s fentry 16: 181.440 ± 0.032M/s fentry 32: 369.131 ± 0.693M/s Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Message-ID: <20240315213329.1161589-1-andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
Quentin Monnet authored
Commit d510296d ("bpftool: Use syscall/loader program in "prog load" and "gen skeleton" command.") added new files to the list of objects to compile in order to build the bootstrap version of bpftool. As far as I can tell, these objects are unnecessary and were added by mistake; maybe a draft version intended to add support for loading loader programs from the bootstrap version. Anyway, we can remove these object files from the list to make the bootstrap bpftool binary a tad smaller and faster to build. Fixes: d510296d ("bpftool: Use syscall/loader program in "prog load" and "gen skeleton" command.") Signed-off-by: Quentin Monnet <qmo@kernel.org> Message-ID: <20240320013457.44808-1-qmo@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
Quentin Monnet authored
When trying to load the pid_iter BPF program used to iterate over the PIDs of the processes holding file descriptors to BPF links, we would unconditionally silence libbpf in order to keep the output clean if the kernel does not support iterators and loading fails. Although this is the desirable behaviour in most cases, this may hide bugs in the pid_iter program that prevent it from loading, and it makes it hard to debug such load failures, even in "debug" mode. Instead, it makes more sense to print libbpf's logs when we pass the -d|--debug flag to bpftool, so that users get the logs to investigate failures without having to edit bpftool's source code. Signed-off-by: Quentin Monnet <qmo@kernel.org> Message-ID: <20240320012241.42991-1-qmo@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
Alexei Starovoitov authored
Andrii Nakryiko says: ==================== BPF raw tracepoint support for BPF cookie Add ability to specify and retrieve BPF cookie for raw tracepoint programs. Both BTF-aware (SEC("tp_btf")) and non-BTF-aware (SEC("raw_tp")) are supported, as they are exactly the same at runtime. This issue recently came up in production use cases, where custom tried to switch from slower classic tracepoints to raw tracepoints and ran into this limitation. Luckily, it's not that hard to support this for raw_tp programs. v2->v3: - s/bpf_raw_tp_open/bpf_raw_tracepoint_open_opts/ (Alexei, Eduard); v1->v2: - fixed type definition for stubs of bpf_probe_{register,unregister}; - added __u32 :u32 and aligned raw_tp fields (Jiri); - added Stanislav's ack. ==================== Link: https://lore.kernel.org/r/20240319233852.1977493-1-andrii@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Andrii Nakryiko authored
Add test validating BPF cookie can be passed during raw_tp/tp_btf attachment and can be retried at runtime with bpf_get_attach_cookie() helper. Acked-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Message-ID: <20240319233852.1977493-6-andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
Andrii Nakryiko authored
Wire up BPF cookie passing or raw_tp and tp_btf programs, both in low-level and high-level APIs. Acked-by: Stanislav Fomichev <sdf@google.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Message-ID: <20240319233852.1977493-5-andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
Andrii Nakryiko authored
Wire up BPF cookie for raw tracepoint programs (both BTF and non-BTF aware variants). This brings them up to part w.r.t. BPF cookie usage with classic tracepoint and fentry/fexit programs. Acked-by: Stanislav Fomichev <sdf@google.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Message-ID: <20240319233852.1977493-4-andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
Andrii Nakryiko authored
Instead of passing prog as an argument to bpf_trace_runX() helpers, that are called from tracepoint triggering calls, store BPF link itself (struct bpf_raw_tp_link for raw tracepoints). This will allow to pass extra information like BPF cookie into raw tracepoint registration. Instead of replacing `struct bpf_prog *prog = __data;` with corresponding `struct bpf_raw_tp_link *link = __data;` assignment in `__bpf_trace_##call` I just passed `__data` through into underlying bpf_trace_runX() call. This works well because we implicitly cast `void *`, and it also avoids naming clashes with arguments coming from tracepoint's "proto" list. We could have run into the same problem with "prog", we just happened to not have a tracepoint that has "prog" input argument. We are less lucky with "link", as there are tracepoints using "link" argument name already. So instead of trying to avoid naming conflicts, let's just remove intermediate local variable. It doesn't hurt readibility, it's either way a bit of a maze of calls and macros, that requires careful reading. Acked-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Message-ID: <20240319233852.1977493-3-andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
Andrii Nakryiko authored
bpf_probe_register() and __bpf_probe_register() have identical signatures and bpf_probe_register() just redirect to __bpf_probe_register(). So get rid of this extra function call step to simplify following the source code. It has no difference at runtime due to inlining, of course. Acked-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Message-ID: <20240319233852.1977493-2-andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
- 19 Mar, 2024 8 commits
-
-
Alessandro Carminati (Red Hat) authored
In some systems, the netcat server can incur in delay to start listening. When this happens, the test can randomly fail in various points. This is an example error message: # ip gre none gso # encap 192.168.1.1 to 192.168.1.2, type gre, mac none len 2000 # test basic connectivity # Ncat: Connection refused. The issue stems from a race condition between the netcat client and server. The test author had addressed this problem by implementing a sleep, which I have removed in this patch. This patch introduces a function capable of sleeping for up to two seconds. However, it can terminate the waiting period early if the port is reported to be listening. Signed-off-by: Alessandro Carminati (Red Hat) <alessandro.carminati@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240314105911.213411-1-alessandro.carminati@gmail.com
-
Andrii Nakryiko authored
Yonghong Song says: ==================== current_pid_tgid() for all prog types Currently bpf_get_current_pid_tgid() is allowed in tracing, cgroup and sk_msg progs while bpf_get_ns_current_pid_tgid() is only allowed in tracing progs. We have an internal use case where for an application running in a container (with pid namespace), user wants to get the pid associated with the pid namespace in a cgroup bpf program. Besides cgroup, the only prog type, supporting bpf_get_current_pid_tgid() but not bpf_get_ns_current_pid_tgid(), is sk_msg. But actually both bpf_get_current_pid_tgid() and bpf_get_ns_current_pid_tgid() helpers do not reveal kernel internal data and there is no reason that they cannot be used in other program types. This patch just did this and enabled these two helpers for all program types. Patch 1 added the kernel support and patches 2-5 added the test for cgroup and sk_msg. Change logs: v1 -> v2: - allow bpf_get_[ns_]current_pid_tgid() for all prog types. - for network related selftests, using netns. ==================== Link: https://lore.kernel.org/r/20240315184849.2974556-1-yonghong.song@linux.devSigned-off-by: Andrii Nakryiko <andrii@kernel.org>
-
Yonghong Song authored
Add a sk_msg bpf program test where the program is running in a pid namespace. The test is successful: #165/4 ns_current_pid_tgid/new_ns_sk_msg:OK Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240315184915.2976718-1-yonghong.song@linux.dev
-
Yonghong Song authored
Add a cgroup bpf program test where the bpf program is running in a pid namespace. The test is successfully: #165/3 ns_current_pid_tgid/new_ns_cgrp:OK Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240315184910.2976522-1-yonghong.song@linux.dev
-
Yonghong Song authored
Refactor some functions in both user space code and bpf program as these functions are used by later cgroup/sk_msg tests. Another change is to mark tp program optional loading as later patches will use optional loading as well since they have quite different attachment and testing logic. There is no functionality change. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240315184904.2976123-1-yonghong.song@linux.dev
-
Yonghong Song authored
Replace CHECK in selftest ns_current_pid_tgid with recommended ASSERT_* style. I also shortened subtest name as the prefix of subtest name is covered by the test name already. This patch does fix a testing issue. Currently even if bss->user_{pid,tgid} is not correct, the test still passed since the clone func returns 0. I fixed it to return a non-zero value if bss->user_{pid,tgid} is incorrect. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20240315184859.2975543-1-yonghong.song@linux.dev
-
Yonghong Song authored
Currently bpf_get_current_pid_tgid() is allowed in tracing, cgroup and sk_msg progs while bpf_get_ns_current_pid_tgid() is only allowed in tracing progs. We have an internal use case where for an application running in a container (with pid namespace), user wants to get the pid associated with the pid namespace in a cgroup bpf program. Currently, cgroup bpf progs already allow bpf_get_current_pid_tgid(). Let us allow bpf_get_ns_current_pid_tgid() as well. With auditing the code, bpf_get_current_pid_tgid() is also used by sk_msg prog. But there are no side effect to expose these two helpers to all prog types since they do not reveal any kernel specific data. The detailed discussion is in [1]. So with this patch, both bpf_get_current_pid_tgid() and bpf_get_ns_current_pid_tgid() are put in bpf_base_func_proto(), making them available to all program types. [1] https://lore.kernel.org/bpf/20240307232659.1115872-1-yonghong.song@linux.dev/Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20240315184854.2975190-1-yonghong.song@linux.dev
-
Jesper Dangaard Brouer authored
The BPF map type LPM (Longest Prefix Match) is used heavily in production by multiple products that have BPF components. Perf data shows trie_lookup_elem() and longest_prefix_match() being part of kernels perf top. For every level in the LPM tree trie_lookup_elem() calls out to longest_prefix_match(). The compiler is free to inline this call, but chooses not to inline, because other slowpath callers (that can be invoked via syscall) exists like trie_update_elem(), trie_delete_elem() or trie_get_next_key(). bcc/tools/funccount -Ti 1 'trie_lookup_elem|longest_prefix_match.isra.0' FUNC COUNT trie_lookup_elem 664945 longest_prefix_match.isra.0 8101507 Observation on a single random machine shows a factor 12 between the two functions. Given an average of 12 levels in the trie being searched. This patch force inlining longest_prefix_match(), but only for the lookup fastpath to balance object instruction size. In production with AMD CPUs, measuring the function latency of 'trie_lookup_elem' (bcc/tools/funclatency) we are seeing an improvement function latency reduction 7-8% with this patch applied (to production kernels 6.6 and 6.1). Analyzing perf data, we can explain this rather large improvement due to reducing the overhead for AMD side-channel mitigation SRSO (Speculative Return Stack Overflow). Fixes: fb3bd914 ("x86/srso: Add a Speculative RAS Overflow mitigation") Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/bpf/171076828575.2141737.18370644069389889027.stgit@firesoul
-
- 18 Mar, 2024 4 commits
-
-
Christophe Leroy authored
arch_protect_bpf_trampoline() and alloc_new_pack() call set_memory_rox() which can fail, leading to unprotected memory. Take into account return from set_memory_rox() function and add __must_check flag to arch_protect_bpf_trampoline(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/fe1c163c83767fde5cab31d209a4a6be3ddb3a73.1710574353.git.christophe.leroy@csgroup.euSigned-off-by: Martin KaFai Lau <martin.lau@kernel.org>
-
Christophe Leroy authored
Last user of arch_unprotect_bpf_trampoline() was removed by commit 187e2af0 ("bpf: struct_ops supports more than one page for trampolines.") Remove arch_unprotect_bpf_trampoline() Reported-by: Daniel Borkmann <daniel@iogearbox.net> Fixes: 187e2af0 ("bpf: struct_ops supports more than one page for trampolines.") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Link: https://lore.kernel.org/r/42c635bb54d3af91db0f9b85d724c7c290069f67.1710574353.git.christophe.leroy@csgroup.euSigned-off-by: Martin KaFai Lau <martin.lau@kernel.org>
-
Mykyta Yatsenko authored
libbpf creates bpf_program/bpf_map structs for each program/map that user defines, but it allows to disable creating/loading those objects in kernel, in that case they won't have associated file descriptor (fd < 0). Such functionality is used for backward compatibility with some older kernels. Nothing prevents users from passing these maps or programs with no kernel counterpart to libbpf APIs. This change introduces explicit checks for kernel objects existence, aiming to improve visibility of those edge cases and provide meaningful warnings to users. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240318131808.95959-1-yatsenko@meta.com
-
Martin KaFai Lau authored
There is a "if (err)" check earlier, so the "if (err < 0)" check that this patch removing is unnecessary. It was my overlook when making adjustments to the bpf_struct_ops_prepare_trampoline() such that the caller does not have to worry about the new page when the function returns error. Fixes: 187e2af0 ("bpf: struct_ops supports more than one page for trampolines.") Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/bpf/20240315192112.2825039-1-martin.lau@linux.dev
-
- 15 Mar, 2024 4 commits
-
-
Colin Ian King authored
There are statements with two semicolons. Remove the second one, it is redundant. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240315092654.2431062-1-colin.i.king@gmail.com
-
Christophe Leroy authored
set_memory_rox() can fail, leaving memory unprotected. Check return and bail out when bpf_jit_binary_lock_ro() returns an error. Link: https://github.com/KSPP/linux/issues/7Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: linux-hardening@vger.kernel.org <linux-hardening@vger.kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Puranjay Mohan <puranjay12@gmail.com> Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com> # s390x Acked-by: Tiezhu Yang <yangtiezhu@loongson.cn> # LoongArch Reviewed-by: Johan Almbladh <johan.almbladh@anyfinetworks.com> # MIPS Part Message-ID: <036b6393f23a2032ce75a1c92220b2afcb798d5d.1709850515.git.christophe.leroy@csgroup.eu> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
Christophe Leroy authored
set_memory_ro() can fail, leaving memory unprotected. Check its return and take it into account as an error. Link: https://github.com/KSPP/linux/issues/7Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: linux-hardening@vger.kernel.org <linux-hardening@vger.kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Message-ID: <286def78955e04382b227cb3e4b6ba272a7442e3.1709850515.git.christophe.leroy@csgroup.eu> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
Andrii Nakryiko authored
Copy over main program's sleepable bit into subprog's info. This might be important for, e.g., freplace cases. Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Stanislav Fomichev <sdf@google.com> Message-ID: <20240314000127.3881569-1-andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
- 14 Mar, 2024 6 commits
-
-
Andrii Nakryiko authored
Kui-Feng Lee says: ==================== Ignore additional fields in the struct_ops maps in an updated version. According to an offline discussion, it would be beneficial to implement a backward-compatible method for struct_ops types with additional fields that are not present in older kernels. This patchset accepts additional fields of a struct_ops map with all zero values even if these fields are not in the corresponding type in the kernel. This provides a way to be backward compatible. User space programs can use the same map on a machine running an old kernel by clearing fields that do not exist in the kernel. For example, in a test case, it adds an additional field "zeroed" that doesn't exist in struct bpf_testmod_ops of the kernel. struct bpf_testmod_ops___zeroed { int (*test_1)(void); void (*test_2)(int a, int b); int (*test_maybe_null)(int dummy, struct task_struct *task); int zeroed; }; SEC(".struct_ops.link") struct bpf_testmod_ops___zeroed testmod_zeroed = { .test_1 = (void *)test_1, .test_2 = (void *)test_2_v2, }; Here, it doesn't assign a value to "zeroed" of testmod_zeroed, and by default the value of this field will be zero. So, the map will be accepted by libbpf, but libbpf will skip the "zeroed" field. However, if the "zeroed" field is assigned to any value other than "0", libbpf will reject to load this map. --- Changes from v1: - Fix the issue about function pointer fields. - Change a warning message, and add an info message for skipping fields. - Add a small demo of additional arguments that are not in the function pointer prototype in the kernel. v1: https://lore.kernel.org/all/20240312183245.341141-1-thinker.li@gmail.com/ Kui-Feng Lee (3): libbpf: Skip zeroed or null fields if not found in the kernel type. selftests/bpf: Ensure libbpf skip all-zeros fields of struct_ops maps. selftests/bpf: Accept extra arguments if they are not used. tools/lib/bpf/libbpf.c | 24 +++- .../bpf/prog_tests/test_struct_ops_module.c | 103 ++++++++++++++++++ .../bpf/progs/struct_ops_extra_arg.c | 49 +++++++++ .../selftests/bpf/progs/struct_ops_module.c | 16 ++- 4 files changed, 186 insertions(+), 6 deletions(-) create mode 100644 tools/testing/selftests/bpf/progs/struct_ops_extra_arg.c ==================== Link: https://lore.kernel.org/r/20240313214139.685112-1-thinker.li@gmail.comSigned-off-by: Andrii Nakryiko <andrii@kernel.org>
-
Kui-Feng Lee authored
A new version of a type may have additional fields that do not exist in older versions. Previously, libbpf would reject struct_ops maps with a new version containing extra fields when running on a machine with an old kernel. However, we have updated libbpf to ignore these fields if their values are all zeros or null in order to provide backward compatibility. Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240313214139.685112-3-thinker.li@gmail.com
-
Kui-Feng Lee authored
Accept additional fields of a struct_ops type with all zero values even if these fields are not in the corresponding type in the kernel. This provides a way to be backward compatible. User space programs can use the same map on a machine running an old kernel by clearing fields that do not exist in the kernel. Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240313214139.685112-2-thinker.li@gmail.com
-
Quentin Monnet authored
In bpf_objec_load_prog(), there's no guarantee that obj->btf is non-NULL when passing it to btf__fd(), and this function does not perform any check before dereferencing its argument (as bpf_object__btf_fd() used to do). As a consequence, we get segmentation fault errors in bpftool (for example) when trying to load programs that come without BTF information. v2: Keep btf__fd() in the fix instead of reverting to bpf_object__btf_fd(). Fixes: df7c3f7d ("libbpf: make uniform use of btf__fd() accessor inside libbpf") Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Quentin Monnet <qmo@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240314150438.232462-1-qmo@kernel.org
-
Yonghong Song authored
Current 'bpftool link' command does not show pids, e.g., $ tools/build/bpftool/bpftool link ... 4: tracing prog 23 prog_type lsm attach_type lsm_mac target_obj_id 1 target_btf_id 31320 Hack the following change to enable normal libbpf debug output, --- a/tools/bpf/bpftool/pids.c +++ b/tools/bpf/bpftool/pids.c @@ -121,9 +121,9 @@ int build_obj_refs_table(struct hashmap **map, enum bpf_obj_type type) /* we don't want output polluted with libbpf errors if bpf_iter is not * supported */ - default_print = libbpf_set_print(libbpf_print_none); + /* default_print = libbpf_set_print(libbpf_print_none); */ err = pid_iter_bpf__load(skel); - libbpf_set_print(default_print); + /* libbpf_set_print(default_print); */ Rerun the above bpftool command: $ tools/build/bpftool/bpftool link libbpf: prog 'iter': BPF program load failed: Permission denied libbpf: prog 'iter': -- BEGIN PROG LOAD LOG -- 0: R1=ctx() R10=fp0 ; struct task_struct *task = ctx->task; @ pid_iter.bpf.c:69 0: (79) r6 = *(u64 *)(r1 +8) ; R1=ctx() R6_w=ptr_or_null_task_struct(id=1) ; struct file *file = ctx->file; @ pid_iter.bpf.c:68 ... ; struct bpf_link *link = (struct bpf_link *) file->private_data; @ pid_iter.bpf.c:103 80: (79) r3 = *(u64 *)(r8 +432) ; R3_w=scalar() R8=ptr_file() ; if (link->type == bpf_core_enum_value(enum bpf_link_type___local, @ pid_iter.bpf.c:105 81: (61) r1 = *(u32 *)(r3 +12) R3 invalid mem access 'scalar' processed 39 insns (limit 1000000) max_states_per_insn 0 total_states 3 peak_states 3 mark_read 2 -- END PROG LOAD LOG -- libbpf: prog 'iter': failed to load: -13 ... The 'file->private_data' returns a 'void' type and this caused subsequent 'link->type' (insn #81) failed in verification. To fix the issue, restore the previous BPF_CORE_READ so old kernels can also work. With this patch, the 'bpftool link' runs successfully with 'pids'. $ tools/build/bpftool/bpftool link ... 4: tracing prog 23 prog_type lsm attach_type lsm_mac target_obj_id 1 target_btf_id 31320 pids systemd(1) Fixes: 44ba7b30 ("bpftool: Use a local copy of BPF_LINK_TYPE_PERF_EVENT in pid_iter.bpf.c") Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Quentin Monnet <quentin@isovalent.com> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20240312023249.3776718-1-yonghong.song@linux.dev
-
Kui-Feng Lee authored
According to a report, skeletons fail to assign shadow pointers when being compiled with C++ programs. Unlike C doing implicit casting for void pointers, C++ requires an explicit casting. To support C++, we do explicit casting for each shadow pointer. Also add struct_ops_module.skel.h to test_cpp to validate C++ compilation as part of BPF selftests. Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Acked-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20240312013726.1780720-1-thinker.li@gmail.com
-