- 02 Jul, 2020 2 commits
-
-
Martin KaFai Lau authored
It is common for networking tests creating its netns and making its own setting under this new netns (e.g. changing tcp sysctl). If the test forgot to restore to the original netns, it would affect the result of other tests. This patch saves the original netns at the beginning and then restores it after every test. Since the restore "setns()" is not expensive, it does it on all tests without tracking if a test has created a new netns or not. The new restore_netns() could also be done in test__end_subtest() such that each subtest will get an automatic netns reset. However, the individual test would lose flexibility to have total control on netns for its own subtests. In some cases, forcing a test to do unnecessary netns re-configure for each subtest is time consuming. e.g. In my vm, forcing netns re-configure on each subtest in sk_assign.c increased the runtime from 1s to 8s. On top of that, test_progs.c is also doing per-test (instead of per-subtest) cleanup for cgroup. Thus, this patch also does per-test restore_netns(). The only existing per-subtest cleanup is reset_affinity() and no test is depending on this. Thus, it is removed from test__end_subtest() to give a consistent expectation to the individual tests. test_progs.c only ensures any affinity/netns/cgroup change made by an earlier test does not affect the following tests. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200702004858.2103728-1-kafai@fb.com
-
Martin KaFai Lau authored
This patch makes a few changes to the network_helpers.c 1) Enforce SO_RCVTIMEO and SO_SNDTIMEO This patch enforces timeout to the network fds through setsockopt SO_RCVTIMEO and SO_SNDTIMEO. It will remove the need for SOCK_NONBLOCK that requires a more demanding timeout logic with epoll/select, e.g. epoll_create, epoll_ctrl, and then epoll_wait for timeout. That removes the need for connect_wait() from the cgroup_skb_sk_lookup.c. The needed change is made in cgroup_skb_sk_lookup.c. 2) start_server(): Add optional addr_str and port to start_server(). That removes the need of the start_server_with_port(). The caller can pass addr_str==NULL and/or port==0. I have a future tcp-hdr-opt test that will pass a non-NULL addr_str and it is in general useful for other future tests. "int timeout_ms" is also added to control the timeout on the "accept(listen_fd)". 3) connect_to_fd(): Fully use the server_fd. The server sock address has already been obtained from getsockname(server_fd). The sockaddr includes the family, so the "int family" arg is redundant. Since the server address is obtained from server_fd, there is little reason not to get the server's socket type from the server_fd also. getsockopt(server_fd) can be used to do that, so "int type" arg is also removed. "int timeout_ms" is added. 4) connect_fd_to_fd(): "int timeout_ms" is added. Some code is also refactored to connect_fd_to_addr() which is shared with connect_to_fd(). 5) Preserve errno: Some callers need to check errno, e.g. cgroup_skb_sk_lookup.c. Make changes to do it more consistently in save_errno_close() and log_err(). Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200702004852.2103003-1-kafai@fb.com
-
- 01 Jul, 2020 15 commits
-
-
Alexei Starovoitov authored
Jesper Dangaard Brouer says: ==================== V3: Reorder patches to cause less code churn. The BPF selftest 'test_progs' contains many tests, that cover all the different areas of the kernel where BPF is used. The CI system sees this as one test, which is impractical for identifying what team/engineer is responsible for debugging the problem. This patchset add some options that makes it easier to create a shell for-loop that invoke each (top-level) test avail in test_progs. Then each test FAIL/PASS result can be presented the CI system to have a separate bullet. (For Red Hat use-case in Beaker https://beaker-project.org/) Created a public script[1] that uses these features in an advanced way. Demonstrating howto reduce the number of (top-level) tests by grouping tests together via using the existing test pattern selection feature, and then using the new --list feature combined with exclude (-b) to get a list of remaining test names that was not part of the groups. [1] https://github.com/netoptimizer/prototype-kernel/blob/master/scripts/bpf_selftests_grouping.sh ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
Jesper Dangaard Brouer authored
The program test_progs have some very useful ability to specify a list of test name substrings for selecting which tests to run. This patch add the ability to list the selected test names without running them. This is practical for seeing which tests gets selected with given select arguments (which can also contain a exclude list via --name-blacklist). This output can also be used by shell-scripts in a for-loop: for N in $(./test_progs --list -t xdp); do \ ./test_progs -t $N 2>&1 > result_test_${N}.log & \ done ; wait This features can also be used for looking up a test number and returning a testname. If the selection was empty then a shell EXIT_FAILURE is returned. This is useful for scripting. e.g. like this: n=1; while [ $(./test_progs --list -n $n) ] ; do \ ./test_progs -n $n ; n=$(( n+1 )); \ done Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/159363985751.930467.9610992940793316982.stgit@firesoul
-
Jesper Dangaard Brouer authored
It can be practial to get the number of tests that test_progs contain. This could for example be used to create a shell for-loop construct that runs the individual tests. Like: for N in $(seq 1 $(./test_progs -c)); do ./test_progs -n $N 2>&1 > result_test_${N}.log & done ; wait V2: Add the ability to return the count for the selected tests. This is useful for getting a count e.g. after excluding some tests with option -b. The current beakers test script like to report the max test count upfront. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/159363985244.930467.12617117873058936829.stgit@firesoul
-
Jesper Dangaard Brouer authored
When a user selects a non-existing test the summary is printed with indication 0 for all info types, and shell "success" (EXIT_SUCCESS) is indicated. This can be understood by a human end-user, but for shell scripting is it useful to indicate a shell failure (EXIT_FAILURE). Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/159363984736.930467.17956007131403952343.stgit@firesoul
-
Andrii Nakryiko authored
Turn off -Wnested-externs to avoid annoying warnings in BUILD_BUG_ON macro when compiling bpftool: In file included from /data/users/andriin/linux/tools/include/linux/build_bug.h:5, from /data/users/andriin/linux/tools/include/linux/kernel.h:8, from /data/users/andriin/linux/kernel/bpf/disasm.h:10, from /data/users/andriin/linux/kernel/bpf/disasm.c:8: /data/users/andriin/linux/kernel/bpf/disasm.c: In function ‘__func_get_name’: /data/users/andriin/linux/tools/include/linux/compiler.h:37:38: warning: nested extern declaration of ‘__compiletime_assert_0’ [-Wnested-externs] _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) ^~~~~~~~~~~~~~~~~~~~~ /data/users/andriin/linux/tools/include/linux/compiler.h:16:15: note: in definition of macro ‘__compiletime_assert’ extern void prefix ## suffix(void) __compiletime_error(msg); \ ^~~~~~ /data/users/andriin/linux/tools/include/linux/compiler.h:37:2: note: in expansion of macro ‘_compiletime_assert’ _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) ^~~~~~~~~~~~~~~~~~~ /data/users/andriin/linux/tools/include/linux/build_bug.h:39:37: note: in expansion of macro ‘compiletime_assert’ #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg) ^~~~~~~~~~~~~~~~~~ /data/users/andriin/linux/tools/include/linux/build_bug.h:50:2: note: in expansion of macro ‘BUILD_BUG_ON_MSG’ BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition) ^~~~~~~~~~~~~~~~ /data/users/andriin/linux/kernel/bpf/disasm.c:20:2: note: in expansion of macro ‘BUILD_BUG_ON’ BUILD_BUG_ON(ARRAY_SIZE(func_id_str) != __BPF_FUNC_MAX_ID); ^~~~~~~~~~~~ Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200701212816.2072340-1-andriin@fb.com
-
Hao Luo authored
The test_vmlinux test uses hrtimer_nanosleep as hook to test tracing programs. But in a kernel built by clang, which performs more aggresive inlining, that function gets inlined into its caller SyS_nanosleep. Therefore, even though fentry and kprobe do hook on the function, they aren't triggered by the call to nanosleep in the test. A possible fix is switching to use a function that is less likely to be inlined, such as hrtimer_range_start_ns. The EXPORT_SYMBOL functions shouldn't be inlined based on the description of [1], therefore safe to use for this test. Also the arguments of this function include the duration of sleep, therefore suitable for test verification. [1] af3b5628 time: don't inline EXPORT_SYMBOL functions Tested: In a clang build kernel, before this change, the test fails: test_vmlinux:PASS:skel_open 0 nsec test_vmlinux:PASS:skel_attach 0 nsec test_vmlinux:PASS:tp 0 nsec test_vmlinux:PASS:raw_tp 0 nsec test_vmlinux:PASS:tp_btf 0 nsec test_vmlinux:FAIL:kprobe not called test_vmlinux:FAIL:fentry not called After switching to hrtimer_range_start_ns, the test passes: test_vmlinux:PASS:skel_open 0 nsec test_vmlinux:PASS:skel_attach 0 nsec test_vmlinux:PASS:tp 0 nsec test_vmlinux:PASS:raw_tp 0 nsec test_vmlinux:PASS:tp_btf 0 nsec test_vmlinux:PASS:kprobe 0 nsec test_vmlinux:PASS:fentry 0 nsec Signed-off-by: Hao Luo <haoluo@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200701175315.1161242-1-haoluo@google.com
-
Randy Dunlap authored
Fix build errors when CONFIG_INET is not set/enabled. (.text+0x2b1b): undefined reference to `tcp_prot' (.text+0x2b3b): undefined reference to `tcp_prot' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/b1a858ec-7e04-56bc-248a-62cb9bbee726@infradead.org
-
Alexei Starovoitov authored
Song Liu says: ==================== This set introduces a new helper bpf_get_task_stack(). The primary use case is to dump all /proc/*/stack to seq_file via bpf_iter__task. A few different approaches have been explored and compared: 1. A simple wrapper around stack_trace_save_tsk(), as v1 [1]. This approach introduces new syntax, which is different to existing helper bpf_get_stack(). Therefore, this is not ideal. 2. Extend get_perf_callchain() to support "task" as argument. This approach reuses most of bpf_get_stack(). However, extending get_perf_callchain() requires non-trivial changes to architecture specific code. Which is error prone. 3. Current (v2) approach, leverages most of existing bpf_get_stack(), and uses stack_trace_save_tsk() to handle architecture specific logic. [1] https://lore.kernel.org/netdev/20200623070802.2310018-1-songliubraving@fb.com/ Changes v4 => v5: 1. Rebase and work around git-am issue. (Alexei) 2. Update commit log for 4/4. (Yonghong) Changes v3 => v4: 1. Simplify the selftests with bpf_iter.h. (Yonghong) 2. Add example output to commit log of 4/4. (Yonghong) Changes v2 => v3: 1. Rebase on top of bpf-next. (Yonghong) 2. Sanitize get_callchain_entry(). (Peter) 3. Use has_callchain_buf for bpf_get_task_stack. (Andrii) 4. Other small clean up. (Yonghong, Andrii). Changes v1 => v2: 1. Reuse most of bpf_get_stack() logic. (Andrii) 2. Fix unsigned long vs. u64 mismatch for 32-bit systems. (Yonghong) 3. Add %pB support in bpf_trace_printk(). (Daniel) 4. Fix buffer size to bytes. ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
Song Liu authored
The new test is similar to other bpf_iter tests. It dumps all /proc/<pid>/stack to a seq_file. Here is some example output: pid: 2873 num_entries: 3 [<0>] worker_thread+0xc6/0x380 [<0>] kthread+0x135/0x150 [<0>] ret_from_fork+0x22/0x30 pid: 2874 num_entries: 9 [<0>] __bpf_get_stack+0x15e/0x250 [<0>] bpf_prog_22a400774977bb30_dump_task_stack+0x4a/0xb3c [<0>] bpf_iter_run_prog+0x81/0x170 [<0>] __task_seq_show+0x58/0x80 [<0>] bpf_seq_read+0x1c3/0x3b0 [<0>] vfs_read+0x9e/0x170 [<0>] ksys_read+0xa7/0xe0 [<0>] do_syscall_64+0x4c/0xa0 [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Note: bpf_iter test as-is doesn't print the contents of the seq_file. To see the example above, it is necessary to add printf() to do_dummy_read. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200630062846.664389-5-songliubraving@fb.com
-
Song Liu authored
This makes it easy to dump stack trace in text. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200630062846.664389-4-songliubraving@fb.com
-
Song Liu authored
Introduce helper bpf_get_task_stack(), which dumps stack trace of given task. This is different to bpf_get_stack(), which gets stack track of current task. One potential use case of bpf_get_task_stack() is to call it from bpf_iter__task and dump all /proc/<pid>/stack to a seq_file. bpf_get_task_stack() uses stack_trace_save_tsk() instead of get_perf_callchain() for kernel stack. The benefit of this choice is that stack_trace_save_tsk() doesn't require changes in arch/. The downside of using stack_trace_save_tsk() is that stack_trace_save_tsk() dumps the stack trace to unsigned long array. For 32-bit systems, we need to translate it to u64 array. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200630062846.664389-3-songliubraving@fb.com
-
Song Liu authored
Sanitize and expose get/put_callchain_entry(). This would be used by bpf stack map. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200630062846.664389-2-songliubraving@fb.com
-
Alexei Starovoitov authored
bpf_free_used_maps() or close(map_fd) will trigger map_free callback. bpf_free_used_maps() is called after bpf prog is no longer executing: bpf_prog_put->call_rcu->bpf_prog_free->bpf_free_used_maps. Hence there is no need to call synchronize_rcu() to protect map elements. Note that hash_of_maps and array_of_maps update/delete inner maps via sys_bpf() that calls maybe_wait_bpf_programs() and synchronize_rcu(). Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Paul E. McKenney <paulmck@kernel.org> Link: https://lore.kernel.org/bpf/20200630043343.53195-2-alexei.starovoitov@gmail.com
-
Andrii Nakryiko authored
Add simple selftest validating byte swap built-ins and compile-time macros. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200630152125.3631920-3-andriin@fb.com
-
Andrii Nakryiko authored
Make bpf_endian.h compatible with vmlinux.h. It is a frequent request from users wanting to use bpf_endian.h in their BPF applications using CO-RE and vmlinux.h. To achieve that, re-implement byte swap macros and drop all the header includes. This way it can be used both with linux header includes, as well as with a vmlinux.h. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200630152125.3631920-2-andriin@fb.com
-
- 30 Jun, 2020 2 commits
-
-
Andrii Nakryiko authored
Similarly to bpftool Makefile, allow to specify custom location of vmlinux.h to be used during the build. This allows simpler testing setups with checked-in pre-generated vmlinux.h. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200630004759.521530-2-andriin@fb.com
-
Andrii Nakryiko authored
In some build contexts (e.g., Travis CI build for outdated kernel), vmlinux.h, generated from available kernel, doesn't contain all the types necessary for BPF program compilation. For such set up, the most maintainable way to deal with this problem is to keep pre-generated (almost up-to-date) vmlinux.h checked in and use it for compilation purposes. bpftool after that can deal with kernel missing some of the features in runtime with no problems. To that effect, allow to specify path to custom vmlinux.h to bpftool's Makefile with VMLINUX_H variable. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200630004759.521530-1-andriin@fb.com
-
- 28 Jun, 2020 3 commits
-
-
Alexei Starovoitov authored
Andrii Nakryiko says: ==================== Add ability to turn off default auto-loading of each BPF program by libbpf on BPF object load. This is the feature that allows BPF applications to have optional functionality, which is only excercised on kernel that support necessary features, while falling back to reduced/less performant functionality, if kernel is outdated. ==================== Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
Andrii Nakryiko authored
Validate that BPF object with broken (in multiple ways) BPF program can still be successfully loaded, if that broken BPF program is disabled. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200625232629.3444003-3-andriin@fb.com
-
Andrii Nakryiko authored
Currently, bpf_object__load() (and by induction skeleton's load), will always attempt to prepare, relocate, and load into kernel every single BPF program found inside the BPF object file. This is often convenient and the right thing to do and what users expect. But there are plenty of cases (especially with BPF development constantly picking up the pace), where BPF application is intended to work with old kernels, with potentially reduced set of features. But on kernels supporting extra features, it would like to take a full advantage of them, by employing extra BPF program. This could be a choice of using fentry/fexit over kprobe/kretprobe, if kernel is recent enough and is built with BTF. Or BPF program might be providing optimized bpf_iter-based solution that user-space might want to use, whenever available. And so on. With libbpf and BPF CO-RE in particular, it's advantageous to not have to maintain two separate BPF object files to achieve this. So to enable such use cases, this patch adds ability to request not auto-loading chosen BPF programs. In such case, libbpf won't attempt to perform relocations (which might fail due to old kernel), won't try to resolve BTF types for BTF-aware (tp_btf/fentry/fexit/etc) program types, because BTF might not be present, and so on. Skeleton will also automatically skip auto-attachment step for such not loaded BPF programs. Overall, this feature allows to simplify development and deployment of real-world BPF applications with complicated compatibility requirements. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200625232629.3444003-2-andriin@fb.com
-
- 25 Jun, 2020 18 commits
-
-
Tobias Klauser authored
Define attach_type_name in common.c instead of main.h so it is only defined once. This leads to a slight decrease in the binary size of bpftool. Before: text data bss dec hex filename 399024 11168 1573160 1983352 1e4378 bpftool After: text data bss dec hex filename 398256 10880 1573160 1982296 1e3f58 bpftool Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20200624143154.13145-1-tklauser@distanz.ch
-
Tobias Klauser authored
Define prog_type_name in prog.c instead of main.h so it is only defined once. This leads to a slight decrease in the binary size of bpftool. Before: text data bss dec hex filename 401032 11936 1573160 1986128 1e4e50 bpftool After: text data bss dec hex filename 399024 11168 1573160 1983352 1e4378 bpftool Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20200624143124.12914-1-tklauser@distanz.ch
-
Alexei Starovoitov authored
Yonghong Song says: ==================== bpf iterator implments traversal of kernel data structures and these data structures are passed to a bpf program for processing. This gives great flexibility for users to examine kernel data structure without using e.g. /proc/net which has limited and fixed format. Commit 138d0be3 ("net: bpf: Add netlink and ipv6_route bpf_iter targets") implemented bpf iterators for netlink and ipv6_route. This patch set intends to implement bpf iterators for tcp and udp. Currently, /proc/net/tcp is used to print tcp4 stats and /proc/net/tcp6 is used to print tcp6 stats. /proc/net/udp[6] have similar usage model. In contrast, only one tcp iterator is implemented and it is bpf program resposibility to filter based on socket family. The same is for udp. This will avoid another unnecessary traversal pass if users want to check both tcp4 and tcp6. Several helpers are also implemented in this patch bpf_skc_to_{tcp, tcp6, tcp_timewait, tcp_request, udp6}_sock The argument for these helpers is not a fixed btf_id. For example, bpf_skc_to_tcp(struct sock_common *), or bpf_skc_to_tcp(struct sock *), or bpf_skc_to_tcp(struct inet_sock *), ... are all valid. At runtime, the helper will check whether pointer cast is legal or not. Please see Patch #5 for details. Since btf_id's for both arguments and return value are known at build time, the btf_id's are pre-computed once vmlinux btf becomes valid. Jiri's "adding d_path helper" patch set https://lore.kernel.org/bpf/20200616100512.2168860-1-jolsa@kernel.org/T/ provides a way to pre-compute btf id during vmlinux build time. This can be applied here as well. A followup patch can convert to build time btf id computation after Jiri's patch landed. Changelogs: v4 -> v5: - fix bpf_skc_to_udp6_sock helper as besides sk_protocol, sk_family, sk_type == SOCK_DGRAM is also needed to differentiate from SOCK_RAW (Eric) v3 -> v4: - fix bpf_skc_to_{tcp_timewait, tcp_request}_sock helper implementation as just checking sk->sk_state is not enough (Martin) - fix a few kernel test robot reported failures - move bpf_tracing_net.h from libbpf to selftests (Andrii) - remove __weak attribute from selftests CONFIG_HZ variables (Andrii) v2 -> v3: - change sock_cast*/SOCK_CAST* names to btf_sock* names for generality (Martin) - change gpl_license to false (Martin) - fix helper to cast to tcp timewait/request socket. (Martin) v1 -> v2: - guard init_sock_cast_types() defination properly with CONFIG_NET (Martin) - reuse the btf_ids, computed for new helper argument, for return values (Martin) - using BTF_TYPE_EMIT to express intent of btf type generation (Andrii) - abstract out common net macros into bpf_tracing_net.h (Andrii) ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
Yonghong Song authored
Added tcp{4,6} and udp{4,6} bpf programs into test_progs selftest so that they at least can load successfully. $ ./test_progs -n 3 ... #3/7 tcp4:OK #3/8 tcp6:OK #3/9 udp4:OK #3/10 udp6:OK ... #3 bpf_iter:OK Summary: 1/16 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200623230823.3989372-1-yhs@fb.com
-
Yonghong Song authored
On my VM, I got identical results between /proc/net/udp[6] and the udp{4,6} bpf iterator. For udp6: $ cat /sys/fs/bpf/p1 sl local_address remote_address st tx_queue rx_queue tr tm->when retrnsmt uid timeout inode ref pointer drops 1405: 000080FE00000000FF7CC4D0D9EFE4FE:0222 00000000000000000000000000000000:0000 07 00000000:00000000 00:00000000 00000000 193 0 19183 2 0000000029eab111 0 $ cat /proc/net/udp6 sl local_address remote_address st tx_queue rx_queue tr tm->when retrnsmt uid timeout inode ref pointer drops 1405: 000080FE00000000FF7CC4D0D9EFE4FE:0222 00000000000000000000000000000000:0000 07 00000000:00000000 00:00000000 00000000 193 0 19183 2 0000000029eab111 0 For udp4: $ cat /sys/fs/bpf/p4 sl local_address rem_address st tx_queue rx_queue tr tm->when retrnsmt uid timeout inode ref pointer drops 2007: 00000000:1F90 00000000:0000 07 00000000:00000000 00:00000000 00000000 0 0 72540 2 000000004ede477a 0 $ cat /proc/net/udp sl local_address rem_address st tx_queue rx_queue tr tm->when retrnsmt uid timeout inode ref pointer drops 2007: 00000000:1F90 00000000:0000 07 00000000:00000000 00:00000000 00000000 0 0 72540 2 000000004ede477a 0 Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200623230822.3989299-1-yhs@fb.com
-
Yonghong Song authored
In my VM, I got identical result compared to /proc/net/{tcp,tcp6}. For tcp6: $ cat /proc/net/tcp6 sl local_address remote_address st tx_queue rx_queue tr tm->when retrnsmt uid timeout inode 0: 00000000000000000000000000000000:0016 00000000000000000000000000000000:0000 0A 00000000:00000000 00:00000001 00000000 0 0 17955 1 000000003eb3102e 100 0 0 10 0 $ cat /sys/fs/bpf/p1 sl local_address remote_address st tx_queue rx_queue tr tm->when retrnsmt uid timeout inode 0: 00000000000000000000000000000000:0016 00000000000000000000000000000000:0000 0A 00000000:00000000 00:00000000 00000000 0 0 17955 1 000000003eb3102e 100 0 0 10 0 For tcp: $ cat /proc/net/tcp sl local_address rem_address st tx_queue rx_queue tr tm->when retrnsmt uid timeout inode 0: 00000000:0016 00000000:0000 0A 00000000:00000000 00:00000000 00000000 0 0 2666 1 000000007152e43f 100 0 0 10 0 $ cat /sys/fs/bpf/p2 sl local_address remote_address st tx_queue rx_queue tr tm->when retrnsmt uid timeout inode 1: 00000000:0016 00000000:0000 0A 00000000:00000000 00:00000000 00000000 0 0 2666 1 000000007152e43f 100 0 0 10 0 Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200623230820.3989165-1-yhs@fb.com
-
Yonghong Song authored
These newly added macros will be used in subsequent bpf iterator tcp{4,6} and udp{4,6} programs. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200623230819.3989050-1-yhs@fb.com
-
Yonghong Song authored
Refactor bpf_iter_ipv6_route.c and bpf_iter_netlink.c so net macros, originally from various include/linux header files, are moved to a new header file bpf_tracing_net.h. The goal is to improve reuse so networking tracing programs do not need to copy these macros every time they use them. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200623230817.3988962-1-yhs@fb.com
-
Yonghong Song authored
Commit b9f4c01f ("selftest/bpf: Make bpf_iter selftest compilable against old vmlinux.h") and Commit dda18a5c ("selftests/bpf: Convert bpf_iter_test_kern{3, 4}.c to define own bpf_iter_meta") redefined newly introduced types in bpf programs so the bpf program can still compile properly with old kernels although loading may fail. Since this patch set introduced new types and the same workaround is needed, so let us move the workaround to a separate header file so they do not clutter bpf programs. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200623230816.3988656-1-yhs@fb.com
-
Yonghong Song authored
The helper is used in tracing programs to cast a socket pointer to a udp6_sock pointer. The return value could be NULL if the casting is illegal. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Cc: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/bpf/20200623230815.3988481-1-yhs@fb.com
-
Yonghong Song authored
The bpf iterator for udp is implemented. Both udp4 and udp6 sockets will be traversed. It is up to bpf program to filter for udp4 or udp6 only, or both families of sockets. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200623230813.3988404-1-yhs@fb.com
-
Yonghong Song authored
Similar to tcp_iter_state, a new field bpf_seq_afinfo is added to udp_iter_state to provide bpf udp iterator afinfo. This does not change /proc/net/{udp, udp6} behavior. But it enables bpf iterator to avoid get afinfo from PDE_DATA and iterate through all udp and udp6 sockets in one pass. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200623230812.3988347-1-yhs@fb.com
-
Yonghong Song authored
Three more helpers are added to cast a sock_common pointer to an tcp_sock, tcp_timewait_sock or a tcp_request_sock for tracing programs. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200623230811.3988277-1-yhs@fb.com
-
Yonghong Song authored
The helper is used in tracing programs to cast a socket pointer to a tcp6_sock pointer. The return value could be NULL if the casting is illegal. A new helper return type RET_PTR_TO_BTF_ID_OR_NULL is added so the verifier is able to deduce proper return types for the helper. Different from the previous BTF_ID based helpers, the bpf_skc_to_tcp6_sock() argument can be several possible btf_ids. More specifically, all possible socket data structures with sock_common appearing in the first in the memory layout. This patch only added socket types related to tcp and udp. All possible argument btf_id and return value btf_id for helper bpf_skc_to_tcp6_sock() are pre-calculcated and cached. In the future, it is even possible to precompute these btf_id's at kernel build time. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200623230809.3988195-1-yhs@fb.com
-
Yonghong Song authored
/proc/net/tcp{4,6} uses jiffies for various computations. Let us add bpf_jiffies64() helper to tracing program so bpf_iter and other programs can use it. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200623230808.3988073-1-yhs@fb.com
-
Yonghong Song authored
'X' tells kernel to print hex with upper case letters. /proc/net/tcp{4,6} seq_file show() used this, and supports it in bpf_seq_printf() helper too. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200623230807.3988014-1-yhs@fb.com
-
Yonghong Song authored
The bpf iterator for tcp is implemented. Both tcp4 and tcp6 sockets will be traversed. It is up to bpf program to filter for tcp4 or tcp6 only, or both families of sockets. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200623230805.3987959-1-yhs@fb.com
-
Yonghong Song authored
A new field bpf_seq_afinfo is added to tcp_iter_state to provide bpf tcp iterator afinfo. There are two reasons on why we did this. First, the current way to get afinfo from PDE_DATA does not work for bpf iterator as its seq_file inode does not conform to /proc/net/{tcp,tcp6} inode structures. More specifically, anonymous bpf iterator will use an anonymous inode which is shared in the system and we cannot change inode private data structure at all. Second, bpf iterator for tcp/tcp6 wants to traverse all tcp and tcp6 sockets in one pass and bpf program can control whether they want to skip one sk_family or not. Having a different afinfo with family AF_UNSPEC make it easier to understand in the code. This patch does not change /proc/net/{tcp,tcp6} behavior as the bpf_seq_afinfo will be NULL for these two proc files. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200623230804.3987829-1-yhs@fb.com
-