- 25 Aug, 2022 1 commit
-
-
Kumar Kartikeya Dwivedi authored
Currently, verifier verifies callback functions (sync and async) as if they will be executed once, (i.e. it explores execution state as if the function was being called once). The next insn to explore is set to start of subprog and the exit from nested frame is handled using curframe > 0 and prepare_func_exit. In case of async callback it uses a customized variant of push_stack simulating a kind of branch to set up custom state and execution context for the async callback. While this approach is simple and works when callback really will be executed only once, it is unsafe for all of our current helpers which are for_each style, i.e. they execute the callback multiple times. A callback releasing acquired references of the caller may do so multiple times, but currently verifier sees it as one call inside the frame, which then returns to caller. Hence, it thinks it released some reference that the cb e.g. got access through callback_ctx (register filled inside cb from spilled typed register on stack). Similarly, it may see that an acquire call is unpaired inside the callback, so the caller will copy the reference state of callback and then will have to release the register with new ref_obj_ids. But again, the callback may execute multiple times, but the verifier will only account for acquired references for a single symbolic execution of the callback, which will cause leaks. Note that for async callback case, things are different. While currently we have bpf_timer_set_callback which only executes it once, even for multiple executions it would be safe, as reference state is NULL and check_reference_leak would force program to release state before BPF_EXIT. The state is also unaffected by analysis for the caller frame. Hence async callback is safe. Since we want the reference state to be accessible, e.g. for pointers loaded from stack through callback_ctx's PTR_TO_STACK, we still have to copy caller's reference_state to callback's bpf_func_state, but we enforce that whatever references it adds to that reference_state has been released before it hits BPF_EXIT. This requires introducing a new callback_ref member in the reference state to distinguish between caller vs callee references. Hence, check_reference_leak now errors out if it sees we are in callback_fn and we have not released callback_ref refs. Since there can be multiple nested callbacks, like frame 0 -> cb1 -> cb2 etc. we need to also distinguish between whether this particular ref belongs to this callback frame or parent, and only error for our own, so we store state->frameno (which is always non-zero for callbacks). In short, callbacks can read parent reference_state, but cannot mutate it, to be able to use pointers acquired by the caller. They must only undo their changes (by releasing their own acquired_refs before BPF_EXIT) on top of caller reference_state before returning (at which point the caller and callback state will match anyway, so no need to copy it back to caller). Fixes: 69c087ba ("bpf: Add bpf_for_each_map_elem() helper") Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20220823013125.24938-1-memxor@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
- 23 Aug, 2022 13 commits
-
-
Kumar Kartikeya Dwivedi authored
They would require func_info which needs prog BTF anyway. Loading BTF and setting the prog btf_fd while loading the prog indirectly requires CAP_BPF, so just to reduce confusion, move both these helpers taking callback under bpf_capable() protection as well, since they cannot be used without CAP_BPF. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20220823013117.24916-1-memxor@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Alexei Starovoitov authored
Stanislav Fomichev says: ==================== Apparently, only a small subset of cgroup hooks actually falls back to cgroup_base_func_proto. This leads to unexpected result where not all cgroup helpers have bpf_{g,s}et_retval. It's getting harder and harder to manage which helpers are exported to which hooks. We now have the following call chains: - cg_skb_func_proto - sk_filter_func_proto - bpf_sk_base_func_proto - bpf_base_func_proto So by looking at cg_skb_func_proto it's pretty hard to understand what's going on. For cgroup helpers, I'm proposing we do the following instead: func_proto = cgroup_common_func_proto(); if (func_proto) return func_proto; /* optional, if hook has 'current' */ func_proto = cgroup_current_func_proto(); if (func_proto) return func_proto; ... switch (func_id) { /* hook specific helpers */ case BPF_FUNC_hook_specific_helper: return &xyz; default: /* always fall back to plain bpf_base_func_proto */ bpf_base_func_proto(func_id); } If this turns out more workable, we can follow up with converting the rest to the same pattern. v5: - remove net/cls_cgroup.h include from patch 1/5 (Martin) - move endif changes from patch 1/5 to 3/5 (Martin) - don't define __weak protos, the ones in core.c suffice (Martin) v4: - don't touch existing helper.c helpers (Martin) - drop unneeded CONFIG_CGROUP_BPF in bpf_lsm_func_proto (Martin) v3: - expose strtol/strtoul everywhere (Martin) - move helpers declarations from bpf.h to bpf-cgroup.h (Martin) - revise bpf_{g,s}et_retval documentation (Martin) - don't expose bpf_{g,s}et_retval to cg_skb hooks (Martin) v2: - move everything into kernel/bpf/cgroup.c instead (Martin) - use cgroup_common_func_proto in lsm (Martin) ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
Stanislav Fomichev authored
For each hook, have a simple bpf_set_retval(bpf_get_retval) program and make sure it loads for the hooks we want. The exceptions are the hooks which don't propagate the error to the callers: - sockops - recvmsg - getpeername - getsockname - cg_skb ingress and egress Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20220823222555.523590-6-sdf@google.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Stanislav Fomichev authored
* replace 'syscall' with 'upper layers', still mention that it's being exported via syscall errno * describe what happens in set_retval(-EPERM) + return 1 * describe what happens with bind's 'return 3' Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20220823222555.523590-5-sdf@google.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Stanislav Fomichev authored
bpf_strncmp is already exposed everywhere. The motivation is to keep those helpers in kernel/bpf/helpers.c. Otherwise it's tempting to move them under kernel/bpf/cgroup.c because they are currently only used by sysctl prog types. Suggested-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20220823222555.523590-4-sdf@google.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Stanislav Fomichev authored
The following hooks are per-cgroup hooks but they are not using cgroup_{common,current}_func_proto, fix it: * BPF_PROG_TYPE_CGROUP_SKB (cg_skb) * BPF_PROG_TYPE_CGROUP_SOCK_ADDR (cg_sock_addr) * BPF_PROG_TYPE_CGROUP_SOCK (cg_sock) * BPF_PROG_TYPE_LSM+BPF_LSM_CGROUP Also: * move common func_proto's into cgroup func_proto handlers * make sure bpf_{g,s}et_retval are not accessible from recvmsg, getpeername and getsockname (return/errno is ignored in these places) * as a side effect, expose get_current_pid_tgid, get_current_comm_proto, get_current_ancestor_cgroup_id, get_cgroup_classid to more cgroup hooks Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20220823222555.523590-3-sdf@google.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Stanislav Fomichev authored
Split cgroup_base_func_proto into the following: * cgroup_common_func_proto - common helpers for all cgroup hooks * cgroup_current_func_proto - common helpers for all cgroup hooks running in the process context (== have meaningful 'current'). Move bpf_{g,s}et_retval and other cgroup-related helpers into kernel/bpf/cgroup.c so they closer to where they are being used. Signed-off-by: Stanislav Fomichev <sdf@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/r/20220823222555.523590-2-sdf@google.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Quentin Monnet authored
The bpf-helpers(7) manual page shipped in the man-pages project is generated from the documentation contained in the BPF UAPI header, in the Linux repository, parsed by script/bpf_doc.py and then fed to rst2man. The man page should contain the date of last modification of the documentation. This commit adds the relevant date when generating the page. Before: $ ./scripts/bpf_doc.py helpers | rst2man | grep '\.TH' .TH BPF-HELPERS 7 "" "Linux v5.19-14022-g30d2a4d74e11" "" After: $ ./scripts/bpf_doc.py helpers | rst2man | grep '\.TH' .TH BPF-HELPERS 7 "2022-08-15" "Linux v5.19-14022-g30d2a4d74e11" "" We get the version by using "git log" to look for the commit date of the latest change to the section of the BPF header containing the documentation. If the command fails, we just skip the date field. and keep generating the page. Reported-by: Alejandro Colomar <alx.manpages@gmail.com> Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Alejandro Colomar <alx.manpages@gmail.com> Link: https://lore.kernel.org/bpf/20220823155327.98888-2-quentin@isovalent.com
-
Quentin Monnet authored
The bpf-helpers(7) manual page shipped in the man-pages project is generated from the documentation contained in the BPF UAPI header, in the Linux repository, parsed by script/bpf_doc.py and then fed to rst2man. After a recent update of that page [0], Alejandro reported that the linter used to validate the man pages complains about the generated document [1]. The header for the page is supposed to contain some attributes that we do not set correctly with the script. This commit updates the "project and version" field. We discussed the format of those fields in [1] and [2]. Before: $ ./scripts/bpf_doc.py helpers | rst2man | grep '\.TH' .TH BPF-HELPERS 7 "" "" "" After: $ ./scripts/bpf_doc.py helpers | rst2man | grep '\.TH' .TH BPF-HELPERS 7 "" "Linux v5.19-14022-g30d2a4d74e11" "" We get the version from "git describe", but if unavailable, we fall back on "make kernelversion". If none works, for example because neither git nore make are installed, we just set the field to "Linux" and keep generating the page. [0] https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/commit/man7/bpf-helpers.7?id=19c7f78393f2b038e76099f87335ddf43a87f039 [1] https://lore.kernel.org/all/20220823084719.13613-1-quentin@isovalent.com/t/#m58a418a318642c6428e14ce9bb84eba5183b06e8 [2] https://lore.kernel.org/all/20220721110821.8240-1-alx.manpages@gmail.com/t/#m8e689a822e03f6e2530a0d6de9d128401916c5deReported-by: Alejandro Colomar <alx.manpages@gmail.com> Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Alejandro Colomar <alx.manpages@gmail.com> Link: https://lore.kernel.org/bpf/20220823155327.98888-1-quentin@isovalent.com
-
Shmulik Ladkani authored
The dissector program returns BPF_FLOW_DISSECTOR_CONTINUE (and avoids setting skb->flow_keys or last_dissection map) in case it encounters IP packets whose (outer) source address is 127.0.0.127. Additional test is added to prog_tests/flow_dissector.c which sets this address as test's pkk.iph.saddr, with the expected retval of BPF_FLOW_DISSECTOR_CONTINUE. Also, legacy test_flow_dissector.sh was similarly augmented. Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Stanislav Fomichev <sdf@google.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20220821113519.116765-5-shmulik.ladkani@gmail.com
-
Shmulik Ladkani authored
Formerly, a boolean denoting whether bpf_flow_dissect returned BPF_OK was set into 'bpf_attr.test.retval'. Augment this, so users can check the actual return code of the dissector program under test. Existing prog_tests/flow_dissector*.c tests were correspondingly changed to check against each test's expected retval. Also, tests' resulting 'flow_keys' are verified only in case the expected retval is BPF_OK. This allows adding new tests that expect non BPF_OK. Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Stanislav Fomichev <sdf@google.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20220821113519.116765-4-shmulik.ladkani@gmail.com
-
Shmulik Ladkani authored
Currently, attaching BPF_PROG_TYPE_FLOW_DISSECTOR programs completely replaces the flow-dissector logic with custom dissection logic. This forces implementors to write programs that handle dissection for any flows expected in the namespace. It makes sense for flow-dissector BPF programs to just augment the dissector with custom logic (e.g. dissecting certain flows or custom protocols), while enjoying the broad capabilities of the standard dissector for any other traffic. Introduce BPF_FLOW_DISSECTOR_CONTINUE retcode. Flow-dissector BPF programs may return this to indicate no dissection was made, and fallback to the standard dissector is requested. Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Stanislav Fomichev <sdf@google.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20220821113519.116765-3-shmulik.ladkani@gmail.com
-
Shmulik Ladkani authored
Let 'bpf_flow_dissect' callers know the BPF program's retcode and act accordingly. Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Stanislav Fomichev <sdf@google.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20220821113519.116765-2-shmulik.ladkani@gmail.com
-
- 19 Aug, 2022 18 commits
-
-
Martin KaFai Lau authored
Trampoline is not supported in s390. Fixes: 31123c03 ("selftests/bpf: bpf_setsockopt tests") Signed-off-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/r/20220819192155.91713-1-kafai@fb.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Colin Ian King authored
There is a spelling mistake in an ASSERT_OK literal string. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Acked-by: Mykola Lysenko <mykolal@fb.com> Link: https://lore.kernel.org/r/20220817213242.101277-1-colin.i.king@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Alexei Starovoitov authored
Martin KaFai Lau says: ==================== The code in bpf_setsockopt() is mostly a copy-and-paste from the sock_setsockopt(), do_tcp_setsockopt(), do_ipv6_setsockopt(), and do_ip_setsockopt(). As the allowed optnames in bpf_setsockopt() grows, so are the duplicated code. The code between the copies also slowly drifted. This set is an effort to clean this up and reuse the existing {sock,do_tcp,do_ipv6,do_ip}_setsockopt() as much as possible. After the clean up, this set also adds a few allowed optnames that we need to the bpf_setsockopt(). The initial attempt was to clean up both bpf_setsockopt() and bpf_getsockopt() together. However, the patch set was getting too long. It is beneficial to leave the bpf_getsockopt() out for another patch set. Thus, this set is focusing on the bpf_setsockopt(). v4: - This set now depends on the commit f574f7f8 ("net: bpf: Use the protocol's set_rcvlowat behavior if there is one") in the net-next tree. The commit calls a specific protocol's set_rcvlowat and it changed the bpf_setsockopt which this set has also changed. Because of this, patch 9 of this set has also adjusted and a 'sock' NULL check is added to the sk_setsockopt() because some of the bpf hooks have a NULL sk->sk_socket. This removes more dup code from the bpf_setsockopt() side. - Avoid mentioning specific prog types in the comment of the has_current_bpf_ctx(). (Andrii) - Replace signed with unsigned int bitfield in the patch 15 selftest. (Daniel) v3: - s/in_bpf/has_current_bpf_ctx/ (Andrii) - Add comment to has_current_bpf_ctx() and sockopt_lock_sock() (Stanislav) - Use vmlinux.h in selftest and add defines to bpf_tracing_net.h (Stanislav) - Use bpf_getsockopt(SO_MARK) in selftest (Stanislav) - Use BPF_CORE_READ_BITFIELD in selftest (Yonghong) v2: - A major change is to use in_bpf() to test if a setsockopt() is called by a bpf prog and use in_bpf() to skip capable check. Suggested by Stanislav. - Instead of passing is_locked through sockptr_t or through an extra argument to sk_setsockopt, v2 uses in_bpf() to skip the lock_sock() also because bpf prog has the lock acquired. - No change to the current sockptr_t in this revision - s/codes/code/ ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
Martin KaFai Lau authored
This patch adds tests to exercise optnames that are allowed in bpf_setsockopt(). Reviewed-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/r/20220817061847.4182339-1-kafai@fb.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Martin KaFai Lau authored
This patch adds a few optnames for bpf_setsockopt: SO_REUSEADDR, IPV6_AUTOFLOWLABEL, TCP_MAXSEG, TCP_NODELAY, and TCP_THIN_LINEAR_TIMEOUTS. Thanks to the previous patches of this set, all additions can reuse the sk_setsockopt(), do_ipv6_setsockopt(), and do_tcp_setsockopt(). The only change here is to allow them in bpf_setsockopt. The bpf prog has been able to read all members of a sk by using PTR_TO_BTF_ID of a sk. The optname additions here can also be read by the same approach. Meaning there is a way to read the values back. These optnames can also be added to bpf_getsockopt() later with another patch set that makes the bpf_getsockopt() to reuse the sock_getsockopt(), tcp_getsockopt(), and ip[v6]_getsockopt(). Thus, this patch does not add more duplicated code to bpf_getsockopt() now. Reviewed-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/r/20220817061841.4181642-1-kafai@fb.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Martin KaFai Lau authored
After the prep work in the previous patches, this patch removes the dup code from bpf_setsockopt(SOL_IPV6) and reuses the implementation in do_ipv6_setsockopt(). ipv6 could be compiled as a module. Like how other code solved it with stubs in ipv6_stubs.h, this patch adds the do_ipv6_setsockopt to the ipv6_bpf_stub. The current bpf_setsockopt(IPV6_TCLASS) does not take the INET_ECN_MASK into the account for tcp. The do_ipv6_setsockopt(IPV6_TCLASS) will handle it correctly. The existing optname white-list is refactored into a new function sol_ipv6_setsockopt(). After this last SOL_IPV6 dup code removal, the __bpf_setsockopt() is simplified enough that the extra "{ }" around the if statement can be removed. Reviewed-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/r/20220817061834.4181198-1-kafai@fb.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Martin KaFai Lau authored
After the prep work in the previous patches, this patch removes the dup code from bpf_setsockopt(SOL_IP) and reuses the implementation in do_ip_setsockopt(). The existing optname white-list is refactored into a new function sol_ip_setsockopt(). NOTE, the current bpf_setsockopt(IP_TOS) is quite different from the the do_ip_setsockopt(IP_TOS). For example, it does not take the INET_ECN_MASK into the account for tcp and also does not adjust sk->sk_priority. It looks like the current bpf_setsockopt(IP_TOS) was referencing the IPV6_TCLASS implementation instead of IP_TOS. This patch tries to rectify that by using the do_ip_setsockopt(IP_TOS). While this is a behavior change, the do_ip_setsockopt(IP_TOS) behavior is arguably what the user is expecting. At least, the INET_ECN_MASK bits should be masked out for tcp. Reviewed-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/r/20220817061826.4180990-1-kafai@fb.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Martin KaFai Lau authored
After the prep work in the previous patches, this patch removes all the dup code from bpf_setsockopt(SOL_TCP) and reuses the do_tcp_setsockopt(). The existing optname white-list is refactored into a new function sol_tcp_setsockopt(). The sol_tcp_setsockopt() also calls the bpf_sol_tcp_setsockopt() to handle the TCP_BPF_XXX specific optnames. bpf_setsockopt(TCP_SAVE_SYN) now also allows a value 2 to save the eth header also and it comes for free from do_tcp_setsockopt(). Reviewed-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/r/20220817061819.4180146-1-kafai@fb.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Martin KaFai Lau authored
The patch moves all bpf specific tcp optnames (TCP_BPF_XXX) to a new function bpf_sol_tcp_setsockopt(). This will make the next patch easier to follow. Reviewed-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/r/20220817061812.4179645-1-kafai@fb.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Martin KaFai Lau authored
After the prep work in the previous patches, this patch removes most of the dup code from bpf_setsockopt(SOL_SOCKET) and reuses them from sk_setsockopt(). The sock ptr test is added to the SO_RCVLOWAT because the sk->sk_socket could be NULL in some of the bpf hooks. The existing optname white-list is refactored into a new function sol_socket_setsockopt(). Reviewed-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/r/20220817061804.4178920-1-kafai@fb.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Martin KaFai Lau authored
This patch moves the "#ifdef CONFIG_XXX" check into the "if/else" statement itself. The change is done for the bpf_setsockopt() function only. It will make the latter patches easier to follow without the surrounding ifdef macro. Reviewed-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/r/20220817061758.4178374-1-kafai@fb.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Martin KaFai Lau authored
The bpf-iter-prog for tcp and unix sk can do bpf_setsockopt() which needs has_current_bpf_ctx() to decide if it is called by a bpf prog. This patch initializes the bpf_run_ctx in bpf_iter_run_prog() for the has_current_bpf_ctx() to use. Acked-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/r/20220817061751.4177657-1-kafai@fb.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Martin KaFai Lau authored
Similar to the earlier patch that avoids sk_setsockopt() from taking sk lock and doing capable test when called by bpf. This patch changes do_ipv6_setsockopt() to use the sockopt_{lock,release}_sock() and sockopt_[ns_]capable(). Reviewed-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/r/20220817061744.4176893-1-kafai@fb.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Martin KaFai Lau authored
Similar to the earlier patch that avoids sk_setsockopt() from taking sk lock and doing capable test when called by bpf. This patch changes do_ip_setsockopt() to use the sockopt_{lock,release}_sock() and sockopt_[ns_]capable(). Reviewed-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/r/20220817061737.4176402-1-kafai@fb.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Martin KaFai Lau authored
Similar to the earlier patch that avoids sk_setsockopt() from taking sk lock and doing capable test when called by bpf. This patch changes do_tcp_setsockopt() to use the sockopt_{lock,release}_sock() and sockopt_[ns_]capable(). Reviewed-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/r/20220817061730.4176021-1-kafai@fb.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Martin KaFai Lau authored
When bpf program calling bpf_setsockopt(SOL_SOCKET), it could be run in softirq and doesn't make sense to do the capable check. There was a similar situation in bpf_setsockopt(TCP_CONGESTION). In commit 8d650cde ("tcp: fix tcp_set_congestion_control() use from bpf hook"), tcp_set_congestion_control(..., cap_net_admin) was added to skip the cap check for bpf prog. This patch adds sockopt_ns_capable() and sockopt_capable() for the sk_setsockopt() to use. They will consider the has_current_bpf_ctx() before doing the ns_capable() and capable() test. They are in EXPORT_SYMBOL for the ipv6 module to use in a latter patch. Suggested-by: Stanislav Fomichev <sdf@google.com> Reviewed-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/r/20220817061723.4175820-1-kafai@fb.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Martin KaFai Lau authored
Most of the code in bpf_setsockopt(SOL_SOCKET) are duplicated from the sk_setsockopt(). The number of supported optnames are increasing ever and so as the duplicated code. One issue in reusing sk_setsockopt() is that the bpf prog has already acquired the sk lock. This patch adds a has_current_bpf_ctx() to tell if the sk_setsockopt() is called from a bpf prog. The bpf prog calling bpf_setsockopt() is either running in_task() or in_serving_softirq(). Both cases have the current->bpf_ctx initialized. Thus, the has_current_bpf_ctx() only needs to test !!current->bpf_ctx. This patch also adds sockopt_{lock,release}_sock() helpers for sk_setsockopt() to use. These helpers will test has_current_bpf_ctx() before acquiring/releasing the lock. They are in EXPORT_SYMBOL for the ipv6 module to use in a latter patch. Note on the change in sock_setbindtodevice(). sockopt_lock_sock() is done in sock_setbindtodevice() instead of doing the lock_sock in sock_bindtoindex(..., lock_sk = true). Reviewed-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/r/20220817061717.4175589-1-kafai@fb.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Martin KaFai Lau authored
A latter patch refactors bpf_setsockopt(SOL_SOCKET) with the sock_setsockopt() to avoid code duplication and code drift between the two duplicates. The current sock_setsockopt() takes sock ptr as the argument. The very first thing of this function is to get back the sk ptr by 'sk = sock->sk'. bpf_setsockopt() could be called when the sk does not have the sock ptr created. Meaning sk->sk_socket is NULL. For example, when a passive tcp connection has just been established but has yet been accept()-ed. Thus, it cannot use the sock_setsockopt(sk->sk_socket) or else it will pass a NULL ptr. This patch moves all sock_setsockopt implementation to the newly added sk_setsockopt(). The new sk_setsockopt() takes a sk ptr and immediately gets the sock ptr by 'sock = sk->sk_socket' The existing sock_setsockopt(sock) is changed to call sk_setsockopt(sock->sk). All existing callers have both sock->sk and sk->sk_socket pointer. The latter patch will make bpf_setsockopt(SOL_SOCKET) call sk_setsockopt(sk) directly. The bpf_setsockopt(SOL_SOCKET) does not use the optnames that require sk->sk_socket, so it will be safe. Reviewed-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/r/20220817061711.4175048-1-kafai@fb.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
- 18 Aug, 2022 3 commits
-
-
Maxime Chevallier authored
Add the ethtool_op_get_ts_info() callback to ethtool ops, so that we can at least use software timestamping. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://lore.kernel.org/r/20220817095725.97444-1-maxime.chevallier@bootlin.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Wong Vee Khee authored
The 'has_crossts' flag was not used anywhere in the stmmac driver, removing it from both header file and dwmac-intel driver. Signed-off-by: Wong Vee Khee <veekhee@apple.com> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Link: https://lore.kernel.org/r/20220817064324.10025-1-veekhee@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextJakub Kicinski authored
Andrii Nakryiko says: ==================== bpf-next 2022-08-17 We've added 45 non-merge commits during the last 14 day(s) which contain a total of 61 files changed, 986 insertions(+), 372 deletions(-). The main changes are: 1) New bpf_ktime_get_tai_ns() BPF helper to access CLOCK_TAI, from Kurt Kanzenbach and Jesper Dangaard Brouer. 2) Few clean ups and improvements for libbpf 1.0, from Andrii Nakryiko. 3) Expose crash_kexec() as kfunc for BPF programs, from Artem Savkov. 4) Add ability to define sleepable-only kfuncs, from Benjamin Tissoires. 5) Teach libbpf's bpf_prog_load() and bpf_map_create() to gracefully handle unsupported names on old kernels, from Hangbin Liu. 6) Allow opting out from auto-attaching BPF programs by libbpf's BPF skeleton, from Hao Luo. 7) Relax libbpf's requirement for shared libs to be marked executable, from Henqgi Chen. 8) Improve bpf_iter internals handling of error returns, from Hao Luo. 9) Few accommodations in libbpf to support GCC-BPF quirks, from James Hilliard. 10) Fix BPF verifier logic around tracking dynptr ref_obj_id, from Joanne Koong. 11) bpftool improvements to handle full BPF program names better, from Manu Bretelle. 12) bpftool fixes around libcap use, from Quentin Monnet. 13) BPF map internals clean ups and improvements around memory allocations, from Yafang Shao. 14) Allow to use cgroup_get_from_file() on cgroupv1, allowing BPF cgroup iterator to work on cgroupv1, from Yosry Ahmed. 15) BPF verifier internal clean ups, from Dave Marchevsky and Joanne Koong. 16) Various fixes and clean ups for selftests/bpf and vmtest.sh, from Daniel Xu, Artem Savkov, Joanne Koong, Andrii Nakryiko, Shibin Koikkara Reeny. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (45 commits) selftests/bpf: Few fixes for selftests/bpf built in release mode libbpf: Clean up deprecated and legacy aliases libbpf: Streamline bpf_attr and perf_event_attr initialization libbpf: Fix potential NULL dereference when parsing ELF selftests/bpf: Tests libbpf autoattach APIs libbpf: Allows disabling auto attach selftests/bpf: Fix attach point for non-x86 arches in test_progs/lsm libbpf: Making bpf_prog_load() ignore name if kernel doesn't support selftests/bpf: Update CI kconfig selftests/bpf: Add connmark read test selftests/bpf: Add existing connection bpf_*_ct_lookup() test bpftool: Clear errno after libcap's checks bpf: Clear up confusion in bpf_skb_adjust_room()'s documentation bpftool: Fix a typo in a comment libbpf: Add names for auxiliary maps bpf: Use bpf_map_area_alloc consistently on bpf map creation bpf: Make __GFP_NOWARN consistent in bpf map creation bpf: Use bpf_map_area_free instread of kvfree bpf: Remove unneeded memset in queue_stack_map creation libbpf: preserve errno across pr_warn/pr_info/pr_debug ... ==================== Link: https://lore.kernel.org/r/20220817215656.1180215-1-andrii@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
- 17 Aug, 2022 5 commits
-
-
Andrii Nakryiko authored
Fix few issues found when building and running test_progs in release mode. First, potentially uninitialized idx variable in xskxceiver, force-initialize to zero to satisfy compiler. Few instances of defining uprobe trigger functions break in release mode unless marked as noinline, due to being static. Add noinline to make sure everything works. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Hao Luo <haoluo@google.com> Link: https://lore.kernel.org/bpf/20220816001929.369487-5-andrii@kernel.org
-
Andrii Nakryiko authored
Remove three missed deprecated APIs that were aliased to new APIs: bpf_object__unload, bpf_prog_attach_xattr and btf__load. Also move legacy API libbpf_find_kernel_btf (aliased to btf__load_vmlinux_btf) into libbpf_legacy.h. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Hao Luo <haoluo@google.com> Link: https://lore.kernel.org/bpf/20220816001929.369487-4-andrii@kernel.org
-
Andrii Nakryiko authored
Make sure that entire libbpf code base is initializing bpf_attr and perf_event_attr with memset(0). Also for bpf_attr make sure we clear and pass to kernel only relevant parts of bpf_attr. bpf_attr is a huge union of independent sub-command attributes, so there is no need to clear and pass entire union bpf_attr, which over time grows quite a lot and for most commands this growth is completely irrelevant. Few cases where we were relying on compiler initialization of BPF UAPI structs (like bpf_prog_info, bpf_map_info, etc) with `= {};` were switched to memset(0) pattern for future-proofing. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Hao Luo <haoluo@google.com> Link: https://lore.kernel.org/bpf/20220816001929.369487-3-andrii@kernel.org
-
Andrii Nakryiko authored
Fix if condition filtering empty ELF sections to prevent NULL dereference. Fixes: 47ea7417 ("libbpf: Skip empty sections in bpf_object__init_global_data_maps") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Hao Luo <haoluo@google.com> Link: https://lore.kernel.org/bpf/20220816001929.369487-2-andrii@kernel.org
-
Jakub Kicinski authored
Florian Fainelli says: ==================== net: dsa: bcm_sf2: Utilize PHYLINK for all ports This patch series has the bcm_sf2 driver utilize PHYLINK to configure the CPU port link parameters to unify the configuration and pave the way for DSA to utilize PHYLINK for all ports in the future. Tested on BCM7445 and BCM7278 ==================== Link: https://lore.kernel.org/r/20220815175009.2681932-1-f.fainelli@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-