Commits · 35df0155e68a1fb20646f34247f19131170693bd · Kirill Smelkov / linux

22 Mar, 2022 2 commits
- Revert "powerpc: Add rethook support" · 35df0155
  Alexei Starovoitov authored Mar 21, 2022
```
This reverts commit 02752bd9.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
```
  35df0155
- Revert "ARM: rethook: Add rethook arm implementation" · ecaed3b9
  Alexei Starovoitov authored Mar 21, 2022
```
This reverts commit 515a4917.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
```
  ecaed3b9
21 Mar, 2022 23 commits

bpftool: Fix a bug in subskeleton code generation · f97b8b9b

Yonghong Song authored Mar 19, 2022

Compiled with clang by adding LLVM=1 both kernel and selftests/bpf
build, I hit the following compilation error:

In file included from /.../tools/testing/selftests/bpf/prog_tests/subskeleton.c:6:
  ./test_subskeleton_lib.subskel.h:168:6: error: variable 'err' is used uninitialized whenever
      'if' condition is true [-Werror,-Wsometimes-uninitialized]
          if (!s->progs)
              ^~~~~~~~~
  ./test_subskeleton_lib.subskel.h:181:11: note: uninitialized use occurs here
          errno = -err;
                   ^~~
  ./test_subskeleton_lib.subskel.h:168:2: note: remove the 'if' if its condition is always false
          if (!s->progs)
          ^~~~~~~~~~~~~~

The compilation error is triggered by the following code
        ...
        int err;

        obj = (struct test_subskeleton_lib *)calloc(1, sizeof(*obj));
        if (!obj) {
                errno = ENOMEM;
                goto err;
        }
        ...

  err:
        test_subskeleton_lib__destroy(obj);
        errno = -err;
        ...
in test_subskeleton_lib__open(). The 'err' is not initialized, yet it
is used in 'errno = -err' later.

The fix is to remove 'errno = -err' since errno has been set properly
in all incoming branches.

Fixes: 00389c58 ("bpftool: Add support for subskeletons")
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220320032009.3106133-1-yhs@fb.com

f97b8b9b

bpf: Fix bpf_prog_pack when PMU_SIZE is not defined · e5810941

Song Liu authored Mar 21, 2022

PMD_SIZE is not available in some special config, e.g. ARCH=arm with
CONFIG_MMU=n. Use bpf_prog_pack of PAGE_SIZE in these cases.

Fixes: ef078600 ("bpf: Select proper size for bpf_prog_pack")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Song Liu <song@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220321180009.1944482-3-song@kernel.org

e5810941

bpf: Fix bpf_prog_pack for multi-node setup · 96805674

Song Liu authored Mar 21, 2022

module_alloc requires num_online_nodes * PMD_SIZE to allocate huge pages.
bpf_prog_pack uses pack of size num_online_nodes * PMD_SIZE.
OTOH, module_alloc returns addresses that are PMD_SIZE aligned (instead of
num_online_nodes * PMD_SIZE aligned). Therefore, PMD_MASK should be used
to calculate pack_ptr in bpf_prog_pack_free().

Fixes: ef078600 ("bpf: Select proper size for bpf_prog_pack")
Reported-by: syzbot+c946805b5ce6ab87df0b@syzkaller.appspotmail.com
Signed-off-by: Song Liu <song@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220321180009.1944482-2-song@kernel.org

96805674

bpf: Fix warning for cast from restricted gfp_t in verifier · d56c9fe6

Joanne Koong authored Mar 21, 2022

This fixes the sparse warning reported by the kernel test robot:

kernel/bpf/verifier.c:13499:47: sparse: warning: cast from restricted gfp_t
kernel/bpf/verifier.c:13501:47: sparse: warning: cast from restricted gfp_t

This fix can be verified locally by running:
1) wget
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross
-O make.cross

2) chmod +x ~/bin/make.cross

3) COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 ./make.cross
C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__'

Fixes: b00fa38a ("bpf: Enable non-atomic allocations in local storage")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220321185802.824223-1-joannekoong@fb.com

d56c9fe6

bpf, arm: Fix various typos in comments · d8dc09a4

Julia Lawall authored Mar 18, 2022

Various spelling mistakes in comments. Detected with the help of
Coccinelle.
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220318103729.157574-9-Julia.Lawall@inria.fr

d8dc09a4

libbpf: Close fd in bpf_object__reuse_map · d0f325c3

Hengqi Chen authored Mar 19, 2022

pin_fd is dup-ed and assigned in bpf_map__reuse_fd. Close it
in bpf_object__reuse_map after reuse.
Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220319030533.3132250-1-hengqi.chen@gmail.com

d0f325c3

bpftool: Fix print error when show bpf map · 1824d8ea

Yafang Shao authored Mar 20, 2022

If there is no btf_id or frozen, it will not show the pids, but the pids don't
depend on any one of them.

Below is the result after this change:

  $ ./bpftool map show
  2: lpm_trie  flags 0x1
	key 8B  value 8B  max_entries 1  memlock 4096B
	pids systemd(1)
  3: lpm_trie  flags 0x1
	key 20B  value 8B  max_entries 1  memlock 4096B
	pids systemd(1)

While before this change, the 'pids systemd(1)' can't be displayed.

Fixes: 9330986c ("bpf: Add bloom filter map implementation")
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220320060815.7716-1-laoar.shao@gmail.com

1824d8ea

bpf: Fix kprobe_multi return probe backtrace · f7098690

Jiri Olsa authored Mar 21, 2022

Andrii reported that backtraces from kprobe_multi program attached
as return probes are not complete and showing just initial entry [1].

It's caused by changing registers to have original function ip address
as instruction pointer even for return probe, which will screw backtrace
from return probe.

This change keeps registers intact and store original entry ip and
link address on the stack in bpf_kprobe_multi_run_ctx struct, where
bpf_get_func_ip and bpf_get_attach_cookie helpers for kprobe_multi
programs can find it.

[1] https://lore.kernel.org/bpf/CAEf4BzZDDqK24rSKwXNp7XL3ErGD4bZa1M6c_c4EvDSt3jrZcg@mail.gmail.com/T/#m8d1301c0ea0892ddf9dc6fba57a57b8cf11b8c51

Fixes: ca74823c ("bpf: Add cookie support to programs attached with kprobe multi link")
Reported-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220321070113.1449167-3-jolsa@kernel.org

f7098690

Revert "bpf: Add support to inline bpf_get_func_ip helper on x86" · f705ec76

Jiri Olsa authored Mar 21, 2022

This reverts commit 97ee4d20.

Following change is adding more complexity to bpf_get_func_ip
helper for kprobe_multi programs, which can't be inlined easily.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220321070113.1449167-2-jolsa@kernel.org

f705ec76

bpf: Simplify check in btf_parse_hdr() · 583669ab

Yuntao Wang authored Mar 20, 2022

Replace offsetof(hdr_len) + sizeof(hdr_len) with offsetofend(hdr_len) to
simplify the check for correctness of btf_data_size in btf_parse_hdr()
Signed-off-by: Yuntao Wang <ytcoode@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220320075240.1001728-1-ytcoode@gmail.com

583669ab

selftests/bpf/test_lirc_mode2.sh: Exit with proper code · ec80906b

Hangbin Liu authored Mar 21, 2022

When test_lirc_mode2_user exec failed, the test report failed but still
exit with 0. Fix it by exiting with an error code.

Another issue is for the LIRCDEV checking. With bash -n, we need to quote
the variable, or it will always be true. So if test_lirc_mode2_user was
not run, just exit with skip code.

Fixes: 6bdd533c ("bpf: add selftest for lirc_mode2 type program")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220321024149.157861-1-liuhangbin@gmail.com

ec80906b

bpf: Check for NULL return from bpf_get_btf_vmlinux · 7ada3787

Kumar Kartikeya Dwivedi authored Mar 20, 2022

When CONFIG_DEBUG_INFO_BTF is disabled, bpf_get_btf_vmlinux can return a
NULL pointer. Check for it in btf_get_module_btf to prevent a NULL pointer
dereference.

While kernel test robot only complained about this specific case, let's
also check for NULL in other call sites of bpf_get_btf_vmlinux.

Fixes: 9492450f ("bpf: Always raise reference in btf_get_module_btf")
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220320143003.589540-1-memxor@gmail.com

7ada3787

selftests/bpf: Test skipping stacktrace · e1cc1f39

Namhyung Kim authored Mar 14, 2022

Add a test case for stacktrace with skip > 0 using a small sized
buffer.  It didn't support skipping entries greater than or equal to
the size of buffer and filled the skipped part with 0.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220314182042.71025-2-namhyung@kernel.org

e1cc1f39

bpf: Adjust BPF stack helper functions to accommodate skip > 0 · ee2a0988

Namhyung Kim authored Mar 14, 2022

Let's say that the caller has storage for num_elem stack frames.  Then,
the BPF stack helper functions walk the stack for only num_elem frames.
This means that if skip > 0, one keeps only 'num_elem - skip' frames.

This is because it sets init_nr in the perf_callchain_entry to the end
of the buffer to save num_elem entries only.  I believe it was because
the perf callchain code unwound the stack frames until it reached the
global max size (sysctl_perf_event_max_stack).

However it now has perf_callchain_entry_ctx.max_stack to limit the
iteration locally.  This simplifies the code to handle init_nr in the
BPF callstack entries and removes the confusion with the perf_event's
__PERF_SAMPLE_CALLCHAIN_EARLY which sets init_nr to 0.

Also change the comment on bpf_get_stack() in the header file to be
more explicit what the return value means.

Fixes: c195651e ("bpf: add bpf_get_stack helper")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/30a7b5d5-6726-1cc2-eaee-8da2828a9a9c@oracle.com
Link: https://lore.kernel.org/bpf/20220314182042.71025-1-namhyung@kernel.orgBased-on-patch-by: Eugene Loh <eugene.loh@oracle.com>

ee2a0988

bpf: Select proper size for bpf_prog_pack · ef078600

Song Liu authored Mar 11, 2022

Using HPAGE_PMD_SIZE as the size for bpf_prog_pack is not ideal in some
cases. Specifically, for NUMA systems, __vmalloc_node_range requires
PMD_SIZE * num_online_nodes() to allocate huge pages. Also, if the system
does not support huge pages (i.e., with cmdline option nohugevmalloc), it
is better to use PAGE_SIZE packs.

Add logic to select proper size for bpf_prog_pack. This solution is not
ideal, as it makes assumption about the behavior of module_alloc and
__vmalloc_node_range. However, it appears to be the easiest solution as
it doesn't require changes in module_alloc and vmalloc code.

Fixes: 57631054 ("bpf: Introduce bpf_prog_pack allocator")
Signed-off-by: Song Liu <song@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220311201135.3573610-1-song@kernel.org

ef078600

Merge branch 'Make 2-byte access to bpf_sk_lookup->remote_port endian-agnostic' · 46e9244b

Alexei Starovoitov authored Mar 20, 2022

Jakub Sitnicki says:

====================

This patch set is a result of a discussion we had around the RFC patchset from
Ilya [1]. The fix for the narrow loads from the RFC series is still relevant,
but this series does not depend on it. Nor is it required to unbreak sk_lookup
tests on BE, if this series gets applied.

To summarize the takeaways from [1]:

 1) we want to make 2-byte load from ctx->remote_port portable across LE and BE,
 2) we keep the 4-byte load from ctx->remote_port as it is today - result varies
    on endianess of the platform.

[1] https://lore.kernel.org/bpf/20220222182559.2865596-2-iii@linux.ibm.com/

v1 -> v2:
- Remove needless check that 4-byte load is from &ctx->remote_port offset
  (Martin)

[v1]: https://lore.kernel.org/bpf/20220317165826.1099418-1-jakub@cloudflare.com/
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

46e9244b

selftests/bpf: Fix test for 4-byte load from remote_port on big-endian · ce523680

Jakub Sitnicki authored Mar 19, 2022

The context access converter rewrites the 4-byte load from
bpf_sk_lookup->remote_port to a 2-byte load from bpf_sk_lookup_kern
structure.

It means that we cannot treat the destination register contents as a 32-bit
value, or the code will not be portable across big- and little-endian
architectures.

This is exactly the same case as with 4-byte loads from bpf_sock->dst_port
so follow the approach outlined in [1] and treat the register contents as a
16-bit value in the test.

[1]: https://lore.kernel.org/bpf/20220317113920.1068535-5-jakub@cloudflare.com/

Fixes: 2ed0dc59 ("selftests/bpf: Cover 4-byte load from remote_port in bpf_sk_lookup")
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20220319183356.233666-4-jakub@cloudflare.com

ce523680

selftests/bpf: Fix u8 narrow load checks for bpf_sk_lookup remote_port · 3c69611b

Jakub Sitnicki authored Mar 19, 2022

In commit 9a69e2b3 ("bpf: Make remote_port field in struct
bpf_sk_lookup 16-bit wide") ->remote_port field changed from __u32 to
__be16.

However, narrow load tests which exercise 1-byte sized loads from
offsetof(struct bpf_sk_lookup, remote_port) were not adopted to reflect the
change.

As a result, on little-endian we continue testing loads from addresses:

 - (__u8 *)&ctx->remote_port + 3
 - (__u8 *)&ctx->remote_port + 4

which map to the zero padding following the remote_port field, and don't
break the tests because there is no observable change.

While on big-endian, we observe breakage because tests expect to see zeros
for values loaded from:

 - (__u8 *)&ctx->remote_port - 1
 - (__u8 *)&ctx->remote_port - 2

Above addresses map to ->remote_ip6 field, which precedes ->remote_port,
and are populated during the bpf_sk_lookup IPv6 tests.

Unsurprisingly, on s390x we observe:

  #136/38 sk_lookup/narrow access to ctx v4:OK
  #136/39 sk_lookup/narrow access to ctx v6:FAIL

Fix it by removing the checks for 1-byte loads from offsets outside of the
->remote_port field.

Fixes: 9a69e2b3 ("bpf: Make remote_port field in struct bpf_sk_lookup 16-bit wide")
Suggested-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20220319183356.233666-3-jakub@cloudflare.com

3c69611b

bpf: Treat bpf_sk_lookup remote_port as a 2-byte field · 058ec4a7

Jakub Sitnicki authored Mar 19, 2022

In commit 9a69e2b3 ("bpf: Make remote_port field in struct
bpf_sk_lookup 16-bit wide") the remote_port field has been split up and
re-declared from u32 to be16.

However, the accompanying changes to the context access converter have not
been well thought through when it comes big-endian platforms.

Today 2-byte wide loads from offsetof(struct bpf_sk_lookup, remote_port)
are handled as narrow loads from a 4-byte wide field.

This by itself is not enough to create a problem, but when we combine

 1. 32-bit wide access to ->remote_port backed by a 16-wide wide load, with
 2. inherent difference between litte- and big-endian in how narrow loads
    need have to be handled (see bpf_ctx_narrow_access_offset),

we get inconsistent results for a 2-byte loads from &ctx->remote_port on LE
and BE architectures. This in turn makes BPF C code for the common case of
2-byte load from ctx->remote_port not portable.

To rectify it, inform the context access converter that remote_port is
2-byte wide field, and only 1-byte loads need to be treated as narrow
loads.

At the same time, we special-case the 4-byte load from &ctx->remote_port to
continue handling it the same way as do today, in order to keep the
existing BPF programs working.

Fixes: 9a69e2b3 ("bpf: Make remote_port field in struct bpf_sk_lookup 16-bit wide")
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20220319183356.233666-2-jakub@cloudflare.com

058ec4a7

Merge branch 'Enable non-atomic allocations in local storage' · 30630e44

Alexei Starovoitov authored Mar 20, 2022

Joanne Koong says:

====================

From: Joanne Koong <joannelkoong@gmail.com>

Currently, local storage memory can only be allocated atomically
(GFP_ATOMIC). This restriction is too strict for sleepable bpf
programs.

In this patchset, sleepable programs can allocate memory in local
storage using GFP_KERNEL, while non-sleepable programs always default to
GFP_ATOMIC.

v3 <- v2:
* Add extra case to local_storage.c selftest to test associating multiple
elements with the local storage, which triggers a GFP_KERNEL allocation in
local_storage_update().
* Cast gfp_t to __s32 in verifier to fix the sparse warnings

v2 <- v1:
* Allocate the memory before/after the raw_spin_lock_irqsave, depending
on the gfp flags
* Rename mem_flags to gfp_flags
* Reword the comment "*mem_flags* is set by the bpf verifier" to
"*gfp_flags* is a hidden argument provided by the verifier"
* Add a sentence to the commit message about existing local storage
selftests covering both the GFP_ATOMIC and GFP_KERNEL paths in
bpf_local_storage_update.
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

30630e44

selftests/bpf: Test for associating multiple elements with the local storage · 0e790cbb

Joanne Koong authored Mar 17, 2022

This patch adds a few calls to the existing local storage selftest to
test that we can associate multiple elements with the local storage.

The sleepable program's call to bpf_sk_storage_get with sk_storage_map2
will lead to an allocation of a new selem under the GFP_KERNEL flag.
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220318045553.3091807-3-joannekoong@fb.com

0e790cbb

bpf: Enable non-atomic allocations in local storage · b00fa38a

Joanne Koong authored Mar 17, 2022

Currently, local storage memory can only be allocated atomically
(GFP_ATOMIC). This restriction is too strict for sleepable bpf
programs.

In this patch, the verifier detects whether the program is sleepable,
and passes the corresponding GFP_KERNEL or GFP_ATOMIC flag as a
5th argument to bpf_task/sk/inode_storage_get. This flag will propagate
down to the local storage functions that allocate memory.

Please note that bpf_task/sk/inode_storage_update_elem functions are
invoked by userspace applications through syscalls. Preemption is
disabled before bpf_task/sk/inode_storage_update_elem is called, which
means they will always have to allocate memory atomically.
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: KP Singh <kpsingh@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20220318045553.3091807-2-joannekoong@fb.com

b00fa38a

libbpf: Avoid NULL deref when initializing map BTF info · a8fee962

Andrii Nakryiko authored Mar 19, 2022

If BPF object doesn't have an BTF info, don't attempt to search for BTF
types describing BPF map key or value layout.

Fixes: 262cfb74 ("libbpf: Init btf_{key,value}_type_id on internal map open")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220320001911.3640917-1-andrii@kernel.org

a8fee962

19 Mar, 2022 1 commit

bpf: Always raise reference in btf_get_module_btf · 9492450f

Kumar Kartikeya Dwivedi authored Mar 17, 2022

Align it with helpers like bpf_find_btf_id, so all functions returning
BTF in out parameter follow the same rule of raising reference
consistently, regardless of module or vmlinux BTF.

Adjust existing callers to handle the change accordinly.
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220317115957.3193097-10-memxor@gmail.com

9492450f

18 Mar, 2022 14 commits

bpf: Factor out fd returning from bpf_btf_find_by_name_kind · edc3ec09

Kumar Kartikeya Dwivedi authored Mar 17, 2022

In next few patches, we need a helper that searches all kernel BTFs
(vmlinux and module BTFs), and finds the type denoted by 'name' and
'kind'. Turns out bpf_btf_find_by_name_kind already does the same thing,
but it instead returns a BTF ID and optionally fd (if module BTF). This
is used for relocating ksyms in BPF loader code (bpftool gen skel -L).

We extract the core code out into a new helper bpf_find_btf_id, which
returns the BTF ID in the return value, and BTF pointer in an out
parameter. The reference for the returned BTF pointer is always raised,
hence user must either transfer it (e.g. to a fd), or release it after
use.
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220317115957.3193097-2-memxor@gmail.com

edc3ec09

bpftool: Add BPF_TRACE_KPROBE_MULTI to attach type names table · 08063b4b

Andrii Nakryiko authored Mar 18, 2022

BPF_TRACE_KPROBE_MULTI is a new attach type name, add it to bpftool's
table. This fixes a currently failing CI bpftool check.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220318150106.2933343-1-andrii@kernel.org

08063b4b

Merge branch 'bpf-fix-sock-field-tests' · 63cc8e20

Daniel Borkmann authored Mar 18, 2022

Jakub Sitnicki says:

====================
I think we have reached a consensus [1] on how the test for the 4-byte load from
bpf_sock->dst_port and bpf_sk_lookup->remote_port should look, so here goes v3.

I will submit a separate set of patches for bpf_sk_lookup->remote_port tests.

This series has been tested on x86_64 and s390 on top of recent bpf-next -
ad13baf4 ("selftests/bpf: Test subprog jit when toggle bpf_jit_harden
repeatedly").

  [1] https://lore.kernel.org/bpf/87k0cwxkzs.fsf@cloudflare.com/

v2 -> v3:
- Split what was previously patch 2 which was doing two things
- Use BPF_TCP_* constants (Martin)
- Treat the result of 4-byte load from dst_port as a 16-bit value (Martin)
- Typo fixup and some rewording in patch 4 description
v1 -> v2:
- Limit read_sk_dst_port only to client traffic (patch 2)
- Make read_sk_dst_port pass on litte- and big-endian (patch 3)

v1: https://lore.kernel.org/bpf/20220225184130.483208-1-jakub@cloudflare.com/
v2: https://lore.kernel.org/bpf/20220227202757.519015-1-jakub@cloudflare.com/
====================
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

63cc8e20

selftests/bpf: Fix test for 4-byte load from dst_port on big-endian · deb59400

Jakub Sitnicki authored Mar 17, 2022

The check for 4-byte load from dst_port offset into bpf_sock is failing on
big-endian architecture - s390. The bpf access converter rewrites the
4-byte load to a 2-byte load from sock_common at skc_dport offset, as shown
below.

  * s390 / llvm-objdump -S --no-show-raw-insn

  00000000000002a0 <sk_dst_port__load_word>:
        84:       r1 = *(u32 *)(r1 + 48)
        85:       w0 = 1
        86:       if w1 == 51966 goto +1 <LBB5_2>
        87:       w0 = 0
  00000000000002c0 <LBB5_2>:
        88:       exit

  * s390 / bpftool prog dump xlated

  _Bool sk_dst_port__load_word(struct bpf_sock * sk):
    35: (69) r1 = *(u16 *)(r1 +12)
    36: (bc) w1 = w1
    37: (b4) w0 = 1
    38: (16) if w1 == 0xcafe goto pc+1
    39: (b4) w0 = 0
    40: (95) exit

  * x86_64 / llvm-objdump -S --no-show-raw-insn

  00000000000002a0 <sk_dst_port__load_word>:
        84:       r1 = *(u32 *)(r1 + 48)
        85:       w0 = 1
        86:       if w1 == 65226 goto +1 <LBB5_2>
        87:       w0 = 0
  00000000000002c0 <LBB5_2>:
        88:       exit

  * x86_64 / bpftool prog dump xlated

  _Bool sk_dst_port__load_word(struct bpf_sock * sk):
    33: (69) r1 = *(u16 *)(r1 +12)
    34: (b4) w0 = 1
    35: (16) if w1 == 0xfeca goto pc+1
    36: (b4) w0 = 0
    37: (95) exit

This leads to surprises if we treat the destination register contents as a
32-bit value, ignoring the fact that in reality it contains a 16-bit value.

On little-endian the register contents reflect the bpf_sock struct
definition, where the lower 16-bits contain the port number:

	struct bpf_sock {
		...
		__be16 dst_port;	/* offset 48 */
		__u16 :16;
		...
	};

However, on big-endian the register contents suggest that field the layout
of bpf_sock struct is as so:

	struct bpf_sock {
		...
		__u16 :16;		/* offset 48 */
		__be16 dst_port;
		...
	};

Account for this quirky access conversion in the test case exercising the
4-byte load by treating the result as 16-bit wide.
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20220317113920.1068535-5-jakub@cloudflare.com

deb59400

selftests/bpf: Use constants for socket states in sock_fields test · e06b5bbc

Jakub Sitnicki authored Mar 17, 2022

Replace magic numbers in BPF code with constants from bpf.h, so that they
don't require an explanation in the comments.
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20220317113920.1068535-4-jakub@cloudflare.com

e06b5bbc

selftests/bpf: Check dst_port only on the client socket · 2d2202ba

Jakub Sitnicki authored Mar 17, 2022

cgroup_skb/egress programs which sock_fields test installs process packets
flying in both directions, from the client to the server, and in reverse
direction.

Recently added dst_port check relies on the fact that destination
port (remote peer port) of the socket which sends the packet is known ahead
of time. This holds true only for the client socket, which connects to the
known server port.

Filter out any traffic that is not egressing from the client socket in the
BPF program that tests reading the dst_port.

Fixes: 8f50f16f ("selftests/bpf: Extend verifier and bpf_sock tests for dst_port loads")
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20220317113920.1068535-3-jakub@cloudflare.com

2d2202ba

selftests/bpf: Fix error reporting from sock_fields programs · a4c9fe0e

Jakub Sitnicki authored Mar 17, 2022

The helper macro that records an error in BPF programs that exercise sock
fields access has been inadvertently broken by adaptation work that
happened in commit b18c1f0a ("bpf: selftest: Adapt sock_fields test to
use skel and global variables").

BPF_NOEXIST flag cannot be used to update BPF_MAP_TYPE_ARRAY. The operation
always fails with -EEXIST, which in turn means the error never gets
recorded, and the checks for errors always pass.

Revert the change in update flags.

Fixes: b18c1f0a ("bpf: selftest: Adapt sock_fields test to use skel and global variables")
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20220317113920.1068535-2-jakub@cloudflare.com

a4c9fe0e

Merge branch 'Subskeleton support for BPF librariesThread-Topic: [PATCH bpf-next v4 0/5' · 60911970

Andrii Nakryiko authored Mar 17, 2022

Delyan Kratunov says:

====================

In the quest for ever more modularity, a new need has arisen - the ability to
access data associated with a BPF library from a corresponding userspace library.
The catch is that we don't want the userspace library to know about the structure of the
final BPF object that the BPF library is linked into.

In pursuit of this modularity, this patch series introduces *subskeletons.*
Subskeletons are similar in use and design to skeletons with a couple of differences:

1. The generated storage types do not rely on contiguous storage for the library's
variables because they may be interspersed randomly throughout the final BPF object's sections.

2. Subskeletons do not own objects and instead require a loaded bpf_object* to
be passed at runtime in order to be initialized. By extension, symbols are resolved at
runtime by parsing the final object's BTF.

3. Subskeletons allow access to all global variables, programs, and custom maps. They also expose
the internal maps *of the final object*. This allows bpf_var_skeleton objects to contain a bpf_map**
instead of a section name.

Changes since v3:
 - Re-add key/value type lookup for legacy user maps (fixing btf test)
 - Minor cleanups (missed sanitize_identifier call, error messages, formatting)

Changes since v2:
 - Reuse SEC_NAME strict mode flag
 - Init bpf_map->btf_value_type_id on open for internal maps *and* user BTF maps
 - Test custom section names (.data.foo) and overlapping kconfig externs between the final object and the library
 - Minor review comments in gen.c & libbpf.c

Changes since v1:
 - Introduced new strict mode knob for single-routine-in-.text compatibility behavior, which
   disproportionately affects library objects. bpftool works in 1.0 mode so subskeleton generation
   doesn't have to worry about this now.
 - Made bpf_map_btf_value_type_id available earlier and used it wherever applicable.
 - Refactoring in bpftool gen.c per review comments.
 - Subskels now use typeof() for array and func proto globals to avoid the need for runtime split btf.
 - Expanded the subskeleton test to include arrays, custom maps, extern maps, weak symbols, and kconfigs.
 - selftests/bpf/Makefile now generates a subskel.h for every skel.h it would make.

For reference, here is a shortened subskeleton header:

#ifndef __TEST_SUBSKELETON_LIB_SUBSKEL_H__
#define __TEST_SUBSKELETON_LIB_SUBSKEL_H__

struct test_subskeleton_lib {
	struct bpf_object *obj;
	struct bpf_object_subskeleton *subskel;
	struct {
		struct bpf_map *map2;
		struct bpf_map *map1;
		struct bpf_map *data;
		struct bpf_map *rodata;
		struct bpf_map *bss;
		struct bpf_map *kconfig;
	} maps;
	struct {
		struct bpf_program *lib_perf_handler;
	} progs;
	struct test_subskeleton_lib__data {
		int *var6;
		int *var2;
		int *var5;
	} data;
	struct test_subskeleton_lib__rodata {
		int *var1;
	} rodata;
	struct test_subskeleton_lib__bss {
		struct {
			int var3_1;
			__s64 var3_2;
		} *var3;
		int *libout1;
		typeof(int[4]) *var4;
		typeof(int (*)()) *fn_ptr;
	} bss;
	struct test_subskeleton_lib__kconfig {
		_Bool *CONFIG_BPF_SYSCALL;
	} kconfig;

static inline struct test_subskeleton_lib *
test_subskeleton_lib__open(const struct bpf_object *src)
{
	struct test_subskeleton_lib *obj;
	struct bpf_object_subskeleton *s;
	int err;

	...
	s = (struct bpf_object_subskeleton *)calloc(1, sizeof(*s));
	...

	s->var_cnt = 9;
	...

	s->vars[0].name = "var6";
	s->vars[0].map = &obj->maps.data;
	s->vars[0].addr = (void**) &obj->data.var6;
  ...

	/* maps */
	...

	/* programs */
	s->prog_cnt = 1;
	...

	err = bpf_object__open_subskeleton(s);
  ...
	return obj;
}
#endif /* __TEST_SUBSKELETON_LIB_SUBSKEL_H__ */
====================
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>

60911970

selftests/bpf: Test subskeleton functionality · 3cccbaa0

Delyan Kratunov authored Mar 16, 2022

This patch changes the selftests/bpf Makefile to also generate
a subskel.h for every skel.h it would have normally generated.

Separately, it also introduces a new subskeleton test which tests
library objects, externs, weak symbols, kconfigs, and user maps.
Signed-off-by: Delyan Kratunov <delyank@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1bd24956940bbbfe169bb34f7f87b11df52ef011.1647473511.git.delyank@fb.com

3cccbaa0

bpftool: Add support for subskeletons · 00389c58

Delyan Kratunov authored Mar 16, 2022

Subskeletons are headers which require an already loaded program to
operate.

For example, when a BPF library is linked into a larger BPF object file,
the library userspace needs a way to access its own global variables
without requiring knowledge about the larger program at build time.

As a result, subskeletons require a loaded bpf_object to open().
Further, they find their own symbols in the larger program by
walking BTF type data at run time.

At this time, programs, maps, and globals are supported through
non-owning pointers.
Signed-off-by: Delyan Kratunov <delyank@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/ca8a48b4841c72d285ecce82371bef4a899756cb.1647473511.git.delyank@fb.com

00389c58

libbpf: Add subskeleton scaffolding · 430025e5

Delyan Kratunov authored Mar 16, 2022

In symmetry with bpf_object__open_skeleton(),
bpf_object__open_subskeleton() performs the actual walking and linking
of maps, progs, and globals described by bpf_*_skeleton objects.
Signed-off-by: Delyan Kratunov <delyank@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/6942a46fbe20e7ebf970affcca307ba616985b15.1647473511.git.delyank@fb.com

430025e5

libbpf: Init btf_{key,value}_type_id on internal map open · 262cfb74

Delyan Kratunov authored Mar 16, 2022

For internal and user maps, look up the key and value btf
types on open() and not load(), so that `bpf_map_btf_value_type_id`
is usable in `bpftool gen`.
Signed-off-by: Delyan Kratunov <delyank@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/78dbe4e457b4a05e098fc6c8f50014b680c86e4e.1647473511.git.delyank@fb.com

262cfb74

libbpf: .text routines are subprograms in strict mode · bc380eb9

Delyan Kratunov authored Mar 16, 2022

Currently, libbpf considers a single routine in .text to be a program. This
is particularly confusing when it comes to library objects - a single routine
meant to be used as an extern will instead be considered a bpf_program.

This patch hides this compatibility behavior behind the pre-existing
SEC_NAME strict mode flag.
Signed-off-by: Delyan Kratunov <delyank@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/018de8d0d67c04bf436055270d35d394ba393505.1647473511.git.delyank@fb.com

bc380eb9

Merge branch 'bpf: Add kprobe multi link' · 5a5c11ee

Alexei Starovoitov authored Mar 17, 2022

Jiri Olsa says:

====================

hi,
this patchset adds new link type BPF_TRACE_KPROBE_MULTI that attaches
kprobe program through fprobe API [1] instroduced by Masami.

The fprobe API allows to attach probe on multiple functions at once very
fast, because it works on top of ftrace. On the other hand this limits
the probe point to the function entry or return.

With bpftrace support I see following attach speed:

  # perf stat --null -r 5 ./src/bpftrace -e 'kprobe:x* { } i:ms:1 { exit(); } '
  Attaching 2 probes...
  Attaching 3342 functions
  ...

  1.4960 +- 0.0285 seconds time elapsed  ( +-  1.91% )

v3 changes:
  - based on latest fprobe post from Masami [2]
  - add acks
  - add extra comment to kprobe_multi_link_handler wrt entry ip setup [Masami]
  - keep swap_words_64 static and swap values directly in
    bpf_kprobe_multi_cookie_swap [Andrii]
  - rearrange locking/migrate setup in kprobe_multi_link_prog_run [Andrii]
  - move uapi fields [Andrii]
  - add bpf_program__attach_kprobe_multi_opts function [Andrii]
  - many small test changes [Andrii]
  - added tests for bpf_program__attach_kprobe_multi_opts
  - make kallsyms_lookup_name check for empty string [Andrii]

v2 changes:
  - based on latest fprobe changes [1]
  - renaming the uapi interface to kprobe multi
  - adding support for sort_r to pass user pointer for swap functions
    and using that in cookie support to keep just single functions array
  - moving new link to kernel/trace/bpf_trace.c file
  - using single fprobe callback function for entry and exit
  - using kvzalloc, libbpf_ensure_mem functions
  - adding new k[ret]probe.multi sections instead of using current kprobe
  - used glob_match from test_progs.c, added '?' matching
  - move bpf_get_func_ip verifier inline change to seprate change
  - couple of other minor fixes

Also available at:
  https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
  bpf/kprobe_multi

thanks,
jirka

[1] https://lore.kernel.org/bpf/164458044634.586276.3261555265565111183.stgit@devnote2/
[2] https://lore.kernel.org/bpf/164735281449.1084943.12438881786173547153.stgit@devnote2/
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

5a5c11ee