Commits · cc9b22dfa735800980e7362f02aff6f1c2280996 · Kirill Smelkov / linux

21 Mar, 2024 1 commit

bpftool: Clean up HOST_CFLAGS, HOST_LDFLAGS for bootstrap bpftool · cc9b22df

Quentin Monnet authored Mar 20, 2024

Bpftool's Makefile uses $(HOST_CFLAGS) to build the bootstrap version of
bpftool, in order to pick the flags for the host (where we run the
bootstrap version) and not for the target system (where we plan to run
the full bpftool binary). But we pass too much information through this
variable.

In particular, we set HOST_CFLAGS by copying most of the $(CFLAGS); but
we do this after the feature detection for bpftool, which means that
$(CFLAGS), hence $(HOST_CFLAGS), contain all macro definitions for using
the different optional features. For example, -DHAVE_LLVM_SUPPORT may be
passed to the $(HOST_CFLAGS), even though the LLVM disassembler is not
used in the bootstrap version, and the related library may even be
missing for the host architecture.

A similar thing happens with the $(LDFLAGS), that we use unchanged for
linking the bootstrap version even though they may contains flags to
link against additional libraries.

To address the $(HOST_CFLAGS) issue, we move the definition of
$(HOST_CFLAGS) earlier in the Makefile, before the $(CFLAGS) update
resulting from the feature probing - none of which being relevant to the
bootstrap version. To clean up the $(LDFLAGS) for the bootstrap version,
we introduce a dedicated $(HOST_LDFLAGS) variable that we base on
$(LDFLAGS), before the feature probing as well.

On my setup, the following macro and libraries are removed from the
compiler invocation to build bpftool after this patch:

  -DUSE_LIBCAP
  -DHAVE_LLVM_SUPPORT
  -I/usr/lib/llvm-17/include
  -D_GNU_SOURCE
  -D__STDC_CONSTANT_MACROS
  -D__STDC_FORMAT_MACROS
  -D__STDC_LIMIT_MACROS
  -lLLVM-17
  -L/usr/lib/llvm-17/lib

Another advantage of cleaning up these flags is that displaying
available features with "bpftool version" becomes more accurate for the
bootstrap bpftool, and no longer reflects the features detected (and
available only) for the final binary.

Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Quentin Monnet <qmo@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Message-ID: <20240320014103.45641-1-qmo@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

cc9b22df

20 Mar, 2024 9 commits

selftests/bpf: scale benchmark counting by using per-CPU counters · 520fad2e

Andrii Nakryiko authored Mar 15, 2024

When benchmarking with multiple threads (-pN, where N>1), we start
contending on single atomic counter that both BPF trigger benchmarks are
using, as well as "baseline" tests in user space (trig-base and
trig-uprobe-base benchmarks). As such, we start bottlenecking on
something completely irrelevant to benchmark at hand.

Scale counting up by using per-CPU counters on BPF side. On use space
side we do the next best thing: hash thread ID to approximate per-CPU
behavior. It seems to work quite well in practice.

To demonstrate the difference, I ran three benchmarks with 1, 2, 4, 8,
16, and 32 threads:
  - trig-uprobe-base (no syscalls, pure tight counting loop in user-space);
  - trig-base (get_pgid() syscall, atomic counter in user-space);
  - trig-fentry (syscall to trigger fentry program, atomic uncontended per-CPU
    counter on BPF side).

Command used:

  for b in uprobe-base base fentry; do \
    for p in 1 2 4 8 16 32; do \
      printf "%-11s %2d: %s\n" $b $p \
        "$(sudo ./bench -w2 -d5 -a -p$p trig-$b | tail -n1 | cut -d'(' -f1 | cut -d' ' -f3-)"; \
    done; \
  done

Before these changes, aggregate throughput across all threads doesn't
scale well with number of threads, it actually even falls sharply for
uprobe-base due to a very high contention:

  uprobe-base  1:  138.998 ± 0.650M/s
  uprobe-base  2:   70.526 ± 1.147M/s
  uprobe-base  4:   63.114 ± 0.302M/s
  uprobe-base  8:   54.177 ± 0.138M/s
  uprobe-base 16:   45.439 ± 0.057M/s
  uprobe-base 32:   37.163 ± 0.242M/s
  base         1:   16.940 ± 0.182M/s
  base         2:   19.231 ± 0.105M/s
  base         4:   21.479 ± 0.038M/s
  base         8:   23.030 ± 0.037M/s
  base        16:   22.034 ± 0.004M/s
  base        32:   18.152 ± 0.013M/s
  fentry       1:   14.794 ± 0.054M/s
  fentry       2:   17.341 ± 0.055M/s
  fentry       4:   23.792 ± 0.024M/s
  fentry       8:   21.557 ± 0.047M/s
  fentry      16:   21.121 ± 0.004M/s
  fentry      32:   17.067 ± 0.023M/s

After these changes, we see almost perfect linear scaling, as expected.
The sub-linear scaling when going from 8 to 16 threads is interesting
and consistent on my test machine, but I haven't investigated what is
causing it this peculiar slowdown (across all benchmarks, could be due
to hyperthreading effects, not sure).

  uprobe-base  1:  139.980 ± 0.648M/s
  uprobe-base  2:  270.244 ± 0.379M/s
  uprobe-base  4:  532.044 ± 1.519M/s
  uprobe-base  8: 1004.571 ± 3.174M/s
  uprobe-base 16: 1720.098 ± 0.744M/s
  uprobe-base 32: 3506.659 ± 8.549M/s
  base         1:   16.869 ± 0.071M/s
  base         2:   33.007 ± 0.092M/s
  base         4:   64.670 ± 0.203M/s
  base         8:  121.969 ± 0.210M/s
  base        16:  207.832 ± 0.112M/s
  base        32:  424.227 ± 1.477M/s
  fentry       1:   14.777 ± 0.087M/s
  fentry       2:   28.575 ± 0.146M/s
  fentry       4:   56.234 ± 0.176M/s
  fentry       8:  106.095 ± 0.385M/s
  fentry      16:  181.440 ± 0.032M/s
  fentry      32:  369.131 ± 0.693M/s
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Message-ID: <20240315213329.1161589-1-andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

520fad2e

bpftool: Remove unnecessary source files from bootstrap version · e9a826dd

Quentin Monnet authored Mar 20, 2024

Commit d510296d ("bpftool: Use syscall/loader program in "prog load"
and "gen skeleton" command.") added new files to the list of objects to
compile in order to build the bootstrap version of bpftool. As far as I
can tell, these objects are unnecessary and were added by mistake; maybe
a draft version intended to add support for loading loader programs from
the bootstrap version. Anyway, we can remove these object files from the
list to make the bootstrap bpftool binary a tad smaller and faster to
build.

Fixes: d510296d ("bpftool: Use syscall/loader program in "prog load" and "gen skeleton" command.")
Signed-off-by: Quentin Monnet <qmo@kernel.org>
Message-ID: <20240320013457.44808-1-qmo@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

e9a826dd

bpftool: Enable libbpf logs when loading pid_iter in debug mode · be24a895

Quentin Monnet authored Mar 20, 2024

When trying to load the pid_iter BPF program used to iterate over the
PIDs of the processes holding file descriptors to BPF links, we would
unconditionally silence libbpf in order to keep the output clean if the
kernel does not support iterators and loading fails.

Although this is the desirable behaviour in most cases, this may hide
bugs in the pid_iter program that prevent it from loading, and it makes
it hard to debug such load failures, even in "debug" mode. Instead, it
makes more sense to print libbpf's logs when we pass the -d|--debug flag
to bpftool, so that users get the logs to investigate failures without
having to edit bpftool's source code.
Signed-off-by: Quentin Monnet <qmo@kernel.org>
Message-ID: <20240320012241.42991-1-qmo@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

be24a895

Merge branch 'bpf-raw-tracepoint-support-for-bpf-cookie' · 2e244a72

Alexei Starovoitov authored Mar 19, 2024

Andrii Nakryiko says:

====================
BPF raw tracepoint support for BPF cookie

Add ability to specify and retrieve BPF cookie for raw tracepoint programs.
Both BTF-aware (SEC("tp_btf")) and non-BTF-aware (SEC("raw_tp")) are
supported, as they are exactly the same at runtime.

This issue recently came up in production use cases, where custom tried to
switch from slower classic tracepoints to raw tracepoints and ran into this
limitation. Luckily, it's not that hard to support this for raw_tp programs.

v2->v3:
  - s/bpf_raw_tp_open/bpf_raw_tracepoint_open_opts/ (Alexei, Eduard);
v1->v2:
  - fixed type definition for stubs of bpf_probe_{register,unregister};
  - added __u32 :u32 and aligned raw_tp fields (Jiri);
  - added Stanislav's ack.
====================

Link: https://lore.kernel.org/r/20240319233852.1977493-1-andrii@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>

2e244a72

selftests/bpf: add raw_tp/tp_btf BPF cookie subtests · 51146ff0

Andrii Nakryiko authored Mar 19, 2024

Add test validating BPF cookie can be passed during raw_tp/tp_btf
attachment and can be retried at runtime with bpf_get_attach_cookie()
helper.
Acked-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Message-ID: <20240319233852.1977493-6-andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

51146ff0

libbpf: add support for BPF cookie for raw_tp/tp_btf programs · 36ffb202

Andrii Nakryiko authored Mar 19, 2024

Wire up BPF cookie passing or raw_tp and tp_btf programs, both in
low-level and high-level APIs.
Acked-by: Stanislav Fomichev <sdf@google.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Message-ID: <20240319233852.1977493-5-andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

36ffb202

bpf: support BPF cookie in raw tracepoint (raw_tp, tp_btf) programs · 68ca5d4e

Andrii Nakryiko authored Mar 19, 2024

Wire up BPF cookie for raw tracepoint programs (both BTF and non-BTF
aware variants). This brings them up to part w.r.t. BPF cookie usage
with classic tracepoint and fentry/fexit programs.
Acked-by: Stanislav Fomichev <sdf@google.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Message-ID: <20240319233852.1977493-4-andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

68ca5d4e

bpf: pass whole link instead of prog when triggering raw tracepoint · d4dfc570

Andrii Nakryiko authored Mar 19, 2024

Instead of passing prog as an argument to bpf_trace_runX() helpers, that
are called from tracepoint triggering calls, store BPF link itself
(struct bpf_raw_tp_link for raw tracepoints). This will allow to pass
extra information like BPF cookie into raw tracepoint registration.

Instead of replacing `struct bpf_prog *prog = __data;` with
corresponding `struct bpf_raw_tp_link *link = __data;` assignment in
`__bpf_trace_##call` I just passed `__data` through into underlying
bpf_trace_runX() call. This works well because we implicitly cast `void *`,
and it also avoids naming clashes with arguments coming from
tracepoint's "proto" list. We could have run into the same problem with
"prog", we just happened to not have a tracepoint that has "prog" input
argument. We are less lucky with "link", as there are tracepoints using
"link" argument name already. So instead of trying to avoid naming
conflicts, let's just remove intermediate local variable. It doesn't
hurt readibility, it's either way a bit of a maze of calls and macros,
that requires careful reading.
Acked-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Message-ID: <20240319233852.1977493-3-andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

d4dfc570

bpf: flatten bpf_probe_register call chain · 6b9c2950

Andrii Nakryiko authored Mar 19, 2024

bpf_probe_register() and __bpf_probe_register() have identical
signatures and bpf_probe_register() just redirect to
__bpf_probe_register(). So get rid of this extra function call step to
simplify following the source code.

It has no difference at runtime due to inlining, of course.
Acked-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Message-ID: <20240319233852.1977493-2-andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

6b9c2950

19 Mar, 2024 8 commits

selftests/bpf: Prevent client connect before server bind in test_tc_tunnel.sh · f803bcf9

Alessandro Carminati (Red Hat) authored Mar 14, 2024

In some systems, the netcat server can incur in delay to start listening.
When this happens, the test can randomly fail in various points.
This is an example error message:

   # ip gre none gso
   # encap 192.168.1.1 to 192.168.1.2, type gre, mac none len 2000
   # test basic connectivity
   # Ncat: Connection refused.

The issue stems from a race condition between the netcat client and server.
The test author had addressed this problem by implementing a sleep, which
I have removed in this patch.
This patch introduces a function capable of sleeping for up to two seconds.
However, it can terminate the waiting period early if the port is reported
to be listening.
Signed-off-by: Alessandro Carminati (Red Hat) <alessandro.carminati@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240314105911.213411-1-alessandro.carminati@gmail.com

f803bcf9

Merge branch 'current_pid_tgid-for-all-prog-types' · 437ffcb0

Andrii Nakryiko authored Mar 19, 2024

Yonghong Song says:

====================
current_pid_tgid() for all prog types

Currently bpf_get_current_pid_tgid() is allowed in tracing, cgroup
and sk_msg progs while bpf_get_ns_current_pid_tgid() is only allowed
in tracing progs.

We have an internal use case where for an application running
in a container (with pid namespace), user wants to get
the pid associated with the pid namespace in a cgroup bpf
program. Besides cgroup, the only prog type, supporting
bpf_get_current_pid_tgid() but not bpf_get_ns_current_pid_tgid(),
is sk_msg.

But actually both bpf_get_current_pid_tgid() and
bpf_get_ns_current_pid_tgid() helpers do not reveal kernel internal
data and there is no reason that they cannot be used in other
program types. This patch just did this and enabled these
two helpers for all program types.

Patch 1 added the kernel support and patches 2-5 added
the test for cgroup and sk_msg.

Change logs:
  v1 -> v2:
    - allow bpf_get_[ns_]current_pid_tgid() for all prog types.
    - for network related selftests, using netns.
====================

Link: https://lore.kernel.org/r/20240315184849.2974556-1-yonghong.song@linux.devSigned-off-by: Andrii Nakryiko <andrii@kernel.org>

437ffcb0

selftests/bpf: Add a sk_msg prog bpf_get_ns_current_pid_tgid() test · 4c195ee4

Yonghong Song authored Mar 15, 2024

Add a sk_msg bpf program test where the program is running in a pid
namespace. The test is successful:
  #165/4   ns_current_pid_tgid/new_ns_sk_msg:OK
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240315184915.2976718-1-yonghong.song@linux.dev

4c195ee4

selftests/bpf: Add a cgroup prog bpf_get_ns_current_pid_tgid() test · 87ade6cd

Yonghong Song authored Mar 15, 2024

Add a cgroup bpf program test where the bpf program is running
in a pid namespace. The test is successfully:
  #165/3   ns_current_pid_tgid/new_ns_cgrp:OK
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240315184910.2976522-1-yonghong.song@linux.dev

87ade6cd

selftests/bpf: Refactor out some functions in ns_current_pid_tgid test · 4d4bd29e

Yonghong Song authored Mar 15, 2024

Refactor some functions in both user space code and bpf program
as these functions are used by later cgroup/sk_msg tests.
Another change is to mark tp program optional loading as later
patches will use optional loading as well since they have quite
different attachment and testing logic.

There is no functionality change.
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240315184904.2976123-1-yonghong.song@linux.dev

4d4bd29e

selftests/bpf: Replace CHECK with ASSERT_* in ns_current_pid_tgid test · 84239a24

Yonghong Song authored Mar 15, 2024

Replace CHECK in selftest ns_current_pid_tgid with recommended ASSERT_* style.
I also shortened subtest name as the prefix of subtest name is covered
by the test name already.

This patch does fix a testing issue. Currently even if bss->user_{pid,tgid}
is not correct, the test still passed since the clone func returns 0.
I fixed it to return a non-zero value if bss->user_{pid,tgid} is incorrect.
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20240315184859.2975543-1-yonghong.song@linux.dev

84239a24

bpf: Allow helper bpf_get_[ns_]current_pid_tgid() for all prog types · eb166e52

Yonghong Song authored Mar 15, 2024

Currently bpf_get_current_pid_tgid() is allowed in tracing, cgroup
and sk_msg progs while bpf_get_ns_current_pid_tgid() is only allowed
in tracing progs.

We have an internal use case where for an application running
in a container (with pid namespace), user wants to get
the pid associated with the pid namespace in a cgroup bpf
program. Currently, cgroup bpf progs already allow
bpf_get_current_pid_tgid(). Let us allow bpf_get_ns_current_pid_tgid()
as well.

With auditing the code, bpf_get_current_pid_tgid() is also used
by sk_msg prog. But there are no side effect to expose these two
helpers to all prog types since they do not reveal any kernel specific
data. The detailed discussion is in [1].

So with this patch, both bpf_get_current_pid_tgid() and bpf_get_ns_current_pid_tgid()
are put in bpf_base_func_proto(), making them available to all
program types.

  [1] https://lore.kernel.org/bpf/20240307232659.1115872-1-yonghong.song@linux.dev/Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20240315184854.2975190-1-yonghong.song@linux.dev

eb166e52

bpf/lpm_trie: Inline longest_prefix_match for fastpath · 1a4a0cb7

Jesper Dangaard Brouer authored Mar 18, 2024

The BPF map type LPM (Longest Prefix Match) is used heavily
in production by multiple products that have BPF components.
Perf data shows trie_lookup_elem() and longest_prefix_match()
being part of kernels perf top.

For every level in the LPM tree trie_lookup_elem() calls out
to longest_prefix_match().  The compiler is free to inline this
call, but chooses not to inline, because other slowpath callers
(that can be invoked via syscall) exists like trie_update_elem(),
trie_delete_elem() or trie_get_next_key().

 bcc/tools/funccount -Ti 1 'trie_lookup_elem|longest_prefix_match.isra.0'
 FUNC                                    COUNT
 trie_lookup_elem                       664945
 longest_prefix_match.isra.0           8101507

Observation on a single random machine shows a factor 12 between
the two functions. Given an average of 12 levels in the trie being
searched.

This patch force inlining longest_prefix_match(), but only for
the lookup fastpath to balance object instruction size.

In production with AMD CPUs, measuring the function latency of
'trie_lookup_elem' (bcc/tools/funclatency) we are seeing an improvement
function latency reduction 7-8% with this patch applied (to production
kernels 6.6 and 6.1). Analyzing perf data, we can explain this rather
large improvement due to reducing the overhead for AMD side-channel
mitigation SRSO (Speculative Return Stack Overflow).

Fixes: fb3bd914 ("x86/srso: Add a Speculative RAS Overflow mitigation")
Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/bpf/171076828575.2141737.18370644069389889027.stgit@firesoul

1a4a0cb7

18 Mar, 2024 4 commits

bpf: Check return from set_memory_rox() · c733239f

Christophe Leroy authored Mar 16, 2024

arch_protect_bpf_trampoline() and alloc_new_pack() call
set_memory_rox() which can fail, leading to unprotected memory.

Take into account return from set_memory_rox() function and add
__must_check flag to arch_protect_bpf_trampoline().
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/fe1c163c83767fde5cab31d209a4a6be3ddb3a73.1710574353.git.christophe.leroy@csgroup.euSigned-off-by: Martin KaFai Lau <martin.lau@kernel.org>

c733239f

bpf: Remove arch_unprotect_bpf_trampoline() · e3362acd

Christophe Leroy authored Mar 16, 2024

Last user of arch_unprotect_bpf_trampoline() was removed by
commit 187e2af0 ("bpf: struct_ops supports more than one page for
trampolines.")

Remove arch_unprotect_bpf_trampoline()
Reported-by: Daniel Borkmann <daniel@iogearbox.net>
Fixes: 187e2af0 ("bpf: struct_ops supports more than one page for trampolines.")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Link: https://lore.kernel.org/r/42c635bb54d3af91db0f9b85d724c7c290069f67.1710574353.git.christophe.leroy@csgroup.euSigned-off-by: Martin KaFai Lau <martin.lau@kernel.org>

e3362acd

libbpbpf: Check bpf_map/bpf_program fd validity · 7b30c296

Mykyta Yatsenko authored Mar 18, 2024

libbpf creates bpf_program/bpf_map structs for each program/map that
user defines, but it allows to disable creating/loading those objects in
kernel, in that case they won't have associated file descriptor
(fd < 0). Such functionality is used for backward compatibility
with some older kernels.

Nothing prevents users from passing these maps or programs with no
kernel counterpart to libbpf APIs. This change introduces explicit
checks for kernel objects existence, aiming to improve visibility of
those edge cases and provide meaningful warnings to users.
Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240318131808.95959-1-yatsenko@meta.com

7b30c296

bpf: Remove unnecessary err < 0 check in bpf_struct_ops_map_update_elem · 7f3edd0c

Martin KaFai Lau authored Mar 15, 2024

There is a "if (err)" check earlier, so the "if (err < 0)"
check that this patch removing is unnecessary. It was my overlook
when making adjustments to the bpf_struct_ops_prepare_trampoline()
such that the caller does not have to worry about the new page when
the function returns error.

Fixes: 187e2af0 ("bpf: struct_ops supports more than one page for trampolines.")
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/bpf/20240315192112.2825039-1-martin.lau@linux.dev

7f3edd0c

15 Mar, 2024 4 commits

selftests/bpf: Remove second semicolon · 4c8644f8

Colin Ian King authored Mar 15, 2024

There are statements with two semicolons. Remove the second one, it
is redundant.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240315092654.2431062-1-colin.i.king@gmail.com

4c8644f8

bpf: Take return from set_memory_rox() into account with bpf_jit_binary_lock_ro() · e60adf51

Christophe Leroy authored Mar 08, 2024

set_memory_rox() can fail, leaving memory unprotected.

Check return and bail out when bpf_jit_binary_lock_ro() returns
an error.

Link: https://github.com/KSPP/linux/issues/7Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: linux-hardening@vger.kernel.org <linux-hardening@vger.kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Puranjay Mohan <puranjay12@gmail.com>
Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com> # s390x
Acked-by: Tiezhu Yang <yangtiezhu@loongson.cn> # LoongArch
Reviewed-by: Johan Almbladh <johan.almbladh@anyfinetworks.com> # MIPS Part
Message-ID: <036b6393f23a2032ce75a1c92220b2afcb798d5d.1709850515.git.christophe.leroy@csgroup.eu>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

e60adf51

bpf: Take return from set_memory_ro() into account with bpf_prog_lock_ro() · 7d2cc63e

Christophe Leroy authored Mar 08, 2024

set_memory_ro() can fail, leaving memory unprotected.

Check its return and take it into account as an error.

Link: https://github.com/KSPP/linux/issues/7Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: linux-hardening@vger.kernel.org <linux-hardening@vger.kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Message-ID: <286def78955e04382b227cb3e4b6ba272a7442e3.1709850515.git.christophe.leroy@csgroup.eu>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

7d2cc63e

bpf: preserve sleepable bit in subprog info · 4d8926a0

Andrii Nakryiko authored Mar 13, 2024

Copy over main program's sleepable bit into subprog's info. This might
be important for, e.g., freplace cases.
Suggested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Stanislav Fomichev <sdf@google.com>
Message-ID: <20240314000127.3881569-1-andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

4d8926a0

14 Mar, 2024 6 commits

Merge branch 'ignore-additional-fields-in-the-struct_ops-maps-in-an-updated-version' · 6cda7e17

Andrii Nakryiko authored Mar 14, 2024

Kui-Feng Lee says:

====================
Ignore additional fields in the struct_ops maps in an updated version.

According to an offline discussion, it would be beneficial to
implement a backward-compatible method for struct_ops types with
additional fields that are not present in older kernels.

This patchset accepts additional fields of a struct_ops map with all
zero values even if these fields are not in the corresponding type in
the kernel. This provides a way to be backward compatible. User space
programs can use the same map on a machine running an old kernel by
clearing fields that do not exist in the kernel.

For example, in a test case, it adds an additional field "zeroed" that
doesn't exist in struct bpf_testmod_ops of the kernel.

    struct bpf_testmod_ops___zeroed {
    	int (*test_1)(void);
    	void (*test_2)(int a, int b);
    	int (*test_maybe_null)(int dummy, struct task_struct *task);
    	int zeroed;
    };

    SEC(".struct_ops.link")
    struct bpf_testmod_ops___zeroed testmod_zeroed = {
    	.test_1 = (void *)test_1,
    	.test_2 = (void *)test_2_v2,
    };

Here, it doesn't assign a value to "zeroed" of testmod_zeroed, and by
default the value of this field will be zero. So, the map will be
accepted by libbpf, but libbpf will skip the "zeroed" field. However,
if the "zeroed" field is assigned to any value other than "0", libbpf
will reject to load this map.
---
Changes from v1:

 - Fix the issue about function pointer fields.

 - Change a warning message, and add an info message for skipping
   fields.

 - Add a small demo of additional arguments that are not in the
   function pointer prototype in the kernel.

v1: https://lore.kernel.org/all/20240312183245.341141-1-thinker.li@gmail.com/

Kui-Feng Lee (3):
  libbpf: Skip zeroed or null fields if not found in the kernel type.
  selftests/bpf: Ensure libbpf skip all-zeros fields of struct_ops maps.
  selftests/bpf: Accept extra arguments if they are not used.

 tools/lib/bpf/libbpf.c                        |  24 +++-
 .../bpf/prog_tests/test_struct_ops_module.c   | 103 ++++++++++++++++++
 .../bpf/progs/struct_ops_extra_arg.c          |  49 +++++++++
 .../selftests/bpf/progs/struct_ops_module.c   |  16 ++-
 4 files changed, 186 insertions(+), 6 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/progs/struct_ops_extra_arg.c
====================

Link: https://lore.kernel.org/r/20240313214139.685112-1-thinker.li@gmail.comSigned-off-by: Andrii Nakryiko <andrii@kernel.org>

6cda7e17

selftests/bpf: Ensure libbpf skip all-zeros fields of struct_ops maps. · 26a7cf2b

Kui-Feng Lee authored Mar 13, 2024

A new version of a type may have additional fields that do not exist in
older versions. Previously, libbpf would reject struct_ops maps with a new
version containing extra fields when running on a machine with an old
kernel. However, we have updated libbpf to ignore these fields if their
values are all zeros or null in order to provide backward compatibility.
Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240313214139.685112-3-thinker.li@gmail.com

26a7cf2b

libbpf: Skip zeroed or null fields if not found in the kernel type. · c911fc61

Kui-Feng Lee authored Mar 13, 2024

Accept additional fields of a struct_ops type with all zero values even if
these fields are not in the corresponding type in the kernel. This provides
a way to be backward compatible. User space programs can use the same map
on a machine running an old kernel by clearing fields that do not exist in
the kernel.
Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240313214139.685112-2-thinker.li@gmail.com

c911fc61

libbpf: Prevent null-pointer dereference when prog to load has no BTF · 9bf48fa1

Quentin Monnet authored Mar 14, 2024

In bpf_objec_load_prog(), there's no guarantee that obj->btf is non-NULL
when passing it to btf__fd(), and this function does not perform any
check before dereferencing its argument (as bpf_object__btf_fd() used to
do). As a consequence, we get segmentation fault errors in bpftool (for
example) when trying to load programs that come without BTF information.

v2: Keep btf__fd() in the fix instead of reverting to bpf_object__btf_fd().

Fixes: df7c3f7d ("libbpf: make uniform use of btf__fd() accessor inside libbpf")
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Quentin Monnet <qmo@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240314150438.232462-1-qmo@kernel.org

9bf48fa1

bpftool: Fix missing pids during link show · fe879bb4

Yonghong Song authored Mar 11, 2024

Current 'bpftool link' command does not show pids, e.g.,
  $ tools/build/bpftool/bpftool link
  ...
  4: tracing  prog 23
        prog_type lsm  attach_type lsm_mac
        target_obj_id 1  target_btf_id 31320

Hack the following change to enable normal libbpf debug output,
  --- a/tools/bpf/bpftool/pids.c
  +++ b/tools/bpf/bpftool/pids.c
  @@ -121,9 +121,9 @@ int build_obj_refs_table(struct hashmap **map, enum bpf_obj_type type)
          /* we don't want output polluted with libbpf errors if bpf_iter is not
           * supported
           */
  -       default_print = libbpf_set_print(libbpf_print_none);
  +       /* default_print = libbpf_set_print(libbpf_print_none); */
          err = pid_iter_bpf__load(skel);
  -       libbpf_set_print(default_print);
  +       /* libbpf_set_print(default_print); */

Rerun the above bpftool command:
  $ tools/build/bpftool/bpftool link
  libbpf: prog 'iter': BPF program load failed: Permission denied
  libbpf: prog 'iter': -- BEGIN PROG LOAD LOG --
  0: R1=ctx() R10=fp0
  ; struct task_struct *task = ctx->task; @ pid_iter.bpf.c:69
  0: (79) r6 = *(u64 *)(r1 +8)          ; R1=ctx() R6_w=ptr_or_null_task_struct(id=1)
  ; struct file *file = ctx->file; @ pid_iter.bpf.c:68
  ...
  ; struct bpf_link *link = (struct bpf_link *) file->private_data; @ pid_iter.bpf.c:103
  80: (79) r3 = *(u64 *)(r8 +432)       ; R3_w=scalar() R8=ptr_file()
  ; if (link->type == bpf_core_enum_value(enum bpf_link_type___local, @ pid_iter.bpf.c:105
  81: (61) r1 = *(u32 *)(r3 +12)
  R3 invalid mem access 'scalar'
  processed 39 insns (limit 1000000) max_states_per_insn 0 total_states 3 peak_states 3 mark_read 2
  -- END PROG LOAD LOG --
  libbpf: prog 'iter': failed to load: -13
  ...

The 'file->private_data' returns a 'void' type and this caused subsequent 'link->type'
(insn #81) failed in verification.

To fix the issue, restore the previous BPF_CORE_READ so old kernels can also work.
With this patch, the 'bpftool link' runs successfully with 'pids'.
  $ tools/build/bpftool/bpftool link
  ...
  4: tracing  prog 23
        prog_type lsm  attach_type lsm_mac
        target_obj_id 1  target_btf_id 31320
        pids systemd(1)

Fixes: 44ba7b30 ("bpftool: Use a local copy of BPF_LINK_TYPE_PERF_EVENT in pid_iter.bpf.c")
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Tested-by: Quentin Monnet <quentin@isovalent.com>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20240312023249.3776718-1-yonghong.song@linux.dev

fe879bb4

bpftool: Cast pointers for shadow types explicitly. · c2a0257c

Kui-Feng Lee authored Mar 11, 2024

According to a report, skeletons fail to assign shadow pointers when being
compiled with C++ programs. Unlike C doing implicit casting for void
pointers, C++ requires an explicit casting.

To support C++, we do explicit casting for each shadow pointer.

Also add struct_ops_module.skel.h to test_cpp to validate C++
compilation as part of BPF selftests.
Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20240312013726.1780720-1-thinker.li@gmail.com

c2a0257c

13 Mar, 2024 1 commit

Merge tag 'net-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next · 9187210e

Linus Torvalds authored Mar 12, 2024

Pull networking updates from Jakub Kicinski:
"Core & protocols:

- Large effort by Eric to lower rtnl_lock pressure and remove locks:

- Make commonly used parts of rtnetlink (address, route dumps
etc) lockless, protected by RCU instead of rtnl_lock.

- Add a netns exit callback which already holds rtnl_lock,
allowing netns exit to take rtnl_lock once in the core instead
of once for each driver / callback.

- Remove locks / serialization in the socket diag interface.

- Remove 6 calls to synchronize_rcu() while holding rtnl_lock.

- Remove the dev_base_lock, depend on RCU where necessary.

- Support busy polling on a per-epoll context basis. Poll length and
budget parameters can be set independently of system defaults.

- Introduce struct net_hotdata, to make sure read-mostly global
config variables fit in as few cache lines as possible.

- Add optional per-nexthop statistics to ease monitoring / debug of
ECMP imbalance problems.

- Support TCP_NOTSENT_LOWAT in MPTCP.

- Ensure that IPv6 temporary addresses' preferred lifetimes are long
enough, compared to other configured lifetimes, and at least 2 sec.

- Support forwarding of ICMP Error messages in IPSec, per RFC 4301.

- Add support for the independent control state machine for bonding
per IEEE 802.1AX-2008 5.4.15 in addition to the existing coupled
control state machine.

- Add "network ID" to MCTP socket APIs to support hosts with multiple
disjoint MCTP networks.

- Re-use the mono_delivery_time skbuff bit for packets which user
space wants to be sent at a specified time. Maintain the timing
information while traversing veth links, bridge etc.

- Take advantage of MSG_SPLICE_PAGES for RxRPC DATA and ACK packets.

- Simplify many places iterating over netdevs by using an xarray
instead of a hash table walk (hash table remains in place, for use
on fastpaths).

- Speed up scanning for expired routes by keeping a dedicated list.

- Speed up "generic" XDP by trying harder to avoid large allocations.

- Support attaching arbitrary metadata to netconsole messages.

Things we sprinkled into general kernel code:

- Enforce VM_IOREMAP flag and range in ioremap_page_range and
introduce VM_SPARSE kind and vm_area_[un]map_pages (used by
bpf_arena).

- Rework selftest harness to enable the use of the full range of ksft
exit code (pass, fail, skip, xfail, xpass).

Netfilter:

- Allow userspace to define a table that is exclusively owned by a
daemon (via netlink socket aliveness) without auto-removing this
table when the userspace program exits. Such table gets marked as
orphaned and a restarting management daemon can re-attach/regain
ownership.

- Speed up element insertions to nftables' concatenated-ranges set
type. Compact a few related data structures.

BPF:

- Add BPF token support for delegating a subset of BPF subsystem
functionality from privileged system-wide daemons such as systemd
through special mount options for userns-bound BPF fs to a trusted
& unprivileged application.

- Introduce bpf_arena which is sparse shared memory region between
BPF program and user space where structures inside the arena can
have pointers to other areas of the arena, and pointers work
seamlessly for both user-space programs and BPF programs.

- Introduce may_goto instruction that is a contract between the
verifier and the program. The verifier allows the program to loop
assuming it's behaving well, but reserves the right to terminate
it.

- Extend the BPF verifier to enable static subprog calls in spin lock
critical sections.

- Support registration of struct_ops types from modules which helps
projects like fuse-bpf that seeks to implement a new struct_ops
type.

- Add support for retrieval of cookies for perf/kprobe multi links.

- Support arbitrary TCP SYN cookie generation / validation in the TC
layer with BPF to allow creating SYN flood handling in BPF
firewalls.

- Add code generation to inline the bpf_kptr_xchg() helper which
improves performance when stashing/popping the allocated BPF
objects.

Wireless:

- Add SPP (signaling and payload protected) AMSDU support.

- Support wider bandwidth OFDMA, as required for EHT operation.

Driver API:

- Major overhaul of the Energy Efficient Ethernet internals to
support new link modes (2.5GE, 5GE), share more code between
drivers (especially those using phylib), and encourage more
uniform behavior. Convert and clean up drivers.

- Define an API for querying per netdev queue statistics from
drivers.

- IPSec: account in global stats for fully offloaded sessions.

- Create a concept of Ethernet PHY Packages at the Device Tree level,
to allow parameterizing the existing PHY package code.

- Enable Rx hashing (RSS) on GTP protocol fields.

Misc:

- Improvements and refactoring all over networking selftests.

- Create uniform module aliases for TC classifiers, actions, and
packet schedulers to simplify creating modprobe policies.

- Address all missing MODULE_DESCRIPTION() warnings in networking.

- Extend the Netlink descriptions in YAML to cover message
encapsulation or "Netlink polymorphism", where interpretation of
nested attributes depends on link type, classifier type or some
other "class type".

Drivers:

- Ethernet high-speed NICs:
- Add a new driver for Marvell's Octeon PCI Endpoint NIC VF.
- Intel (100G, ice, idpf):
- support E825-C devices
- nVidia/Mellanox:
- support devices with one port and multiple PCIe links
- Broadcom (bnxt):
- support n-tuple filters
- support configuring the RSS key
- Wangxun (ngbe/txgbe):
- implement irq_domain for TXGBE's sub-interrupts
- Pensando/AMD:
- support XDP
- optimize queue submission and wakeup handling (+17% bps)
- optimize struct layout, saving 28% of memory on queues

- Ethernet NICs embedded and virtual:
- Google cloud vNIC:
- refactor driver to perform memory allocations for new queue
config before stopping and freeing the old queue memory
- Synopsys (stmmac):
- obey queueMaxSDU and implement counters required by 802.1Qbv
- Renesas (ravb):
- support packet checksum offload
- suspend to RAM and runtime PM support

- Ethernet switches:
- nVidia/Mellanox:
- support for nexthop group statistics
- Microchip:
- ksz8: implement PHY loopback
- add support for KSZ8567, a 7-port 10/100Mbps switch

- PTP:
- New driver for RENESAS FemtoClock3 Wireless clock generator.
- Support OCP PTP cards designed and built by Adva.

- CAN:
- Support recvmsg() flags for own, local and remote traffic on CAN
BCM sockets.
- Support for esd GmbH PCIe/402 CAN device family.
- m_can:
- Rx/Tx submission coalescing
- wake on frame Rx

- WiFi:
- Intel (iwlwifi):
- enable signaling and payload protected A-MSDUs
- support wider-bandwidth OFDMA
- support for new devices
- bump FW API to 89 for AX devices; 90 for BZ/SC devices
- MediaTek (mt76):
- mt7915: newer ADIE version support
- mt7925: radio temperature sensor support
- Qualcomm (ath11k):
- support 6 GHz station power modes: Low Power Indoor (LPI),
Standard Power) SP and Very Low Power (VLP)
- QCA6390 & WCN6855: support 2 concurrent station interfaces
- QCA2066 support
- Qualcomm (ath12k):
- refactoring in preparation for Multi-Link Operation (MLO)
support
- 1024 Block Ack window size support
- firmware-2.bin support
- support having multiple identical PCI devices (firmware needs
to have ATH12K_FW_FEATURE_MULTI_QRTR_ID)
- QCN9274: support split-PHY devices
- WCN7850: enable Power Save Mode in station mode
- WCN7850: P2P support
- RealTek:
- rtw88: support for more rtw8811cu and rtw8821cu devices
- rtw89: support SCAN_RANDOM_SN and SET_SCAN_DWELL
- rtlwifi: speed up USB firmware initialization
- rtwl8xxxu:
- RTL8188F: concurrent interface support
- Channel Switch Announcement (CSA) support in AP mode
- Broadcom (brcmfmac):
- per-vendor feature support
- per-vendor SAE password setup
- DMI nvram filename quirk for ACEPC W5 Pro"

* tag 'net-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2255 commits)
nexthop: Fix splat with CONFIG_DEBUG_PREEMPT=y
nexthop: Fix out-of-bounds access during attribute validation
nexthop: Only parse NHA_OP_FLAGS for dump messages that require it
nexthop: Only parse NHA_OP_FLAGS for get messages that require it
bpf: move sleepable flag from bpf_prog_aux to bpf_prog
bpf: hardcode BPF_PROG_PACK_SIZE to 2MB * num_possible_nodes()
selftests/bpf: Add kprobe multi triggering benchmarks
ptp: Move from simple ida to xarray
vxlan: Remove generic .ndo_get_stats64
vxlan: Do not alloc tstats manually
devlink: Add comments to use netlink gen tool
nfp: flower: handle acti_netdevs allocation failure
net/packet: Add getsockopt support for PACKET_COPY_THRESH
net/netlink: Add getsockopt support for NETLINK_LISTEN_ALL_NSID
selftests/bpf: Add bpf_arena_htab test.
selftests/bpf: Add bpf_arena_list test.
selftests/bpf: Add unit tests for bpf_arena_alloc/free_pages
bpf: Add helper macro bpf_addr_space_cast()
libbpf: Recognize __arena global variables.
bpftool: Recognize arena map type
...

9187210e

12 Mar, 2024 7 commits

Merge tag 'docs-6.9' of git://git.lwn.net/linux · 1f440397

Linus Torvalds authored Mar 12, 2024

Pull documentation updates from Jonathan Corbet:
 "A moderatly busy cycle for development this time around.

   - Some cleanup of the main index page for easier navigation

   - Rework some of the other top-level pages for better readability
     and, with luck, fewer merge conflicts in the future.

   - Submit-checklist improvements, hopefully the first of many.

   - New Italian translations

   - A fair number of kernel-doc fixes and improvements. We have also
     dropped the recommendation to use an old version of Sphinx.

   - A new document from Thorsten on bisection

  ... and lots of fixes and updates"

* tag 'docs-6.9' of git://git.lwn.net/linux: (54 commits)
  docs: verify/bisect: fixes, finetuning, and support for Arch
  docs: Makefile: Add dependency to $(YNL_INDEX) for targets other than htmldocs
  docs: Move ja_JP/howto.rst to ja_JP/process/howto.rst
  docs: submit-checklist: use subheadings
  docs: submit-checklist: structure by category
  docs: new text on bisecting which also covers bug validation
  docs: drop the version constraints for sphinx and dependencies
  docs: kerneldoc-preamble.sty: Remove code for Sphinx <2.4
  docs: Restore "smart quotes" for quotes
  docs/zh_CN: accurate translation of "function"
  docs: Include simplified link titles in main index
  docs: Correct formatting of title in admin-guide/index.rst
  docs: kernel_feat.py: fix build error for missing files
  MAINTAINERS: Set the field name for subsystem profile section
  kasan: Add documentation for CONFIG_KASAN_EXTRA_INFO
  Fixed case issue with 'fault-injection' in documentation
  kernel-doc: handle #if in enums as well
  Documentation: update mailing list addresses
  doc: kerneldoc.py: fix indentation
  scripts/kernel-doc: simplify signature printing
  ...

1f440397

Merge tag 'audit-pr-20240312' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit · 3749bda2

Linus Torvalds authored Mar 12, 2024

Pull audit updates from Paul Moore:
 "Two small audit patches:

   - Use the KMEM_CACHE() macro instead of kmem_cache_create()

     The guidance appears to be to use the KMEM_CACHE() macro when
     possible and there is no reason why we can't use the macro, so
     let's use it.

   - Remove an unnecessary assignment in audit_dupe_lsm_field()

     A return value variable was assigned a value in its declaration,
     but the declaration value is overwritten before the return value
     variable is ever referenced; drop the assignment at declaration
     time"

* tag 'audit-pr-20240312' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
  audit: use KMEM_CACHE() instead of kmem_cache_create()
  audit: remove unnecessary assignment in audit_dupe_lsm_field()

3749bda2

Merge tag 'Smack-for-6.9' of https://github.com/cschaufler/smack-next · 681ba318

Linus Torvalds authored Mar 12, 2024

Pull smack updates from Casey Schaufler:

 - Improvements to the initialization of in-memory inodes

 - A fix in ramfs to propery ensure the initialization of in-memory
   inodes

 - Removal of duplicated code in smack_cred_transfer()

* tag 'Smack-for-6.9' of https://github.com/cschaufler/smack-next:
  Smack: use init_task_smack() in smack_cred_transfer()
  ramfs: Initialize security of in-memory inodes
  smack: Initialize the in-memory inode in smack_inode_init_security()
  smack: Always determine inode labels in smack_inode_init_security()
  smack: Handle SMACK64TRANSMUTE in smack_inode_setsecurity()
  smack: Set SMACK64TRANSMUTE only for dirs in smack_inode_setxattr()

681ba318

Merge tag 'seccomp-v6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · 7f1a2774

Linus Torvalds authored Mar 12, 2024

Pull seccomp updates from Kees Cook:
 "There are no core kernel changes here; it's entirely selftests and
  samples:

   - Improve reliability of selftests (Terry Tritton, Kees Cook)

   - Fix strict-aliasing warning in samples (Arnd Bergmann)"

* tag 'seccomp-v6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  samples: user-trap: fix strict-aliasing warning
  selftests/seccomp: Pin benchmark to single CPU
  selftests/seccomp: user_notification_addfd check nextfd is available
  selftests/seccomp: Change the syscall used in KILL_THREAD test
  selftests/seccomp: Handle EINVAL on unshare(CLONE_NEWPID)

7f1a2774

Merge tag 'hardening-v6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · 216532e1

Linus Torvalds authored Mar 12, 2024

Pull hardening updates from Kees Cook:
 "As is pretty normal for this tree, there are changes all over the
  place, especially for small fixes, selftest improvements, and improved
  macro usability.

  Some header changes ended up landing via this tree as they depended on
  the string header cleanups. Also, a notable set of changes is the work
  for the reintroduction of the UBSAN signed integer overflow sanitizer
  so that we can continue to make improvements on the compiler side to
  make this sanitizer a more viable future security hardening option.

  Summary:

   - string.h and related header cleanups (Tanzir Hasan, Andy
     Shevchenko)

   - VMCI memcpy() usage and struct_size() cleanups (Vasiliy Kovalev,
     Harshit Mogalapalli)

   - selftests/powerpc: Fix load_unaligned_zeropad build failure
     (Michael Ellerman)

   - hardened Kconfig fragment updates (Marco Elver, Lukas Bulwahn)

   - Handle tail call optimization better in LKDTM (Douglas Anderson)

   - Use long form types in overflow.h (Andy Shevchenko)

   - Add flags param to string_get_size() (Andy Shevchenko)

   - Add Coccinelle script for potential struct_size() use (Jacob
     Keller)

   - Fix objtool corner case under KCFI (Josh Poimboeuf)

   - Drop 13 year old backward compat CAP_SYS_ADMIN check (Jingzi Meng)

   - Add str_plural() helper (Michal Wajdeczko, Kees Cook)

   - Ignore relocations in .notes section

   - Add comments to explain how __is_constexpr() works

   - Fix m68k stack alignment expectations in stackinit Kunit test

   - Convert string selftests to KUnit

   - Add KUnit tests for fortified string functions

   - Improve reporting during fortified string warnings

   - Allow non-type arg to type_max() and type_min()

   - Allow strscpy() to be called with only 2 arguments

   - Add binary mode to leaking_addresses scanner

   - Various small cleanups to leaking_addresses scanner

   - Adding wrapping_*() arithmetic helper

   - Annotate initial signed integer wrap-around in refcount_t

   - Add explicit UBSAN section to MAINTAINERS

   - Fix UBSAN self-test warnings

   - Simplify UBSAN build via removal of CONFIG_UBSAN_SANITIZE_ALL

   - Reintroduce UBSAN's signed overflow sanitizer"

* tag 'hardening-v6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (51 commits)
  selftests/powerpc: Fix load_unaligned_zeropad build failure
  string: Convert helpers selftest to KUnit
  string: Convert selftest to KUnit
  sh: Fix build with CONFIG_UBSAN=y
  compiler.h: Explain how __is_constexpr() works
  overflow: Allow non-type arg to type_max() and type_min()
  VMCI: Fix possible memcpy() run-time warning in vmci_datagram_invoke_guest_handler()
  lib/string_helpers: Add flags param to string_get_size()
  x86, relocs: Ignore relocations in .notes section
  objtool: Fix UNWIND_HINT_{SAVE,RESTORE} across basic blocks
  overflow: Use POD in check_shl_overflow()
  lib: stackinit: Adjust target string to 8 bytes for m68k
  sparc: vdso: Disable UBSAN instrumentation
  kernel.h: Move lib/cmdline.c prototypes to string.h
  leaking_addresses: Provide mechanism to scan binary files
  leaking_addresses: Ignore input device status lines
  leaking_addresses: Use File::Temp for /tmp files
  MAINTAINERS: Update LEAKING_ADDRESSES details
  fortify: Improve buffer overflow reporting
  fortify: Add KUnit tests for runtime overflows
  ...

216532e1

Merge tag 'execve-v6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · b32273ee

Linus Torvalds authored Mar 12, 2024

Pull execve updates from Kees Cook:

 - Drop needless error path code in remove_arg_zero() (Li kunyu, Kees
   Cook)

 - binfmt_elf_efpic: Don't use missing interpreter's properties (Max
   Filippov)

 - Use /bin/bash for execveat selftests

* tag 'execve-v6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  exec: Simplify remove_arg_zero() error path
  selftests/exec: Perform script checks with /bin/bash
  exec: Delete unnecessary statements in remove_arg_zero()
  fs: binfmt_elf_efpic: don't use missing interpreter's properties

b32273ee

Merge tag 'pstore-v6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · 41cb8c33

Linus Torvalds authored Mar 12, 2024

Pull pstore updates from Kees Cook:

 - Make PSTORE_RAM available by default on arm64 (Nícolas F R A Prado)

 - Allow for dynamic initialization in modular build (Guilherme G
   Piccoli)

 - Add missing allocation failure check (Kunwu Chan)

 - Avoid duplicate memory zeroing (Christophe JAILLET)

 - Avoid potential double-free during pstorefs umount

* tag 'pstore-v6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  pstore/zone: Don't clear memory twice
  pstore/zone: Add a null pointer check to the psz_kmsg_read
  efi: pstore: Allow dynamic initialization based on module parameter
  arm64: defconfig: Enable PSTORE_RAM
  pstore/ram: Register to module device table
  pstore: inode: Only d_invalidate() is needed

41cb8c33