Commits · a79ac2d1036a824abba982c33e938b717d1b659f · Kirill Smelkov / linux

16 Dec, 2019 10 commits

libbpf: Fix build by renaming variables · a79ac2d1

Prashant Bhole authored Dec 16, 2019

In btf__align_of() variable name 't' is shadowed by inner block
declaration of another variable with same name. Patch renames
variables in order to fix it.

  CC       sharedobjs/btf.o
btf.c: In function ‘btf__align_of’:
btf.c:303:21: error: declaration of ‘t’ shadows a previous local [-Werror=shadow]
  303 |   int i, align = 1, t;
      |                     ^
btf.c:283:25: note: shadowed declaration is here
  283 |  const struct btf_type *t = btf__type_by_id(btf, id);
      |

Fixes: 3d208f4c ("libbpf: Expose btf__align_of() API")
Signed-off-by: Prashant Bhole <prashantbhole.linux@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20191216082738.28421-1-prashantbhole.linux@gmail.com

a79ac2d1

Merge branch 'support-flex-arrays' · 0849e102

Alexei Starovoitov authored Dec 15, 2019

Andrii Nakryiko says:

====================
Add support for flexible array accesses in a relocatable manner in BPF CO-RE.
It's a typical pattern in C, and kernel in particular, to provide
a fixed-length struct with zero-sized or dimensionless array at the end. In
such cases variable-sized array contents follows immediately after the end of
a struct. This patch set adds support for such access pattern by allowing
accesses to such arrays.

Patch #1 adds libbpf support. Patch #2 adds few test cases for validation.
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

0849e102

selftests/bpf: Add flexible array relocation tests · 5f2eecef

Andrii Nakryiko authored Dec 14, 2019

Add few tests validation CO-RE relocation handling of flexible array accesses.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191215070844.1014385-3-andriin@fb.com

5f2eecef

libbpf: Support flexible arrays in CO-RE · 1b484b30

Andrii Nakryiko authored Dec 14, 2019

Some data stuctures in kernel are defined with either zero-sized array or
flexible (dimensionless) array at the end of a struct. Actual data of such
array follows in memory immediately after the end of that struct, forming its
variable-sized "body" of elements. Support such access pattern in CO-RE
relocation handling.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191215070844.1014385-2-andriin@fb.com

1b484b30

Merge branch 'extern-var-support' · 01c6f7aa

Alexei Starovoitov authored Dec 15, 2019

Andrii Nakryiko says:

====================
It's often important for BPF program to know kernel version or some specific
config values (e.g., CONFIG_HZ to convert jiffies to seconds) and change or
adjust program logic based on their values. As of today, any such need has to
be resolved by recompiling BPF program for specific kernel and kernel
configuration. In practice this is usually achieved by using BCC and its
embedded LLVM/Clang. With such set up #ifdef CONFIG_XXX and similar
compile-time constructs allow to deal with kernel varieties.

With CO-RE (Compile Once – Run Everywhere) approach, this is not an option,
unfortunately. All such logic variations have to be done as a normal
C language constructs (i.e., if/else, variables, etc), not a preprocessor
directives. This patch series add support for such advanced scenarios through
C extern variables. These extern variables will be recognized by libbpf and
supplied through extra .extern internal map, similarly to global data. This
.extern map is read-only, which allows BPF verifier to track its content
precisely as constants. That gives an opportunity to have pre-compiled BPF
program, which can potentially use BPF functionality (e.g., BPF helpers) or
kernel features (types, fields, etc), that are available only on a subset of
targeted kernels, while effectively eleminating (through verifier's dead code
detection) such unsupported functionality for other kernels (typically, older
versions). Patch #3 explicitly tests a scenario of using unsupported BPF
helper, to validate the approach.

This patch set heavily relies on BTF type information emitted by compiler for
each extern variable declaration. Based on specific types, libbpf does strict
checks of config data values correctness. See patch #1 for details.

Outline of the patch set:
- patch #1 does a small clean up of internal map names contants;
- patch #2 adds all of the libbpf internal machinery for externs support,
  including setting up BTF information for .extern data section;
- patch #3 adds support for .extern into BPF skeleton;
- patch #4 adds externs selftests, as well as enhances test_skeleton.c test to
  validate mmap()-ed .extern datasection functionality.

v3->v4:
- clean up copyrights and rebase onto latest skeleton patches (Alexei);

v2->v3:
- truncate too long strings (Alexei);
- clean ups, adding comments (Alexei);

v1->v2:
- use BTF type information for externs (Alexei);
- add strings support;
- add BPF skeleton support for .extern.
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

01c6f7aa

selftests/bpf: Add tests for libbpf-provided externs · 330a73a7

Andrii Nakryiko authored Dec 13, 2019

Add a set of tests validating libbpf-provided extern variables. One crucial
feature that's tested is dead code elimination together with using invalid BPF
helper. CONFIG_MISSING is not supposed to exist and should always be specified
by libbpf as zero, which allows BPF verifier to correctly do branch pruning
and not fail validation, when invalid BPF helper is called from dead if branch.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191214014710.3449601-5-andriin@fb.com

330a73a7

bpftool: Generate externs datasec in BPF skeleton · 2ad97d47

Andrii Nakryiko authored Dec 13, 2019

Add support for generation of mmap()-ed read-only view of libbpf-provided
extern variables. As externs are not supposed to be provided by user code
(that's what .data, .bss, and .rodata is for), don't mmap() it initially. Only
after skeleton load is performed, map .extern contents as read-only memory.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191214014710.3449601-4-andriin@fb.com

2ad97d47

libbpf: Support libbpf-provided extern variables · 166750bc

Andrii Nakryiko authored Dec 13, 2019

Add support for extern variables, provided to BPF program by libbpf. Currently
the following extern variables are supported:
  - LINUX_KERNEL_VERSION; version of a kernel in which BPF program is
    executing, follows KERNEL_VERSION() macro convention, can be 4- and 8-byte
    long;
  - CONFIG_xxx values; a set of values of actual kernel config. Tristate,
    boolean, strings, and integer values are supported.

Set of possible values is determined by declared type of extern variable.
Supported types of variables are:
- Tristate values. Are represented as `enum libbpf_tristate`. Accepted values
  are **strictly** 'y', 'n', or 'm', which are represented as TRI_YES, TRI_NO,
  or TRI_MODULE, respectively.
- Boolean values. Are represented as bool (_Bool) types. Accepted values are
  'y' and 'n' only, turning into true/false values, respectively.
- Single-character values. Can be used both as a substritute for
  bool/tristate, or as a small-range integer:
  - 'y'/'n'/'m' are represented as is, as characters 'y', 'n', or 'm';
  - integers in a range [-128, 127] or [0, 255] (depending on signedness of
    char in target architecture) are recognized and represented with
    respective values of char type.
- Strings. String values are declared as fixed-length char arrays. String of
  up to that length will be accepted and put in first N bytes of char array,
  with the rest of bytes zeroed out. If config string value is longer than
  space alloted, it will be truncated and warning message emitted. Char array
  is always zero terminated. String literals in config have to be enclosed in
  double quotes, just like C-style string literals.
- Integers. 8-, 16-, 32-, and 64-bit integers are supported, both signed and
  unsigned variants. Libbpf enforces parsed config value to be in the
  supported range of corresponding integer type. Integers values in config can
  be:
  - decimal integers, with optional + and - signs;
  - hexadecimal integers, prefixed with 0x or 0X;
  - octal integers, starting with 0.

Config file itself is searched in /boot/config-$(uname -r) location with
fallback to /proc/config.gz, unless config path is specified explicitly
through bpf_object_open_opts' kernel_config_path option. Both gzipped and
plain text formats are supported. Libbpf adds explicit dependency on zlib
because of this, but this shouldn't be a problem, given libelf already depends
on zlib.

All detected extern variables, are put into a separate .extern internal map.
It, similarly to .rodata map, is marked as read-only from BPF program side, as
well as is frozen on load. This allows BPF verifier to track extern values as
constants and perform enhanced branch prediction and dead code elimination.
This can be relied upon for doing kernel version/feature detection and using
potentially unsupported field relocations or BPF helpers in a CO-RE-based BPF
program, while still having a single version of BPF program running on old and
new kernels. Selftests are validating this explicitly for unexisting BPF
helper.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191214014710.3449601-3-andriin@fb.com

166750bc

libbpf: Extract internal map names into constants · ac9d1389

Andrii Nakryiko authored Dec 13, 2019

Instead of duplicating string literals, keep them in one place and consistent.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191214014710.3449601-2-andriin@fb.com

ac9d1389

Merge branch 'bpf-obj-skel' · f7c0bbf2

Alexei Starovoitov authored Dec 15, 2019

Andrii Nakryiko says:

====================
This patch set introduces an alternative and complimentary to existing libbpf
API interface for working with BPF objects, maps, programs, and global data
from userspace side. This approach is relying on code generation. bpftool
produces a struct (a.k.a. skeleton) tailored and specific to provided BPF
object file. It includes hard-coded fields and data structures for every map,
program, link, and global data present.

Altogether this approach significantly reduces amount of userspace boilerplate
code required to open, load, attach, and work with BPF objects. It improves
attach/detach story, by providing pre-allocated space for bpf_links, and
ensuring they are properly detached on shutdown. It allows to do away with by
name/title lookups of maps and programs, because libbpf's skeleton API, in
conjunction with generated code from bpftool, is filling in hard-coded fields
with actual pointers to corresponding struct bpf_map/bpf_program/bpf_link.

Also, thanks to BPF array mmap() support, working with global data (variables)
from userspace is now as natural as it is from BPF side: each variable is just
a struct field inside skeleton struct. Furthermore, this allows to have
a natural way for userspace to pre-initialize global data (including
previously impossible to initialize .rodata) by just assigning values to the
same per-variable fields. Libbpf will carefully take into account this
initialization image, will use it to pre-populate BPF maps at creation time,
and will re-mmap() BPF map's contents at exactly the same userspace memory
address such that it can continue working with all the same pointers without
any interruptions. If kernel doesn't support mmap(), global data will still be
successfully initialized, but after map creation global data structures inside
skeleton will be NULL-ed out. This allows userspace application to gracefully
handle lack of mmap() support, if necessary.

A bunch of selftests are also converted to using skeletons, demonstrating
significant simplification of userspace part of test and reduction in amount
of code necessary.

v3->v4:
- add OPTS_VALID check to btf_dump__emit_type_decl (Alexei);
- expose skeleton as LIBBPF_API functions (Alexei);
- copyright clean up, update internal map init refactor (Alexei);

v2->v3:
- make skeleton part of public API;
- expose btf_dump__emit_type_decl and btf__align_of APIs;
- move LIBBPF_API and DECLARE_LIBBPF_OPTS into libbpf_common.h for reuse;

v1->v2:
- checkpatch.pl and reverse Christmas tree styling (Jakub);
- sanitize variable names to accomodate in-function static vars;

rfc->v1:
- runqslower moved out into separate patch set waiting for vmlinux.h
  improvements;
- skeleton generation code deals with unknown internal maps more gracefully.
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

f7c0bbf2

15 Dec, 2019 22 commits

bpftool: Add `gen skeleton` BASH completions · d9c00c3b

Andrii Nakryiko authored Dec 13, 2019

Add BASH completions for gen sub-command.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Cc: Quentin Monnet <quentin.monnet@netronome.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-18-andriin@fb.com

d9c00c3b

selftests/bpf: Add test validating data section to struct convertion layout · 197448ea

Andrii Nakryiko authored Dec 13, 2019

Add a simple selftests validating datasection-to-struct layour dumping. Global
variables are constructed in such a way as to cause both natural and
artificial padding (through custom alignment requirement).
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-17-andriin@fb.com

197448ea

selftests/bpf: Convert few more selftest to skeletons · dde53c1b

Andrii Nakryiko authored Dec 13, 2019

Convert few more selftests to use generated BPF skeletons as a demonstration
on how to use it.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-16-andriin@fb.com

dde53c1b

selftests/bpf: Add BPF skeletons selftests and convert attach_probe.c · f3c926a4

Andrii Nakryiko authored Dec 13, 2019

Add BPF skeleton generation to selftest/bpf's Makefile. Convert attach_probe.c
to use skeleton.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-15-andriin@fb.com

f3c926a4

bpftool: Add skeleton codegen command · 985ead41

Andrii Nakryiko authored Dec 13, 2019

Add `bpftool gen skeleton` command, which takes in compiled BPF .o object file
and dumps a BPF skeleton struct and related code to work with that skeleton.
Skeleton itself is tailored to a specific structure of provided BPF object
file, containing accessors (just plain struct fields) for every map and
program, as well as dedicated space for bpf_links. If BPF program is using
global variables, corresponding structure definitions of compatible memory
layout are emitted as well, making it possible to initialize and subsequently
read/update global variables values using simple and clear C syntax for
accessing fields. This skeleton majorly improves usability of
opening/loading/attaching of BPF object, as well as interacting with it
throughout the lifetime of loaded BPF object.

Generated skeleton struct has the following structure:

struct <object-name> {
	/* used by libbpf's skeleton API */
	struct bpf_object_skeleton *skeleton;
	/* bpf_object for libbpf APIs */
	struct bpf_object *obj;
	struct {
		/* for every defined map in BPF object: */
		struct bpf_map *<map-name>;
	} maps;
	struct {
		/* for every program in BPF object: */
		struct bpf_program *<program-name>;
	} progs;
	struct {
		/* for every program in BPF object: */
		struct bpf_link *<program-name>;
	} links;
	/* for every present global data section: */
	struct <object-name>__<one of bss, data, or rodata> {
		/* memory layout of corresponding data section,
		 * with every defined variable represented as a struct field
		 * with exactly the same type, but without const/volatile
		 * modifiers, e.g.:
		 */
		 int *my_var_1;
		 ...
	} *<one of bss, data, or rodata>;
};

This provides great usability improvements:
- no need to look up maps and programs by name, instead just
  my_obj->maps.my_map or my_obj->progs.my_prog would give necessary
  bpf_map/bpf_program pointers, which user can pass to existing libbpf APIs;
- pre-defined places for bpf_links, which will be automatically populated for
  program types that libbpf knows how to attach automatically (currently
  tracepoints, kprobe/kretprobe, raw tracepoint and tracing programs). On
  tearing down skeleton, all active bpf_links will be destroyed (meaning BPF
  programs will be detached, if they are attached). For cases in which libbpf
  doesn't know how to auto-attach BPF program, user can manually create link
  after loading skeleton and they will be auto-detached on skeleton
  destruction:

	my_obj->links.my_fancy_prog = bpf_program__attach_cgroup_whatever(
		my_obj->progs.my_fancy_prog, <whatever extra param);

- it's extremely easy and convenient to work with global data from userspace
  now. Both for read-only and read/write variables, it's possible to
  pre-initialize them before skeleton is loaded:

	skel = my_obj__open(raw_embed_data);
	my_obj->rodata->my_var = 123;
	my_obj__load(skel); /* 123 will be initialization value for my_var */

  After load, if kernel supports mmap() for BPF arrays, user can still read
  (and write for .bss and .data) variables values, but at that point it will
  be directly mmap()-ed to BPF array, backing global variables. This allows to
  seamlessly exchange data with BPF side. From userspace program's POV, all
  the pointers and memory contents stay the same, but mapped kernel memory
  changes to point to created map.
  If kernel doesn't yet support mmap() for BPF arrays, it's still possible to
  use those data section structs to pre-initialize .bss, .data, and .rodata,
  but after load their pointers will be reset to NULL, allowing user code to
  gracefully handle this condition, if necessary.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-14-andriin@fb.com

985ead41

libbpf: Add BPF object skeleton support · d66562fb

Andrii Nakryiko authored Dec 13, 2019

Add new set of APIs, allowing to open/load/attach BPF object through BPF
object skeleton, generated by bpftool for a specific BPF object file. All the
xxx_skeleton() APIs wrap up corresponding bpf_object_xxx() APIs, but
additionally also automate map/program lookups by name, global data
initialization and mmap()-ing, etc.  All this greatly improves and simplifies
userspace usability of working with BPF programs. See follow up patches for
examples.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-13-andriin@fb.com

d66562fb

libbpf: Reduce log level of supported section names dump · 3f519353

Andrii Nakryiko authored Dec 13, 2019

It's quite spammy. And now that bpf_object__open() is trying to determine
program type from its section name, we are getting these verbose messages all
the time. Reduce their log level to DEBUG.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-12-andriin@fb.com

3f519353

libbpf: Postpone BTF ID finding for TRACING programs to load phase · 13acb508

Andrii Nakryiko authored Dec 13, 2019

Move BTF ID determination for BPF_PROG_TYPE_TRACING programs to a load phase.
Performing it at open step is inconvenient, because it prevents BPF skeleton
generation on older host kernel, which doesn't contain BTF_KIND_FUNCs
information in vmlinux BTF. This is a common set up, though, when, e.g.,
selftests are compiled on older host kernel, but the test program itself is
executed in qemu VM with bleeding edge kernel. Having this BTF searching
performed at load time allows to successfully use bpf_object__open() for
codegen and inspection of BPF object file.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-11-andriin@fb.com

13acb508

libbpf: Refactor global data map initialization · eba9c5f4

Andrii Nakryiko authored Dec 13, 2019

Refactor global data map initialization to use anonymous mmap()-ed memory
instead of malloc()-ed one. This allows to do a transparent re-mmap()-ing of
already existing memory address to point to BPF map's memory after
bpf_object__load() step (done in follow up patch). This choreographed setup
allows to have a nice and unsurprising way to pre-initialize read-only (and
r/w as well) maps by user and after BPF map creation keep working with
mmap()-ed contents of this map. All in a way that doesn't require user code to
update any pointers: the illusion of working with memory contents is preserved
before and after actual BPF map instantiation.

Selftests and runqslower example demonstrate this feature in follow up patches.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-10-andriin@fb.com

eba9c5f4

libbpf: Expose BPF program's function name · 01af3bf0

Andrii Nakryiko authored Dec 13, 2019

Add APIs to get BPF program function name, as opposed to bpf_program__title(),
which returns BPF program function's section name. Function name has a benefit
of being a valid C identifier and uniquely identifies a specific BPF program,
while section name can be duplicated across multiple independent BPF programs.

Add also bpf_object__find_program_by_name(), similar to
bpf_object__find_program_by_title(), to facilitate looking up BPF programs by
their C function names.

Convert one of selftests to new API for look up.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-9-andriin@fb.com

01af3bf0

libbpf: Expose BTF-to-C type declaration emitting API · 9f81654e

Andrii Nakryiko authored Dec 13, 2019

Expose API that allows to emit type declaration and field/variable definition
(if optional field name is specified) in valid C syntax for any provided BTF
type. This is going to be used by bpftool when emitting data section layout as
a struct. As part of making this API useful in a stand-alone fashion, move
initialization of some of the internal btf_dump state to earlier phase.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-8-andriin@fb.com

9f81654e

libbpf: Expose btf__align_of() API · 3d208f4c

Andrii Nakryiko authored Dec 13, 2019

Expose BTF API that calculates type alignment requirements.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-7-andriin@fb.com

3d208f4c

libbpf: Extract common user-facing helpers · 544402d4

Andrii Nakryiko authored Dec 13, 2019

LIBBPF_API and DECLARE_LIBBPF_OPTS are needed in many public libbpf API
headers. Extract them into libbpf_common.h to avoid unnecessary
interdependency between btf.h, libbpf.h, and bpf.h or code duplication.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-6-andriin@fb.com

544402d4

libbpf: Add BPF_EMBED_OBJ macro for embedding BPF .o files · 917f6b7b

Andrii Nakryiko authored Dec 13, 2019

Add a convenience macro BPF_EMBED_OBJ, which allows to embed other files
(typically used to embed BPF .o files) into a hosting userspace programs. To
C program it is exposed as struct bpf_embed_data, containing a pointer to
raw data and its size in bytes.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-5-andriin@fb.com

917f6b7b

libbpf: Move non-public APIs from libbpf.h to libbpf_internal.h · 612d05be

Andrii Nakryiko authored Dec 13, 2019

Few libbpf APIs are not public but currently exposed through libbpf.h to be
used by bpftool. Move them to libbpf_internal.h, where intent of being
non-stable and non-public is much more obvious.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-4-andriin@fb.com

612d05be

libbpf: Add generic bpf_program__attach() · d7a18ea7

Andrii Nakryiko authored Dec 13, 2019

Generalize BPF program attaching and allow libbpf to auto-detect type (and
extra parameters, where applicable) and attach supported BPF program types
based on program sections. Currently this is supported for:
- kprobe/kretprobe;
- tracepoint;
- raw tracepoint;
- tracing programs (typed raw TP/fentry/fexit).

More types support can be trivially added within this framework.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-3-andriin@fb.com

d7a18ea7

libbpf: Don't require root for bpf_object__open() · 0d13bfce

Andrii Nakryiko authored Dec 13, 2019

Reorganize bpf_object__open and bpf_object__load steps such that
bpf_object__open doesn't need root access. This was previously done for
feature probing and BTF sanitization. This doesn't have to happen on open,
though, so move all those steps into the load phase.

This is important, because it makes it possible for tools like bpftool, to
just open BPF object file and inspect their contents: programs, maps, BTF,
etc. For such operations it is prohibitive to require root access. On the
other hand, there is a lot of custom libbpf logic in those steps, so its best
avoided for tools to reimplement all that on their own.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191214014341.3442258-2-andriin@fb.com

0d13bfce

libbpf: Fix readelf output parsing for Fedora · aa915931

Thadeu Lima de Souza Cascardo authored Dec 13, 2019

Fedora binutils has been patched to show "other info" for a symbol at the
end of the line. This was done in order to support unmaintained scripts
that would break with the extra info. [1]

[1] https://src.fedoraproject.org/rpms/binutils/c/b8265c46f7ddae23a792ee8306fbaaeacba83bf8

This in turn has been done to fix the build of ruby, because of checksec.
[2] Thanks Michael Ellerman for the pointer.

[2] https://bugzilla.redhat.com/show_bug.cgi?id=1479302

As libbpf Makefile is not unmaintained, we can simply deal with either
output format, by just removing the "other info" field, as it always comes
inside brackets.

Fixes: 3464afdf (libbpf: Fix readelf output parsing on powerpc with recent binutils)
Reported-by: Justin Forbes <jmforbes@linuxtx.org>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Link: https://lore.kernel.org/bpf/20191213101114.GA3986@calabresa

aa915931

Merge branch 'bpftool-match-by-name' · a06ae6ac

Alexei Starovoitov authored Dec 15, 2019

Paul Chaignon says:

====================
When working with frequently modified BPF programs, both the ID and the
tag may change.  bpftool currently doesn't provide a "stable" way to match
such programs.  This patchset allows bpftool to match programs and maps by
name.

When given a tag that matches several programs, bpftool currently only
considers the first match.  The first patch changes that behavior to
either process all matching programs (for the show and dump commands) or
error out.  The second patch implements program lookup by name, with the
same behavior as for tags in case of ambiguity.  The last patch implements
map lookup by name.

Changelogs:
  Changes in v2:
    - Fix buffer overflow after realloc.
    - Add example output to commit message.
    - Properly close JSON arrays on errors.
    - Fix style errors (line breaks, for loops, exit labels, type for
      tagname).
    - Move do_show code for argc == 2 to do_show_subset functions.
    - Rebase.
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

a06ae6ac

bpftool: Match maps by name · 99f9863a

Paul Chaignon authored Dec 13, 2019

This patch implements lookup by name for maps and changes the behavior of
lookups by tag to be consistent with prog subcommands. Similarly to
program subcommands, the show and dump commands will return all maps with
the given name (or tag), whereas other commands will error out if several
maps have the same name (resp. tag).

When a map has BTF info, it is dumped in JSON with available BTF info.
This patch requires that all matched maps have BTF info before switching
the output format to JSON.
Signed-off-by: Paul Chaignon <paul.chaignon@orange.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/8de1c9f273860b3ea1680502928f4da2336b853e.1576263640.git.paul.chaignon@gmail.com

99f9863a

bpftool: Match programs by name · a7d22ca2

Paul Chaignon authored Dec 13, 2019

When working with frequently modified BPF programs, both the ID and the
tag may change. bpftool currently doesn't provide a "stable" way to match
such programs.

This patch implements lookup by name for programs. The show and dump
commands will return all programs with the given name, whereas other
commands will error out if several programs have the same name.
Signed-off-by: Paul Chaignon <paul.chaignon@orange.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Link: https://lore.kernel.org/bpf/b5fc1a5dcfaeb5f16fc80295cdaa606dd2d91534.1576263640.git.paul.chaignon@gmail.com

a7d22ca2

bpftool: Match several programs with same tag · ec202509

Paul Chaignon authored Dec 13, 2019

When several BPF programs have the same tag, bpftool matches only the
first (in ID order).  This patch changes that behavior such that dump and
show commands return all matched programs.  Commands that require a single
program (e.g., pin and attach) will error out if given a tag that matches
several.  bpftool prog dump will also error out if file or visual are
given and several programs have the given tag.

In the case of the dump command, a program header is added before each
dump only if the tag matches several programs; this patch doesn't change
the output if a single program matches.  The output when several
programs match thus looks as follows.

$ ./bpftool prog dump xlated tag 6deef7357e7b4530
3: cgroup_skb  tag 6deef7357e7b4530  gpl
   0: (bf) r6 = r1
   [...]
   7: (95) exit

4: cgroup_skb  tag 6deef7357e7b4530  gpl
   0: (bf) r6 = r1
   [...]
   7: (95) exit
Signed-off-by: Paul Chaignon <paul.chaignon@orange.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/fb1fe943202659a69cd21dd5b907c205af1e1e22.1576263640.git.paul.chaignon@gmail.com

ec202509

13 Dec, 2019 8 commits

selftests/bpf: Test wire_len/gso_segs in BPF_PROG_TEST_RUN · a06bf42f

Stanislav Fomichev authored Dec 13, 2019

Make sure we can pass arbitrary data in wire_len/gso_segs.
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191213223028.161282-2-sdf@google.com

a06bf42f

bpf: Expose __sk_buff wire_len/gso_segs to BPF_PROG_TEST_RUN · 850a88cc

Stanislav Fomichev authored Dec 13, 2019

wire_len should not be less than real len and is capped by GSO_MAX_SIZE.
gso_segs is capped by GSO_MAX_SEGS.

v2:
* set wire_len to skb->len when passed wire_len is 0 (Alexei Starovoitov)
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Cc: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191213223028.161282-1-sdf@google.com

850a88cc

Merge branch 'bpf-dispatcher' · 02620d9e

Alexei Starovoitov authored Dec 13, 2019

Björn Töpel says:

====================
Overview
========

This is the 6th iteration of the series that introduces the BPF
dispatcher, which is a mechanism to avoid indirect calls.

The BPF dispatcher is a multi-way branch code generator, targeted for
BPF programs. E.g. when an XDP program is executed via the
bpf_prog_run_xdp(), it is invoked via an indirect call. With
retpolines enabled, the indirect call has a substantial performance
impact. The dispatcher is a mechanism that transform indirect calls to
direct calls, and therefore avoids the retpoline. The dispatcher is
generated using the BPF JIT, and relies on text poking provided by
bpf_arch_text_poke().

The dispatcher hijacks a trampoline function it via the __fentry__ nop
of the trampoline. One dispatcher instance currently supports up to 48
dispatch points. This can be extended in the future.

In this series, only one dispatcher instance is supported, and the
only user is XDP. The dispatcher is updated when an XDP program is
attached/detached to/from a netdev. An alternative to this could have
been to update the dispatcher at program load point, but as there are
usually more XDP programs loaded than attached, so the latter was
picked.

The XDP dispatcher is always enabled, if available, because it helps
even when retpolines are disabled. Please refer to the "Performance"
section below.

The first patch refactors the image allocation from the BPF trampoline
code. Patch two introduces the dispatcher, and patch three adds a
dispatcher for XDP, and wires up the XDP control-/ fast-path. Patch
four adds the dispatcher to BPF_TEST_RUN. Patch five adds a simple
selftest, and the last adds alignment to jump targets.

I have rebased the series on commit 679152d3 ("libbpf: Fix printf
compilation warnings on ppc64le arch").

Generated code, x86-64
======================

The dispatcher currently has a maximum of 48 entries, where one entry
is a unique BPF program. Multiple users of a dispatcher instance using
the same BPF program will share that entry.

The program/slot lookup is performed by a binary search, O(log
n). Let's have a look at the generated code.

The trampoline function has the following signature:

  unsigned int tramp(const void *ctx,
                     const struct bpf_insn *insnsi,
                     unsigned int (*bpf_func)(const void *,
                                              const struct bpf_insn *))

On Intel x86-64 this means that rdx will contain the bpf_func. To,
make it easier to read, I've let the BPF programs have the following
range: 0xffffffffffffffff (-1) to 0xfffffffffffffff0
(-16). 0xffffffff81c00f10 is the retpoline thunk, in this case
__x86_indirect_thunk_rdx. If retpolines are disabled the thunk will be
a regular indirect call.

The minimal dispatcher will then look like this:

ffffffffc0002000: cmp    rdx,0xffffffffffffffff
ffffffffc0002007: je     0xffffffffffffffff ; -1
ffffffffc000200d: jmp    0xffffffff81c00f10

A 16 entry dispatcher looks like this:

ffffffffc0020000: cmp    rdx,0xfffffffffffffff7 ; -9
ffffffffc0020007: jg     0xffffffffc0020130
ffffffffc002000d: cmp    rdx,0xfffffffffffffff3 ; -13
ffffffffc0020014: jg     0xffffffffc00200a0
ffffffffc002001a: cmp    rdx,0xfffffffffffffff1 ; -15
ffffffffc0020021: jg     0xffffffffc0020060
ffffffffc0020023: cmp    rdx,0xfffffffffffffff0 ; -16
ffffffffc002002a: jg     0xffffffffc0020040
ffffffffc002002c: cmp    rdx,0xfffffffffffffff0 ; -16
ffffffffc0020033: je     0xfffffffffffffff0 ; -16
ffffffffc0020039: jmp    0xffffffff81c00f10
ffffffffc002003e: xchg   ax,ax
ffffffffc0020040: cmp    rdx,0xfffffffffffffff1 ; -15
ffffffffc0020047: je     0xfffffffffffffff1 ; -15
ffffffffc002004d: jmp    0xffffffff81c00f10
ffffffffc0020052: nop    DWORD PTR [rax+rax*1+0x0]
ffffffffc002005a: nop    WORD PTR [rax+rax*1+0x0]
ffffffffc0020060: cmp    rdx,0xfffffffffffffff2 ; -14
ffffffffc0020067: jg     0xffffffffc0020080
ffffffffc0020069: cmp    rdx,0xfffffffffffffff2 ; -14
ffffffffc0020070: je     0xfffffffffffffff2 ; -14
ffffffffc0020076: jmp    0xffffffff81c00f10
ffffffffc002007b: nop    DWORD PTR [rax+rax*1+0x0]
ffffffffc0020080: cmp    rdx,0xfffffffffffffff3 ; -13
ffffffffc0020087: je     0xfffffffffffffff3 ; -13
ffffffffc002008d: jmp    0xffffffff81c00f10
ffffffffc0020092: nop    DWORD PTR [rax+rax*1+0x0]
ffffffffc002009a: nop    WORD PTR [rax+rax*1+0x0]
ffffffffc00200a0: cmp    rdx,0xfffffffffffffff5 ; -11
ffffffffc00200a7: jg     0xffffffffc00200f0
ffffffffc00200a9: cmp    rdx,0xfffffffffffffff4 ; -12
ffffffffc00200b0: jg     0xffffffffc00200d0
ffffffffc00200b2: cmp    rdx,0xfffffffffffffff4 ; -12
ffffffffc00200b9: je     0xfffffffffffffff4 ; -12
ffffffffc00200bf: jmp    0xffffffff81c00f10
ffffffffc00200c4: nop    DWORD PTR [rax+rax*1+0x0]
ffffffffc00200cc: nop    DWORD PTR [rax+0x0]
ffffffffc00200d0: cmp    rdx,0xfffffffffffffff5 ; -11
ffffffffc00200d7: je     0xfffffffffffffff5 ; -11
ffffffffc00200dd: jmp    0xffffffff81c00f10
ffffffffc00200e2: nop    DWORD PTR [rax+rax*1+0x0]
ffffffffc00200ea: nop    WORD PTR [rax+rax*1+0x0]
ffffffffc00200f0: cmp    rdx,0xfffffffffffffff6 ; -10
ffffffffc00200f7: jg     0xffffffffc0020110
ffffffffc00200f9: cmp    rdx,0xfffffffffffffff6 ; -10
ffffffffc0020100: je     0xfffffffffffffff6 ; -10
ffffffffc0020106: jmp    0xffffffff81c00f10
ffffffffc002010b: nop    DWORD PTR [rax+rax*1+0x0]
ffffffffc0020110: cmp    rdx,0xfffffffffffffff7 ; -9
ffffffffc0020117: je     0xfffffffffffffff7 ; -9
ffffffffc002011d: jmp    0xffffffff81c00f10
ffffffffc0020122: nop    DWORD PTR [rax+rax*1+0x0]
ffffffffc002012a: nop    WORD PTR [rax+rax*1+0x0]
ffffffffc0020130: cmp    rdx,0xfffffffffffffffb ; -5
ffffffffc0020137: jg     0xffffffffc00201d0
ffffffffc002013d: cmp    rdx,0xfffffffffffffff9 ; -7
ffffffffc0020144: jg     0xffffffffc0020190
ffffffffc0020146: cmp    rdx,0xfffffffffffffff8 ; -8
ffffffffc002014d: jg     0xffffffffc0020170
ffffffffc002014f: cmp    rdx,0xfffffffffffffff8 ; -8
ffffffffc0020156: je     0xfffffffffffffff8 ; -8
ffffffffc002015c: jmp    0xffffffff81c00f10
ffffffffc0020161: nop    DWORD PTR [rax+rax*1+0x0]
ffffffffc0020169: nop    DWORD PTR [rax+0x0]
ffffffffc0020170: cmp    rdx,0xfffffffffffffff9 ; -7
ffffffffc0020177: je     0xfffffffffffffff9 ; -7
ffffffffc002017d: jmp    0xffffffff81c00f10
ffffffffc0020182: nop    DWORD PTR [rax+rax*1+0x0]
ffffffffc002018a: nop    WORD PTR [rax+rax*1+0x0]
ffffffffc0020190: cmp    rdx,0xfffffffffffffffa ; -6
ffffffffc0020197: jg     0xffffffffc00201b0
ffffffffc0020199: cmp    rdx,0xfffffffffffffffa ; -6
ffffffffc00201a0: je     0xfffffffffffffffa ; -6
ffffffffc00201a6: jmp    0xffffffff81c00f10
ffffffffc00201ab: nop    DWORD PTR [rax+rax*1+0x0]
ffffffffc00201b0: cmp    rdx,0xfffffffffffffffb ; -5
ffffffffc00201b7: je     0xfffffffffffffffb ; -5
ffffffffc00201bd: jmp    0xffffffff81c00f10
ffffffffc00201c2: nop    DWORD PTR [rax+rax*1+0x0]
ffffffffc00201ca: nop    WORD PTR [rax+rax*1+0x0]
ffffffffc00201d0: cmp    rdx,0xfffffffffffffffd ; -3
ffffffffc00201d7: jg     0xffffffffc0020220
ffffffffc00201d9: cmp    rdx,0xfffffffffffffffc ; -4
ffffffffc00201e0: jg     0xffffffffc0020200
ffffffffc00201e2: cmp    rdx,0xfffffffffffffffc ; -4
ffffffffc00201e9: je     0xfffffffffffffffc ; -4
ffffffffc00201ef: jmp    0xffffffff81c00f10
ffffffffc00201f4: nop    DWORD PTR [rax+rax*1+0x0]
ffffffffc00201fc: nop    DWORD PTR [rax+0x0]
ffffffffc0020200: cmp    rdx,0xfffffffffffffffd ; -3
ffffffffc0020207: je     0xfffffffffffffffd ; -3
ffffffffc002020d: jmp    0xffffffff81c00f10
ffffffffc0020212: nop    DWORD PTR [rax+rax*1+0x0]
ffffffffc002021a: nop    WORD PTR [rax+rax*1+0x0]
ffffffffc0020220: cmp    rdx,0xfffffffffffffffe ; -2
ffffffffc0020227: jg     0xffffffffc0020240
ffffffffc0020229: cmp    rdx,0xfffffffffffffffe ; -2
ffffffffc0020230: je     0xfffffffffffffffe ; -2
ffffffffc0020236: jmp    0xffffffff81c00f10
ffffffffc002023b: nop    DWORD PTR [rax+rax*1+0x0]
ffffffffc0020240: cmp    rdx,0xffffffffffffffff ; -1
ffffffffc0020247: je     0xffffffffffffffff ; -1
ffffffffc002024d: jmp    0xffffffff81c00f10

The nops are there to align jump targets to 16 B.

Performance
===========

The tests were performed using the xdp_rxq_info sample program with
the following command-line:

1. XDP_DRV:
  # xdp_rxq_info --dev eth0 --action XDP_DROP
2. XDP_SKB:
  # xdp_rxq_info --dev eth0 -S --action XDP_DROP
3. xdp-perf, from selftests/bpf:
  # test_progs -v -t xdp_perf

Run with mitigations=auto
-------------------------

Baseline:
1. 21.7 Mpps (21736190)
2. 3.8 Mpps   (3837582)
3. 15 ns

Dispatcher:
1. 30.2 Mpps (30176320)
2. 4.0 Mpps   (4015579)
3. 5 ns

Dispatcher (full; walk all entries, and fallback):
1. 22.0 Mpps (21986704)
2. 3.8 Mpps   (3831298)
3. 17 ns

Run with mitigations=off
------------------------

Baseline:
1. 29.9 Mpps (29875135)
2. 4.1 Mpps   (4100179)
3. 4 ns

Dispatcher:
1. 30.4 Mpps (30439241)
2. 4.1 Mpps   (4109350)
1. 4 ns

Dispatcher (full; walk all entries, and fallback):
1. 28.9 Mpps (28903269)
2. 4.1 Mpps   (4080078)
3. 5 ns

xdp-perf runs, aliged vs non-aligned jump targets
-------------------------------------------------

In this test dispatchers of different sizes, with and without jump
target alignment, were exercised. As outlined above the function
lookup is performed via binary search. This means that depending on
the pointer value of the function, it can reside in the upper or lower
part of the search table. The performed tests were:

1. aligned, mititations=auto, function entry < other entries
2. aligned, mititations=auto, function entry > other entries
3. non-aligned, mititations=auto, function entry < other entries
4. non-aligned, mititations=auto, function entry > other entries
5. aligned, mititations=off, function entry < other entries
6. aligned, mititations=off, function entry > other entries
7. non-aligned, mititations=off, function entry < other entries
8. non-aligned, mititations=off, function entry > other entries

The micro benchmarks showed that alignment of jump target has some
positive impact.

A reply to this cover letter will contain complete data for all runs.

Multiple xdp-perf baseline with mitigations=auto
------------------------------------------------

 Performance counter stats for './test_progs -v -t xdp_perf' (1024 runs):

             16.69 msec task-clock                #    0.984 CPUs utilized            ( +-  0.08% )
                 2      context-switches          #    0.123 K/sec                    ( +-  1.11% )
                 0      cpu-migrations            #    0.000 K/sec                    ( +- 70.68% )
                97      page-faults               #    0.006 M/sec                    ( +-  0.05% )
        49,254,635      cycles                    #    2.951 GHz                      ( +-  0.09% )  (12.28%)
        42,138,558      instructions              #    0.86  insn per cycle           ( +-  0.02% )  (36.15%)
         7,315,291      branches                  #  438.300 M/sec                    ( +-  0.01% )  (59.43%)
         1,011,201      branch-misses             #   13.82% of all branches          ( +-  0.01% )  (83.31%)
        15,440,788      L1-dcache-loads           #  925.143 M/sec                    ( +-  0.00% )  (99.40%)
            39,067      L1-dcache-load-misses     #    0.25% of all L1-dcache hits    ( +-  0.04% )
             6,531      LLC-loads                 #    0.391 M/sec                    ( +-  0.05% )
               442      LLC-load-misses           #    6.76% of all LL-cache hits     ( +-  0.77% )
   <not supported>      L1-icache-loads
            57,964      L1-icache-load-misses                                         ( +-  0.06% )
        15,442,496      dTLB-loads                #  925.246 M/sec                    ( +-  0.00% )
               514      dTLB-load-misses          #    0.00% of all dTLB cache hits   ( +-  0.73% )  (40.57%)
               130      iTLB-loads                #    0.008 M/sec                    ( +-  2.75% )  (16.69%)
     <not counted>      iTLB-load-misses                                              ( +-  8.71% )  (0.60%)
   <not supported>      L1-dcache-prefetches
   <not supported>      L1-dcache-prefetch-misses

         0.0169558 +- 0.0000127 seconds time elapsed  ( +-  0.07% )

Multiple xdp-perf dispatcher with mitigations=auto
--------------------------------------------------

Note that this includes generating the dispatcher.

 Performance counter stats for './test_progs -v -t xdp_perf' (1024 runs):

              4.80 msec task-clock                #    0.953 CPUs utilized            ( +-  0.06% )
                 1      context-switches          #    0.258 K/sec                    ( +-  1.57% )
                 0      cpu-migrations            #    0.000 K/sec
                97      page-faults               #    0.020 M/sec                    ( +-  0.05% )
        14,185,861      cycles                    #    2.955 GHz                      ( +-  0.17% )  (50.49%)
        45,691,935      instructions              #    3.22  insn per cycle           ( +-  0.01% )  (99.19%)
         8,346,008      branches                  # 1738.709 M/sec                    ( +-  0.00% )
            13,046      branch-misses             #    0.16% of all branches          ( +-  0.10% )
        15,443,735      L1-dcache-loads           # 3217.365 M/sec                    ( +-  0.00% )
            39,585      L1-dcache-load-misses     #    0.26% of all L1-dcache hits    ( +-  0.05% )
             7,138      LLC-loads                 #    1.487 M/sec                    ( +-  0.06% )
               671      LLC-load-misses           #    9.40% of all LL-cache hits     ( +-  0.73% )
   <not supported>      L1-icache-loads
            56,213      L1-icache-load-misses                                         ( +-  0.08% )
        15,443,735      dTLB-loads                # 3217.365 M/sec                    ( +-  0.00% )
     <not counted>      dTLB-load-misses                                              (0.00%)
     <not counted>      iTLB-loads                                                    (0.00%)
     <not counted>      iTLB-load-misses                                              (0.00%)
   <not supported>      L1-dcache-prefetches
   <not supported>      L1-dcache-prefetch-misses

        0.00503705 +- 0.00000546 seconds time elapsed  ( +-  0.11% )

Revisions
=========

v4->v5: [1]
  * Fixed s/xdp_ctx/ctx/ type-o (Toke)
  * Marked dispatcher trampoline with noinline attribute (Alexei)

v3->v4: [2]
  * Moved away from doing dispatcher lookup based on the trampoline
    function, to a model where the dispatcher instance is explicitly
    passed to the bpf_dispatcher_change_prog() (Alexei)

v2->v3: [3]
  * Removed xdp_call, and instead make the dispatcher available to all
    XDP users via bpf_prog_run_xdp() and dev_xdp_install(). (Toke)
  * Always enable the dispatcher, if available (Alexei)
  * Reuse BPF trampoline image allocator (Alexei)
  * Make sure the dispatcher is exercised in selftests (Alexei)
  * Only allow one dispatcher, and wire it to XDP

v1->v2: [4]
  * Fixed i386 build warning (kbuild robot)
  * Made bpf_dispatcher_lookup() static (kbuild robot)
  * Make sure xdp_call.h is only enabled for builtins
  * Add xdp_call() to ixgbe, mlx4, and mlx5

RFC->v1: [5]
  * Improved error handling (Edward and Andrii)
  * Explicit cleanup (Andrii)
  * Use 32B with sext cmp (Alexei)
  * Align jump targets to 16B (Alexei)
  * 4 to 16 entries (Toke)
  * Added stats to xdp_call_run()

[1] https://lore.kernel.org/bpf/20191211123017.13212-1-bjorn.topel@gmail.com/
[2] https://lore.kernel.org/bpf/20191209135522.16576-1-bjorn.topel@gmail.com/
[3] https://lore.kernel.org/bpf/20191123071226.6501-1-bjorn.topel@gmail.com/
[4] https://lore.kernel.org/bpf/20191119160757.27714-1-bjorn.topel@gmail.com/
[5] https://lore.kernel.org/bpf/20191113204737.31623-1-bjorn.topel@gmail.com/
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

02620d9e

bpf, x86: Align dispatcher branch targets to 16B · 116eb788

Björn Töpel authored Dec 13, 2019

>From Intel 64 and IA-32 Architectures Optimization Reference Manual,
3.4.1.4 Code Alignment, Assembly/Compiler Coding Rule 11: All branch
targets should be 16-byte aligned.

This commits aligns branch targets according to the Intel manual.

The nops used to align branch targets make the dispatcher larger, and
therefore the number of supported dispatch points/programs are
descreased from 64 to 48.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191213175112.30208-7-bjorn.topel@gmail.com

116eb788

selftests: bpf: Add xdp_perf test · e754f5a6

Björn Töpel authored Dec 13, 2019

The xdp_perf is a dummy XDP test, only used to measure the the cost of
jumping into a naive XDP program one million times.

To build and run the program:
  $ cd tools/testing/selftests/bpf
  $ make
  $ ./test_progs -v -t xdp_perf
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191213175112.30208-6-bjorn.topel@gmail.com

e754f5a6

bpf: Start using the BPF dispatcher in BPF_TEST_RUN · f23c4b39

Björn Töpel authored Dec 13, 2019

In order to properly exercise the BPF dispatcher, this commit adds BPF
dispatcher usage to BPF_TEST_RUN when executing XDP programs.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191213175112.30208-5-bjorn.topel@gmail.com

f23c4b39

bpf, xdp: Start using the BPF dispatcher for XDP · 7e6897f9

Björn Töpel authored Dec 13, 2019

This commit adds a BPF dispatcher for XDP. The dispatcher is updated
from the XDP control-path, dev_xdp_install(), and used when an XDP
program is run via bpf_prog_run_xdp().
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191213175112.30208-4-bjorn.topel@gmail.com

7e6897f9

bpf: Introduce BPF dispatcher · 75ccbef6

Björn Töpel authored Dec 13, 2019

The BPF dispatcher is a multi-way branch code generator, mainly
targeted for XDP programs. When an XDP program is executed via the
bpf_prog_run_xdp(), it is invoked via an indirect call. The indirect
call has a substantial performance impact, when retpolines are
enabled. The dispatcher transform indirect calls to direct calls, and
therefore avoids the retpoline. The dispatcher is generated using the
BPF JIT, and relies on text poking provided by bpf_arch_text_poke().

The dispatcher hijacks a trampoline function it via the __fentry__ nop
of the trampoline. One dispatcher instance currently supports up to 64
dispatch points. A user creates a dispatcher with its corresponding
trampoline with the DEFINE_BPF_DISPATCHER macro.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191213175112.30208-3-bjorn.topel@gmail.com

75ccbef6