Commit 608b638e authored by Andrii Nakryiko's avatar Andrii Nakryiko

Merge branch 'Dynamic pointers'

Joanne Koong says:

====================

This patchset implements the basics of dynamic pointers in bpf.

A dynamic pointer (struct bpf_dynptr) is a pointer that stores extra metadata
alongside the address it points to. This abstraction is useful in bpf given
that every memory access in a bpf program must be safe. The verifier and bpf
helper functions can use the metadata to enforce safety guarantees for things
such as dynamically sized strings and kernel heap allocations.

From the program side, the bpf_dynptr is an opaque struct and the verifier
will enforce that its contents are never written to by the program.
It can only be written to through specific bpf helper functions.

There are several uses cases for dynamic pointers in bpf programs. Some
examples include: dynamically sized ringbuf reservations without extra
memcpys, dynamic string parsing and memory comparisons, dynamic memory
allocations that can be persisted in maps, and dynamic + ergonomic parsing of
sk_buff and xdp_md packet data.

At a high-level, the patches are as follows:
1/6 - Adds verifier support for dynptrs
2/6 - Adds bpf_dynptr_from_mem (local dynptr)
3/6 - Adds dynptr support for ring buffers
4/6 - Adds bpf_dynptr_read and bpf_dynptr_write
5/6 - Adds dynptr data slices (ptr to the dynptr data)
6/6 - Tests to check that the verifier rejects invalid cases and passes valid ones

This is the first dynptr patchset in a larger series. The next series of
patches will add dynptrs that support dynamic memory allocations that can also
be persisted in maps, support for parsing packet data through dynptrs, convenience
helpers for using dynptrs as iterators, and more helper functions for interacting
with strings and memory dynamically.

Changelog:
----------
v5 -> v6:
v5:
https://lore.kernel.org/bpf/20220520044245.3305025-1-joannelkoong@gmail.com/
* enforce PTR_TO_MAP_VALUE for bpf_dynptr_from_mem data in check_helper_call
instead of using DYNPTR_TYPE_LOCAL when checking func arg compatiblity
* remove MEM_DYNPTR modifier

v4 -> v5:
v4:
https://lore.kernel.org/bpf/20220509224257.3222614-1-joannelkoong@gmail.com/
* Remove malloc dynptr; this will be part of the 2nd patchset while we
figure out memory accounting
* For data slices, only set the register's ref_obj_id to dynptr_id (Alexei)
* Tidying (eg remove "inline", move offset checking to "check_func_arg_reg_off")
(David)
* Add a few new test cases, remove malloc-only ones.

v3 -> v4:
v3:
https://lore.kernel.org/bpf/20220428211059.4065379-1-joannelkoong@gmail.com/
1/6 - Change mem ptr + size check to use more concise inequality expression
(David + Andrii)
2/6 - Add check for meta->uninit_dynptr_regno not already set (Andrii)
      Move DYNPTR_TYPE_FLAG_MASK to include/linux/bpf.h (Andrii)
3/6 - Remove four underscores for invoking BPF_CALL (Andrii)
      Add __BPF_TYPE_FLAG_MAX and use it for __BPF_TYPE_LAST_FLAG (Andrii)
4/6 - Fix capacity to be bpf_dynptr size value in check_off_len (Andrii)
      Change -EINVAL to -E2BIG if len + offset is out of bounds (Andrii)
5/6 - Add check for only 1 dynptr arg for dynptr data function (Andrii)
6/6 - For ringbuf map, set max_entries from userspace (Andrii)
      Use err ?: ... for interactring with dynptr APIs (Andrii)
      Define array_map2 for add_dynptr_to_map2 test where value is a struct
with an embedded dynptr
      Remove ref id from missing_put_callback message, since on different
environments, ref id is not always = 1

v2 -> v3:
v2:
https://lore.kernel.org/bpf/20220416063429.3314021-1-joannelkoong@gmail.com/
* Reorder patches (move ringbuffer patch to be right after the verifier +
* malloc
dynptr patchset)
* Remove local type dynptrs (Andrii + Alexei)
* Mark stack slot as STACK_MISC after any writes into a dynptr instead of
* explicitly prohibiting writes (Alexei)
* Pass number of slots, not memory size to is_spi_bounds_valid (Kumar)
* Check reference leaks by adding dynptr id to state->refs instead of checking
stack slots (Alexei)

v1 -> v2:
v1: https://lore.kernel.org/bpf/20220402015826.3941317-1-joannekoong@fb.com/
1/7 -
    * Remove ARG_PTR_TO_MAP_VALUE_UNINIT alias and use
      ARG_PTR_TO_MAP_VALUE | MEM_UNINIT directly (Andrii)
    * Drop arg_type_is_mem_ptr() wrapper function (Andrii)
2/7 -
    * Change name from MEM_RELEASE to OBJ_RELEASE (Andrii)
    * Use meta.release_ref instead of ref_obj_id != 0 to determine whether
      to release reference (Kumar)
    * Drop type_is_release_mem() wrapper function (Andrii)
3/7 -
    * Add checks for nested dynptrs edge-cases, which could lead to corrupt
    * writes of the dynptr stack variable.
    * Add u64 flags to bpf_dynptr_from_mem() and bpf_dynptr_alloc() (Andrii)
    * Rename from bpf_malloc/bpf_free to bpf_dynptr_alloc/bpf_dynptr_put
      (Alexei)
    * Support alloc flag __GFP_ZERO (Andrii)
    * Reserve upper 8 bits in dynptr size and offset fields instead of
      reserving just the upper 4 bits (Andrii)
    * Allow dynptr zero-slices (Andrii)
    * Use the highest bit for is_rdonly instead of the 28th bit (Andrii)
    * Rename check_* functions to is_* functions for better readability
      (Andrii)
    * Add comment for code that checks the spi bounds (Andrii)
4/7 -
    * Fix doc description for bpf_dynpt_read (Toke)
    * Move bpf_dynptr_check_off_len() from function patch 1 to here (Andrii)
5/7 -
    * When finding the id for the dynptr to associate the data slice with,
      look for dynptr arg instead of assuming it is BPF_REG_1.
6/7 -
    * Add __force when casting from unsigned long to void * (kernel test
    * robot)
    * Expand on docs for ringbuf dynptr APIs (Andrii)
7/7 -
    * Use table approach for defining test programs and error messages
    * (Andrii)
    * Print out full log if there’s an error (Andrii)
    * Use bpf_object__find_program_by_name() instead of specifying
      program name as a string (Andrii)
    * Add 6 extra cases: invalid_nested_dynptrs1, invalid_nested_dynptrs2,
      invalid_ref_mem1, invalid_ref_mem2, zero_slice_access,
      and test_alloc_zero_bytes
    * Add checking for edge cases (eg allocing with invalid flags)
====================
Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
parents 1ec5ee8c 0cf7052a
......@@ -392,10 +392,18 @@ enum bpf_type_flag {
MEM_UNINIT = BIT(7 + BPF_BASE_TYPE_BITS),
/* DYNPTR points to memory local to the bpf program. */
DYNPTR_TYPE_LOCAL = BIT(8 + BPF_BASE_TYPE_BITS),
/* DYNPTR points to a ringbuf record. */
DYNPTR_TYPE_RINGBUF = BIT(9 + BPF_BASE_TYPE_BITS),
__BPF_TYPE_FLAG_MAX,
__BPF_TYPE_LAST_FLAG = __BPF_TYPE_FLAG_MAX - 1,
};
#define DYNPTR_TYPE_FLAG_MASK (DYNPTR_TYPE_LOCAL | DYNPTR_TYPE_RINGBUF)
/* Max number of base types. */
#define BPF_BASE_TYPE_LIMIT (1UL << BPF_BASE_TYPE_BITS)
......@@ -438,6 +446,7 @@ enum bpf_arg_type {
ARG_PTR_TO_CONST_STR, /* pointer to a null terminated read-only string */
ARG_PTR_TO_TIMER, /* pointer to bpf_timer */
ARG_PTR_TO_KPTR, /* pointer to referenced kptr */
ARG_PTR_TO_DYNPTR, /* pointer to bpf_dynptr. See bpf_type_flag for dynptr type */
__BPF_ARG_TYPE_MAX,
/* Extended arg_types. */
......@@ -479,6 +488,7 @@ enum bpf_return_type {
RET_PTR_TO_TCP_SOCK_OR_NULL = PTR_MAYBE_NULL | RET_PTR_TO_TCP_SOCK,
RET_PTR_TO_SOCK_COMMON_OR_NULL = PTR_MAYBE_NULL | RET_PTR_TO_SOCK_COMMON,
RET_PTR_TO_ALLOC_MEM_OR_NULL = PTR_MAYBE_NULL | MEM_ALLOC | RET_PTR_TO_ALLOC_MEM,
RET_PTR_TO_DYNPTR_MEM_OR_NULL = PTR_MAYBE_NULL | RET_PTR_TO_ALLOC_MEM,
RET_PTR_TO_BTF_ID_OR_NULL = PTR_MAYBE_NULL | RET_PTR_TO_BTF_ID,
/* This must be the last entry. Its purpose is to ensure the enum is
......@@ -2225,6 +2235,9 @@ extern const struct bpf_func_proto bpf_ringbuf_reserve_proto;
extern const struct bpf_func_proto bpf_ringbuf_submit_proto;
extern const struct bpf_func_proto bpf_ringbuf_discard_proto;
extern const struct bpf_func_proto bpf_ringbuf_query_proto;
extern const struct bpf_func_proto bpf_ringbuf_reserve_dynptr_proto;
extern const struct bpf_func_proto bpf_ringbuf_submit_dynptr_proto;
extern const struct bpf_func_proto bpf_ringbuf_discard_dynptr_proto;
extern const struct bpf_func_proto bpf_skc_to_tcp6_sock_proto;
extern const struct bpf_func_proto bpf_skc_to_tcp_sock_proto;
extern const struct bpf_func_proto bpf_skc_to_tcp_timewait_sock_proto;
......@@ -2376,4 +2389,33 @@ int bpf_bprintf_prepare(char *fmt, u32 fmt_size, const u64 *raw_args,
u32 **bin_buf, u32 num_args);
void bpf_bprintf_cleanup(void);
/* the implementation of the opaque uapi struct bpf_dynptr */
struct bpf_dynptr_kern {
void *data;
/* Size represents the number of usable bytes of dynptr data.
* If for example the offset is at 4 for a local dynptr whose data is
* of type u64, the number of usable bytes is 4.
*
* The upper 8 bits are reserved. It is as follows:
* Bits 0 - 23 = size
* Bits 24 - 30 = dynptr type
* Bit 31 = whether dynptr is read-only
*/
u32 size;
u32 offset;
} __aligned(8);
enum bpf_dynptr_type {
BPF_DYNPTR_TYPE_INVALID,
/* Points to memory that is local to the bpf program */
BPF_DYNPTR_TYPE_LOCAL,
/* Underlying data is a ringbuf record */
BPF_DYNPTR_TYPE_RINGBUF,
};
void bpf_dynptr_init(struct bpf_dynptr_kern *ptr, void *data,
enum bpf_dynptr_type type, u32 offset, u32 size);
void bpf_dynptr_set_null(struct bpf_dynptr_kern *ptr);
int bpf_dynptr_check_size(u32 size);
#endif /* _LINUX_BPF_H */
......@@ -72,6 +72,18 @@ struct bpf_reg_state {
u32 mem_size; /* for PTR_TO_MEM | PTR_TO_MEM_OR_NULL */
/* For dynptr stack slots */
struct {
enum bpf_dynptr_type type;
/* A dynptr is 16 bytes so it takes up 2 stack slots.
* We need to track which slot is the first slot
* to protect against cases where the user may try to
* pass in an address starting at the second slot of the
* dynptr.
*/
bool first_slot;
} dynptr;
/* Max size from any of the above. */
struct {
unsigned long raw1;
......@@ -88,6 +100,8 @@ struct bpf_reg_state {
* for the purpose of tracking that it's freed.
* For PTR_TO_SOCKET this is used to share which pointers retain the
* same reference to the socket, to determine proper reference freeing.
* For stack slots that are dynptrs, this is used to track references to
* the dynptr to determine proper reference freeing.
*/
u32 id;
/* PTR_TO_SOCKET and PTR_TO_TCP_SOCK could be a ptr returned
......@@ -174,9 +188,15 @@ enum bpf_stack_slot_type {
STACK_SPILL, /* register spilled into stack */
STACK_MISC, /* BPF program wrote some data into this slot */
STACK_ZERO, /* BPF program wrote constant zero */
/* A dynptr is stored in this stack slot. The type of dynptr
* is stored in bpf_stack_state->spilled_ptr.dynptr.type
*/
STACK_DYNPTR,
};
#define BPF_REG_SIZE 8 /* size of eBPF register in bytes */
#define BPF_DYNPTR_SIZE sizeof(struct bpf_dynptr_kern)
#define BPF_DYNPTR_NR_SLOTS (BPF_DYNPTR_SIZE / BPF_REG_SIZE)
struct bpf_stack_state {
struct bpf_reg_state spilled_ptr;
......
......@@ -5178,6 +5178,77 @@ union bpf_attr {
* Dynamically cast a *sk* pointer to a *mptcp_sock* pointer.
* Return
* *sk* if casting is valid, or **NULL** otherwise.
*
* long bpf_dynptr_from_mem(void *data, u32 size, u64 flags, struct bpf_dynptr *ptr)
* Description
* Get a dynptr to local memory *data*.
*
* *data* must be a ptr to a map value.
* The maximum *size* supported is DYNPTR_MAX_SIZE.
* *flags* is currently unused.
* Return
* 0 on success, -E2BIG if the size exceeds DYNPTR_MAX_SIZE,
* -EINVAL if flags is not 0.
*
* long bpf_ringbuf_reserve_dynptr(void *ringbuf, u32 size, u64 flags, struct bpf_dynptr *ptr)
* Description
* Reserve *size* bytes of payload in a ring buffer *ringbuf*
* through the dynptr interface. *flags* must be 0.
*
* Please note that a corresponding bpf_ringbuf_submit_dynptr or
* bpf_ringbuf_discard_dynptr must be called on *ptr*, even if the
* reservation fails. This is enforced by the verifier.
* Return
* 0 on success, or a negative error in case of failure.
*
* void bpf_ringbuf_submit_dynptr(struct bpf_dynptr *ptr, u64 flags)
* Description
* Submit reserved ring buffer sample, pointed to by *data*,
* through the dynptr interface. This is a no-op if the dynptr is
* invalid/null.
*
* For more information on *flags*, please see
* 'bpf_ringbuf_submit'.
* Return
* Nothing. Always succeeds.
*
* void bpf_ringbuf_discard_dynptr(struct bpf_dynptr *ptr, u64 flags)
* Description
* Discard reserved ring buffer sample through the dynptr
* interface. This is a no-op if the dynptr is invalid/null.
*
* For more information on *flags*, please see
* 'bpf_ringbuf_discard'.
* Return
* Nothing. Always succeeds.
*
* long bpf_dynptr_read(void *dst, u32 len, struct bpf_dynptr *src, u32 offset)
* Description
* Read *len* bytes from *src* into *dst*, starting from *offset*
* into *src*.
* Return
* 0 on success, -E2BIG if *offset* + *len* exceeds the length
* of *src*'s data, -EINVAL if *src* is an invalid dynptr.
*
* long bpf_dynptr_write(struct bpf_dynptr *dst, u32 offset, void *src, u32 len)
* Description
* Write *len* bytes from *src* into *dst*, starting from *offset*
* into *dst*.
* Return
* 0 on success, -E2BIG if *offset* + *len* exceeds the length
* of *dst*'s data, -EINVAL if *dst* is an invalid dynptr or if *dst*
* is a read-only dynptr.
*
* void *bpf_dynptr_data(struct bpf_dynptr *ptr, u32 offset, u32 len)
* Description
* Get a pointer to the underlying dynptr data.
*
* *len* must be a statically known value. The returned data slice
* is invalidated whenever the dynptr is invalidated.
* Return
* Pointer to the underlying dynptr data, NULL if the dynptr is
* read-only, if the dynptr is invalid, or if the offset and length
* is out of bounds.
*/
#define __BPF_FUNC_MAPPER(FN) \
FN(unspec), \
......@@ -5377,6 +5448,13 @@ union bpf_attr {
FN(kptr_xchg), \
FN(map_lookup_percpu_elem), \
FN(skc_to_mptcp_sock), \
FN(dynptr_from_mem), \
FN(ringbuf_reserve_dynptr), \
FN(ringbuf_submit_dynptr), \
FN(ringbuf_discard_dynptr), \
FN(dynptr_read), \
FN(dynptr_write), \
FN(dynptr_data), \
/* */
/* integer value in 'imm' field of BPF_CALL instruction selects which helper
......@@ -6528,6 +6606,11 @@ struct bpf_timer {
__u64 :64;
} __attribute__((aligned(8)));
struct bpf_dynptr {
__u64 :64;
__u64 :64;
} __attribute__((aligned(8)));
struct bpf_sysctl {
__u32 write; /* Sysctl is being read (= 0) or written (= 1).
* Allows 1,2,4-byte read, but no write.
......
......@@ -1412,6 +1412,169 @@ const struct bpf_func_proto bpf_kptr_xchg_proto = {
.arg2_btf_id = BPF_PTR_POISON,
};
/* Since the upper 8 bits of dynptr->size is reserved, the
* maximum supported size is 2^24 - 1.
*/
#define DYNPTR_MAX_SIZE ((1UL << 24) - 1)
#define DYNPTR_TYPE_SHIFT 28
#define DYNPTR_SIZE_MASK 0xFFFFFF
#define DYNPTR_RDONLY_BIT BIT(31)
static bool bpf_dynptr_is_rdonly(struct bpf_dynptr_kern *ptr)
{
return ptr->size & DYNPTR_RDONLY_BIT;
}
static void bpf_dynptr_set_type(struct bpf_dynptr_kern *ptr, enum bpf_dynptr_type type)
{
ptr->size |= type << DYNPTR_TYPE_SHIFT;
}
static u32 bpf_dynptr_get_size(struct bpf_dynptr_kern *ptr)
{
return ptr->size & DYNPTR_SIZE_MASK;
}
int bpf_dynptr_check_size(u32 size)
{
return size > DYNPTR_MAX_SIZE ? -E2BIG : 0;
}
void bpf_dynptr_init(struct bpf_dynptr_kern *ptr, void *data,
enum bpf_dynptr_type type, u32 offset, u32 size)
{
ptr->data = data;
ptr->offset = offset;
ptr->size = size;
bpf_dynptr_set_type(ptr, type);
}
void bpf_dynptr_set_null(struct bpf_dynptr_kern *ptr)
{
memset(ptr, 0, sizeof(*ptr));
}
static int bpf_dynptr_check_off_len(struct bpf_dynptr_kern *ptr, u32 offset, u32 len)
{
u32 size = bpf_dynptr_get_size(ptr);
if (len > size || offset > size - len)
return -E2BIG;
return 0;
}
BPF_CALL_4(bpf_dynptr_from_mem, void *, data, u32, size, u64, flags, struct bpf_dynptr_kern *, ptr)
{
int err;
err = bpf_dynptr_check_size(size);
if (err)
goto error;
/* flags is currently unsupported */
if (flags) {
err = -EINVAL;
goto error;
}
bpf_dynptr_init(ptr, data, BPF_DYNPTR_TYPE_LOCAL, 0, size);
return 0;
error:
bpf_dynptr_set_null(ptr);
return err;
}
const struct bpf_func_proto bpf_dynptr_from_mem_proto = {
.func = bpf_dynptr_from_mem,
.gpl_only = false,
.ret_type = RET_INTEGER,
.arg1_type = ARG_PTR_TO_UNINIT_MEM,
.arg2_type = ARG_CONST_SIZE_OR_ZERO,
.arg3_type = ARG_ANYTHING,
.arg4_type = ARG_PTR_TO_DYNPTR | DYNPTR_TYPE_LOCAL | MEM_UNINIT,
};
BPF_CALL_4(bpf_dynptr_read, void *, dst, u32, len, struct bpf_dynptr_kern *, src, u32, offset)
{
int err;
if (!src->data)
return -EINVAL;
err = bpf_dynptr_check_off_len(src, offset, len);
if (err)
return err;
memcpy(dst, src->data + src->offset + offset, len);
return 0;
}
const struct bpf_func_proto bpf_dynptr_read_proto = {
.func = bpf_dynptr_read,
.gpl_only = false,
.ret_type = RET_INTEGER,
.arg1_type = ARG_PTR_TO_UNINIT_MEM,
.arg2_type = ARG_CONST_SIZE_OR_ZERO,
.arg3_type = ARG_PTR_TO_DYNPTR,
.arg4_type = ARG_ANYTHING,
};
BPF_CALL_4(bpf_dynptr_write, struct bpf_dynptr_kern *, dst, u32, offset, void *, src, u32, len)
{
int err;
if (!dst->data || bpf_dynptr_is_rdonly(dst))
return -EINVAL;
err = bpf_dynptr_check_off_len(dst, offset, len);
if (err)
return err;
memcpy(dst->data + dst->offset + offset, src, len);
return 0;
}
const struct bpf_func_proto bpf_dynptr_write_proto = {
.func = bpf_dynptr_write,
.gpl_only = false,
.ret_type = RET_INTEGER,
.arg1_type = ARG_PTR_TO_DYNPTR,
.arg2_type = ARG_ANYTHING,
.arg3_type = ARG_PTR_TO_MEM | MEM_RDONLY,
.arg4_type = ARG_CONST_SIZE_OR_ZERO,
};
BPF_CALL_3(bpf_dynptr_data, struct bpf_dynptr_kern *, ptr, u32, offset, u32, len)
{
int err;
if (!ptr->data)
return 0;
err = bpf_dynptr_check_off_len(ptr, offset, len);
if (err)
return 0;
if (bpf_dynptr_is_rdonly(ptr))
return 0;
return (unsigned long)(ptr->data + ptr->offset + offset);
}
const struct bpf_func_proto bpf_dynptr_data_proto = {
.func = bpf_dynptr_data,
.gpl_only = false,
.ret_type = RET_PTR_TO_DYNPTR_MEM_OR_NULL,
.arg1_type = ARG_PTR_TO_DYNPTR,
.arg2_type = ARG_ANYTHING,
.arg3_type = ARG_CONST_ALLOC_SIZE_OR_ZERO,
};
const struct bpf_func_proto bpf_get_current_task_proto __weak;
const struct bpf_func_proto bpf_get_current_task_btf_proto __weak;
const struct bpf_func_proto bpf_probe_read_user_proto __weak;
......@@ -1460,12 +1623,26 @@ bpf_base_func_proto(enum bpf_func_id func_id)
return &bpf_ringbuf_discard_proto;
case BPF_FUNC_ringbuf_query:
return &bpf_ringbuf_query_proto;
case BPF_FUNC_ringbuf_reserve_dynptr:
return &bpf_ringbuf_reserve_dynptr_proto;
case BPF_FUNC_ringbuf_submit_dynptr:
return &bpf_ringbuf_submit_dynptr_proto;
case BPF_FUNC_ringbuf_discard_dynptr:
return &bpf_ringbuf_discard_dynptr_proto;
case BPF_FUNC_for_each_map_elem:
return &bpf_for_each_map_elem_proto;
case BPF_FUNC_loop:
return &bpf_loop_proto;
case BPF_FUNC_strncmp:
return &bpf_strncmp_proto;
case BPF_FUNC_dynptr_from_mem:
return &bpf_dynptr_from_mem_proto;
case BPF_FUNC_dynptr_read:
return &bpf_dynptr_read_proto;
case BPF_FUNC_dynptr_write:
return &bpf_dynptr_write_proto;
case BPF_FUNC_dynptr_data:
return &bpf_dynptr_data_proto;
default:
break;
}
......
......@@ -475,3 +475,81 @@ const struct bpf_func_proto bpf_ringbuf_query_proto = {
.arg1_type = ARG_CONST_MAP_PTR,
.arg2_type = ARG_ANYTHING,
};
BPF_CALL_4(bpf_ringbuf_reserve_dynptr, struct bpf_map *, map, u32, size, u64, flags,
struct bpf_dynptr_kern *, ptr)
{
struct bpf_ringbuf_map *rb_map;
void *sample;
int err;
if (unlikely(flags)) {
bpf_dynptr_set_null(ptr);
return -EINVAL;
}
err = bpf_dynptr_check_size(size);
if (err) {
bpf_dynptr_set_null(ptr);
return err;
}
rb_map = container_of(map, struct bpf_ringbuf_map, map);
sample = __bpf_ringbuf_reserve(rb_map->rb, size);
if (!sample) {
bpf_dynptr_set_null(ptr);
return -EINVAL;
}
bpf_dynptr_init(ptr, sample, BPF_DYNPTR_TYPE_RINGBUF, 0, size);
return 0;
}
const struct bpf_func_proto bpf_ringbuf_reserve_dynptr_proto = {
.func = bpf_ringbuf_reserve_dynptr,
.ret_type = RET_INTEGER,
.arg1_type = ARG_CONST_MAP_PTR,
.arg2_type = ARG_ANYTHING,
.arg3_type = ARG_ANYTHING,
.arg4_type = ARG_PTR_TO_DYNPTR | DYNPTR_TYPE_RINGBUF | MEM_UNINIT,
};
BPF_CALL_2(bpf_ringbuf_submit_dynptr, struct bpf_dynptr_kern *, ptr, u64, flags)
{
if (!ptr->data)
return 0;
bpf_ringbuf_commit(ptr->data, flags, false /* discard */);
bpf_dynptr_set_null(ptr);
return 0;
}
const struct bpf_func_proto bpf_ringbuf_submit_dynptr_proto = {
.func = bpf_ringbuf_submit_dynptr,
.ret_type = RET_VOID,
.arg1_type = ARG_PTR_TO_DYNPTR | DYNPTR_TYPE_RINGBUF | OBJ_RELEASE,
.arg2_type = ARG_ANYTHING,
};
BPF_CALL_2(bpf_ringbuf_discard_dynptr, struct bpf_dynptr_kern *, ptr, u64, flags)
{
if (!ptr->data)
return 0;
bpf_ringbuf_commit(ptr->data, flags, true /* discard */);
bpf_dynptr_set_null(ptr);
return 0;
}
const struct bpf_func_proto bpf_ringbuf_discard_dynptr_proto = {
.func = bpf_ringbuf_discard_dynptr,
.ret_type = RET_VOID,
.arg1_type = ARG_PTR_TO_DYNPTR | DYNPTR_TYPE_RINGBUF | OBJ_RELEASE,
.arg2_type = ARG_ANYTHING,
};
This diff is collapsed.
......@@ -634,6 +634,7 @@ class PrinterHelpers(Printer):
'struct file',
'struct bpf_timer',
'struct mptcp_sock',
'struct bpf_dynptr',
]
known_types = {
'...',
......@@ -684,6 +685,7 @@ class PrinterHelpers(Printer):
'struct file',
'struct bpf_timer',
'struct mptcp_sock',
'struct bpf_dynptr',
}
mapped_types = {
'u8': '__u8',
......
......@@ -5178,6 +5178,77 @@ union bpf_attr {
* Dynamically cast a *sk* pointer to a *mptcp_sock* pointer.
* Return
* *sk* if casting is valid, or **NULL** otherwise.
*
* long bpf_dynptr_from_mem(void *data, u32 size, u64 flags, struct bpf_dynptr *ptr)
* Description
* Get a dynptr to local memory *data*.
*
* *data* must be a ptr to a map value.
* The maximum *size* supported is DYNPTR_MAX_SIZE.
* *flags* is currently unused.
* Return
* 0 on success, -E2BIG if the size exceeds DYNPTR_MAX_SIZE,
* -EINVAL if flags is not 0.
*
* long bpf_ringbuf_reserve_dynptr(void *ringbuf, u32 size, u64 flags, struct bpf_dynptr *ptr)
* Description
* Reserve *size* bytes of payload in a ring buffer *ringbuf*
* through the dynptr interface. *flags* must be 0.
*
* Please note that a corresponding bpf_ringbuf_submit_dynptr or
* bpf_ringbuf_discard_dynptr must be called on *ptr*, even if the
* reservation fails. This is enforced by the verifier.
* Return
* 0 on success, or a negative error in case of failure.
*
* void bpf_ringbuf_submit_dynptr(struct bpf_dynptr *ptr, u64 flags)
* Description
* Submit reserved ring buffer sample, pointed to by *data*,
* through the dynptr interface. This is a no-op if the dynptr is
* invalid/null.
*
* For more information on *flags*, please see
* 'bpf_ringbuf_submit'.
* Return
* Nothing. Always succeeds.
*
* void bpf_ringbuf_discard_dynptr(struct bpf_dynptr *ptr, u64 flags)
* Description
* Discard reserved ring buffer sample through the dynptr
* interface. This is a no-op if the dynptr is invalid/null.
*
* For more information on *flags*, please see
* 'bpf_ringbuf_discard'.
* Return
* Nothing. Always succeeds.
*
* long bpf_dynptr_read(void *dst, u32 len, struct bpf_dynptr *src, u32 offset)
* Description
* Read *len* bytes from *src* into *dst*, starting from *offset*
* into *src*.
* Return
* 0 on success, -E2BIG if *offset* + *len* exceeds the length
* of *src*'s data, -EINVAL if *src* is an invalid dynptr.
*
* long bpf_dynptr_write(struct bpf_dynptr *dst, u32 offset, void *src, u32 len)
* Description
* Write *len* bytes from *src* into *dst*, starting from *offset*
* into *dst*.
* Return
* 0 on success, -E2BIG if *offset* + *len* exceeds the length
* of *dst*'s data, -EINVAL if *dst* is an invalid dynptr or if *dst*
* is a read-only dynptr.
*
* void *bpf_dynptr_data(struct bpf_dynptr *ptr, u32 offset, u32 len)
* Description
* Get a pointer to the underlying dynptr data.
*
* *len* must be a statically known value. The returned data slice
* is invalidated whenever the dynptr is invalidated.
* Return
* Pointer to the underlying dynptr data, NULL if the dynptr is
* read-only, if the dynptr is invalid, or if the offset and length
* is out of bounds.
*/
#define __BPF_FUNC_MAPPER(FN) \
FN(unspec), \
......@@ -5377,6 +5448,13 @@ union bpf_attr {
FN(kptr_xchg), \
FN(map_lookup_percpu_elem), \
FN(skc_to_mptcp_sock), \
FN(dynptr_from_mem), \
FN(ringbuf_reserve_dynptr), \
FN(ringbuf_submit_dynptr), \
FN(ringbuf_discard_dynptr), \
FN(dynptr_read), \
FN(dynptr_write), \
FN(dynptr_data), \
/* */
/* integer value in 'imm' field of BPF_CALL instruction selects which helper
......@@ -6528,6 +6606,11 @@ struct bpf_timer {
__u64 :64;
} __attribute__((aligned(8)));
struct bpf_dynptr {
__u64 :64;
__u64 :64;
} __attribute__((aligned(8)));
struct bpf_sysctl {
__u32 write; /* Sysctl is being read (= 0) or written (= 1).
* Allows 1,2,4-byte read, but no write.
......
// SPDX-License-Identifier: GPL-2.0
/* Copyright (c) 2022 Facebook */
#include <test_progs.h>
#include "dynptr_fail.skel.h"
#include "dynptr_success.skel.h"
static size_t log_buf_sz = 1048576; /* 1 MB */
static char obj_log_buf[1048576];
static struct {
const char *prog_name;
const char *expected_err_msg;
} dynptr_tests[] = {
/* failure cases */
{"ringbuf_missing_release1", "Unreleased reference id=1"},
{"ringbuf_missing_release2", "Unreleased reference id=2"},
{"ringbuf_missing_release_callback", "Unreleased reference id"},
{"use_after_invalid", "Expected an initialized dynptr as arg #3"},
{"ringbuf_invalid_api", "type=mem expected=alloc_mem"},
{"add_dynptr_to_map1", "invalid indirect read from stack"},
{"add_dynptr_to_map2", "invalid indirect read from stack"},
{"data_slice_out_of_bounds_ringbuf", "value is outside of the allowed memory range"},
{"data_slice_out_of_bounds_map_value", "value is outside of the allowed memory range"},
{"data_slice_use_after_release", "invalid mem access 'scalar'"},
{"data_slice_missing_null_check1", "invalid mem access 'mem_or_null'"},
{"data_slice_missing_null_check2", "invalid mem access 'mem_or_null'"},
{"invalid_helper1", "invalid indirect read from stack"},
{"invalid_helper2", "Expected an initialized dynptr as arg #3"},
{"invalid_write1", "Expected an initialized dynptr as arg #1"},
{"invalid_write2", "Expected an initialized dynptr as arg #3"},
{"invalid_write3", "Expected an initialized ringbuf dynptr as arg #1"},
{"invalid_write4", "arg 1 is an unacquired reference"},
{"invalid_read1", "invalid read from stack"},
{"invalid_read2", "cannot pass in dynptr at an offset"},
{"invalid_read3", "invalid read from stack"},
{"invalid_read4", "invalid read from stack"},
{"invalid_offset", "invalid write to stack"},
{"global", "type=map_value expected=fp"},
{"release_twice", "arg 1 is an unacquired reference"},
{"release_twice_callback", "arg 1 is an unacquired reference"},
{"dynptr_from_mem_invalid_api",
"Unsupported reg type fp for bpf_dynptr_from_mem data"},
/* success cases */
{"test_read_write", NULL},
{"test_data_slice", NULL},
{"test_ringbuf", NULL},
};
static void verify_fail(const char *prog_name, const char *expected_err_msg)
{
LIBBPF_OPTS(bpf_object_open_opts, opts);
struct bpf_program *prog;
struct dynptr_fail *skel;
int err;
opts.kernel_log_buf = obj_log_buf;
opts.kernel_log_size = log_buf_sz;
opts.kernel_log_level = 1;
skel = dynptr_fail__open_opts(&opts);
if (!ASSERT_OK_PTR(skel, "dynptr_fail__open_opts"))
goto cleanup;
prog = bpf_object__find_program_by_name(skel->obj, prog_name);
if (!ASSERT_OK_PTR(prog, "bpf_object__find_program_by_name"))
goto cleanup;
bpf_program__set_autoload(prog, true);
bpf_map__set_max_entries(skel->maps.ringbuf, getpagesize());
err = dynptr_fail__load(skel);
if (!ASSERT_ERR(err, "unexpected load success"))
goto cleanup;
if (!ASSERT_OK_PTR(strstr(obj_log_buf, expected_err_msg), "expected_err_msg")) {
fprintf(stderr, "Expected err_msg: %s\n", expected_err_msg);
fprintf(stderr, "Verifier output: %s\n", obj_log_buf);
}
cleanup:
dynptr_fail__destroy(skel);
}
static void verify_success(const char *prog_name)
{
struct dynptr_success *skel;
struct bpf_program *prog;
struct bpf_link *link;
skel = dynptr_success__open();
if (!ASSERT_OK_PTR(skel, "dynptr_success__open"))
return;
skel->bss->pid = getpid();
bpf_map__set_max_entries(skel->maps.ringbuf, getpagesize());
dynptr_success__load(skel);
if (!ASSERT_OK_PTR(skel, "dynptr_success__load"))
goto cleanup;
prog = bpf_object__find_program_by_name(skel->obj, prog_name);
if (!ASSERT_OK_PTR(prog, "bpf_object__find_program_by_name"))
goto cleanup;
link = bpf_program__attach(prog);
if (!ASSERT_OK_PTR(link, "bpf_program__attach"))
goto cleanup;
usleep(1);
ASSERT_EQ(skel->bss->err, 0, "err");
bpf_link__destroy(link);
cleanup:
dynptr_success__destroy(skel);
}
void test_dynptr(void)
{
int i;
for (i = 0; i < ARRAY_SIZE(dynptr_tests); i++) {
if (!test__start_subtest(dynptr_tests[i].prog_name))
continue;
if (dynptr_tests[i].expected_err_msg)
verify_fail(dynptr_tests[i].prog_name,
dynptr_tests[i].expected_err_msg);
else
verify_success(dynptr_tests[i].prog_name);
}
}
This diff is collapsed.
// SPDX-License-Identifier: GPL-2.0
/* Copyright (c) 2022 Facebook */
#include <string.h>
#include <linux/bpf.h>
#include <bpf/bpf_helpers.h>
#include "bpf_misc.h"
#include "errno.h"
char _license[] SEC("license") = "GPL";
int pid, err, val;
struct sample {
int pid;
int seq;
long value;
char comm[16];
};
struct {
__uint(type, BPF_MAP_TYPE_RINGBUF);
} ringbuf SEC(".maps");
struct {
__uint(type, BPF_MAP_TYPE_ARRAY);
__uint(max_entries, 1);
__type(key, __u32);
__type(value, __u32);
} array_map SEC(".maps");
SEC("tp/syscalls/sys_enter_nanosleep")
int test_read_write(void *ctx)
{
char write_data[64] = "hello there, world!!";
char read_data[64] = {}, buf[64] = {};
struct bpf_dynptr ptr;
int i;
if (bpf_get_current_pid_tgid() >> 32 != pid)
return 0;
bpf_ringbuf_reserve_dynptr(&ringbuf, sizeof(write_data), 0, &ptr);
/* Write data into the dynptr */
err = err ?: bpf_dynptr_write(&ptr, 0, write_data, sizeof(write_data));
/* Read the data that was written into the dynptr */
err = err ?: bpf_dynptr_read(read_data, sizeof(read_data), &ptr, 0);
/* Ensure the data we read matches the data we wrote */
for (i = 0; i < sizeof(read_data); i++) {
if (read_data[i] != write_data[i]) {
err = 1;
break;
}
}
bpf_ringbuf_discard_dynptr(&ptr, 0);
return 0;
}
SEC("tp/syscalls/sys_enter_nanosleep")
int test_data_slice(void *ctx)
{
__u32 key = 0, val = 235, *map_val;
struct bpf_dynptr ptr;
__u32 map_val_size;
void *data;
map_val_size = sizeof(*map_val);
if (bpf_get_current_pid_tgid() >> 32 != pid)
return 0;
bpf_map_update_elem(&array_map, &key, &val, 0);
map_val = bpf_map_lookup_elem(&array_map, &key);
if (!map_val) {
err = 1;
return 0;
}
bpf_dynptr_from_mem(map_val, map_val_size, 0, &ptr);
/* Try getting a data slice that is out of range */
data = bpf_dynptr_data(&ptr, map_val_size + 1, 1);
if (data) {
err = 2;
return 0;
}
/* Try getting more bytes than available */
data = bpf_dynptr_data(&ptr, 0, map_val_size + 1);
if (data) {
err = 3;
return 0;
}
data = bpf_dynptr_data(&ptr, 0, sizeof(__u32));
if (!data) {
err = 4;
return 0;
}
*(__u32 *)data = 999;
err = bpf_probe_read_kernel(&val, sizeof(val), data);
if (err)
return 0;
if (val != *(int *)data)
err = 5;
return 0;
}
static int ringbuf_callback(__u32 index, void *data)
{
struct sample *sample;
struct bpf_dynptr *ptr = (struct bpf_dynptr *)data;
sample = bpf_dynptr_data(ptr, 0, sizeof(*sample));
if (!sample)
err = 2;
else
sample->pid += index;
return 0;
}
SEC("tp/syscalls/sys_enter_nanosleep")
int test_ringbuf(void *ctx)
{
struct bpf_dynptr ptr;
struct sample *sample;
if (bpf_get_current_pid_tgid() >> 32 != pid)
return 0;
val = 100;
/* check that you can reserve a dynamic size reservation */
err = bpf_ringbuf_reserve_dynptr(&ringbuf, val, 0, &ptr);
sample = err ? NULL : bpf_dynptr_data(&ptr, 0, sizeof(*sample));
if (!sample) {
err = 1;
goto done;
}
sample->pid = 10;
/* Can pass dynptr to callback functions */
bpf_loop(10, ringbuf_callback, &ptr, 0);
if (sample->pid != 55)
err = 2;
done:
bpf_ringbuf_discard_dynptr(&ptr, 0);
return 0;
}
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment