Commit db55174d authored by Daniel Borkmann's avatar Daniel Borkmann

Merge branch 'bpf-kptr-rcu'

Alexei Starovoitov says:

====================
v4->v5:
fix typos, add acks.

v3->v4:
- patch 3 got much cleaner after BPF_KPTR_RCU was removed as suggested by David.

- make KF_RCU stronger and require that bpf program checks for NULL
before passing such pointers into kfunc. The prog has to do that anyway
to access fields and it aligns with BTF_TYPE_SAFE_RCU allowlist.

- New patch 6: refactor RCU enforcement in the verifier.
The patches 2,3,6 are part of one feature.
The 2 and 3 alone are incomplete, since RCU pointers are barely useful
without bpf_rcu_read_lock/unlock in GCC compiled kernel.
Even if GCC lands support for btf_type_tag today it will take time
to mandate that version for kernel builds. Hence go with allow list
approach. See patch 6 for details.
This allows to start strict enforcement of TRUSTED | UNTRUSTED
in one part of PTR_TO_BTF_ID accesses.
One step closer to KF_TRUSTED_ARGS by default.

v2->v3:
- Instead of requiring bpf progs to tag fields with __kptr_rcu
teach the verifier to infer RCU properties based on the type.
BPF_KPTR_RCU becomes kernel internal type of struct btf_field.
- Add patch 2 to tag cgroups and dfl_cgrp as trusted.
That bug was spotted by BPF CI on clang compiler kernels,
since patch 3 is doing:
static bool in_rcu_cs(struct bpf_verifier_env *env)
{
        return env->cur_state->active_rcu_lock || !env->prog->aux->sleepable;
}
which makes all non-sleepable programs behave like they have implicit
rcu_read_lock around them. Which is the case in practice.
It was fine on gcc compiled kernels where task->cgroup deference was producing
PTR_TO_BTF_ID, but on clang compiled kernels task->cgroup deference was
producing PTR_TO_BTF_ID | MEM_RCU | MAYBE_NULL, which is more correct,
but selftests were failing. Patch 2 fixes this discrepancy.
With few more patches like patch 2 we can make KF_TRUSTED_ARGS default
for kfuncs and helpers.
- Add comment in selftest patch 5 that it's verifier only check.

v1->v2:
Instead of agressively allow dereferenced kptr_rcu pointers into KF_TRUSTED_ARGS
kfuncs only allow them into KF_RCU funcs.
The KF_RCU flag is a weaker version of KF_TRUSTED_ARGS. The kfuncs marked with
KF_RCU expect either PTR_TRUSTED or MEM_RCU arguments. The verifier guarantees
that the objects are valid and there is no use-after-free, but the pointers
maybe NULL and pointee object's reference count could have reached zero, hence
kfuncs must do != NULL check and consider refcnt==0 case when accessing such
arguments.
No changes in patch 1.
Patches 2,3,4 adjusted with above behavior.

v1:
The __kptr_ref turned out to be too limited, since any "trusted" pointer access
requires bpf_kptr_xchg() which is impractical when the same pointer needs
to be dereferenced by multiple cpus.
The __kptr "untrusted" only access isn't very useful in practice.
Rename __kptr to __kptr_untrusted with eventual goal to deprecate it,
and rename __kptr_ref to __kptr, since that looks to be more common use of kptrs.
Introduce __kptr_rcu that can be directly dereferenced and used similar
to native kernel C code.
Once bpf_cpumask and task_struct kfuncs are converted to observe RCU GP
when refcnt goes to zero, both __kptr and __kptr_untrusted can be deprecated
and __kptr_rcu can become the only __kptr tag.
====================
Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
parents 944459e8 6fcd486b
......@@ -314,7 +314,7 @@ Q: What is the compatibility story for special BPF types in map values?
Q: Users are allowed to embed bpf_spin_lock, bpf_timer fields in their BPF map
values (when using BTF support for BPF maps). This allows to use helpers for
such objects on these fields inside map values. Users are also allowed to embed
pointers to some kernel types (with __kptr and __kptr_ref BTF tags). Will the
pointers to some kernel types (with __kptr_untrusted and __kptr BTF tags). Will the
kernel preserve backwards compatibility for these features?
A: It depends. For bpf_spin_lock, bpf_timer: YES, for kptr and everything else:
......@@ -324,7 +324,7 @@ For struct types that have been added already, like bpf_spin_lock and bpf_timer,
the kernel will preserve backwards compatibility, as they are part of UAPI.
For kptrs, they are also part of UAPI, but only with respect to the kptr
mechanism. The types that you can use with a __kptr and __kptr_ref tagged
mechanism. The types that you can use with a __kptr_untrusted and __kptr tagged
pointer in your struct are NOT part of the UAPI contract. The supported types can
and will change across kernel releases. However, operations like accessing kptr
fields and bpf_kptr_xchg() helper will continue to be supported across kernel
......
......@@ -51,7 +51,7 @@ For example:
.. code-block:: c
struct cpumask_map_value {
struct bpf_cpumask __kptr_ref * cpumask;
struct bpf_cpumask __kptr * cpumask;
};
struct array_map {
......@@ -128,7 +128,7 @@ Here is an example of a ``struct bpf_cpumask *`` being retrieved from a map:
/* struct containing the struct bpf_cpumask kptr which is stored in the map. */
struct cpumasks_kfunc_map_value {
struct bpf_cpumask __kptr_ref * bpf_cpumask;
struct bpf_cpumask __kptr * bpf_cpumask;
};
/* The map containing struct cpumasks_kfunc_map_value entries. */
......
......@@ -249,11 +249,13 @@ added later.
2.4.8 KF_RCU flag
-----------------
The KF_RCU flag is used for kfuncs which have a rcu ptr as its argument.
When used together with KF_ACQUIRE, it indicates the kfunc should have a
single argument which must be a trusted argument or a MEM_RCU pointer.
The argument may have reference count of 0 and the kfunc must take this
into consideration.
The KF_RCU flag is a weaker version of KF_TRUSTED_ARGS. The kfuncs marked with
KF_RCU expect either PTR_TRUSTED or MEM_RCU arguments. The verifier guarantees
that the objects are valid and there is no use-after-free. The pointers are not
NULL, but the object's refcount could have reached zero. The kfuncs need to
consider doing refcnt != 0 check, especially when returning a KF_ACQUIRE
pointer. Note as well that a KF_ACQUIRE kfunc that is KF_RCU should very likely
also be KF_RET_NULL.
.. _KF_deprecated_flag:
......@@ -544,7 +546,7 @@ Here's an example of how it can be used:
/* struct containing the struct task_struct kptr which is actually stored in the map. */
struct __cgroups_kfunc_map_value {
struct cgroup __kptr_ref * cgroup;
struct cgroup __kptr * cgroup;
};
/* The map containing struct __cgroups_kfunc_map_value entries. */
......
......@@ -2279,7 +2279,7 @@ struct bpf_core_ctx {
bool btf_nested_type_is_trusted(struct bpf_verifier_log *log,
const struct bpf_reg_state *reg,
int off);
int off, const char *suffix);
bool btf_type_ids_nocast_alias(struct bpf_verifier_log *log,
const struct btf *reg_btf, u32 reg_id,
......
......@@ -537,7 +537,6 @@ struct bpf_verifier_env {
bool bypass_spec_v1;
bool bypass_spec_v4;
bool seen_direct_write;
bool rcu_tag_supported;
struct bpf_insn_aux_data *insn_aux_data; /* array of per-insn state */
const struct bpf_line_info *prev_linfo;
struct bpf_verifier_log log;
......
......@@ -70,7 +70,7 @@
#define KF_TRUSTED_ARGS (1 << 4) /* kfunc only takes trusted pointer arguments */
#define KF_SLEEPABLE (1 << 5) /* kfunc may sleep */
#define KF_DESTRUCTIVE (1 << 6) /* kfunc performs destructive actions */
#define KF_RCU (1 << 7) /* kfunc only takes rcu pointer arguments */
#define KF_RCU (1 << 7) /* kfunc takes either rcu or trusted pointer arguments */
/*
* Tag marking a kernel function as a kfunc. This is meant to minimize the
......
......@@ -3288,9 +3288,9 @@ static int btf_find_kptr(const struct btf *btf, const struct btf_type *t,
/* Reject extra tags */
if (btf_type_is_type_tag(btf_type_by_id(btf, t->type)))
return -EINVAL;
if (!strcmp("kptr", __btf_name_by_offset(btf, t->name_off)))
if (!strcmp("kptr_untrusted", __btf_name_by_offset(btf, t->name_off)))
type = BPF_KPTR_UNREF;
else if (!strcmp("kptr_ref", __btf_name_by_offset(btf, t->name_off)))
else if (!strcmp("kptr", __btf_name_by_offset(btf, t->name_off)))
type = BPF_KPTR_REF;
else
return -EINVAL;
......@@ -6163,6 +6163,7 @@ static int btf_struct_walk(struct bpf_verifier_log *log, const struct btf *btf,
const char *tname, *mname, *tag_value;
u32 vlen, elem_id, mid;
*flag = 0;
again:
tname = __btf_name_by_offset(btf, t->name_off);
if (!btf_type_is_struct(t)) {
......@@ -6329,6 +6330,15 @@ static int btf_struct_walk(struct bpf_verifier_log *log, const struct btf *btf,
* of this field or inside of this struct
*/
if (btf_type_is_struct(mtype)) {
if (BTF_INFO_KIND(mtype->info) == BTF_KIND_UNION &&
btf_type_vlen(mtype) != 1)
/*
* walking unions yields untrusted pointers
* with exception of __bpf_md_ptr and other
* unions with a single member
*/
*flag |= PTR_UNTRUSTED;
/* our field must be inside that union or struct */
t = mtype;
......@@ -6373,7 +6383,7 @@ static int btf_struct_walk(struct bpf_verifier_log *log, const struct btf *btf,
stype = btf_type_skip_modifiers(btf, mtype->type, &id);
if (btf_type_is_struct(stype)) {
*next_btf_id = id;
*flag = tmp_flag;
*flag |= tmp_flag;
return WALK_PTR;
}
}
......@@ -8357,7 +8367,7 @@ int bpf_core_apply(struct bpf_core_ctx *ctx, const struct bpf_core_relo *relo,
bool btf_nested_type_is_trusted(struct bpf_verifier_log *log,
const struct bpf_reg_state *reg,
int off)
int off, const char *suffix)
{
struct btf *btf = reg->btf;
const struct btf_type *walk_type, *safe_type;
......@@ -8374,7 +8384,7 @@ bool btf_nested_type_is_trusted(struct bpf_verifier_log *log,
tname = btf_name_by_offset(btf, walk_type->name_off);
ret = snprintf(safe_tname, sizeof(safe_tname), "%s__safe_fields", tname);
ret = snprintf(safe_tname, sizeof(safe_tname), "%s%s", tname, suffix);
if (ret < 0)
return false;
......
......@@ -427,26 +427,26 @@ BTF_ID_FLAGS(func, bpf_cpumask_create, KF_ACQUIRE | KF_RET_NULL)
BTF_ID_FLAGS(func, bpf_cpumask_release, KF_RELEASE | KF_TRUSTED_ARGS)
BTF_ID_FLAGS(func, bpf_cpumask_acquire, KF_ACQUIRE | KF_TRUSTED_ARGS)
BTF_ID_FLAGS(func, bpf_cpumask_kptr_get, KF_ACQUIRE | KF_KPTR_GET | KF_RET_NULL)
BTF_ID_FLAGS(func, bpf_cpumask_first, KF_TRUSTED_ARGS)
BTF_ID_FLAGS(func, bpf_cpumask_first_zero, KF_TRUSTED_ARGS)
BTF_ID_FLAGS(func, bpf_cpumask_set_cpu, KF_TRUSTED_ARGS)
BTF_ID_FLAGS(func, bpf_cpumask_clear_cpu, KF_TRUSTED_ARGS)
BTF_ID_FLAGS(func, bpf_cpumask_test_cpu, KF_TRUSTED_ARGS)
BTF_ID_FLAGS(func, bpf_cpumask_test_and_set_cpu, KF_TRUSTED_ARGS)
BTF_ID_FLAGS(func, bpf_cpumask_test_and_clear_cpu, KF_TRUSTED_ARGS)
BTF_ID_FLAGS(func, bpf_cpumask_setall, KF_TRUSTED_ARGS)
BTF_ID_FLAGS(func, bpf_cpumask_clear, KF_TRUSTED_ARGS)
BTF_ID_FLAGS(func, bpf_cpumask_and, KF_TRUSTED_ARGS)
BTF_ID_FLAGS(func, bpf_cpumask_or, KF_TRUSTED_ARGS)
BTF_ID_FLAGS(func, bpf_cpumask_xor, KF_TRUSTED_ARGS)
BTF_ID_FLAGS(func, bpf_cpumask_equal, KF_TRUSTED_ARGS)
BTF_ID_FLAGS(func, bpf_cpumask_intersects, KF_TRUSTED_ARGS)
BTF_ID_FLAGS(func, bpf_cpumask_subset, KF_TRUSTED_ARGS)
BTF_ID_FLAGS(func, bpf_cpumask_empty, KF_TRUSTED_ARGS)
BTF_ID_FLAGS(func, bpf_cpumask_full, KF_TRUSTED_ARGS)
BTF_ID_FLAGS(func, bpf_cpumask_copy, KF_TRUSTED_ARGS)
BTF_ID_FLAGS(func, bpf_cpumask_any, KF_TRUSTED_ARGS)
BTF_ID_FLAGS(func, bpf_cpumask_any_and, KF_TRUSTED_ARGS)
BTF_ID_FLAGS(func, bpf_cpumask_first, KF_RCU)
BTF_ID_FLAGS(func, bpf_cpumask_first_zero, KF_RCU)
BTF_ID_FLAGS(func, bpf_cpumask_set_cpu, KF_RCU)
BTF_ID_FLAGS(func, bpf_cpumask_clear_cpu, KF_RCU)
BTF_ID_FLAGS(func, bpf_cpumask_test_cpu, KF_RCU)
BTF_ID_FLAGS(func, bpf_cpumask_test_and_set_cpu, KF_RCU)
BTF_ID_FLAGS(func, bpf_cpumask_test_and_clear_cpu, KF_RCU)
BTF_ID_FLAGS(func, bpf_cpumask_setall, KF_RCU)
BTF_ID_FLAGS(func, bpf_cpumask_clear, KF_RCU)
BTF_ID_FLAGS(func, bpf_cpumask_and, KF_RCU)
BTF_ID_FLAGS(func, bpf_cpumask_or, KF_RCU)
BTF_ID_FLAGS(func, bpf_cpumask_xor, KF_RCU)
BTF_ID_FLAGS(func, bpf_cpumask_equal, KF_RCU)
BTF_ID_FLAGS(func, bpf_cpumask_intersects, KF_RCU)
BTF_ID_FLAGS(func, bpf_cpumask_subset, KF_RCU)
BTF_ID_FLAGS(func, bpf_cpumask_empty, KF_RCU)
BTF_ID_FLAGS(func, bpf_cpumask_full, KF_RCU)
BTF_ID_FLAGS(func, bpf_cpumask_copy, KF_RCU)
BTF_ID_FLAGS(func, bpf_cpumask_any, KF_RCU)
BTF_ID_FLAGS(func, bpf_cpumask_any_and, KF_RCU)
BTF_SET8_END(cpumask_kfunc_btf_ids)
static const struct btf_kfunc_id_set cpumask_kfunc_set = {
......
......@@ -2163,8 +2163,10 @@ __bpf_kfunc struct cgroup *bpf_cgroup_ancestor(struct cgroup *cgrp, int level)
if (level > cgrp->level || level < 0)
return NULL;
/* cgrp's refcnt could be 0 here, but ancestors can still be accessed */
ancestor = cgrp->ancestors[level];
cgroup_get(ancestor);
if (!cgroup_tryget(ancestor))
return NULL;
return ancestor;
}
......@@ -2382,7 +2384,7 @@ BTF_ID_FLAGS(func, bpf_rbtree_first, KF_RET_NULL)
BTF_ID_FLAGS(func, bpf_cgroup_acquire, KF_ACQUIRE | KF_TRUSTED_ARGS)
BTF_ID_FLAGS(func, bpf_cgroup_kptr_get, KF_ACQUIRE | KF_KPTR_GET | KF_RET_NULL)
BTF_ID_FLAGS(func, bpf_cgroup_release, KF_RELEASE)
BTF_ID_FLAGS(func, bpf_cgroup_ancestor, KF_ACQUIRE | KF_TRUSTED_ARGS | KF_RET_NULL)
BTF_ID_FLAGS(func, bpf_cgroup_ancestor, KF_ACQUIRE | KF_RCU | KF_RET_NULL)
BTF_ID_FLAGS(func, bpf_cgroup_from_id, KF_ACQUIRE | KF_RET_NULL)
#endif
BTF_ID_FLAGS(func, bpf_task_from_pid, KF_ACQUIRE | KF_RET_NULL)
......
......@@ -4218,7 +4218,7 @@ static int map_kptr_match_type(struct bpf_verifier_env *env,
struct bpf_reg_state *reg, u32 regno)
{
const char *targ_name = kernel_type_name(kptr_field->kptr.btf, kptr_field->kptr.btf_id);
int perm_flags = PTR_MAYBE_NULL | PTR_TRUSTED;
int perm_flags = PTR_MAYBE_NULL | PTR_TRUSTED | MEM_RCU;
const char *reg_name = "";
/* Only unreferenced case accepts untrusted pointers */
......@@ -4285,6 +4285,34 @@ static int map_kptr_match_type(struct bpf_verifier_env *env,
return -EINVAL;
}
/* The non-sleepable programs and sleepable programs with explicit bpf_rcu_read_lock()
* can dereference RCU protected pointers and result is PTR_TRUSTED.
*/
static bool in_rcu_cs(struct bpf_verifier_env *env)
{
return env->cur_state->active_rcu_lock || !env->prog->aux->sleepable;
}
/* Once GCC supports btf_type_tag the following mechanism will be replaced with tag check */
BTF_SET_START(rcu_protected_types)
BTF_ID(struct, prog_test_ref_kfunc)
BTF_ID(struct, cgroup)
BTF_SET_END(rcu_protected_types)
static bool rcu_protected_object(const struct btf *btf, u32 btf_id)
{
if (!btf_is_kernel(btf))
return false;
return btf_id_set_contains(&rcu_protected_types, btf_id);
}
static bool rcu_safe_kptr(const struct btf_field *field)
{
const struct btf_field_kptr *kptr = &field->kptr;
return field->type == BPF_KPTR_REF && rcu_protected_object(kptr->btf, kptr->btf_id);
}
static int check_map_kptr_access(struct bpf_verifier_env *env, u32 regno,
int value_regno, int insn_idx,
struct btf_field *kptr_field)
......@@ -4319,7 +4347,10 @@ static int check_map_kptr_access(struct bpf_verifier_env *env, u32 regno,
* value from map as PTR_TO_BTF_ID, with the correct type.
*/
mark_btf_ld_reg(env, cur_regs(env), value_regno, PTR_TO_BTF_ID, kptr_field->kptr.btf,
kptr_field->kptr.btf_id, PTR_MAYBE_NULL | PTR_UNTRUSTED);
kptr_field->kptr.btf_id,
rcu_safe_kptr(kptr_field) && in_rcu_cs(env) ?
PTR_MAYBE_NULL | MEM_RCU :
PTR_MAYBE_NULL | PTR_UNTRUSTED);
/* For mark_ptr_or_null_reg */
val_reg->id = ++env->id_gen;
} else if (class == BPF_STX) {
......@@ -5042,23 +5073,76 @@ static int bpf_map_direct_read(struct bpf_map *map, int off, int size, u64 *val)
return 0;
}
#define BTF_TYPE_SAFE_NESTED(__type) __PASTE(__type, __safe_fields)
#define BTF_TYPE_SAFE_RCU(__type) __PASTE(__type, __safe_rcu)
#define BTF_TYPE_SAFE_TRUSTED(__type) __PASTE(__type, __safe_trusted)
/*
* Allow list few fields as RCU trusted or full trusted.
* This logic doesn't allow mix tagging and will be removed once GCC supports
* btf_type_tag.
*/
BTF_TYPE_SAFE_NESTED(struct task_struct) {
/* RCU trusted: these fields are trusted in RCU CS and never NULL */
BTF_TYPE_SAFE_RCU(struct task_struct) {
const cpumask_t *cpus_ptr;
struct css_set __rcu *cgroups;
struct task_struct __rcu *real_parent;
struct task_struct *group_leader;
};
static bool nested_ptr_is_trusted(struct bpf_verifier_env *env,
struct bpf_reg_state *reg,
int off)
BTF_TYPE_SAFE_RCU(struct css_set) {
struct cgroup *dfl_cgrp;
};
/* full trusted: these fields are trusted even outside of RCU CS and never NULL */
BTF_TYPE_SAFE_TRUSTED(struct bpf_iter_meta) {
__bpf_md_ptr(struct seq_file *, seq);
};
BTF_TYPE_SAFE_TRUSTED(struct bpf_iter__task) {
__bpf_md_ptr(struct bpf_iter_meta *, meta);
__bpf_md_ptr(struct task_struct *, task);
};
BTF_TYPE_SAFE_TRUSTED(struct linux_binprm) {
struct file *file;
};
BTF_TYPE_SAFE_TRUSTED(struct file) {
struct inode *f_inode;
};
BTF_TYPE_SAFE_TRUSTED(struct dentry) {
/* no negative dentry-s in places where bpf can see it */
struct inode *d_inode;
};
BTF_TYPE_SAFE_TRUSTED(struct socket) {
struct sock *sk;
};
static bool type_is_rcu(struct bpf_verifier_env *env,
struct bpf_reg_state *reg,
int off)
{
/* If its parent is not trusted, it can't regain its trusted status. */
if (!is_trusted_reg(reg))
return false;
BTF_TYPE_EMIT(BTF_TYPE_SAFE_RCU(struct task_struct));
BTF_TYPE_EMIT(BTF_TYPE_SAFE_RCU(struct css_set));
return btf_nested_type_is_trusted(&env->log, reg, off, "__safe_rcu");
}
BTF_TYPE_EMIT(BTF_TYPE_SAFE_NESTED(struct task_struct));
static bool type_is_trusted(struct bpf_verifier_env *env,
struct bpf_reg_state *reg,
int off)
{
BTF_TYPE_EMIT(BTF_TYPE_SAFE_TRUSTED(struct bpf_iter_meta));
BTF_TYPE_EMIT(BTF_TYPE_SAFE_TRUSTED(struct bpf_iter__task));
BTF_TYPE_EMIT(BTF_TYPE_SAFE_TRUSTED(struct linux_binprm));
BTF_TYPE_EMIT(BTF_TYPE_SAFE_TRUSTED(struct file));
BTF_TYPE_EMIT(BTF_TYPE_SAFE_TRUSTED(struct dentry));
BTF_TYPE_EMIT(BTF_TYPE_SAFE_TRUSTED(struct socket));
return btf_nested_type_is_trusted(&env->log, reg, off);
return btf_nested_type_is_trusted(&env->log, reg, off, "__safe_trusted");
}
static int check_ptr_to_btf_access(struct bpf_verifier_env *env,
......@@ -5144,41 +5228,56 @@ static int check_ptr_to_btf_access(struct bpf_verifier_env *env,
if (ret < 0)
return ret;
/* If this is an untrusted pointer, all pointers formed by walking it
* also inherit the untrusted flag.
*/
if (type_flag(reg->type) & PTR_UNTRUSTED)
flag |= PTR_UNTRUSTED;
if (ret != PTR_TO_BTF_ID) {
/* just mark; */
/* By default any pointer obtained from walking a trusted pointer is no
* longer trusted, unless the field being accessed has explicitly been
* marked as inheriting its parent's state of trust.
*
* An RCU-protected pointer can also be deemed trusted if we are in an
* RCU read region. This case is handled below.
*/
if (nested_ptr_is_trusted(env, reg, off))
flag |= PTR_TRUSTED;
else
flag &= ~PTR_TRUSTED;
if (flag & MEM_RCU) {
/* Mark value register as MEM_RCU only if it is protected by
* bpf_rcu_read_lock() and the ptr reg is rcu or trusted. MEM_RCU
* itself can already indicate trustedness inside the rcu
* read lock region. Also mark rcu pointer as PTR_MAYBE_NULL since
* it could be null in some cases.
} else if (type_flag(reg->type) & PTR_UNTRUSTED) {
/* If this is an untrusted pointer, all pointers formed by walking it
* also inherit the untrusted flag.
*/
if (!env->cur_state->active_rcu_lock ||
!(is_trusted_reg(reg) || is_rcu_reg(reg)))
flag &= ~MEM_RCU;
else
flag |= PTR_MAYBE_NULL;
} else if (reg->type & MEM_RCU) {
/* ptr (reg) is marked as MEM_RCU, but the struct field is not tagged
* with __rcu. Mark the flag as PTR_UNTRUSTED conservatively.
flag = PTR_UNTRUSTED;
} else if (is_trusted_reg(reg) || is_rcu_reg(reg)) {
/* By default any pointer obtained from walking a trusted pointer is no
* longer trusted, unless the field being accessed has explicitly been
* marked as inheriting its parent's state of trust (either full or RCU).
* For example:
* 'cgroups' pointer is untrusted if task->cgroups dereference
* happened in a sleepable program outside of bpf_rcu_read_lock()
* section. In a non-sleepable program it's trusted while in RCU CS (aka MEM_RCU).
* Note bpf_rcu_read_unlock() converts MEM_RCU pointers to PTR_UNTRUSTED.
*
* A regular RCU-protected pointer with __rcu tag can also be deemed
* trusted if we are in an RCU CS. Such pointer can be NULL.
*/
flag |= PTR_UNTRUSTED;
if (type_is_trusted(env, reg, off)) {
flag |= PTR_TRUSTED;
} else if (in_rcu_cs(env) && !type_may_be_null(reg->type)) {
if (type_is_rcu(env, reg, off)) {
/* ignore __rcu tag and mark it MEM_RCU */
flag |= MEM_RCU;
} else if (flag & MEM_RCU) {
/* __rcu tagged pointers can be NULL */
flag |= PTR_MAYBE_NULL;
} else if (flag & (MEM_PERCPU | MEM_USER)) {
/* keep as-is */
} else {
/* walking unknown pointers yields untrusted pointer */
flag = PTR_UNTRUSTED;
}
} else {
/*
* If not in RCU CS or MEM_RCU pointer can be NULL then
* aggressively mark as untrusted otherwise such
* pointers will be plain PTR_TO_BTF_ID without flags
* and will be allowed to be passed into helpers for
* compat reasons.
*/
flag = PTR_UNTRUSTED;
}
} else {
/* Old compat. Deprecated */
flag &= ~PTR_TRUSTED;
}
if (atype == BPF_READ && value_regno >= 0)
......@@ -9670,7 +9769,7 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_
return -EINVAL;
}
if (is_kfunc_trusted_args(meta) &&
if ((is_kfunc_trusted_args(meta) || is_kfunc_rcu(meta)) &&
(register_is_null(reg) || type_may_be_null(reg->type))) {
verbose(env, "Possibly NULL pointer passed to trusted arg%d\n", i);
return -EACCES;
......@@ -10006,10 +10105,6 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
rcu_lock = is_kfunc_bpf_rcu_read_lock(&meta);
rcu_unlock = is_kfunc_bpf_rcu_read_unlock(&meta);
if ((rcu_lock || rcu_unlock) && !env->rcu_tag_supported) {
verbose(env, "no vmlinux btf rcu tag support for kfunc %s\n", func_name);
return -EACCES;
}
if (env->cur_state->active_rcu_lock) {
struct bpf_func_state *state;
......@@ -14868,8 +14963,22 @@ static int do_check(struct bpf_verifier_env *env)
* src_reg == stack|map in some other branch.
* Reject it.
*/
verbose(env, "same insn cannot be used with different pointers\n");
return -EINVAL;
if (base_type(src_reg_type) == PTR_TO_BTF_ID &&
base_type(*prev_src_type) == PTR_TO_BTF_ID) {
/*
* Have to support a use case when one path through
* the program yields TRUSTED pointer while another
* is UNTRUSTED. Fallback to UNTRUSTED to generate
* BPF_PROBE_MEM.
*/
*prev_src_type = PTR_TO_BTF_ID | PTR_UNTRUSTED;
} else {
verbose(env,
"The same insn cannot be used with different pointers: %s",
reg_type_str(env, src_reg_type));
verbose(env, " != %s\n", reg_type_str(env, *prev_src_type));
return -EINVAL;
}
}
} else if (class == BPF_STX) {
......@@ -17941,8 +18050,6 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr)
env->bypass_spec_v1 = bpf_bypass_spec_v1();
env->bypass_spec_v4 = bpf_bypass_spec_v4();
env->bpf_capable = bpf_capable();
env->rcu_tag_supported = btf_vmlinux &&
btf_find_by_name_kind(btf_vmlinux, "rcu", BTF_KIND_TYPE_TAG) > 0;
if (is_priv)
env->test_state_freq = attr->prog_flags & BPF_F_TEST_STATE_FREQ;
......
......@@ -737,6 +737,7 @@ __bpf_kfunc void bpf_kfunc_call_test_mem_len_fail2(u64 *mem, int len)
__bpf_kfunc void bpf_kfunc_call_test_ref(struct prog_test_ref_kfunc *p)
{
/* p != NULL, but p->cnt could be 0 */
}
__bpf_kfunc void bpf_kfunc_call_test_destructive(void)
......@@ -784,7 +785,7 @@ BTF_ID_FLAGS(func, bpf_kfunc_call_test_fail3)
BTF_ID_FLAGS(func, bpf_kfunc_call_test_mem_len_pass1)
BTF_ID_FLAGS(func, bpf_kfunc_call_test_mem_len_fail1)
BTF_ID_FLAGS(func, bpf_kfunc_call_test_mem_len_fail2)
BTF_ID_FLAGS(func, bpf_kfunc_call_test_ref, KF_TRUSTED_ARGS)
BTF_ID_FLAGS(func, bpf_kfunc_call_test_ref, KF_TRUSTED_ARGS | KF_RCU)
BTF_ID_FLAGS(func, bpf_kfunc_call_test_destructive, KF_DESTRUCTIVE)
BTF_ID_FLAGS(func, bpf_kfunc_call_test_static_unused_arg)
BTF_SET8_END(test_sk_check_kfunc_ids)
......
......@@ -174,8 +174,8 @@ enum libbpf_tristate {
#define __kconfig __attribute__((section(".kconfig")))
#define __ksym __attribute__((section(".ksyms")))
#define __kptr_untrusted __attribute__((btf_type_tag("kptr_untrusted")))
#define __kptr __attribute__((btf_type_tag("kptr")))
#define __kptr_ref __attribute__((btf_type_tag("kptr_ref")))
#ifndef ___bpf_concat
#define ___bpf_concat(a, b) a ## b
......
......@@ -193,7 +193,7 @@ static void test_cgroup_iter_sleepable(int cgroup_fd, __u64 cgroup_id)
cgrp_ls_sleepable__destroy(skel);
}
static void test_no_rcu_lock(__u64 cgroup_id)
static void test_yes_rcu_lock(__u64 cgroup_id)
{
struct cgrp_ls_sleepable *skel;
int err;
......@@ -204,7 +204,7 @@ static void test_no_rcu_lock(__u64 cgroup_id)
skel->bss->target_pid = syscall(SYS_gettid);
bpf_program__set_autoload(skel->progs.no_rcu_lock, true);
bpf_program__set_autoload(skel->progs.yes_rcu_lock, true);
err = cgrp_ls_sleepable__load(skel);
if (!ASSERT_OK(err, "skel_load"))
goto out;
......@@ -220,7 +220,7 @@ static void test_no_rcu_lock(__u64 cgroup_id)
cgrp_ls_sleepable__destroy(skel);
}
static void test_rcu_lock(void)
static void test_no_rcu_lock(void)
{
struct cgrp_ls_sleepable *skel;
int err;
......@@ -229,7 +229,7 @@ static void test_rcu_lock(void)
if (!ASSERT_OK_PTR(skel, "skel_open"))
return;
bpf_program__set_autoload(skel->progs.yes_rcu_lock, true);
bpf_program__set_autoload(skel->progs.no_rcu_lock, true);
err = cgrp_ls_sleepable__load(skel);
ASSERT_ERR(err, "skel_load");
......@@ -256,10 +256,10 @@ void test_cgrp_local_storage(void)
test_negative();
if (test__start_subtest("cgroup_iter_sleepable"))
test_cgroup_iter_sleepable(cgroup_fd, cgroup_id);
if (test__start_subtest("yes_rcu_lock"))
test_yes_rcu_lock(cgroup_id);
if (test__start_subtest("no_rcu_lock"))
test_no_rcu_lock(cgroup_id);
if (test__start_subtest("rcu_lock"))
test_rcu_lock();
test_no_rcu_lock();
close(cgroup_fd);
}
......@@ -25,10 +25,10 @@ static void test_success(void)
bpf_program__set_autoload(skel->progs.get_cgroup_id, true);
bpf_program__set_autoload(skel->progs.task_succ, true);
bpf_program__set_autoload(skel->progs.no_lock, true);
bpf_program__set_autoload(skel->progs.two_regions, true);
bpf_program__set_autoload(skel->progs.non_sleepable_1, true);
bpf_program__set_autoload(skel->progs.non_sleepable_2, true);
bpf_program__set_autoload(skel->progs.task_trusted_non_rcuptr, true);
err = rcu_read_lock__load(skel);
if (!ASSERT_OK(err, "skel_load"))
goto out;
......@@ -69,6 +69,7 @@ static void test_rcuptr_acquire(void)
static const char * const inproper_region_tests[] = {
"miss_lock",
"no_lock",
"miss_unlock",
"non_sleepable_rcu_mismatch",
"inproper_sleepable_helper",
......@@ -99,7 +100,6 @@ static void test_inproper_region(void)
}
static const char * const rcuptr_misuse_tests[] = {
"task_untrusted_non_rcuptr",
"task_untrusted_rcuptr",
"cross_rcu_region",
};
......@@ -128,17 +128,8 @@ static void test_rcuptr_misuse(void)
void test_rcu_read_lock(void)
{
struct btf *vmlinux_btf;
int cgroup_fd;
vmlinux_btf = btf__load_vmlinux_btf();
if (!ASSERT_OK_PTR(vmlinux_btf, "could not load vmlinux BTF"))
return;
if (btf__find_by_name_kind(vmlinux_btf, "rcu", BTF_KIND_TYPE_TAG) < 0) {
test__skip();
goto out;
}
cgroup_fd = test__join_cgroup("/rcu_read_lock");
if (!ASSERT_GE(cgroup_fd, 0, "join_cgroup /rcu_read_lock"))
goto out;
......@@ -153,6 +144,5 @@ void test_rcu_read_lock(void)
if (test__start_subtest("negative_tests_rcuptr_misuse"))
test_rcuptr_misuse();
close(cgroup_fd);
out:
btf__free(vmlinux_btf);
out:;
}
......@@ -4,7 +4,7 @@
#include <bpf/bpf_helpers.h>
struct map_value {
struct prog_test_ref_kfunc __kptr_ref *ptr;
struct prog_test_ref_kfunc __kptr *ptr;
};
struct {
......
......@@ -10,7 +10,7 @@
#include <bpf/bpf_tracing.h>
struct __cgrps_kfunc_map_value {
struct cgroup __kptr_ref * cgrp;
struct cgroup __kptr * cgrp;
};
struct hash_map {
......
......@@ -205,7 +205,7 @@ int BPF_PROG(cgrp_kfunc_get_unreleased, struct cgroup *cgrp, const char *path)
}
SEC("tp_btf/cgroup_mkdir")
__failure __msg("arg#0 is untrusted_ptr_or_null_ expected ptr_ or socket")
__failure __msg("expects refcounted")
int BPF_PROG(cgrp_kfunc_release_untrusted, struct cgroup *cgrp, const char *path)
{
struct __cgrps_kfunc_map_value *v;
......
......@@ -61,7 +61,7 @@ int BPF_PROG(test_cgrp_acquire_leave_in_map, struct cgroup *cgrp, const char *pa
SEC("tp_btf/cgroup_mkdir")
int BPF_PROG(test_cgrp_xchg_release, struct cgroup *cgrp, const char *path)
{
struct cgroup *kptr;
struct cgroup *kptr, *cg;
struct __cgrps_kfunc_map_value *v;
long status;
......@@ -80,6 +80,16 @@ int BPF_PROG(test_cgrp_xchg_release, struct cgroup *cgrp, const char *path)
return 0;
}
kptr = v->cgrp;
if (!kptr) {
err = 4;
return 0;
}
cg = bpf_cgroup_ancestor(kptr, 1);
if (cg) /* verifier only check */
bpf_cgroup_release(cg);
kptr = bpf_kptr_xchg(&v->cgrp, NULL);
if (!kptr) {
err = 3;
......
......@@ -49,7 +49,7 @@ int no_rcu_lock(void *ctx)
if (task->pid != target_pid)
return 0;
/* ptr_to_btf_id semantics. should work. */
/* task->cgroups is untrusted in sleepable prog outside of RCU CS */
cgrp = task->cgroups->dfl_cgrp;
ptr = bpf_cgrp_storage_get(&map_a, cgrp, 0,
BPF_LOCAL_STORAGE_GET_F_CREATE);
......@@ -71,7 +71,7 @@ int yes_rcu_lock(void *ctx)
bpf_rcu_read_lock();
cgrp = task->cgroups->dfl_cgrp;
/* cgrp is untrusted and cannot pass to bpf_cgrp_storage_get() helper. */
/* cgrp is trusted under RCU CS */
ptr = bpf_cgrp_storage_get(&map_a, cgrp, 0, BPF_LOCAL_STORAGE_GET_F_CREATE);
if (ptr)
cgroup_id = cgrp->kn->id;
......
......@@ -10,7 +10,7 @@
int err;
struct __cpumask_map_value {
struct bpf_cpumask __kptr_ref * cpumask;
struct bpf_cpumask __kptr * cpumask;
};
struct array_map {
......
......@@ -44,7 +44,7 @@ int BPF_PROG(test_alloc_double_release, struct task_struct *task, u64 clone_flag
}
SEC("tp_btf/task_newtask")
__failure __msg("bpf_cpumask_acquire args#0 expected pointer to STRUCT bpf_cpumask")
__failure __msg("must be referenced")
int BPF_PROG(test_acquire_wrong_cpumask, struct task_struct *task, u64 clone_flags)
{
struct bpf_cpumask *cpumask;
......
......@@ -4,7 +4,7 @@
#include <bpf/bpf_tracing.h>
#include <bpf/bpf_helpers.h>
static struct prog_test_ref_kfunc __kptr_ref *v;
static struct prog_test_ref_kfunc __kptr *v;
long total_sum = -1;
extern struct prog_test_ref_kfunc *bpf_kfunc_call_test_acquire(unsigned long *sp) __ksym;
......
......@@ -4,7 +4,7 @@
#include <bpf/bpf_helpers.h>
struct map_value {
struct task_struct __kptr *ptr;
struct task_struct __kptr_untrusted *ptr;
};
struct {
......
......@@ -4,8 +4,8 @@
#include <bpf/bpf_helpers.h>
struct map_value {
struct prog_test_ref_kfunc __kptr *unref_ptr;
struct prog_test_ref_kfunc __kptr_ref *ref_ptr;
struct prog_test_ref_kfunc __kptr_untrusted *unref_ptr;
struct prog_test_ref_kfunc __kptr *ref_ptr;
};
struct array_map {
......@@ -118,6 +118,7 @@ extern struct prog_test_ref_kfunc *bpf_kfunc_call_test_acquire(unsigned long *sp
extern struct prog_test_ref_kfunc *
bpf_kfunc_call_test_kptr_get(struct prog_test_ref_kfunc **p, int a, int b) __ksym;
extern void bpf_kfunc_call_test_release(struct prog_test_ref_kfunc *p) __ksym;
void bpf_kfunc_call_test_ref(struct prog_test_ref_kfunc *p) __ksym;
#define WRITE_ONCE(x, val) ((*(volatile typeof(x) *) &(x)) = (val))
......@@ -147,12 +148,23 @@ static void test_kptr_ref(struct map_value *v)
WRITE_ONCE(v->unref_ptr, p);
if (!p)
return;
/*
* p is rcu_ptr_prog_test_ref_kfunc,
* because bpf prog is non-sleepable and runs in RCU CS.
* p can be passed to kfunc that requires KF_RCU.
*/
bpf_kfunc_call_test_ref(p);
if (p->a + p->b > 100)
return;
/* store NULL */
p = bpf_kptr_xchg(&v->ref_ptr, NULL);
if (!p)
return;
/*
* p is trusted_ptr_prog_test_ref_kfunc.
* p can be passed to kfunc that requires KF_RCU.
*/
bpf_kfunc_call_test_ref(p);
if (p->a + p->b > 100) {
bpf_kfunc_call_test_release(p);
return;
......
......@@ -7,9 +7,9 @@
struct map_value {
char buf[8];
struct prog_test_ref_kfunc __kptr *unref_ptr;
struct prog_test_ref_kfunc __kptr_ref *ref_ptr;
struct prog_test_member __kptr_ref *ref_memb_ptr;
struct prog_test_ref_kfunc __kptr_untrusted *unref_ptr;
struct prog_test_ref_kfunc __kptr *ref_ptr;
struct prog_test_member __kptr *ref_memb_ptr;
};
struct array_map {
......@@ -281,7 +281,7 @@ int reject_kptr_get_bad_type_match(struct __sk_buff *ctx)
}
SEC("?tc")
__failure __msg("R1 type=untrusted_ptr_or_null_ expected=percpu_ptr_")
__failure __msg("R1 type=rcu_ptr_or_null_ expected=percpu_ptr_")
int mark_ref_as_untrusted_or_null(struct __sk_buff *ctx)
{
struct map_value *v;
......@@ -316,7 +316,7 @@ int reject_untrusted_store_to_ref(struct __sk_buff *ctx)
}
SEC("?tc")
__failure __msg("R2 type=untrusted_ptr_ expected=ptr_")
__failure __msg("R2 must be referenced")
int reject_untrusted_xchg(struct __sk_buff *ctx)
{
struct prog_test_ref_kfunc *p;
......
......@@ -17,7 +17,7 @@ char _license[] SEC("license") = "GPL";
*/
SEC("tp_btf/task_newtask")
__failure __msg("R2 must be referenced or trusted")
__failure __msg("R2 must be")
int BPF_PROG(test_invalid_nested_user_cpus, struct task_struct *task, u64 clone_flags)
{
bpf_cpumask_test_cpu(0, task->user_cpus_ptr);
......
......@@ -81,7 +81,7 @@ int no_lock(void *ctx)
{
struct task_struct *task, *real_parent;
/* no bpf_rcu_read_lock(), old code still works */
/* old style ptr_to_btf_id is not allowed in sleepable */
task = bpf_get_current_task_btf();
real_parent = task->real_parent;
(void)bpf_task_storage_get(&map_a, real_parent, 0, 0);
......@@ -286,13 +286,13 @@ int nested_rcu_region(void *ctx)
}
SEC("?fentry.s/" SYS_PREFIX "sys_getpgid")
int task_untrusted_non_rcuptr(void *ctx)
int task_trusted_non_rcuptr(void *ctx)
{
struct task_struct *task, *group_leader;
task = bpf_get_current_task_btf();
bpf_rcu_read_lock();
/* the pointer group_leader marked as untrusted */
/* the pointer group_leader is explicitly marked as trusted */
group_leader = task->real_parent->group_leader;
(void)bpf_task_storage_get(&map_a, group_leader, 0, 0);
bpf_rcu_read_unlock();
......
......@@ -10,7 +10,7 @@
#include <bpf/bpf_tracing.h>
struct __tasks_kfunc_map_value {
struct task_struct __kptr_ref * task;
struct task_struct __kptr * task;
};
struct hash_map {
......
......@@ -699,13 +699,13 @@ static int create_cgroup_storage(bool percpu)
* struct bpf_timer t;
* };
* struct btf_ptr {
* struct prog_test_ref_kfunc __kptr_untrusted *ptr;
* struct prog_test_ref_kfunc __kptr *ptr;
* struct prog_test_ref_kfunc __kptr_ref *ptr;
* struct prog_test_member __kptr_ref *ptr;
* struct prog_test_member __kptr *ptr;
* }
*/
static const char btf_str_sec[] = "\0bpf_spin_lock\0val\0cnt\0l\0bpf_timer\0timer\0t"
"\0btf_ptr\0prog_test_ref_kfunc\0ptr\0kptr\0kptr_ref"
"\0btf_ptr\0prog_test_ref_kfunc\0ptr\0kptr\0kptr_untrusted"
"\0prog_test_member";
static __u32 btf_raw_types[] = {
/* int */
......@@ -724,20 +724,20 @@ static __u32 btf_raw_types[] = {
BTF_MEMBER_ENC(41, 4, 0), /* struct bpf_timer t; */
/* struct prog_test_ref_kfunc */ /* [6] */
BTF_STRUCT_ENC(51, 0, 0),
BTF_STRUCT_ENC(89, 0, 0), /* [7] */
BTF_STRUCT_ENC(95, 0, 0), /* [7] */
/* type tag "kptr_untrusted" */
BTF_TYPE_TAG_ENC(80, 6), /* [8] */
/* type tag "kptr" */
BTF_TYPE_TAG_ENC(75, 6), /* [8] */
/* type tag "kptr_ref" */
BTF_TYPE_TAG_ENC(80, 6), /* [9] */
BTF_TYPE_TAG_ENC(80, 7), /* [10] */
BTF_TYPE_TAG_ENC(75, 6), /* [9] */
BTF_TYPE_TAG_ENC(75, 7), /* [10] */
BTF_PTR_ENC(8), /* [11] */
BTF_PTR_ENC(9), /* [12] */
BTF_PTR_ENC(10), /* [13] */
/* struct btf_ptr */ /* [14] */
BTF_STRUCT_ENC(43, 3, 24),
BTF_MEMBER_ENC(71, 11, 0), /* struct prog_test_ref_kfunc __kptr *ptr; */
BTF_MEMBER_ENC(71, 12, 64), /* struct prog_test_ref_kfunc __kptr_ref *ptr; */
BTF_MEMBER_ENC(71, 13, 128), /* struct prog_test_member __kptr_ref *ptr; */
BTF_MEMBER_ENC(71, 11, 0), /* struct prog_test_ref_kfunc __kptr_untrusted *ptr; */
BTF_MEMBER_ENC(71, 12, 64), /* struct prog_test_ref_kfunc __kptr *ptr; */
BTF_MEMBER_ENC(71, 13, 128), /* struct prog_test_member __kptr *ptr; */
};
static char bpf_vlog[UINT_MAX >> 8];
......
......@@ -181,7 +181,7 @@
},
.result_unpriv = REJECT,
.result = REJECT,
.errstr = "negative offset ptr_ ptr R1 off=-4 disallowed",
.errstr = "ptr R1 off=-4 disallowed",
},
{
"calls: invalid kfunc call: PTR_TO_BTF_ID with variable offset",
......@@ -243,7 +243,7 @@
},
.result_unpriv = REJECT,
.result = REJECT,
.errstr = "R1 must be referenced",
.errstr = "R1 must be",
},
{
"calls: valid kfunc call: referenced arg needs refcounted PTR_TO_BTF_ID",
......
......@@ -336,7 +336,7 @@
.prog_type = BPF_PROG_TYPE_SCHED_CLS,
.fixup_map_kptr = { 1 },
.result = REJECT,
.errstr = "R1 type=untrusted_ptr_or_null_ expected=percpu_ptr_",
.errstr = "R1 type=rcu_ptr_or_null_ expected=percpu_ptr_",
},
{
"map_kptr: ref: reject off != 0",
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment