• David Vernet's avatar
    bpf: Only invoke kptr dtor following non-NULL xchg · 1431d0b5
    David Vernet authored
    When a map value is being freed, we loop over all of the fields of the
    corresponding BPF object and issue the appropriate cleanup calls
    corresponding to the field's type. If the field is a referenced kptr, we
    atomically xchg the value out of the map, and invoke the kptr's
    destructor on whatever was there before (or bpf_obj_drop() it if it was
    a local kptr).
    
    Currently, we always invoke the destructor (either bpf_obj_drop() or the
    kptr's registered destructor) on any KPTR_REF-type field in a map, even
    if there wasn't a value in the map. This means that any function serving
    as the kptr's KF_RELEASE destructor must always treat the argument as
    possibly NULL, as the following can and regularly does happen:
    
    void *xchgd_field;
    
    /* No value was in the map, so xchgd_field is NULL */
    xchgd_field = (void *)xchg(unsigned long *field_ptr, 0);
    field->kptr.dtor(xchgd_field);
    
    These are odd semantics to impose on KF_RELEASE kfuncs -- BPF programs
    are prohibited by the verifier from passing NULL pointers to KF_RELEASE
    kfuncs, so it doesn't make sense to require this of BPF programs, but
    not the main kernel destructor path. It's also unnecessary to invoke any
    cleanup logic for local kptrs. If there is no object there, there's
    nothing to drop.
    
    So as to allow KF_RELEASE kfuncs to fully assume that an argument is
    non-NULL, this patch updates a KPTR_REF's destructor to only be invoked
    when a non-NULL value is xchg'd out of the kptr map field.
    Signed-off-by: default avatarDavid Vernet <void@manifault.com>
    Link: https://lore.kernel.org/r/20230325213144.486885-2-void@manifault.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
    1431d0b5
syscall.c 130 KB