• Alexei Starovoitov's avatar
    bpf: enable non-root eBPF programs · 1be7f75d
    Alexei Starovoitov authored
    In order to let unprivileged users load and execute eBPF programs
    teach verifier to prevent pointer leaks.
    Verifier will prevent
    - any arithmetic on pointers
      (except R10+Imm which is used to compute stack addresses)
    - comparison of pointers
      (except if (map_value_ptr == 0) ... )
    - passing pointers to helper functions
    - indirectly passing pointers in stack to helper functions
    - returning pointer from bpf program
    - storing pointers into ctx or maps
    
    Spill/fill of pointers into stack is allowed, but mangling
    of pointers stored in the stack or reading them byte by byte is not.
    
    Within bpf programs the pointers do exist, since programs need to
    be able to access maps, pass skb pointer to LD_ABS insns, etc
    but programs cannot pass such pointer values to the outside
    or obfuscate them.
    
    Only allow BPF_PROG_TYPE_SOCKET_FILTER unprivileged programs,
    so that socket filters (tcpdump), af_packet (quic acceleration)
    and future kcm can use it.
    tracing and tc cls/act program types still require root permissions,
    since tracing actually needs to be able to see all kernel pointers
    and tc is for root only.
    
    For example, the following unprivileged socket filter program is allowed:
    int bpf_prog1(struct __sk_buff *skb)
    {
      u32 index = load_byte(skb, ETH_HLEN + offsetof(struct iphdr, protocol));
      u64 *value = bpf_map_lookup_elem(&my_map, &index);
    
      if (value)
    	*value += skb->len;
      return 0;
    }
    
    but the following program is not:
    int bpf_prog1(struct __sk_buff *skb)
    {
      u32 index = load_byte(skb, ETH_HLEN + offsetof(struct iphdr, protocol));
      u64 *value = bpf_map_lookup_elem(&my_map, &index);
    
      if (value)
    	*value += (u64) skb;
      return 0;
    }
    since it would leak the kernel address into the map.
    
    Unprivileged socket filter bpf programs have access to the
    following helper functions:
    - map lookup/update/delete (but they cannot store kernel pointers into them)
    - get_random (it's already exposed to unprivileged user space)
    - get_smp_processor_id
    - tail_call into another socket filter program
    - ktime_get_ns
    
    The feature is controlled by sysctl kernel.unprivileged_bpf_disabled.
    This toggle defaults to off (0), but can be set true (1).  Once true,
    bpf programs and maps cannot be accessed from unprivileged process,
    and the toggle cannot be set back to false.
    Signed-off-by: default avatarAlexei Starovoitov <ast@plumgrid.com>
    Reviewed-by: default avatarKees Cook <keescook@chromium.org>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    1be7f75d
syscall.c 14.4 KB