1. 23 May, 2018 8 commits
    • Daniel Borkmann's avatar
      Merge branch 'btf-uapi-cleanups' · ff4fb475
      Daniel Borkmann authored
      Martin KaFai Lau says:
      
      ====================
      This patch set makes some changes to cleanup the unused
      bits in BTF uapi.  It also makes the btf_header extensible.
      
      Please see individual patches for details.
      
      v2:
      - Remove NR_SECS from patch 2
      - Remove "unsigned" check on array->index_type from patch 3
      - Remove BTF_INT_VARARGS and further limit BTF_INT_ENCODING
        from 8 bits to 4 bits in patch 4
      - Adjustments in test_btf.c to reflect changes in v2
      ====================
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      ff4fb475
    • Martin KaFai Lau's avatar
      bpf: btf: Add tests for the btf uapi changes · 61746dbe
      Martin KaFai Lau authored
      This patch does the followings:
      1. Modify libbpf and test_btf to reflect the uapi changes in btf
      2. Add test for the btf_header changes
      3. Add tests for array->index_type
      4. Add err_str check to the tests
      5. Fix a 4 bytes hole in "struct test #1" by swapping "m" and "n"
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      61746dbe
    • Martin KaFai Lau's avatar
      bpf: btf: Sync bpf.h and btf.h to tools · f03b15d3
      Martin KaFai Lau authored
      This patch sync the uapi bpf.h and btf.h to tools.
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Acked-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      f03b15d3
    • Martin KaFai Lau's avatar
      bpf: btf: Rename btf_key_id and btf_value_id in bpf_map_info · 9b2cf328
      Martin KaFai Lau authored
      In "struct bpf_map_info", the name "btf_id", "btf_key_id" and "btf_value_id"
      could cause confusion because the "id" of "btf_id" means the BPF obj id
      given to the BTF object while
      "btf_key_id" and "btf_value_id" means the BTF type id within
      that BTF object.
      
      To make it clear, btf_key_id and btf_value_id are
      renamed to btf_key_type_id and btf_value_type_id.
      Suggested-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Acked-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      9b2cf328
    • Martin KaFai Lau's avatar
      bpf: btf: Remove unused bits from uapi/linux/btf.h · aea2f7b8
      Martin KaFai Lau authored
      This patch does the followings:
      1. Limit BTF_MAX_TYPES and BTF_MAX_NAME_OFFSET to 64k.  We can
         raise it later.
      
      2. Remove the BTF_TYPE_PARENT and BTF_STR_TBL_ELF_ID.  They are
         currently encoded at the highest bit of a u32.
         It is because the current use case does not require supporting
         parent type (i.e type_id referring to a type in another BTF file).
         It also does not support referring to a string in ELF.
      
         The BTF_TYPE_PARENT and BTF_STR_TBL_ELF_ID checks are replaced
         by BTF_TYPE_ID_CHECK and BTF_STR_OFFSET_CHECK which are
         defined in btf.c instead of uapi/linux/btf.h.
      
      3. Limit the BTF_INFO_KIND from 5 bits to 4 bits which is enough.
         There is unused bits headroom if we ever needed it later.
      
      4. The root bit in BTF_INFO is also removed because it is not
         used in the current use case.
      
      5. Remove BTF_INT_VARARGS since func type is not supported now.
         The BTF_INT_ENCODING is limited to 4 bits instead of 8 bits.
      
      The above can be added back later because the verifier
      ensures the unused bits are zeros.
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Acked-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      aea2f7b8
    • Martin KaFai Lau's avatar
      bpf: btf: Check array->index_type · 4ef5f574
      Martin KaFai Lau authored
      Instead of ingoring the array->index_type field.  Enforce that
      it must be a BTF_KIND_INT in size 1/2/4/8 bytes.
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      4ef5f574
    • Martin KaFai Lau's avatar
      bpf: btf: Change how section is supported in btf_header · f80442a4
      Martin KaFai Lau authored
      There are currently unused section descriptions in the btf_header.  Those
      sections are here to support future BTF use cases.  For example, the
      func section (func_off) is to support function signature (e.g. the BPF
      prog function signature).
      
      Instead of spelling out all potential sections up-front in the btf_header.
      This patch makes changes to btf_header such that extending it (e.g. adding
      a section) is possible later.  The unused ones can be removed for now and
      they can be added back later.
      
      This patch:
      1. adds a hdr_len to the btf_header.  It will allow adding
      sections (and other info like parent_label and parent_name)
      later.  The check is similar to the existing bpf_attr.
      If a user passes in a longer hdr_len, the kernel
      ensures the extra tailing bytes are 0.
      
      2. allows the section order in the BTF object to be
      different from its sec_off order in btf_header.
      
      3. each sec_off is followed by a sec_len.  It must not have gap or
      overlapping among sections.
      
      The string section is ensured to be at the end due to the 4 bytes
      alignment requirement of the type section.
      
      The above changes will allow enough flexibility to
      add new sections (and other info) to the btf_header later.
      
      This patch also removes an unnecessary !err check
      at the end of btf_parse().
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      f80442a4
    • Martin KaFai Lau's avatar
      bpf: Expose check_uarg_tail_zero() · dcab51f1
      Martin KaFai Lau authored
      This patch exposes check_uarg_tail_zero() which will
      be reused by a later BTF patch.  Its name is changed to
      bpf_check_uarg_tail_zero().
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Acked-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      dcab51f1
  2. 22 May, 2018 13 commits
  3. 18 May, 2018 12 commits
  4. 17 May, 2018 4 commits
    • Gustavo A. R. Silva's avatar
      bpf: sockmap, fix double-free · a7862293
      Gustavo A. R. Silva authored
      `e' is being freed twice.
      
      Fix this by removing one of the kfree() calls.
      
      Addresses-Coverity-ID: 1468983 ("Double free")
      Fixes: 81110384 ("bpf: sockmap, add hash map support")
      Signed-off-by: default avatarGustavo A. R. Silva <gustavo@embeddedor.com>
      Acked-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      a7862293
    • Gustavo A. R. Silva's avatar
      bpf: sockmap, fix uninitialized variable · 0e436456
      Gustavo A. R. Silva authored
      There is a potential execution path in which variable err is
      returned without being properly initialized previously.
      
      Fix this by initializing variable err to 0.
      
      Addresses-Coverity-ID: 1468964 ("Uninitialized scalar variable")
      Fixes: e5cd3abc ("bpf: sockmap, refactor sockmap routines to work with hashmap")
      Signed-off-by: default avatarGustavo A. R. Silva <gustavo@embeddedor.com>
      Acked-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      0e436456
    • Quentin Monnet's avatar
      bpf: change eBPF helper doc parsing script to allow for smaller indent · eeacb716
      Quentin Monnet authored
      Documentation for eBPF helpers can be parsed from bpf.h and eventually
      turned into a man page. Commit 6f96674d ("bpf: relax constraints on
      formatting for eBPF helper documentation") changed the script used to
      parse it, in order to allow for different indent style and to ease the
      work for writing documentation for future helpers.
      
      The script currently considers that the first tab can be replaced by 6
      to 8 spaces. But the documentation for bpf_fib_lookup() uses a mix of
      tabs (for the "Description" part) and of spaces ("Return" part), and
      only has 5 space long indent for the latter.
      
      We probably do not want to change the values accepted by the script each
      time a new helper gets a new indent style. However, it is worth noting
      that with those 5 spaces, the "Description" and "Return" part *look*
      aligned in the generated patch and in `git show`, so it is likely other
      helper authors will use the same length. Therefore, allow for helper
      documentation to use 5 spaces only for the first indent level.
      Signed-off-by: default avatarQuentin Monnet <quentin.monnet@netronome.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      eeacb716
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next · b9f672af
      David S. Miller authored
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf-next 2018-05-17
      
      The following pull-request contains BPF updates for your *net-next* tree.
      
      The main changes are:
      
      1) Provide a new BPF helper for doing a FIB and neighbor lookup
         in the kernel tables from an XDP or tc BPF program. The helper
         provides a fast-path for forwarding packets. The API supports
         IPv4, IPv6 and MPLS protocols, but currently IPv4 and IPv6 are
         implemented in this initial work, from David (Ahern).
      
      2) Just a tiny diff but huge feature enabled for nfp driver by
         extending the BPF offload beyond a pure host processing offload.
         Offloaded XDP programs are allowed to set the RX queue index and
         thus opening the door for defining a fully programmable RSS/n-tuple
         filter replacement. Once BPF decided on a queue already, the device
         data-path will skip the conventional RSS processing completely,
         from Jakub.
      
      3) The original sockmap implementation was array based similar to
         devmap. However unlike devmap where an ifindex has a 1:1 mapping
         into the map there are use cases with sockets that need to be
         referenced using longer keys. Hence, sockhash map is added reusing
         as much of the sockmap code as possible, from John.
      
      4) Introduce BTF ID. The ID is allocatd through an IDR similar as
         with BPF maps and progs. It also makes BTF accessible to user
         space via BPF_BTF_GET_FD_BY_ID and adds exposure of the BTF data
         through BPF_OBJ_GET_INFO_BY_FD, from Martin.
      
      5) Enable BPF stackmap with build_id also in NMI context. Due to the
         up_read() of current->mm->mmap_sem build_id cannot be parsed.
         This work defers the up_read() via a per-cpu irq_work so that
         at least limited support can be enabled, from Song.
      
      6) Various BPF JIT follow-up cleanups and fixups after the LD_ABS/LD_IND
         JIT conversion as well as implementation of an optimized 32/64 bit
         immediate load in the arm64 JIT that allows to reduce the number of
         emitted instructions; in case of tested real-world programs they
         were shrinking by three percent, from Daniel.
      
      7) Add ifindex parameter to the libbpf loader in order to enable
         BPF offload support. Right now only iproute2 can load offloaded
         BPF and this will also enable libbpf for direct integration into
         other applications, from David (Beckett).
      
      8) Convert the plain text documentation under Documentation/bpf/ into
         RST format since this is the appropriate standard the kernel is
         moving to for all documentation. Also add an overview README.rst,
         from Jesper.
      
      9) Add __printf verification attribute to the bpf_verifier_vlog()
         helper. Though it uses va_list we can still allow gcc to check
         the format string, from Mathieu.
      
      10) Fix a bash reference in the BPF selftest's Makefile. The '|& ...'
          is a bash 4.0+ feature which is not guaranteed to be available
          when calling out to shell, therefore use a more portable variant,
          from Joe.
      
      11) Fix a 64 bit division in xdp_umem_reg() by using div_u64()
          instead of relying on the gcc built-in, from Björn.
      
      12) Fix a sock hashmap kmalloc warning reported by syzbot when an
          overly large key size is used in hashmap then causing overflows
          in htab->elem_size. Reject bogus attr->key_size early in the
          sock_hash_alloc(), from Yonghong.
      
      13) Ensure in BPF selftests when urandom_read is being linked that
          --build-id is always enabled so that test_stacktrace_build_id[_nmi]
          won't be failing, from Alexei.
      
      14) Add bitsperlong.h as well as errno.h uapi headers into the tools
          header infrastructure which point to one of the arch specific
          uapi headers. This was needed in order to fix a build error on
          some systems for the BPF selftests, from Sirio.
      
      15) Allow for short options to be used in the xdp_monitor BPF sample
          code. And also a bpf.h tools uapi header sync in order to fix a
          selftest build failure. Both from Prashant.
      
      16) More formally clarify the meaning of ID in the direct packet access
          section of the BPF documentation, from Wang.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b9f672af
  5. 16 May, 2018 3 commits
    • John Fastabend's avatar
      bpf: sockmap, on update propagate errors back to userspace · e23afe5e
      John Fastabend authored
      When an error happens in the update sockmap element logic also pass
      the err up to the user.
      
      Fixes: e5cd3abc ("bpf: sockmap, refactor sockmap routines to work with hashmap")
      Signed-off-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      e23afe5e
    • Yonghong Song's avatar
      bpf: fix sock hashmap kmalloc warning · 683d2ac3
      Yonghong Song authored
      syzbot reported a kernel warning below:
        WARNING: CPU: 0 PID: 4499 at mm/slab_common.c:996 kmalloc_slab+0x56/0x70 mm/slab_common.c:996
        Kernel panic - not syncing: panic_on_warn set ...
      
        CPU: 0 PID: 4499 Comm: syz-executor050 Not tainted 4.17.0-rc3+ #9
        Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
        Call Trace:
         __dump_stack lib/dump_stack.c:77 [inline]
         dump_stack+0x1b9/0x294 lib/dump_stack.c:113
         panic+0x22f/0x4de kernel/panic.c:184
         __warn.cold.8+0x163/0x1b3 kernel/panic.c:536
         report_bug+0x252/0x2d0 lib/bug.c:186
         fixup_bug arch/x86/kernel/traps.c:178 [inline]
         do_error_trap+0x1de/0x490 arch/x86/kernel/traps.c:296
         do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315
         invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:992
        RIP: 0010:kmalloc_slab+0x56/0x70 mm/slab_common.c:996
        RSP: 0018:ffff8801d907fc58 EFLAGS: 00010246
        RAX: 0000000000000000 RBX: ffff8801aeecb280 RCX: ffffffff8185ebd7
        RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000ffffffe1
        RBP: ffff8801d907fc58 R08: ffff8801adb5e1c0 R09: ffffed0035a84700
        R10: ffffed0035a84700 R11: ffff8801ad423803 R12: ffff8801aeecb280
        R13: 00000000fffffff4 R14: ffff8801ad891a00 R15: 00000000014200c0
         __do_kmalloc mm/slab.c:3713 [inline]
         __kmalloc+0x25/0x760 mm/slab.c:3727
         kmalloc include/linux/slab.h:517 [inline]
         map_get_next_key+0x24a/0x640 kernel/bpf/syscall.c:858
         __do_sys_bpf kernel/bpf/syscall.c:2131 [inline]
         __se_sys_bpf kernel/bpf/syscall.c:2096 [inline]
         __x64_sys_bpf+0x354/0x4f0 kernel/bpf/syscall.c:2096
         do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
         entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      The test case is against sock hashmap with a key size 0xffffffe1.
      Such a large key size will cause the below code in function
      sock_hash_alloc() overflowing and produces a smaller elem_size,
      hence map creation will be successful.
          htab->elem_size = sizeof(struct htab_elem) +
                            round_up(htab->map.key_size, 8);
      
      Later, when map_get_next_key is called and kernel tries
      to allocate the key unsuccessfully, it will issue
      the above warning.
      
      Similar to hashtab, ensure the key size is at most
      MAX_BPF_STACK for a successful map creation.
      
      Fixes: 81110384 ("bpf: sockmap, add hash map support")
      Reported-by: syzbot+e4566d29080e7f3460ff@syzkaller.appspotmail.com
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Acked-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      683d2ac3
    • David Beckett's avatar
      libbpf: add ifindex to enable offload support · f0307a7e
      David Beckett authored
      BPF programs currently can only be offloaded using iproute2. This
      patch will allow programs to be offloaded using libbpf calls.
      Signed-off-by: default avatarDavid Beckett <david.beckett@netronome.com>
      Reviewed-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      f0307a7e