1. 16 Aug, 2021 6 commits
    • Andrii Nakryiko's avatar
      bpf: Allow to specify user-provided bpf_cookie for BPF perf links · 82e6b1ee
      Andrii Nakryiko authored
      Add ability for users to specify custom u64 value (bpf_cookie) when creating
      BPF link for perf_event-backed BPF programs (kprobe/uprobe, perf_event,
      tracepoints).
      
      This is useful for cases when the same BPF program is used for attaching and
      processing invocation of different tracepoints/kprobes/uprobes in a generic
      fashion, but such that each invocation is distinguished from each other (e.g.,
      BPF program can look up additional information associated with a specific
      kernel function without having to rely on function IP lookups). This enables
      new use cases to be implemented simply and efficiently that previously were
      possible only through code generation (and thus multiple instances of almost
      identical BPF program) or compilation at runtime (BCC-style) on target hosts
      (even more expensive resource-wise). For uprobes it is not even possible in
      some cases to know function IP before hand (e.g., when attaching to shared
      library without PID filtering, in which case base load address is not known
      for a library).
      
      This is done by storing u64 bpf_cookie in struct bpf_prog_array_item,
      corresponding to each attached and run BPF program. Given cgroup BPF programs
      already use two 8-byte pointers for their needs and cgroup BPF programs don't
      have (yet?) support for bpf_cookie, reuse that space through union of
      cgroup_storage and new bpf_cookie field.
      
      Make it available to kprobe/tracepoint BPF programs through bpf_trace_run_ctx.
      This is set by BPF_PROG_RUN_ARRAY, used by kprobe/uprobe/tracepoint BPF
      program execution code, which luckily is now also split from
      BPF_PROG_RUN_ARRAY_CG. This run context will be utilized by a new BPF helper
      giving access to this user-provided cookie value from inside a BPF program.
      Generic perf_event BPF programs will access this value from perf_event itself
      through passed in BPF program context.
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarYonghong Song <yhs@fb.com>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lore.kernel.org/bpf/20210815070609.987780-6-andrii@kernel.org
      82e6b1ee
    • Andrii Nakryiko's avatar
      bpf: Implement minimal BPF perf link · b89fbfbb
      Andrii Nakryiko authored
      Introduce a new type of BPF link - BPF perf link. This brings perf_event-based
      BPF program attachments (perf_event, tracepoints, kprobes, and uprobes) into
      the common BPF link infrastructure, allowing to list all active perf_event
      based attachments, auto-detaching BPF program from perf_event when link's FD
      is closed, get generic BPF link fdinfo/get_info functionality.
      
      BPF_LINK_CREATE command expects perf_event's FD as target_fd. No extra flags
      are currently supported.
      
      Force-detaching and atomic BPF program updates are not yet implemented, but
      with perf_event-based BPF links we now have common framework for this without
      the need to extend ioctl()-based perf_event interface.
      
      One interesting consideration is a new value for bpf_attach_type, which
      BPF_LINK_CREATE command expects. Generally, it's either 1-to-1 mapping from
      bpf_attach_type to bpf_prog_type, or many-to-1 mapping from a subset of
      bpf_attach_types to one bpf_prog_type (e.g., see BPF_PROG_TYPE_SK_SKB or
      BPF_PROG_TYPE_CGROUP_SOCK). In this case, though, we have three different
      program types (KPROBE, TRACEPOINT, PERF_EVENT) using the same perf_event-based
      mechanism, so it's many bpf_prog_types to one bpf_attach_type. I chose to
      define a single BPF_PERF_EVENT attach type for all of them and adjust
      link_create()'s logic for checking correspondence between attach type and
      program type.
      
      The alternative would be to define three new attach types (e.g., BPF_KPROBE,
      BPF_TRACEPOINT, and BPF_PERF_EVENT), but that seemed like unnecessary overkill
      and BPF_KPROBE will cause naming conflicts with BPF_KPROBE() macro, defined by
      libbpf. I chose to not do this to avoid unnecessary proliferation of
      bpf_attach_type enum values and not have to deal with naming conflicts.
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarYonghong Song <yhs@fb.com>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lore.kernel.org/bpf/20210815070609.987780-5-andrii@kernel.org
      b89fbfbb
    • Andrii Nakryiko's avatar
      bpf: Refactor perf_event_set_bpf_prog() to use struct bpf_prog input · 652c1b17
      Andrii Nakryiko authored
      Make internal perf_event_set_bpf_prog() use struct bpf_prog pointer as an
      input argument, which makes it easier to re-use for other internal uses
      (coming up for BPF link in the next patch). BPF program FD is not as
      convenient and in some cases it's not available. So switch to struct bpf_prog,
      move out refcounting outside and let caller do bpf_prog_put() in case of an
      error. This follows the approach of most of the other BPF internal functions.
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: default avatarYonghong Song <yhs@fb.com>
      Link: https://lore.kernel.org/bpf/20210815070609.987780-4-andrii@kernel.org
      652c1b17
    • Andrii Nakryiko's avatar
      bpf: Refactor BPF_PROG_RUN_ARRAY family of macros into functions · 7d08c2c9
      Andrii Nakryiko authored
      Similar to BPF_PROG_RUN, turn BPF_PROG_RUN_ARRAY macros into proper functions
      with all the same readability and maintainability benefits. Making them into
      functions required shuffling around bpf_set_run_ctx/bpf_reset_run_ctx
      functions. Also, explicitly specifying the type of the BPF prog run callback
      required adjusting __bpf_prog_run_save_cb() to accept const void *, casted
      internally to const struct sk_buff.
      
      Further, split out a cgroup-specific BPF_PROG_RUN_ARRAY_CG and
      BPF_PROG_RUN_ARRAY_CG_FLAGS from the more generic BPF_PROG_RUN_ARRAY due to
      the differences in bpf_run_ctx used for those two different use cases.
      
      I think BPF_PROG_RUN_ARRAY_CG would benefit from further refactoring to accept
      struct cgroup and enum bpf_attach_type instead of bpf_prog_array, fetching
      cgrp->bpf.effective[type] and RCU-dereferencing it internally. But that
      required including include/linux/cgroup-defs.h, which I wasn't sure is ok with
      everyone.
      
      The remaining generic BPF_PROG_RUN_ARRAY function will be extended to
      pass-through user-provided context value in the next patch.
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarYonghong Song <yhs@fb.com>
      Link: https://lore.kernel.org/bpf/20210815070609.987780-3-andrii@kernel.org
      7d08c2c9
    • Andrii Nakryiko's avatar
      bpf: Refactor BPF_PROG_RUN into a function · fb7dd8bc
      Andrii Nakryiko authored
      Turn BPF_PROG_RUN into a proper always inlined function. No functional and
      performance changes are intended, but it makes it much easier to understand
      what's going on with how BPF programs are actually get executed. It's more
      obvious what types and callbacks are expected. Also extra () around input
      parameters can be dropped, as well as `__` variable prefixes intended to avoid
      naming collisions, which makes the code simpler to read and write.
      
      This refactoring also highlighted one extra issue. BPF_PROG_RUN is both
      a macro and an enum value (BPF_PROG_RUN == BPF_PROG_TEST_RUN). Turning
      BPF_PROG_RUN into a function causes naming conflict compilation error. So
      rename BPF_PROG_RUN into lower-case bpf_prog_run(), similar to
      bpf_prog_run_xdp(), bpf_prog_run_pin_on_cpu(), etc. All existing callers of
      BPF_PROG_RUN, the macro, are switched to bpf_prog_run() explicitly.
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarYonghong Song <yhs@fb.com>
      Link: https://lore.kernel.org/bpf/20210815070609.987780-2-andrii@kernel.org
      fb7dd8bc
    • Colin Ian King's avatar
      1bda52f8
  2. 15 Aug, 2021 6 commits
  3. 14 Aug, 2021 3 commits
  4. 13 Aug, 2021 10 commits
    • Ilya Leoshkevich's avatar
      selftests/bpf: Fix test_core_autosize on big-endian machines · d164dd9a
      Ilya Leoshkevich authored
      The "probed" part of test_core_autosize copies an integer using
      bpf_core_read() into an integer of a potentially different size.
      On big-endian machines a destination offset is required for this to
      produce a sensible result.
      
      Fixes: 888d83b9 ("selftests/bpf: Validate libbpf's auto-sizing of LD/ST/STX instructions")
      Signed-off-by: default avatarIlya Leoshkevich <iii@linux.ibm.com>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/bpf/20210812224814.187460-1-iii@linux.ibm.com
      d164dd9a
    • Hao Luo's avatar
      libbpf: Support weak typed ksyms. · 2211c825
      Hao Luo authored
      Currently weak typeless ksyms have default value zero, when they don't
      exist in the kernel. However, weak typed ksyms are rejected by libbpf
      if they can not be resolved. This means that if a bpf object contains
      the declaration of a nonexistent weak typed ksym, it will be rejected
      even if there is no program that references the symbol.
      
      Nonexistent weak typed ksyms can also default to zero just like
      typeless ones. This allows programs that access weak typed ksyms to be
      accepted by verifier, if the accesses are guarded. For example,
      
      extern const int bpf_link_fops3 __ksym __weak;
      
      /* then in BPF program */
      
      if (&bpf_link_fops3) {
         /* use bpf_link_fops3 */
      }
      
      If actual use of nonexistent typed ksym is not guarded properly,
      verifier would see that register is not PTR_TO_BTF_ID and wouldn't
      allow to use it for direct memory reads or passing it to BPF helpers.
      Signed-off-by: default avatarHao Luo <haoluo@google.com>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/bpf/20210812003819.2439037-1-haoluo@google.com
      2211c825
    • Jussi Maki's avatar
      selftests/bpf: Fix running of XDP bonding tests · cf7a5cba
      Jussi Maki authored
      An "innocent" cleanup in the last version of the XDP bonding patchset moved
      the "test__start_subtest" calls to the test main function, but I forgot to
      reverse the condition, which lead to all tests being skipped. Fix it.
      
      Fixes: 6aab1c81 ("selftests/bpf: Add tests for XDP bonding")
      Signed-off-by: default avatarJussi Maki <joamaki@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20210811123627.20223-1-joamaki@gmail.com
      cf7a5cba
    • Changbin Du's avatar
      net: in_irq() cleanup · afa79d08
      Changbin Du authored
      Replace the obsolete and ambiguos macro in_irq() with new
      macro in_hardirq().
      Signed-off-by: default avatarChangbin Du <changbin.du@gmail.com>
      Link: https://lore.kernel.org/r/20210813145749.86512-1-changbin.du@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      afa79d08
    • Jussi Maki's avatar
      net, bonding: Disallow vlan+srcmac with XDP · 39a0876d
      Jussi Maki authored
      The new vlan+srcmac xmit policy is not implementable with XDP since
      in many cases the 802.1Q payload is not present in the packet. This
      can be for example due to hardware offload or in the case of veth
      due to use of skbuffs internally.
      
      This also fixes the NULL deref with the vlan+srcmac xmit policy
      reported by Jonathan Toppins by additionally checking the skb
      pointer.
      
      Fixes: a815bde5 ("net, bonding: Refactor bond_xmit_hash for use with xdp_buff")
      Reported-by: default avatarJonathan Toppins <jtoppins@redhat.com>
      Signed-off-by: default avatarJussi Maki <joamaki@gmail.com>
      Reviewed-by: default avatarJonathan Toppins <jtoppins@redhat.com>
      Link: https://lore.kernel.org/r/20210812145241.12449-1-joamaki@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      39a0876d
    • Rao Shoaib's avatar
      af_unix: fix holding spinlock in oob handling · 876c14ad
      Rao Shoaib authored
      syzkaller found that OOB code was holding spinlock
      while calling a function in which it could sleep.
      
      Reported-by: syzbot+8760ca6c1ee783ac4abd@syzkaller.appspotmail.com
      Fixes: 314001f0 ("af_unix: Add OOB support")
      Signed-off-by: default avatarRao Shoaib <rao.shoaib@oracle.com>
      Link: https://lore.kernel.org/r/20210811220652.567434-1-Rao.Shoaib@oracle.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      876c14ad
    • Jakub Kicinski's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · f4083a75
      Jakub Kicinski authored
      Conflicts:
      
      drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.h
        9e266807 ("bnxt_en: Update firmware call to retrieve TX PTP timestamp")
        9e518f25 ("bnxt_en: 1PPS functions to configure TSIO pins")
        099fdeda ("bnxt_en: Event handler for PPS events")
      
      kernel/bpf/helpers.c
      include/linux/bpf-cgroup.h
        a2baf4e8 ("bpf: Fix potentially incorrect results with bpf_get_local_storage()")
        c7603cfa ("bpf: Add ambient BPF runtime context stored in current")
      
      drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c
        5957cc55 ("net/mlx5: Set all field of mlx5_irq before inserting it to the xarray")
        2d0b41a3 ("net/mlx5: Refcount mlx5_irq with integer")
      
      MAINTAINERS
        7b637cd5 ("MAINTAINERS: fix Microchip CAN BUS Analyzer Tool entry typo")
        7d901a1e ("net: phy: add Maxlinear GPY115/21x/24x driver")
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      f4083a75
    • Linus Torvalds's avatar
      Merge tag 'net-5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · f8e6dfc6
      Linus Torvalds authored
      Pull networking fixes from Jakub Kicinski:
       "Networking fixes, including fixes from netfilter, bpf, can and
        ieee802154.
      
        The size of this is pretty normal, but we got more fixes for 5.14
        changes this week than last week. Nothing major but the trend is the
        opposite of what we like. We'll see how the next week goes..
      
        Current release - regressions:
      
         - r8169: fix ASPM-related link-up regressions
      
         - bridge: fix flags interpretation for extern learn fdb entries
      
         - phy: micrel: fix link detection on ksz87xx switch
      
         - Revert "tipc: Return the correct errno code"
      
         - ptp: fix possible memory leak caused by invalid cast
      
        Current release - new code bugs:
      
         - bpf: add missing bpf_read_[un]lock_trace() for syscall program
      
         - bpf: fix potentially incorrect results with bpf_get_local_storage()
      
         - page_pool: mask the page->signature before the checking, avoid dma
           mapping leaks
      
         - netfilter: nfnetlink_hook: 5 fixes to information in netlink dumps
      
         - bnxt_en: fix firmware interface issues with PTP
      
         - mlx5: Bridge, fix ageing time
      
        Previous releases - regressions:
      
         - linkwatch: fix failure to restore device state across
           suspend/resume
      
         - bareudp: fix invalid read beyond skb's linear data
      
        Previous releases - always broken:
      
         - bpf: fix integer overflow involving bucket_size
      
         - ppp: fix issues when desired interface name is specified via
           netlink
      
         - wwan: mhi_wwan_ctrl: fix possible deadlock
      
         - dsa: microchip: ksz8795: fix number of VLAN related bugs
      
         - dsa: drivers: fix broken backpressure in .port_fdb_dump
      
         - dsa: qca: ar9331: make proper initial port defaults
      
        Misc:
      
         - bpf: add lockdown check for probe_write_user helper
      
         - netfilter: conntrack: remove offload_pickup sysctl before 5.14 is
           out
      
         - netfilter: conntrack: collect all entries in one cycle,
           heuristically slow down garbage collection scans on idle systems to
           prevent frequent wake ups"
      
      * tag 'net-5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (87 commits)
        vsock/virtio: avoid potential deadlock when vsock device remove
        wwan: core: Avoid returning NULL from wwan_create_dev()
        net: dsa: sja1105: unregister the MDIO buses during teardown
        Revert "tipc: Return the correct errno code"
        net: mscc: Fix non-GPL export of regmap APIs
        net: igmp: increase size of mr_ifc_count
        MAINTAINERS: switch to my OMP email for Renesas Ethernet drivers
        tcp_bbr: fix u32 wrap bug in round logic if bbr_init() called after 2B packets
        net: pcs: xpcs: fix error handling on failed to allocate memory
        net: linkwatch: fix failure to restore device state across suspend/resume
        net: bridge: fix memleak in br_add_if()
        net: switchdev: zero-initialize struct switchdev_notifier_fdb_info emitted by drivers towards the bridge
        net: bridge: fix flags interpretation for extern learn fdb entries
        net: dsa: sja1105: fix broken backpressure in .port_fdb_dump
        net: dsa: lantiq: fix broken backpressure in .port_fdb_dump
        net: dsa: lan9303: fix broken backpressure in .port_fdb_dump
        net: dsa: hellcreek: fix broken backpressure in .port_fdb_dump
        bpf, core: Fix kernel-doc notation
        net: igmp: fix data-race in igmp_ifc_timer_expire()
        net: Fix memory leak in ieee802154_raw_deliver
        ...
      f8e6dfc6
    • Linus Torvalds's avatar
      Merge tag 'ceph-for-5.14-rc6' of git://github.com/ceph/ceph-client · 3a03c67d
      Linus Torvalds authored
      Pull ceph fixes from Ilya Dryomov:
       "A patch to avoid a soft lockup in ceph_check_delayed_caps() from Luis
        and a reference handling fix from Jeff that should address some memory
        corruption reports in the snaprealm area.
      
        Both marked for stable"
      
      * tag 'ceph-for-5.14-rc6' of git://github.com/ceph/ceph-client:
        ceph: take snap_empty_lock atomically with snaprealm refcount change
        ceph: reduce contention in ceph_check_delayed_caps()
      3a03c67d
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-2021-08-13' of git://anongit.freedesktop.org/drm/drm · 82cce5f4
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Another week, another set of pretty regular fixes, nothing really
        stands out too much.
      
        amdgpu:
         - Yellow carp update
         - RAS EEPROM fixes
         - BACO/BOCO fixes
         - Fix a memory leak in an error path
         - Freesync fix
         - VCN harvesting fix
         - Display fixes
      
        i915:
         - GVT fix for Windows VM hang.
         - Display fix of 12 BPC bits for display 12 and newer.
         - Don't try to access some media register for fused off domains.
         - Fix kerneldoc build warnings.
      
        mediatek:
         - Fix dpi bridge bug.
         - Fix cursor plane no update.
      
        meson:
         - Fix colors when booting with HDR"
      
      * tag 'drm-fixes-2021-08-13' of git://anongit.freedesktop.org/drm/drm:
        drm/doc/rfc: drop lmem uapi section
        drm/i915: Only access SFC_DONE when media domain is not fused off
        drm/i915/display: Fix the 12 BPC bits for PIPE_MISC reg
        drm/amd/display: use GFP_ATOMIC in amdgpu_dm_irq_schedule_work
        drm/amd/display: Remove invalid assert for ODM + MPC case
        drm/amd/pm: bug fix for the runtime pm BACO
        drm/amdgpu: handle VCN instances when harvesting (v2)
        drm/meson: fix colour distortion from HDR set during vendor u-boot
        drm/i915/gvt: Fix cached atomics setting for Windows VM
        drm/amdgpu: Add preferred mode in modeset when freesync video mode's enabled.
        drm/amd/pm: Fix a memory leak in an error handling path in 'vangogh_tables_init()'
        drm/amdgpu: don't enable baco on boco platforms in runpm
        drm/amdgpu: set RAS EEPROM address from VBIOS
        drm/amd/pm: update smu v13.0.1 firmware header
        drm/mediatek: Fix cursor plane no update
        drm/mediatek: mtk-dpi: Set out_fmt from config if not the last bridge
        drm/mediatek: dpi: Fix NULL dereference in mtk_dpi_bridge_atomic_check
      82cce5f4
  5. 12 Aug, 2021 15 commits