1. 08 Jul, 2019 3 commits
    • Andrii Nakryiko's avatar
      selftests/bpf: test perf buffer API · ee5cf82c
      Andrii Nakryiko authored
      Add test verifying perf buffer API functionality.
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Acked-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      ee5cf82c
    • Andrii Nakryiko's avatar
      libbpf: auto-set PERF_EVENT_ARRAY size to number of CPUs · d7ff34d5
      Andrii Nakryiko authored
      For BPF_MAP_TYPE_PERF_EVENT_ARRAY typically correct size is number of
      possible CPUs. This is impossible to specify at compilation time. This
      change adds automatic setting of PERF_EVENT_ARRAY size to number of
      system CPUs, unless non-zero size is specified explicitly. This allows
      to adjust size for advanced specific cases, while providing convenient
      and logical defaults.
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Acked-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      d7ff34d5
    • Andrii Nakryiko's avatar
      libbpf: add perf buffer API · fb84b822
      Andrii Nakryiko authored
      BPF_MAP_TYPE_PERF_EVENT_ARRAY map is often used to send data from BPF program
      to user space for additional processing. libbpf already has very low-level API
      to read single CPU perf buffer, bpf_perf_event_read_simple(), but it's hard to
      use and requires a lot of code to set everything up. This patch adds
      perf_buffer abstraction on top of it, abstracting setting up and polling
      per-CPU logic into simple and convenient API, similar to what BCC provides.
      
      perf_buffer__new() sets up per-CPU ring buffers and updates corresponding BPF
      map entries. It accepts two user-provided callbacks: one for handling raw
      samples and one for get notifications of lost samples due to buffer overflow.
      
      perf_buffer__new_raw() is similar, but provides more control over how
      perf events are set up (by accepting user-provided perf_event_attr), how
      they are handled (perf_event_header pointer is passed directly to
      user-provided callback), and on which CPUs ring buffers are created
      (it's possible to provide a list of CPUs and corresponding map keys to
      update). This API allows advanced users fuller control.
      
      perf_buffer__poll() is used to fetch ring buffer data across all CPUs,
      utilizing epoll instance.
      
      perf_buffer__free() does corresponding clean up and unsets FDs from BPF map.
      
      All APIs are not thread-safe. User should ensure proper locking/coordination if
      used in multi-threaded set up.
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Acked-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      fb84b822
  2. 05 Jul, 2019 20 commits
  3. 04 Jul, 2019 5 commits
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next · c4cde580
      David S. Miller authored
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf-next 2019-07-03
      
      The following pull-request contains BPF updates for your *net-next* tree.
      
      There is a minor merge conflict in mlx5 due to 8960b389 ("linux/dim:
      Rename externally used net_dim members") which has been pulled into your
      tree in the meantime, but resolution seems not that bad ... getting current
      bpf-next out now before there's coming more on mlx5. ;) I'm Cc'ing Saeed
      just so he's aware of the resolution below:
      
      ** First conflict in drivers/net/ethernet/mellanox/mlx5/core/en_main.c:
      
        <<<<<<< HEAD
        static int mlx5e_open_cq(struct mlx5e_channel *c,
                                 struct dim_cq_moder moder,
                                 struct mlx5e_cq_param *param,
                                 struct mlx5e_cq *cq)
        =======
        int mlx5e_open_cq(struct mlx5e_channel *c, struct net_dim_cq_moder moder,
                          struct mlx5e_cq_param *param, struct mlx5e_cq *cq)
        >>>>>>> e5a3e259
      
      Resolution is to take the second chunk and rename net_dim_cq_moder into
      dim_cq_moder. Also the signature for mlx5e_open_cq() in ...
      
        drivers/net/ethernet/mellanox/mlx5/core/en.h +977
      
      ... and in mlx5e_open_xsk() ...
      
        drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.c +64
      
      ... needs the same rename from net_dim_cq_moder into dim_cq_moder.
      
      ** Second conflict in drivers/net/ethernet/mellanox/mlx5/core/en_main.c:
      
        <<<<<<< HEAD
                int cpu = cpumask_first(mlx5_comp_irq_get_affinity_mask(priv->mdev, ix));
                struct dim_cq_moder icocq_moder = {0, 0};
                struct net_device *netdev = priv->netdev;
                struct mlx5e_channel *c;
                unsigned int irq;
        =======
                struct net_dim_cq_moder icocq_moder = {0, 0};
        >>>>>>> e5a3e259
      
      Take the second chunk and rename net_dim_cq_moder into dim_cq_moder
      as well.
      
      Let me know if you run into any issues. Anyway, the main changes are:
      
      1) Long-awaited AF_XDP support for mlx5e driver, from Maxim.
      
      2) Addition of two new per-cgroup BPF hooks for getsockopt and
         setsockopt along with a new sockopt program type which allows more
         fine-grained pass/reject settings for containers. Also add a sock_ops
         callback that can be selectively enabled on a per-socket basis and is
         executed for every RTT to help tracking TCP statistics, both features
         from Stanislav.
      
      3) Follow-up fix from loops in precision tracking which was not propagating
         precision marks and as a result verifier assumed that some branches were
         not taken and therefore wrongly removed as dead code, from Alexei.
      
      4) Fix BPF cgroup release synchronization race which could lead to a
         double-free if a leaf's cgroup_bpf object is released and a new BPF
         program is attached to the one of ancestor cgroups in parallel, from Roman.
      
      5) Support for bulking XDP_TX on veth devices which improves performance
         in some cases by around 9%, from Toshiaki.
      
      6) Allow for lookups into BPF devmap and improve feedback when calling into
         bpf_redirect_map() as lookup is now performed right away in the helper
         itself, from Toke.
      
      7) Add support for fq's Earliest Departure Time to the Host Bandwidth
         Manager (HBM) sample BPF program, from Lawrence.
      
      8) Various cleanups and minor fixes all over the place from many others.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c4cde580
    • René van Dorst's avatar
      net: ethernet: mediatek: Fix overlapping capability bits. · e2c74694
      René van Dorst authored
      Both MTK_TRGMII_MT7621_CLK and MTK_PATH_BIT are defined as bit 10.
      
      This can causes issues on non-MT7621 devices which has the
      MTK_PATH_BIT(MTK_ETH_PATH_GMAC1_RGMII) and MTK_TRGMII capability set.
      The wrong TRGMII setup code can be executed. The current wrongly executed
      code doesn’t do any harm on MT7623 and the TRGMII setup for the MT7623
      SOC side is done in MT7530 driver So it wasn’t noticed in the test.
      
      Move all capability bits in one enum so that they are all unique and easy
      to expand in the future.
      
      Because mtk_eth_path enum is merged in to mkt_eth_capabilities, the
      variable path value is no longer between 0 to number of paths,
      mtk_eth_path_name can’t be used anymore in this form. Convert the
      mtk_eth_path_name array to a function to lookup the pathname.
      
      The old code walked thru the mtk_eth_path enum, which is also merged
      with mkt_eth_capabilities. Expand array mtk_eth_muxc so it can store the
      name and capability bit of the mux. Convert the code so it can walk thru
      the mtk_eth_muxc array.
      
      Fixes: 8efaa653 ("net: ethernet: mediatek: Add MT7621 TRGMII mode support")
      Signed-off-by: default avatarRené van Dorst <opensource@vdorst.com>
      
      v1->v2:
      - Move all capability bits in one enum, suggested by Willem de Bruijn
      - Convert the mtk_eth_path_name array to a function to lookup the pathname
      - Expand array mtk_eth_muxc so it can also store the name and capability
        bit of the mux
      - Updated commit message
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e2c74694
    • Weifeng Voon's avatar
      net: stmmac: Enable dwmac4 jumbo frame more than 8KiB · c3efed5a
      Weifeng Voon authored
      Enable GMAC v4.xx and beyond to support 16KiB buffer.
      Signed-off-by: default avatarWeifeng Voon <weifeng.voon@intel.com>
      Signed-off-by: default avatarOng Boon Leong <boon.leong.ong@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c3efed5a
    • Vincent Bernat's avatar
      bonding: add an option to specify a delay between peer notifications · 07a4ddec
      Vincent Bernat authored
      Currently, gratuitous ARP/ND packets are sent every `miimon'
      milliseconds. This commit allows a user to specify a custom delay
      through a new option, `peer_notif_delay'.
      
      Like for `updelay' and `downdelay', this delay should be a multiple of
      `miimon' to avoid managing an additional work queue. The configuration
      logic is copied from `updelay' and `downdelay'. However, the default
      value cannot be set using a module parameter: Netlink or sysfs should
      be used to configure this feature.
      
      When setting `miimon' to 100 and `peer_notif_delay' to 500, we can
      observe the 500 ms delay is respected:
      
          20:30:19.354693 ARP, Request who-has 203.0.113.10 tell 203.0.113.10, length 28
          20:30:19.874892 ARP, Request who-has 203.0.113.10 tell 203.0.113.10, length 28
          20:30:20.394919 ARP, Request who-has 203.0.113.10 tell 203.0.113.10, length 28
          20:30:20.914963 ARP, Request who-has 203.0.113.10 tell 203.0.113.10, length 28
      
      In bond_mii_monitor(), I have tried to keep the lock logic readable.
      The change is due to the fact we cannot rely on a notification to
      lower the value of `bond->send_peer_notif' as `NETDEV_NOTIFY_PEERS' is
      only triggered once every N times, while we need to decrement the
      counter each time.
      
      iproute2 also needs to be updated to be able to specify this new
      attribute through `ip link'.
      Signed-off-by: default avatarVincent Bernat <vincent@bernat.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      07a4ddec
    • Colin Ian King's avatar
      net: ethernet: sun: remove redundant assignment to variable err · 2368a870
      Colin Ian King authored
      The variable err is being assigned with a value that is never
      read and it is being updated in the next statement with a new value.
      The assignment is redundant and can be removed.
      
      Addresses-Coverity: ("Unused value")
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2368a870
  4. 03 Jul, 2019 12 commits