1. 13 Apr, 2022 40 commits
    • Menglong Dong's avatar
      net: ipv6: add skb drop reasons to ip6_rcv_core() · 4daf841a
      Menglong Dong authored
      Replace kfree_skb() used in ip6_rcv_core() with kfree_skb_reason().
      No new drop reasons are added.
      
      Seems now we use 'SKB_DROP_REASON_IP_INHDR' for too many case during
      ipv6 header parse or check, just like what 'IPSTATS_MIB_INHDRERRORS'
      do. Will it be too general and hard to know what happened?
      Signed-off-by: default avatarMenglong Dong <imagedong@tencent.com>
      Reviewed-by: default avatarJiang Biao <benbjiang@tencent.com>
      Reviewed-by: default avatarHao Peng <flyingpeng@tencent.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4daf841a
    • Menglong Dong's avatar
      net: ipv6: add skb drop reasons to TLV parse · 7d9dbdfb
      Menglong Dong authored
      Replace kfree_skb() used in TLV encoded option header parsing with
      kfree_skb_reason(). Following functions are involved:
      
      ip6_parse_tlv()
      ipv6_hop_ra()
      ipv6_hop_ioam()
      ipv6_hop_jumbo()
      ipv6_hop_calipso()
      ipv6_dest_hao()
      
      Most skb drops during this process are regarded as 'InHdrErrors',
      as 'IPSTATS_MIB_INHDRERRORS' is used when ip6_parse_tlv() fails,
      which make we use 'SKB_DROP_REASON_IP_INHDR' correspondingly.
      
      However, 'IP_INHDR' is a relatively general reason. Therefore, we
      can use other reasons with higher priority in some cases. For example,
      'SKB_DROP_REASON_UNHANDLED_PROTO' is used for unknown TLV options.
      Signed-off-by: default avatarMenglong Dong <imagedong@tencent.com>
      Reviewed-by: default avatarJiang Biao <benbjiang@tencent.com>
      Reviewed-by: default avatarHao Peng <flyingpeng@tencent.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7d9dbdfb
    • Menglong Dong's avatar
      net: ipv6: remove redundant statistics in ipv6_hop_jumbo() · bba98083
      Menglong Dong authored
      There are two call chains for ipv6_hop_jumbo(). The first one is:
      
      ipv6_destopt_rcv() -> ip6_parse_tlv() -> ipv6_hop_jumbo()
      
      On this call chain, the drop statistics will be done in
      ipv6_destopt_rcv() with 'IPSTATS_MIB_INHDRERRORS' if ipv6_hop_jumbo()
      returns false.
      
      The second call chain is:
      
      ip6_rcv_core() -> ipv6_parse_hopopts() -> ip6_parse_tlv()
      
      And the drop statistics will also be done in ip6_rcv_core() with
      'IPSTATS_MIB_INHDRERRORS' if ipv6_hop_jumbo() returns false.
      
      Therefore, the statistics in ipv6_hop_jumbo() is redundant, which
      means the drop is counted twice. The statistics in ipv6_hop_jumbo()
      is almost the same as the outside, except the
      'IPSTATS_MIB_INTRUNCATEDPKTS', which seems that we have to ignore it.
      Signed-off-by: default avatarMenglong Dong <imagedong@tencent.com>
      Reviewed-by: default avatarJiang Biao <benbjiang@tencent.com>
      Reviewed-by: default avatarHao Peng <flyingpeng@tencent.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bba98083
    • Menglong Dong's avatar
      net: icmp: introduce function icmpv6_param_prob_reason() · 1ad6d548
      Menglong Dong authored
      In order to add the skb drop reasons support to icmpv6_param_prob(),
      introduce the function icmpv6_param_prob_reason() and make
      icmpv6_param_prob() an inline call to it. This new function will be
      used in the following patches.
      Signed-off-by: default avatarMenglong Dong <imagedong@tencent.com>
      Reviewed-by: default avatarJiang Biao <benbjiang@tencent.com>
      Reviewed-by: default avatarHao Peng <flyingpeng@tencent.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1ad6d548
    • Menglong Dong's avatar
      net: ip: add skb drop reasons to ip forwarding · 2edc1a38
      Menglong Dong authored
      Replace kfree_skb() which is used in ip6_forward() and ip_forward()
      with kfree_skb_reason().
      
      The new drop reason 'SKB_DROP_REASON_PKT_TOO_BIG' is introduced for
      the case that the length of the packet exceeds MTU and can't
      fragment.
      Signed-off-by: default avatarMenglong Dong <imagedong@tencent.com>
      Reviewed-by: default avatarJiang Biao <benbjiang@tencent.com>
      Reviewed-by: default avatarHao Peng <flyingpeng@tencent.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2edc1a38
    • Menglong Dong's avatar
      net: ipv6: add skb drop reasons to ip6_pkt_drop() · 3ae42cc8
      Menglong Dong authored
      Replace kfree_skb() used in ip6_pkt_drop() with kfree_skb_reason().
      No new reason is added.
      Signed-off-by: default avatarMenglong Dong <imagedong@tencent.com>
      Reviewed-by: default avatarJiang Biao <benbjiang@tencent.com>
      Reviewed-by: default avatarHao Peng <flyingpeng@tencent.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3ae42cc8
    • Menglong Dong's avatar
      net: ipv4: add skb drop reasons to ip_error() · c4eb6641
      Menglong Dong authored
      Eventually, I find out the handler function for inputting route lookup
      fail: ip_error().
      
      The drop reasons we used in ip_error() are almost corresponding to
      IPSTATS_MIB_*, and following new reasons are introduced:
      
      SKB_DROP_REASON_IP_INADDRERRORS
      SKB_DROP_REASON_IP_INNOROUTES
      
      Isn't the name SKB_DROP_REASON_IP_HOSTUNREACH and
      SKB_DROP_REASON_IP_NETUNREACH more accurate? To make them corresponding
      to IPSTATS_MIB_*, we keep their name still.
      Signed-off-by: default avatarMenglong Dong <imagedong@tencent.com>
      Reviewed-by: default avatarJiang Biao <benbjiang@tencent.com>
      Reviewed-by: default avatarHao Peng <flyingpeng@tencent.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c4eb6641
    • Menglong Dong's avatar
      skb: add some helpers for skb drop reasons · d6d3146c
      Menglong Dong authored
      In order to simply the definition and assignment for
      'enum skb_drop_reason', introduce some helpers.
      
      SKB_DR() is used to define a variable of type 'enum skb_drop_reason'
      with the 'SKB_DROP_REASON_NOT_SPECIFIED' initial value.
      
      SKB_DR_SET() is used to set the value of the variable. Seems it is
      a little useless? But it makes the code shorter.
      
      SKB_DR_OR() is used to set the value of the variable if it is not set
      yet, which means its value is SKB_DROP_REASON_NOT_SPECIFIED.
      Signed-off-by: default avatarMenglong Dong <imagedong@tencent.com>
      Reviewed-by: default avatarJiang Biao <benbjiang@tencent.com>
      Reviewed-by: default avatarHao Peng <flyingpeng@tencent.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d6d3146c
    • David S. Miller's avatar
      Merge branch 'octeon_ep-driver' · dba47afd
      David S. Miller authored
      Veerasenareddy Burru says:
      
      ====================
      Add octeon_ep driver
      
      This driver implements networking functionality of Marvell's Octeon
      PCI Endpoint NIC.
      
      This driver support following devices:
       * Network controller: Cavium, Inc. Device b200
      
      V4 -> V5:
         - Fix warnings reported by clang.
         - Address comments from community reviews.
      
      V3 -> V4:
         - Fix warnings and errors reported by "make W=1 C=1".
      
      V2 -> V3:
         - Fix warnings and errors reported by kernel test robot:
           "Reported-by: kernel test robot <lkp@intel.com>"
      
      V1 -> V2:
          - Address review comments on original patch series.
          - Divide PATCH 1/4 from the original series into 4 patches in
            v2 patch series: PATCH 1/7 to PATCH 4/7.
          - Fix clang build errors.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dba47afd
    • Veerasenareddy Burru's avatar
      octeon_ep: add ethtool support for Octeon PCI Endpoint NIC · 5cc256e7
      Veerasenareddy Burru authored
      Add support for the following ethtool commands:
      
      ethtool -i|--driver devname
      ethtool devname
      ethtool -s devname [speed N] [autoneg on|off] [advertise N]
      ethtool -S|--statistics devname
      Signed-off-by: default avatarVeerasenareddy Burru <vburru@marvell.com>
      Signed-off-by: default avatarAbhijit Ayarekar <aayarekar@marvell.com>
      Signed-off-by: default avatarSatananda Burla <sburla@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5cc256e7
    • Veerasenareddy Burru's avatar
      octeon_ep: add Tx/Rx processing and interrupt support · 37d79d05
      Veerasenareddy Burru authored
      Add support to enable MSI-x and register interrupts.
      Add support to process Tx and Rx traffic. Includes processing
      Tx completions and Rx refill.
      Signed-off-by: default avatarVeerasenareddy Burru <vburru@marvell.com>
      Signed-off-by: default avatarAbhijit Ayarekar <aayarekar@marvell.com>
      Signed-off-by: default avatarSatananda Burla <sburla@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      37d79d05
    • Veerasenareddy Burru's avatar
      octeon_ep: add support for ndo ops · 6a610a46
      Veerasenareddy Burru authored
      Add support for ndo ops to set MAC address, change MTU, get stats.
      Add control path support to set MAC address, change MTU, get stats,
      set speed, get and set link mode.
      Signed-off-by: default avatarVeerasenareddy Burru <vburru@marvell.com>
      Signed-off-by: default avatarAbhijit Ayarekar <aayarekar@marvell.com>
      Signed-off-by: default avatarSatananda Burla <sburla@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6a610a46
    • Veerasenareddy Burru's avatar
      octeon_ep: add Tx/Rx ring resource setup and cleanup · 397dfb57
      Veerasenareddy Burru authored
      Implement Tx/Rx ring resource allocation and cleanup.
      Signed-off-by: default avatarVeerasenareddy Burru <vburru@marvell.com>
      Signed-off-by: default avatarAbhijit Ayarekar <aayarekar@marvell.com>
      Signed-off-by: default avatarSatananda Burla <sburla@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      397dfb57
    • Veerasenareddy Burru's avatar
      octeon_ep: Add mailbox for control commands · 4ca2fbdd
      Veerasenareddy Burru authored
      Add mailbox between host and NIC to send control commands from host to
      NIC and receive responses and notifications from NIC to host driver,
      like link status update.
      Signed-off-by: default avatarVeerasenareddy Burru <vburru@marvell.com>
      Signed-off-by: default avatarAbhijit Ayarekar <aayarekar@marvell.com>
      Signed-off-by: default avatarSatananda Burla <sburla@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4ca2fbdd
    • Veerasenareddy Burru's avatar
      octeon_ep: add hardware configuration APIs · 1f2c2d0c
      Veerasenareddy Burru authored
      Implement hardware resource init and shutdown helper APIs.
      This includes hardware Tx/Rx queue init/enable/disable/reset,
      non queue interrupt handler that decodes non-queue interrupt type.
      Signed-off-by: default avatarVeerasenareddy Burru <vburru@marvell.com>
      Signed-off-by: default avatarAbhijit Ayarekar <aayarekar@marvell.com>
      Signed-off-by: default avatarSatananda Burla <sburla@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1f2c2d0c
    • Veerasenareddy Burru's avatar
      octeon_ep: Add driver framework and device initialization · 862cd659
      Veerasenareddy Burru authored
      Add driver framework and device setup and initialization for Octeon
      PCI Endpoint NIC.
      
      Add implementation to load module, initilaize, register network device,
      cleanup and unload module.
      Signed-off-by: default avatarVeerasenareddy Burru <vburru@marvell.com>
      Signed-off-by: default avatarAbhijit Ayarekar <aayarekar@marvell.com>
      Signed-off-by: default avatarSatananda Burla <sburla@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      862cd659
    • David S. Miller's avatar
      Merge branch 'br-flush-filtering' · 92716869
      David S. Miller authored
      Nikolay Aleksandrov says:
      
      ====================
      net: bridge: add flush filtering support
      
      This patch-set adds support to specify filtering conditions for a bulk
      delete (flush) operation. This version uses a new nlmsghdr delete flag
      called NLM_F_BULK in combination with a new ndo_fdb_del_bulk op which is
      used to signal that the driver supports bulk deletes (that avoids
      pushing common mac address checks to ndo_fdb_del implementations and
      also has a different prototype and parsed attribute expectations, more
      info in patch 03). The new delete flag can be used for any RTM_DEL*
      type, implementations just need to be careful with older kernels which
      are doing non-strict attribute parses. A new rtnl flag
      (RTNL_FLAG_BULK_DEL_SUPPORTED) is used to show that the delete supports
      NLM_F_BULK. A proper error is returned if bulk delete is not supported.
      For old kernels I use the fact that mac address attribute (lladdr) is
      mandatory in the classic fdb del case, but it's not allowed if bulk
      deleting so older kernels will error out.
      
      Patch 01 and 02 are minor rtnetlink cleanups to make the code easier to
      read. They remove hardcoded values and use names instead. Patch 03 uses
      BIT() for rtnl flags.
      Patch 04 adds the new NLM_F_BULK delete request modifier, patch 05 adds
      the new bulk delete flag and checks for it if the delete requests have
      NLM_F_BULK set, it also warns if rtnl register is called with a non-delete
      kind and the bulk delete flag is set.
      Patch 06 adds the new ndo_fdb_del_bulk call. Patch 07 adds NLM_F_BULK
      support to rtnl_fdb_del, on such request strict parsing is used only for
      the supported attributes, and if the ndo is implemented it's called, the
      NTF_SELF/MASTER rules are the same as for the standard rtnl_fdb_del.
      Patch 08 implements bridge-specific minimal ndo_fdb_del_bulk call which
      uses the current br_fdb_flush to delete all entries. Patch 09 adds
      filtering support to the new bridge flush op which supports target
      ifindex (port or bridge), vlan id and flags/state mask. Patch 10 adds
      ndm state and flags mask attributes which will be used for filtering.
      Patch 11 converts ndm state/flags and their masks to bridge-private flags
      and fills them in the filter descriptor for matching. Finally patch 12
      fills in the target ifindex (after validating it) and vlan id (already
      validated by rtnl_fdb_flush) for matching. Flush filtering is needed
      because user-space applications need a quick way to delete only a
      specific set of entries, e.g. mlag implementations need a way to flush only
      dynamic entries excluding externally learned ones or only externally
      learned ones without static entries etc. Also apps usually want to target
      only a specific vlan or port/vlan combination. The current 2 flush
      operations (per port and bridge-wide) are not extensible and cannot
      provide such filtering.
      
      I decided against embedding new attrs into the old flush attributes for
      multiple reasons - proper error handling on unsupported attributes,
      older kernels silently flushing all, need for a second mechanism to
      signal that the attribute should be parsed (e.g. using boolopts),
      special treatment for permanent entries.
      
      Examples:
      $ bridge fdb flush dev bridge vlan 100 static
      < flush all static entries on vlan 100 >
      $ bridge fdb flush dev bridge vlan 1 dynamic
      < flush all dynamic entries on vlan 1 >
      $ bridge fdb flush dev bridge port ens16 vlan 1 dynamic
      < flush all dynamic entries on port ens16 and vlan 1 >
      $ bridge fdb flush dev ens16 vlan 1 dynamic master
      < as above: flush all dynamic entries on port ens16 and vlan 1 >
      $ bridge fdb flush dev bridge nooffloaded nopermanent self
      < flush all non-offloaded and non-permanent entries >
      $ bridge fdb flush dev bridge static noextern_learn
      < flush all static entries which are not externally learned >
      $ bridge fdb flush dev bridge permanent
      < flush all permanent entries >
      $ bridge fdb flush dev bridge port bridge permanent
      < flush all permanent entries pointing to the bridge itself >
      
      Example of a flush call with unsupported netlink attribute (NDA_DST):
      $ bridge fdb flush dev bridge vlan 100 dynamic dst
      Error: Unsupported attribute.
      
      Example of a flush call on an older kernel:
      $ bridge fdb flush dev bridge dynamic
      Error: invalid address.
      
      Example of calling PF_UNSPEC RTM_DELNEIGH which doesn't support bulk delete
      with NLM_F_BULK set (ip neigh is changed to add the flag):
      $ ip n del 192.168.122.5 lladdr 00:11:22:33:44:55 dev ens3
      Error: Bulk delete is not supported.
      
      Note that all flags have their negated version (static vs nostatic etc)
      and there are some tricky cases to handle like "static" which in flag
      terms means fdbs that have NUD_NOARP but *not* NUD_PERMANENT, so the
      mask matches on both but we need only NUD_NOARP to be set. That's
      because permanent entries have both set so we can't just match on
      NUD_NOARP. Also note that this flush operation doesn't treat permanent
      entries in a special way (fdb_delete vs fdb_delete_local), it will
      delete them regardless if any port is using them. We can extend the api
      with a flag to do that if needed in the future.
      
      Patch-sets (in order):
       - Initial bulk del infra and fdb flush filtering (this set)
       - iproute2 support
       - selftests
      
      v4: Add and check for rtnl del bulk supported flag when using
          NLM_F_BULK (new patch 05), patches 01 - 03 are also new minor cleanups
          to remove use of raw values and make code easier to read, don't
          rename br_fdb_flush in patch 08, set port ifindex as flush target if
          NDA_IFINDEX is missing and flush was called with port netdev and
          NTF_MASTER (patch 12).
      
      v3: Add NLM_F_BULK delete modifier and ndo_fdb_del_bulk callback,
          patches 01 - 03 and 06 are new. Patch 04 is changed to implement
          bulk_del instead of flush, patches 05, 07 and 08 are adjusted to
          use NDA_ attributes
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      92716869
    • Nikolay Aleksandrov's avatar
      net: bridge: fdb: add support for flush filtering based on ifindex and vlan · 0dbe886a
      Nikolay Aleksandrov authored
      Add support for fdb flush filtering based on destination ifindex and
      vlan id. The ifindex must either match a port's device ifindex or the
      bridge's. The vlan support is trivial since it's already validated by
      rtnl_fdb_del, we just need to fill it in.
      Signed-off-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0dbe886a
    • Nikolay Aleksandrov's avatar
      net: bridge: fdb: add support for flush filtering based on ndm flags and state · 564445fb
      Nikolay Aleksandrov authored
      Add support for fdb flush filtering based on ndm flags and state. NDM
      state and flags are mapped to bridge-specific flags and matched
      according to the specified masks. NTF_USE is used to represent
      added_by_user flag since it sets it on fdb add and we don't have a 1:1
      mapping for it. Only allowed bits can be set, NTF_SELF and NTF_MASTER are
      ignored.
      Signed-off-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      564445fb
    • Nikolay Aleksandrov's avatar
      net: rtnetlink: add ndm flags and state mask attributes · ea2c0f9e
      Nikolay Aleksandrov authored
      Add ndm flags/state masks which will be used for bulk delete filtering.
      All of these are used by the bridge and vxlan drivers. Also minimal attr
      policy validation is added, it is up to ndo_fdb_del_bulk implementers to
      further validate them.
      Signed-off-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ea2c0f9e
    • Nikolay Aleksandrov's avatar
      net: bridge: fdb: add support for fine-grained flushing · 1f78ee14
      Nikolay Aleksandrov authored
      Add the ability to specify exactly which fdbs to be flushed. They are
      described by a new structure - net_bridge_fdb_flush_desc. Currently it
      can match on port/bridge ifindex, vlan id and fdb flags. It is used to
      describe the existing dynamic fdb flush operation. Note that this flush
      operation doesn't treat permanent entries in a special way (fdb_delete vs
      fdb_delete_local), it will delete them regardless if any port is using
      them, so currently it can't directly replace deletes which need to handle
      that case, although we can extend it later for that too.
      Signed-off-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1f78ee14
    • Nikolay Aleksandrov's avatar
      net: bridge: fdb: add ndo_fdb_del_bulk · edaef191
      Nikolay Aleksandrov authored
      Add a minimal ndo_fdb_del_bulk implementation which flushes all entries.
      Support for more fine-grained filtering will be added in the following
      patches.
      Signed-off-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      edaef191
    • Nikolay Aleksandrov's avatar
      net: rtnetlink: add NLM_F_BULK support to rtnl_fdb_del · 9e834259
      Nikolay Aleksandrov authored
      When NLM_F_BULK is specified in a fdb del message we need to handle it
      differently. First since this is a new call we can strictly validate the
      passed attributes, at first only ifindex and vlan are allowed as these
      will be the initially supported filter attributes, any other attribute
      is rejected. The mac address is no longer mandatory, but we use it
      to error out in older kernels because it cannot be specified with bulk
      request (the attribute is not allowed) and then we have to dispatch
      the call to ndo_fdb_del_bulk if the device supports it. The del bulk
      callback can do further validation of the attributes if necessary.
      Signed-off-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9e834259
    • Nikolay Aleksandrov's avatar
      net: add ndo_fdb_del_bulk · 1306d536
      Nikolay Aleksandrov authored
      Add a new netdev op called ndo_fdb_del_bulk, it will be later used for
      driver-specific bulk delete implementation dispatched from rtnetlink. The
      first user will be the bridge, we need it to signal to rtnetlink from
      the driver that we support bulk delete operation (NLM_F_BULK).
      Signed-off-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1306d536
    • Nikolay Aleksandrov's avatar
      net: rtnetlink: add bulk delete support flag · a6cec0bc
      Nikolay Aleksandrov authored
      Add a new rtnl flag (RTNL_FLAG_BULK_DEL_SUPPORTED) which is used to
      verify that the delete operation allows bulk object deletion. Also emit
      a warning if anyone tries to set it for non-delete kind.
      Suggested-by: default avatarDavid Ahern <dsahern@kernel.org>
      Signed-off-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a6cec0bc
    • Nikolay Aleksandrov's avatar
      net: netlink: add NLM_F_BULK delete request modifier · 545528d7
      Nikolay Aleksandrov authored
      Add a new delete request modifier called NLM_F_BULK which, when
      supported, would cause the request to delete multiple objects. The flag
      is a convenient way to signal that a multiple delete operation is
      requested which can be gradually added to different delete requests. In
      order to make sure older kernels will error out if the operation is not
      supported instead of doing something unintended we have to break a
      required condition when implementing support for this flag, f.e. for
      neighbors we will omit the mandatory mac address attribute.
      Initially it will be used to add flush with filtering support for bridge
      fdbs, but it also opens the door to add similar support to others.
      Signed-off-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      545528d7
    • Nikolay Aleksandrov's avatar
      net: rtnetlink: use BIT for flag values · 0569e31f
      Nikolay Aleksandrov authored
      Use BIT to define flag values.
      Signed-off-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0569e31f
    • Nikolay Aleksandrov's avatar
      net: rtnetlink: add helper to extract msg type's kind · 2e9ea3e3
      Nikolay Aleksandrov authored
      Add a helper which extracts the msg type's kind using the kind mask (0x3).
      Signed-off-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2e9ea3e3
    • Nikolay Aleksandrov's avatar
      net: rtnetlink: add msg kind names · 12dc5c2c
      Nikolay Aleksandrov authored
      Add rtnl kind names instead of using raw values. We'll need to
      check for DEL kind later to validate bulk flag support.
      Signed-off-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      12dc5c2c
    • David S. Miller's avatar
      Merge branch 'net-ti-storm-prevention-support' · ae10162c
      David S. Miller authored
      Grygorii Strashko says:
      
      ====================
      net: ethernet: ti: enable bc/mc storm prevention support
      
      This series first adds supports for the ALE feature to rate limit number ingress
      broadcast(BC)/multicast(MC) packets per/sec which main purpose is BC/MC storm
      prevention.
      
      And then enables corresponding support for ingress broadcast(BC)/multicast(MC)
      packets rate limiting for TI CPSW switchdev and AM65x/J221E CPSW_NUSS drivers by
      implementing HW offload for simple tc-flower with policer action with matches
      on dst_mac/mask:
       - ff:ff:ff:ff:ff:ff/ff:ff:ff:ff:ff:ff has to be used for BC packets rate
      limiting (exact match)
       - 01:00:00:00:00:00/01:00:00:00:00:00 fixed value has to be used for MC
      packets rate limiting
      
      The CPSW supports MC/BC packets rate limiting in packets/sec and affects
      all ingress MC/BC packets and serves as BC/MC storm prevention feature.
      
      Examples:
      - BC rate limit to 1000pps:
        tc qdisc add dev eth0 clsact
        tc filter add dev eth0 ingress flower skip_sw dst_mac ff:ff:ff:ff:ff:ff \
        action police pkts_rate 1000 pkts_burst 1 drop
      
      - MC rate limit to 20000pps:
        tc qdisc add dev eth0 clsact
        tc filter add dev eth0 ingress flower skip_sw dst_mac 01:00:00:00:00:00/01:00:00:00:00:00 \
        action police rate pkts_rate 20000 pkts_burst 1 drop
      
        pkts_burst - not used.
      
      The solution inspired patch from Vladimir Oltean [1].
      
      Changes in v3:
        - comments applied
        - policer validation added
      
      Changes in v2:
       - switch to packet-per-second policing introduced by
         commit 2ffe0395 ("net/sched: act_police: add support for packet-per-second policing") [2]
      
      v2: https://patchwork.kernel.org/project/netdevbpf/cover/20211101170122.19160-1-grygorii.strashko@ti.com/
      v1: https://patchwork.kernel.org/project/netdevbpf/cover/20201114035654.32658-1-grygorii.strashko@ti.com/
      
      [1] https://lore.kernel.org/patchwork/patch/1217254/
      [2] https://patchwork.kernel.org/project/netdevbpf/cover/20210312140831.23346-1-simon.horman@netronome.com/
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ae10162c
    • Grygorii Strashko's avatar
      net: ethernet: ti: cpsw_new: enable bc/mc storm prevention support · 127c9e97
      Grygorii Strashko authored
      This patch enables support for ingress broadcast(BC)/multicast(MC) packets
      rate limiting in TI CPSW switchdev driver (the corresponding ALE support
      was added in previous patch) by implementing HW offload for simple
      tc-flower with policer action with matches on dst_mac:
       - ff:ff:ff:ff:ff:ff/ff:ff:ff:ff:ff:ff has to be used for BC packets rate
      limiting (exact match)
       - 01:00:00:00:00:00/01:00:00:00:00:00 fixed value has to be used for MC
      packets rate limiting
      
      The CPSW supports MC/BC packets rate limiting in packets/sec and affects
      all ingress MC/BC packets and serves as BC/MC storm prevention feature.
      
      Examples:
      - BC rate limit to 1000pps:
        tc qdisc add dev eth0 clsact
        tc filter add dev eth0 ingress flower skip_sw dst_mac ff:ff:ff:ff:ff:ff \
        action police pkts_rate 1000 pkts_burst 1 drop
      
      - MC rate limit to 20000pps:
        tc qdisc add dev eth0 clsact
        tc filter add dev eth0 ingress flower skip_sw dst_mac 01:00:00:00:00:00/01:00:00:00:00:00 \
        action police rate pkts_rate 10000 pkts_burst 1 drop
      
        pkts_burst - not used.
      Signed-off-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      127c9e97
    • Grygorii Strashko's avatar
      net: ethernet: ti: am65-cpsw: enable bc/mc storm prevention support · 5ec836be
      Grygorii Strashko authored
      This patch enables support for ingress broadcast(BC)/multicast(MC) packets
      rate limiting in TI AM65x CPSW driver (the corresponding ALE support was
      added in previous patch) by implementing HW offload for simple tc-flower
      with policer action with matches on dst_mac/mask:
       - ff:ff:ff:ff:ff:ff/ff:ff:ff:ff:ff:ff has to be used for BC packets rate
      limiting (exact match)
       - 01:00:00:00:00:00/01:00:00:00:00:00 fixed value has to be used for MC
      packets rate limiting
      
      The CPSW supports MC/BC packets rate limiting in packets/sec and affects
      all ingress MC/BC packets and serves as BC/MC storm prevention feature.
      
      Examples:
      - BC rate limit to 1000pps:
        tc qdisc add dev eth0 clsact
        tc filter add dev eth0 ingress flower skip_sw dst_mac ff:ff:ff:ff:ff:ff \
        action police pkts_rate 1000 pkts_burst 1 drop
      
      - MC rate limit to 20000pps:
        tc qdisc add dev eth0 clsact
        tc filter add dev eth0 ingress flower skip_sw dst_mac 01:00:00:00:00:00/01:00:00:00:00:00 \
        action police rate pkts_rate 20000 pkts_burst 1 drop
      
        pkts_burst - not used.
      Signed-off-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5ec836be
    • Grygorii Strashko's avatar
      drivers: net: cpsw: ale: add broadcast/multicast rate limit support · e3a5e33f
      Grygorii Strashko authored
      The CPSW ALE supports feature to rate limit number ingress
      broadcast(BC)/multicast(MC) packets per/sec which main purpose is BC/MC
      storm prevention.
      
      The ALE BC/MC packet rate limit configuration consist of two parts:
      - global
        ALE_CONTROL.ENABLE_RATE_LIMIT bit 0 which enables rate limiting globally
        ALE_PRESCALE.PRESCALE specifies rate limiting interval
      - per-port
        ALE_PORTCTLx.BCASTMCAST/_LIMIT specifies number of BC/MC packets allowed
        per rate limiting interval.
        When port.BCASTMCAST/_LIMIT is 0 rate limiting is disabled for Port.
      
      When BC/MC packet rate limiting is enabled the number of allowed packets
      per/sec is defined as:
        number_of_packets/sec = (Fclk / ALE_PRESCALE) * port.BCASTMCAST/_LIMIT
      
      Hence, the ALE_PRESCALE configuration is common for all ports the 1ms
      interval is selected and configured during ALE initialization while
      port.BCAST/MCAST_LIMIT are configured per-port.
      This allows to achieve:
       - min number_of_packets = 1000 when port.BCAST/MCAST_LIMIT = 1
       - max number_of_packets = 1000 * 255 = 255000
         when port.BCAST/MCAST_LIMIT = 0xFF
      
      The ALE_CONTROL.ENABLE_RATE_LIMIT can also be enabled once during ALE
      initialization as rate limiting enabled by non zero port.BCASTMCAST/_LIMIT
      values.
      
      This patch implements above logic in ALE and adds new ALE APIs
       cpsw_ale_rx_ratelimit_bc();
       cpsw_ale_rx_ratelimit_mc();
      Signed-off-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e3a5e33f
    • Russell King (Oracle)'s avatar
      net: phylink: remove phylink_helper_basex_speed() · 1a95e04e
      Russell King (Oracle) authored
      As there are now no users of phylink_helper_basex_speed(), we can
      remove this obsolete functionality.
      Signed-off-by: default avatarRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1a95e04e
    • Dan Carpenter's avatar
      net: ethernet: mtk_eth_soc: use after free in __mtk_ppe_check_skb() · 17a5f6a7
      Dan Carpenter authored
      The __mtk_foe_entry_clear() function frees "entry" so we have to use
      the _safe() version of hlist_for_each_entry() to prevent a use after
      free.
      
      Fixes: 33fc42de ("net: ethernet: mtk_eth_soc: support creating mac address based offload entries")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      17a5f6a7
    • Minghao Chi's avatar
      net: ethernet: ti: am65-cpsw-nuss: using pm_runtime_resume_and_get instead of pm_runtime_get_sync · 2240514c
      Minghao Chi authored
      Using pm_runtime_resume_and_get is more appropriate
      for simplifing code
      Reported-by: default avatarZeal Robot <zealci@zte.com.cn>
      Signed-off-by: default avatarMinghao Chi <chi.minghao@zte.com.cn>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2240514c
    • Lin Ma's avatar
      NFC: NULL out the dev->rfkill to prevent UAF · 1b0e8141
      Lin Ma authored
      Commit 3e3b5dfc ("NFC: reorder the logic in nfc_{un,}register_device")
      assumes the device_is_registered() in function nfc_dev_up() will help
      to check when the rfkill is unregistered. However, this check only
      take effect when device_del(&dev->dev) is done in nfc_unregister_device().
      Hence, the rfkill object is still possible be dereferenced.
      
      The crash trace in latest kernel (5.18-rc2):
      
      [   68.760105] ==================================================================
      [   68.760330] BUG: KASAN: use-after-free in __lock_acquire+0x3ec1/0x6750
      [   68.760756] Read of size 8 at addr ffff888009c93018 by task fuzz/313
      [   68.760756]
      [   68.760756] CPU: 0 PID: 313 Comm: fuzz Not tainted 5.18.0-rc2 #4
      [   68.760756] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
      [   68.760756] Call Trace:
      [   68.760756]  <TASK>
      [   68.760756]  dump_stack_lvl+0x57/0x7d
      [   68.760756]  print_report.cold+0x5e/0x5db
      [   68.760756]  ? __lock_acquire+0x3ec1/0x6750
      [   68.760756]  kasan_report+0xbe/0x1c0
      [   68.760756]  ? __lock_acquire+0x3ec1/0x6750
      [   68.760756]  __lock_acquire+0x3ec1/0x6750
      [   68.760756]  ? lockdep_hardirqs_on_prepare+0x410/0x410
      [   68.760756]  ? register_lock_class+0x18d0/0x18d0
      [   68.760756]  lock_acquire+0x1ac/0x4f0
      [   68.760756]  ? rfkill_blocked+0xe/0x60
      [   68.760756]  ? lockdep_hardirqs_on_prepare+0x410/0x410
      [   68.760756]  ? mutex_lock_io_nested+0x12c0/0x12c0
      [   68.760756]  ? nla_get_range_signed+0x540/0x540
      [   68.760756]  ? _raw_spin_lock_irqsave+0x4e/0x50
      [   68.760756]  _raw_spin_lock_irqsave+0x39/0x50
      [   68.760756]  ? rfkill_blocked+0xe/0x60
      [   68.760756]  rfkill_blocked+0xe/0x60
      [   68.760756]  nfc_dev_up+0x84/0x260
      [   68.760756]  nfc_genl_dev_up+0x90/0xe0
      [   68.760756]  genl_family_rcv_msg_doit+0x1f4/0x2f0
      [   68.760756]  ? genl_family_rcv_msg_attrs_parse.constprop.0+0x230/0x230
      [   68.760756]  ? security_capable+0x51/0x90
      [   68.760756]  genl_rcv_msg+0x280/0x500
      [   68.760756]  ? genl_get_cmd+0x3c0/0x3c0
      [   68.760756]  ? lock_acquire+0x1ac/0x4f0
      [   68.760756]  ? nfc_genl_dev_down+0xe0/0xe0
      [   68.760756]  ? lockdep_hardirqs_on_prepare+0x410/0x410
      [   68.760756]  netlink_rcv_skb+0x11b/0x340
      [   68.760756]  ? genl_get_cmd+0x3c0/0x3c0
      [   68.760756]  ? netlink_ack+0x9c0/0x9c0
      [   68.760756]  ? netlink_deliver_tap+0x136/0xb00
      [   68.760756]  genl_rcv+0x1f/0x30
      [   68.760756]  netlink_unicast+0x430/0x710
      [   68.760756]  ? memset+0x20/0x40
      [   68.760756]  ? netlink_attachskb+0x740/0x740
      [   68.760756]  ? __build_skb_around+0x1f4/0x2a0
      [   68.760756]  netlink_sendmsg+0x75d/0xc00
      [   68.760756]  ? netlink_unicast+0x710/0x710
      [   68.760756]  ? netlink_unicast+0x710/0x710
      [   68.760756]  sock_sendmsg+0xdf/0x110
      [   68.760756]  __sys_sendto+0x19e/0x270
      [   68.760756]  ? __ia32_sys_getpeername+0xa0/0xa0
      [   68.760756]  ? fd_install+0x178/0x4c0
      [   68.760756]  ? fd_install+0x195/0x4c0
      [   68.760756]  ? kernel_fpu_begin_mask+0x1c0/0x1c0
      [   68.760756]  __x64_sys_sendto+0xd8/0x1b0
      [   68.760756]  ? lockdep_hardirqs_on+0xbf/0x130
      [   68.760756]  ? syscall_enter_from_user_mode+0x1d/0x50
      [   68.760756]  do_syscall_64+0x3b/0x90
      [   68.760756]  entry_SYSCALL_64_after_hwframe+0x44/0xae
      [   68.760756] RIP: 0033:0x7f67fb50e6b3
      ...
      [   68.760756] RSP: 002b:00007f67fa91fe90 EFLAGS: 00000293 ORIG_RAX: 000000000000002c
      [   68.760756] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f67fb50e6b3
      [   68.760756] RDX: 000000000000001c RSI: 0000559354603090 RDI: 0000000000000003
      [   68.760756] RBP: 00007f67fa91ff00 R08: 00007f67fa91fedc R09: 000000000000000c
      [   68.760756] R10: 0000000000000000 R11: 0000000000000293 R12: 00007ffe824d496e
      [   68.760756] R13: 00007ffe824d496f R14: 00007f67fa120000 R15: 0000000000000003
      
      [   68.760756]  </TASK>
      [   68.760756]
      [   68.760756] Allocated by task 279:
      [   68.760756]  kasan_save_stack+0x1e/0x40
      [   68.760756]  __kasan_kmalloc+0x81/0xa0
      [   68.760756]  rfkill_alloc+0x7f/0x280
      [   68.760756]  nfc_register_device+0xa3/0x1a0
      [   68.760756]  nci_register_device+0x77a/0xad0
      [   68.760756]  nfcmrvl_nci_register_dev+0x20b/0x2c0
      [   68.760756]  nfcmrvl_nci_uart_open+0xf2/0x1dd
      [   68.760756]  nci_uart_tty_ioctl+0x2c3/0x4a0
      [   68.760756]  tty_ioctl+0x764/0x1310
      [   68.760756]  __x64_sys_ioctl+0x122/0x190
      [   68.760756]  do_syscall_64+0x3b/0x90
      [   68.760756]  entry_SYSCALL_64_after_hwframe+0x44/0xae
      [   68.760756]
      [   68.760756] Freed by task 314:
      [   68.760756]  kasan_save_stack+0x1e/0x40
      [   68.760756]  kasan_set_track+0x21/0x30
      [   68.760756]  kasan_set_free_info+0x20/0x30
      [   68.760756]  __kasan_slab_free+0x108/0x170
      [   68.760756]  kfree+0xb0/0x330
      [   68.760756]  device_release+0x96/0x200
      [   68.760756]  kobject_put+0xf9/0x1d0
      [   68.760756]  nfc_unregister_device+0x77/0x190
      [   68.760756]  nfcmrvl_nci_unregister_dev+0x88/0xd0
      [   68.760756]  nci_uart_tty_close+0xdf/0x180
      [   68.760756]  tty_ldisc_kill+0x73/0x110
      [   68.760756]  tty_ldisc_hangup+0x281/0x5b0
      [   68.760756]  __tty_hangup.part.0+0x431/0x890
      [   68.760756]  tty_release+0x3a8/0xc80
      [   68.760756]  __fput+0x1f0/0x8c0
      [   68.760756]  task_work_run+0xc9/0x170
      [   68.760756]  exit_to_user_mode_prepare+0x194/0x1a0
      [   68.760756]  syscall_exit_to_user_mode+0x19/0x50
      [   68.760756]  do_syscall_64+0x48/0x90
      [   68.760756]  entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      This patch just add the null out of dev->rfkill to make sure such
      dereference cannot happen. This is safe since the device_lock() already
      protect the check/write from data race.
      
      Fixes: 3e3b5dfc ("NFC: reorder the logic in nfc_{un,}register_device")
      Signed-off-by: default avatarLin Ma <linma@zju.edu.cn>
      Reviewed-by: default avatarKrzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1b0e8141
    • Guo Zhengkui's avatar
      ipv6: exthdrs: use swap() instead of open coding it · 5ee6ad1d
      Guo Zhengkui authored
      Address the following coccicheck warning:
      net/ipv6/exthdrs.c:620:44-45: WARNING opportunity for swap()
      
      by using swap() for the swapping of variable values and drop
      the tmp (`addr`) variable that is not needed any more.
      Signed-off-by: default avatarGuo Zhengkui <guozhengkui@vivo.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5ee6ad1d
    • Alaa Mohamed's avatar
      selftests: net: fib_rule_tests: add support to select a test to run · 816cda9a
      Alaa Mohamed authored
      Add boilerplate test loop in test to run all tests
      in fib_rule_tests.sh
      Signed-off-by: default avatarAlaa Mohamed <eng.alaamohamedsoliman.am@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      816cda9a
    • Lorenzo Bianconi's avatar
      net: ethernet: mtk_eth_soc: use standard property for cci-control-port · 4263f77a
      Lorenzo Bianconi authored
      Rely on standard cci-control-port property to identify CCI port
      reference.
      Update mt7622 dts binding.
      Signed-off-by: default avatarLorenzo Bianconi <lorenzo@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4263f77a