1. 14 May, 2021 8 commits
    • Jiaran Zhang's avatar
      net: hns3: refactor dev capability and dev spec of debugfs · c929bc2a
      Jiaran Zhang authored
      Currently, the debugfs command for dev capability and dev spec
      are implemented by "echo xxxx > cmd", and record the information
      in dmesg. It's unnecessary and heavy. To improve it, create a
      single file "dev_info" for them, and query them by command
      "cat dev_info", return the result to userspace, rather than
      record in dmesg.
      
      The display style is below:
      $cat dev_info
      dev capability:
      support FD: yes
      support GRO: yes
      support FEC: yes
      support UDP GSO: no
      support PTP: no
      support INT QL: no
      support HW TX csum: no
      support UDP tunnel csum: no
      support TX push: no
      support imp-controlled PHY: no
      support rxd advanced layout: no
      
      dev spec:
      MAC entry num: 0
      MNG entry num: 0
      MAX non tso bd num: 8
      RSS ind tbl size: 512
      RSS key size: 40
      RSS size: 1
      Allocated RSS size: 0
      Task queue pairs numbers: 1
      RX buffer length: 2048
      Desc num per TX queue: 1024
      Desc num per RX queue: 1024
      Total number of enabled TCs: 1
      MAX INT QL: 0
      MAX INT GL: 8160
      MAX TM RATE: 100000
      MAX QSET number: 1024
      Signed-off-by: default avatarJiaran Zhang <zhangjiaran@huawei.com>
      Signed-off-by: default avatarHuazhong Tan <tanhuazhong@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c929bc2a
    • Yufeng Mo's avatar
      net: hns3: refactor the debugfs process · 5e69ea7e
      Yufeng Mo authored
      Currently, each debugfs command needs to create a file to get
      the information. To better support more debugfs commands, the
      debugfs process is reconstructed, including the process of
      creating dentries and files, and obtaining information.
      Signed-off-by: default avatarYufeng Mo <moyufeng@huawei.com>
      Signed-off-by: default avatarHuazhong Tan <tanhuazhong@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5e69ea7e
    • Huazhong Tan's avatar
      net: hns3: refactor out RX completion checksum · 1ddc028a
      Huazhong Tan authored
      Only when RXD advanced layout is enabled, in some cases
      (e.g. ip fragments), the checksum of entire packet will be
      calculated and filled in the least significant 16 bits of
      the unused addr field.
      
      So refactor out the handling of RX completion checksum: adjust
      the location of the checksum in RX descriptor, and use ptype table
      to identify whether this kind of checksum is calculated.
      Signed-off-by: default avatarHuazhong Tan <tanhuazhong@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1ddc028a
    • Huazhong Tan's avatar
      net: hns3: support RXD advanced layout · 79664077
      Huazhong Tan authored
      Currently, the driver gets packet type by parsing the
      L3_ID/L4_ID/OL3_ID/OL4_ID from RX descriptor, it's
      time-consuming.
      
      Now some new devices support RXD advanced layout, which combines
      previous OL3_ID/OL4_ID to 8bit ptype field, so the driver gets
      packet type by looking up only one table, and L3_ID/L4_ID become
      reserved fields.
      
      Considering compatibility, the firmware will report capability of
      RXD advanced layout, the driver will identify and enable it by
      default. This patch provides basic function: identify and enable
      the RXD advanced layout, and refactor out hns3_rx_checksum() by
      using ptype table to handle RX checksum if supported.
      Signed-off-by: default avatarHuazhong Tan <tanhuazhong@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      79664077
    • Guenter Roeck's avatar
      net: thunderx: Drop unnecessary NULL check after container_of · fc25f9f6
      Guenter Roeck authored
      The result of container_of() operations is never NULL unless the embedded
      element is the first element of the structure. This is not the case here.
      The NULL check is therefore unnecessary and misleading. Remove it.
      
      This change was made automatically with the following Coccinelle script.
      
      @@
      type t;
      identifier v;
      statement s;
      @@
      
      <+...
      (
        t v = container_of(...);
      |
        v = container_of(...);
      )
        ...
        when != v
      - if (\( !v \| v == NULL \) ) s
      ...+>
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fc25f9f6
    • Heiner Kallweit's avatar
      sfc: don't use netif_info et al before net_device is registered · fa44821a
      Heiner Kallweit authored
      Using netif_info() before the net_device is registered results in ugly
      messages like the following:
      sfc 0000:01:00.1 (unnamed net_device) (uninitialized): Solarflare NIC detected
      Therefore use pci_info() et al until net_device is registered.
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fa44821a
    • Matteo Croce's avatar
      net: bridge: fix build when IPv6 is disabled · 30515832
      Matteo Croce authored
      The br_ip6_multicast_add_router() prototype is defined only when
      CONFIG_IPV6 is enabled, but the function is always referenced, so there
      is this build error with CONFIG_IPV6 not defined:
      
      net/bridge/br_multicast.c: In function ‘__br_multicast_enable_port’:
      net/bridge/br_multicast.c:1743:3: error: implicit declaration of function ‘br_ip6_multicast_add_router’; did you mean ‘br_ip4_multicast_add_router’? [-Werror=implicit-function-declaration]
       1743 |   br_ip6_multicast_add_router(br, port);
            |   ^~~~~~~~~~~~~~~~~~~~~~~~~~~
            |   br_ip4_multicast_add_router
      net/bridge/br_multicast.c: At top level:
      net/bridge/br_multicast.c:2804:13: warning: conflicting types for ‘br_ip6_multicast_add_router’
       2804 | static void br_ip6_multicast_add_router(struct net_bridge *br,
            |             ^~~~~~~~~~~~~~~~~~~~~~~~~~~
      net/bridge/br_multicast.c:2804:13: error: static declaration of ‘br_ip6_multicast_add_router’ follows non-static declaration
      net/bridge/br_multicast.c:1743:3: note: previous implicit declaration of ‘br_ip6_multicast_add_router’ was here
       1743 |   br_ip6_multicast_add_router(br, port);
            |   ^~~~~~~~~~~~~~~~~~~~~~~~~~~
      
      Fix this build error by moving the definition out of the #ifdef.
      
      Fixes: a3c02e76 ("net: bridge: mcast: split multicast router state for IPv4 and IPv6")
      Signed-off-by: default avatarMatteo Croce <mcroce@microsoft.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      30515832
    • Nikolay Aleksandrov's avatar
      net: bridge: fix br_multicast_is_router stub when igmp is disabled · bbc6f2cc
      Nikolay Aleksandrov authored
      br_multicast_is_router takes two arguments when bridge IGMP is enabled
      and just one when it's disabled, fix the stub to take two as well.
      
      Fixes: 1a3065a2 ("net: bridge: mcast: prepare is-router function for mcast router split")
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@nvidia.com>
      Acked-by: default avatarLinus Lüssing <linus.luessing@c0d3.blue>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bbc6f2cc
  2. 13 May, 2021 32 commits
    • Gustavo A. R. Silva's avatar
      net: mana: Use struct_size() in kzalloc() · ea89c862
      Gustavo A. R. Silva authored
      Make use of the struct_size() helper instead of an open-coded version,
      in order to avoid any potential type mistakes or integer overflows
      that, in the worst scenario, could lead to heap overflows.
      
      This code was detected with the help of Coccinelle and, audited and
      fixed manually.
      Signed-off-by: default avatarGustavo A. R. Silva <gustavoars@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ea89c862
    • Gustavo A. R. Silva's avatar
      bpf: Use struct_size() in kzalloc() · fe0bdaec
      Gustavo A. R. Silva authored
      Make use of the struct_size() helper instead of an open-coded version,
      in order to avoid any potential type mistakes or integer overflows
      that, in the worst scenario, could lead to heap overflows.
      
      This code was detected with the help of Coccinelle and, audited and
      fixed manually.
      Signed-off-by: default avatarGustavo A. R. Silva <gustavoars@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fe0bdaec
    • Guenter Roeck's avatar
      net: caif: Drop unnecessary NULL check after container_of · 0f3ee280
      Guenter Roeck authored
      The first parameter passed to chnl_recv_cb() can never be NULL since all
      callers dereferenced it. Consequently, container_of() on it is also never
      NULL, even though the reference into the structure points to the first
      element of the structure. The NULL check is therefore unnecessary.
      On top of that, it is misleading to perform a NULL check on the result of
      container_of() because the position of the contained element could change,
      which would make the test invalid. Remove the unnecessary NULL check.
      
      This change was made automatically with the following Coccinelle script.
      
      @@
      type t;
      identifier v;
      statement s;
      @@
      
      <+...
      (
        t v = container_of(...);
      |
        v = container_of(...);
      )
        ...
        when != v
      - if (\( !v \| v == NULL \) ) s
      ...+>
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0f3ee280
    • Colin Ian King's avatar
      net: qed: remove redundant initialization of variable rc · 5efe2575
      Colin Ian King authored
      The variable rc is being initialized with a value that is never read,
      it is being updated later on.  The assignment is redundant and can be
      removed.
      
      Addresses-Coverity: ("Unused value")
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5efe2575
    • David S. Miller's avatar
      Merge branch 'virtio_net-fixes' · 25e248a2
      David S. Miller authored
      Xuan Zhuo says:
      
      ====================
      virtio-net: fix for build_skb()
      
      The logic of this piece is really messy. Fortunately, my refactored patch can be
      completed with a small amount of testing.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      25e248a2
    • Xuan Zhuo's avatar
      virtio-net: get build_skb() buf by data ptr · 7bf64460
      Xuan Zhuo authored
      In the case of merge, the page passed into page_to_skb() may be a head
      page, not the page where the current data is located. So when trying to
      get the buf where the data is located, you should directly use the
      pointer(p) to get the address corresponding to the page.
      
      At the same time, the offset of the data in the page should also be
      obtained using offset_in_page().
      
      This patch solves this problem. But if you don’t use this patch, the
      original code can also run, because if the page is not the page of the
      current data, the calculated tailroom will be less than 0, and will not
      enter the logic of build_skb() . The significance of this patch is to
      modify this logical problem, allowing more situations to use
      build_skb().
      Signed-off-by: default avatarXuan Zhuo <xuanzhuo@linux.alibaba.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7bf64460
    • Xuan Zhuo's avatar
      virtio-net: fix for unable to handle page fault for address · 6c66c147
      Xuan Zhuo authored
      In merge mode, when xdp is enabled, if the headroom of buf is smaller
      than virtnet_get_headroom(), xdp_linearize_page() will be called but the
      variable of "headroom" is still 0, which leads to wrong logic after
      entering page_to_skb().
      
      [   16.600944] BUG: unable to handle page fault for address: ffffecbfff7b43c8[   16.602175] #PF: supervisor read access in kernel mode
      [   16.603350] #PF: error_code(0x0000) - not-present page
      [   16.604200] PGD 0 P4D 0
      [   16.604686] Oops: 0000 [#1] SMP PTI
      [   16.605306] CPU: 4 PID: 715 Comm: sh Tainted: G    B             5.12.0+ #312
      [   16.606429] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/04
      [   16.608217] RIP: 0010:unmap_page_range+0x947/0xde0
      [   16.609014] Code: 00 00 08 00 48 83 f8 01 45 19 e4 41 f7 d4 41 83 e4 03 e9 a4 fd ff ff e8 b7 63 ed ff 4c 89 e0 48 c1 e0 065
      [   16.611863] RSP: 0018:ffffc90002503c58 EFLAGS: 00010286
      [   16.612720] RAX: ffffecbfff7b43c0 RBX: 00007f19f7203000 RCX: ffffffff812ff359
      [   16.613853] RDX: ffff888107778000 RSI: 0000000000000000 RDI: 0000000000000005
      [   16.614976] RBP: ffffea000425e000 R08: 0000000000000000 R09: 3030303030303030
      [   16.616124] R10: ffffffff82ed7d94 R11: 6637303030302052 R12: 7c00000afffded0f
      [   16.617276] R13: 0000000000000001 R14: ffff888119ee7010 R15: 00007f19f7202000
      [   16.618423] FS:  0000000000000000(0000) GS:ffff88842fd00000(0000) knlGS:0000000000000000
      [   16.619738] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   16.620670] CR2: ffffecbfff7b43c8 CR3: 0000000103220005 CR4: 0000000000370ee0
      [   16.621792] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [   16.622920] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [   16.624047] Call Trace:
      [   16.624525]  ? release_pages+0x24d/0x730
      [   16.625209]  unmap_single_vma+0xa9/0x130
      [   16.625885]  unmap_vmas+0x76/0xf0
      [   16.626480]  exit_mmap+0xa0/0x210
      [   16.627129]  mmput+0x67/0x180
      [   16.627673]  do_exit+0x3d1/0xf10
      [   16.628259]  ? do_user_addr_fault+0x231/0x840
      [   16.629000]  do_group_exit+0x53/0xd0
      [   16.629631]  __x64_sys_exit_group+0x1d/0x20
      [   16.630354]  do_syscall_64+0x3c/0x80
      [   16.630988]  entry_SYSCALL_64_after_hwframe+0x44/0xae
      [   16.631828] RIP: 0033:0x7f1a043d0191
      [   16.632464] Code: Unable to access opcode bytes at RIP 0x7f1a043d0167.
      [   16.633502] RSP: 002b:00007ffe3d993308 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
      [   16.634737] RAX: ffffffffffffffda RBX: 00007f1a044c9490 RCX: 00007f1a043d0191
      [   16.635857] RDX: 000000000000003c RSI: 00000000000000e7 RDI: 0000000000000000
      [   16.636986] RBP: 0000000000000000 R08: ffffffffffffff88 R09: 0000000000000001
      [   16.638120] R10: 0000000000000008 R11: 0000000000000246 R12: 00007f1a044c9490
      [   16.639245] R13: 0000000000000001 R14: 00007f1a044c9968 R15: 0000000000000000
      [   16.640408] Modules linked in:
      [   16.640958] CR2: ffffecbfff7b43c8
      [   16.641557] ---[ end trace bc4891c6ce46354c ]---
      [   16.642335] RIP: 0010:unmap_page_range+0x947/0xde0
      [   16.643135] Code: 00 00 08 00 48 83 f8 01 45 19 e4 41 f7 d4 41 83 e4 03 e9 a4 fd ff ff e8 b7 63 ed ff 4c 89 e0 48 c1 e0 065
      [   16.645983] RSP: 0018:ffffc90002503c58 EFLAGS: 00010286
      [   16.646845] RAX: ffffecbfff7b43c0 RBX: 00007f19f7203000 RCX: ffffffff812ff359
      [   16.647970] RDX: ffff888107778000 RSI: 0000000000000000 RDI: 0000000000000005
      [   16.649091] RBP: ffffea000425e000 R08: 0000000000000000 R09: 3030303030303030
      [   16.650250] R10: ffffffff82ed7d94 R11: 6637303030302052 R12: 7c00000afffded0f
      [   16.651394] R13: 0000000000000001 R14: ffff888119ee7010 R15: 00007f19f7202000
      [   16.652529] FS:  0000000000000000(0000) GS:ffff88842fd00000(0000) knlGS:0000000000000000
      [   16.653887] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   16.654841] CR2: ffffecbfff7b43c8 CR3: 0000000103220005 CR4: 0000000000370ee0
      [   16.655992] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [   16.657150] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [   16.658290] Kernel panic - not syncing: Fatal exception
      [   16.659613] Kernel Offset: disabled
      [   16.660234] ---[ end Kernel panic - not syncing: Fatal exception ]---
      
      Fixes: fb32856b ("virtio-net: page_to_skb() use build_skb when there's sufficient tailroom")
      Signed-off-by: default avatarXuan Zhuo <xuanzhuo@linux.alibaba.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6c66c147
    • David S. Miller's avatar
      Merge branch 'atl1c-support-for-Mikrotik-10-25G-NIC-features' · 33b31426
      David S. Miller authored
      Gatis Peisenieks says:
      
      ====================
      atl1c: support for Mikrotik 10/25G NIC features
      
      The new Mikrotik 10/25G NIC maintains compatibility with existing atl1c
      driver. However it does have new features.
      
      This patch set adds support for reporting cards higher link speed, max-mtu,
      enables rx csum offload and improves tx performance.
      
      v2:
          - fixed xmit_more handling as pointed out by Eric Dumazet
          - added a more reliable link detection on Mikrotik 10/25G NIC
            since MDIO op emulation can occasionally fail
      Guangbin Huang says:
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      33b31426
    • Gatis Peisenieks's avatar
      atl1c: improve link detection reliability on Mikrotik 10/25G NIC · ea0fbd05
      Gatis Peisenieks authored
      Mikrotik 10/25G NIC emulates the MDIO accesses, but the emulation is
      not 100% reliable - the MDIO ops occasionally can timeout.
      
      This adds a reliable way of detecting link on Mikrotik 10/25G NIC.
      Signed-off-by: default avatarGatis Peisenieks <gatis@mikrotik.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ea0fbd05
    • Gatis Peisenieks's avatar
      atl1c: enable rx csum offload on Mikrotik 10/25G NIC · b0390009
      Gatis Peisenieks authored
      Mikrotik 10/25G NIC supports hw checksum verification on rx for
      IP/IPv6 + TCP/UDP packets. HW checksum offload helps reduce host
      cpu load.
      
      This enables the csum offload specifically for Mikrotik 10/25G NIC
      as other HW supported by the driver is known to have problems with it.
      
      TCP iperf3 to Threadripper 3960X with NIC improved 16.5 -> 20.0 Gbps
      with mtu=1500.
      Signed-off-by: default avatarGatis Peisenieks <gatis@mikrotik.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b0390009
    • Gatis Peisenieks's avatar
      atl1c: adjust max mtu according to Mikrotik 10/25G NIC ability · 545fa3fb
      Gatis Peisenieks authored
      The new Mikrotik 10/25G NIC supports jumbo frames. Jumbo frames are
      supported for TSO as well.
      
      This enables the support for mtu up to 9500 bytes.
      Signed-off-by: default avatarGatis Peisenieks <gatis@mikrotik.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      545fa3fb
    • Gatis Peisenieks's avatar
      atl1c: improve performance by avoiding unnecessary pcie writes on xmit · d7ab6419
      Gatis Peisenieks authored
      The kernel has xmit_more facility that hints the networking driver xmit
      path about whether more packets are coming soon. This information can be
      used to avoid unnecessary expensive PCIe transaction per tx packet.
      
      Max TX pps on Mikrotik 10/25G NIC in a Threadripper 3960X system
      improved from 1150Kpps to 1700Kpps.
      
      Testing L2 forwarding on AR8151 hardware did not reveal a measurable
      increase in latency.
      Signed-off-by: default avatarGatis Peisenieks <gatis@mikrotik.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d7ab6419
    • Gatis Peisenieks's avatar
      atl1c: show correct link speed on Mikrotik 10/25G NIC · f19d4997
      Gatis Peisenieks authored
      The new Mikrotik 10/25G NIC maintains compatibility with existing atl1c
      driver. However it does have new features.
      
      This defines some new register offsets, code for identifying the new type
      of NIC and correct speed detection for the NIC.
      Signed-off-by: default avatarGatis Peisenieks <gatis@mikrotik.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f19d4997
    • David S. Miller's avatar
      Merge branch 'hinic-cleanups' · 0d59c95e
      David S. Miller authored
      Guangbin Huang says:
      
      ====================
      net: hinic: some cleanups
      
      This patchset adds some cleanups for the hinic ethernet driver.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0d59c95e
    • Guangbin Huang's avatar
      net: hinic: fix misspelled "acessing" · 5db8c86e
      Guangbin Huang authored
      The word "acessing" is misspelled, so fix it.
      Signed-off-by: default avatarGuangbin Huang <huangguangbin2@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5db8c86e
    • Guangbin Huang's avatar
      net: hinic: remove unnecessary parentheses · c8ad5df6
      Guangbin Huang authored
      There are some unnecessary parentheses, this patch deletes them.
      Signed-off-by: default avatarGuangbin Huang <huangguangbin2@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c8ad5df6
    • Guangbin Huang's avatar
      net: hinic: add blank line after function declaration · 3402ab54
      Guangbin Huang authored
      There should be a blank line after function declaration, so add two
      missed blank lines.
      Signed-off-by: default avatarGuangbin Huang <huangguangbin2@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3402ab54
    • Guangbin Huang's avatar
      net: hinic: remove unnecessary blank line · 9afcb595
      Guangbin Huang authored
      There are two blank lines are unnecessary, this patch removes them.
      Signed-off-by: default avatarGuangbin Huang <huangguangbin2@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9afcb595
    • David S. Miller's avatar
      Merge branch 'bridge-split-ipv4-ipv6-mc-router-state' · d38717af
      David S. Miller authored
      Linus Lüssing says:
      
      ====================
      net: bridge: split IPv4/v6 mc router state and export for batman-adv
      
      The following patches are splitting the so far combined multicast router
      state in the Linux bridge into two ones, one for IPv4 and one for IPv6,
      for a more fine-grained detection of multicast routers. This avoids
      sending IPv4 multicast packets to an IPv6-only multicast router and
      avoids sending IPv6 multicast packets to an IPv4-only multicast router.
      This also allows batman-adv to make use of the now split information in
      the final patch.
      
      The first eight patches prepare the bridge code to avoid duplicate
      code or IPv6-#ifdef clutter for the multicast router state split. And
      contain no functional changes yet.
      
      The ninth patch then implements the IPv4+IPv6 multicast router state
      split.
      
      Patch number ten adds IPv4+IPv6 specific timers to the mdb netlink
      router port dump, so that the timers validity can be checked individually
      from userspace.
      
      The final, eleventh patch exports this now per protocol family multicast
      router state so that batman-adv can then later make full use of the
      Multicast Router Discovery (MRD) support in the Linux bridge. The
      batman-adv protocol format currently expects separate multicast router
      states for IPv4 and IPv6, therefore it depends on the first patch.
      batman-adv will then make use of this newly exported functions like
      this[0].
      
      Regards, Linus
      
      [0]: https://git.open-mesh.org/batman-adv.git/shortlog/refs/heads/linus/multicast-routeable-mrd
           -> https://git.open-mesh.org/batman-adv.git/commit/d4bed3a92427445708baeb1f2d1841c5fb816fd4
      
      Changelog v3:
      
      * Patch 01/11:
        * fixed/added missing rename of br->router_list to
          br->ip4_mc_router_list in br_multicast_flood()
      * Patch 02/11:
        * moved inline functions from br_forward.c to br_private.h
      * Patch 03/11:
        * removed inline attribute from functions added to br_mdb.c
      * Patch 04/11:
        * unchanged
      * Patch 05/11:
        * converted if()'s into switch-case in br_multicast_is_router()
      * Patch 06/11:
        * removed inline attribute from function added to br_multicast.c
      * Patch 07/11:
        * added missing static attribute to function
          br_ip4_multicast_get_rport_slot() added to br_multicast.c
      * Patch 08/11:
        * removed inline attribute from function added to br_multicast.c
      * Patch 09/11:
        * added missing static attribute to function
          br_ip6_multicast_get_rport_slot() added to br_multicast.c
        * removed inline attribute from function added to br_multicast.c
      * Patch 10/11:
        * unchanged
      * Patch 11/11:
        * simplified bridge check in br_multicast_has_router_adjacent()
          by using br_port_get_check_rcu()
        * added missing declaration for br_multicast_has_router_adjacent()
          in include/linux/if_bridge.h
      
      Changelog v2:
      
      * split into multiple patches as suggested by Nikolay
      * added helper functions to br_multicast_flood(), avoiding
        IPv6 #ifdef clutter
      * fixed reverse xmas tree ordering in br_rports_fill_info() and
        added helper functions to avoid IPv6 #ifdef clutter
      * Added a common br_multicast_add_router() and a helper function
        to retrieve the correct slot to avoid duplicate code for an
        ip4 and ip6 variant
      * replaced the "1" and "2" constants in br_multicast_is_router()
        with the appropriate enums
      * added br_{ip4,ip6}_multicast_rport_del() wrappers to reduce
        IPv6 #ifdef clutter
      * added return values to br_*multicast_rport_del() to only notify
        if the port was actually removed and did not race with a readdition
        somewhere else
      * added empty, void br_ip6_multicast_mark_router() if compiled
        without IPv6, to reduce IPv6 #ifdef clutter
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d38717af
    • Linus Lüssing's avatar
      net: bridge: mcast: export multicast router presence adjacent to a port · 3b85f9ba
      Linus Lüssing authored
      To properly support routable multicast addresses in batman-adv in a
      group-aware way, a batman-adv node needs to know if it serves multicast
      routers.
      
      This adds a function to the bridge to export this so that batman-adv
      can then make full use of the Multicast Router Discovery capability of
      the bridge.
      Signed-off-by: default avatarLinus Lüssing <linus.luessing@c0d3.blue>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3b85f9ba
    • Linus Lüssing's avatar
      net: bridge: mcast: add ip4+ip6 mcast router timers to mdb netlink · b7fb0916
      Linus Lüssing authored
      Now that we have split the multicast router state into two, one for IPv4
      and one for IPv6, also add individual timers to the mdb netlink router
      port dump. Leaving the old timer attribute for backwards compatibility.
      Signed-off-by: default avatarLinus Lüssing <linus.luessing@c0d3.blue>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b7fb0916
    • Linus Lüssing's avatar
      net: bridge: mcast: split multicast router state for IPv4 and IPv6 · a3c02e76
      Linus Lüssing authored
      A multicast router for IPv4 does not imply that the same host also is a
      multicast router for IPv6 and vice versa.
      
      To reduce multicast traffic when a host is only a multicast router for
      one of these two protocol families, keep router state for IPv4 and IPv6
      separately. Similar to how querier state is kept separately.
      
      For backwards compatibility for netlink and switchdev notifications
      these two will still only notify if a port switched from either no
      IPv4/IPv6 multicast router to any IPv4/IPv6 multicast router or the
      other way round. However a full netlink MDB router dump will now also
      include a multicast router timeout for both IPv4 and IPv6.
      Signed-off-by: default avatarLinus Lüssing <linus.luessing@c0d3.blue>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a3c02e76
    • Linus Lüssing's avatar
      net: bridge: mcast: split router port del+notify for mcast router split · ed2d3597
      Linus Lüssing authored
      In preparation for the upcoming split of multicast router state into
      their IPv4 and IPv6 variants split router port deletion and notification
      into two functions. When we disable a port for instance later we want to
      only send one notification to switchdev and netlink for compatibility
      and want to avoid sending one for IPv4 and one for IPv6. For that the
      split is needed.
      Signed-off-by: default avatarLinus Lüssing <linus.luessing@c0d3.blue>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ed2d3597
    • Linus Lüssing's avatar
      net: bridge: mcast: prepare add-router function for mcast router split · d9b8c4d8
      Linus Lüssing authored
      In preparation for the upcoming split of multicast router state into
      their IPv4 and IPv6 variants move the protocol specific router list
      and timer access to ip4 wrapper functions.
      Signed-off-by: default avatarLinus Lüssing <linus.luessing@c0d3.blue>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d9b8c4d8
    • Linus Lüssing's avatar
      net: bridge: mcast: prepare expiry functions for mcast router split · ee5fb222
      Linus Lüssing authored
      In preparation for the upcoming split of multicast router state into
      their IPv4 and IPv6 variants move the protocol specific timer access to
      an ip4 wrapper function.
      Signed-off-by: default avatarLinus Lüssing <linus.luessing@c0d3.blue>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ee5fb222
    • Linus Lüssing's avatar
      net: bridge: mcast: prepare is-router function for mcast router split · 1a3065a2
      Linus Lüssing authored
      In preparation for the upcoming split of multicast router state into
      their IPv4 and IPv6 variants make br_multicast_is_router() protocol
      family aware.
      
      Note that for now br_ip6_multicast_is_router() uses the currently still
      common ip4_mc_router_timer for now. It will be renamed to
      ip6_mc_router_timer later when the split is performed.
      
      While at it also renames the "1" and "2" constants in
      br_multicast_is_router() to the MDB_RTR_TYPE_TEMP_QUERY and
      MDB_RTR_TYPE_PERM enums.
      Signed-off-by: default avatarLinus Lüssing <linus.luessing@c0d3.blue>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1a3065a2
    • Linus Lüssing's avatar
      net: bridge: mcast: prepare query reception for mcast router split · b19232ef
      Linus Lüssing authored
      In preparation for the upcoming split of multicast router state into
      their IPv4 and IPv6 variants and as the br_multicast_mark_router() will
      be split for that remove the select querier wrapper and instead add
      ip4 and ip6 variants for br_multicast_query_received().
      Signed-off-by: default avatarLinus Lüssing <linus.luessing@c0d3.blue>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b19232ef
    • Linus Lüssing's avatar
      net: bridge: mcast: prepare mdb netlink for mcast router split · ff391c5d
      Linus Lüssing authored
      In preparation for the upcoming split of multicast router state into
      their IPv4 and IPv6 variants and to avoid IPv6 #ifdef clutter later add
      some inline functions for the protocol specific parts in the mdb router
      netlink code. Also the we need iterate over the port instead of router
      list to be able put one router port entry with both the IPv4 and IPv6
      multicast router info later.
      Signed-off-by: default avatarLinus Lüssing <linus.luessing@c0d3.blue>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ff391c5d
    • Linus Lüssing's avatar
      net: bridge: mcast: add wrappers for router node retrieval · 44ebb081
      Linus Lüssing authored
      In preparation for the upcoming split of multicast router state into
      their IPv4 and IPv6 variants and to avoid IPv6 #ifdef clutter later add
      two wrapper functions for router node retrieval in the payload
      forwarding code.
      Signed-off-by: default avatarLinus Lüssing <linus.luessing@c0d3.blue>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      44ebb081
    • Linus Lüssing's avatar
      net: bridge: mcast: rename multicast router lists and timers · ce6f7097
      Linus Lüssing authored
      In preparation for the upcoming split of multicast router state into
      their IPv4 and IPv6 variants, rename the affected variable to the IPv4
      version first to avoid some renames in later commits.
      Signed-off-by: default avatarLinus Lüssing <linus.luessing@c0d3.blue>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ce6f7097
    • Sebastian Andrzej Siewior's avatar
      net: Treat __napi_schedule_irqoff() as __napi_schedule() on PREEMPT_RT · 8380c81d
      Sebastian Andrzej Siewior authored
      __napi_schedule_irqoff() is an optimized version of __napi_schedule()
      which can be used where it is known that interrupts are disabled,
      e.g. in interrupt-handlers, spin_lock_irq() sections or hrtimer
      callbacks.
      
      On PREEMPT_RT enabled kernels this assumptions is not true. Force-
      threaded interrupt handlers and spinlocks are not disabling interrupts
      and the NAPI hrtimer callback is forced into softirq context which runs
      with interrupts enabled as well.
      
      Chasing all usage sites of __napi_schedule_irqoff() is a whack-a-mole
      game so make __napi_schedule_irqoff() invoke __napi_schedule() for
      PREEMPT_RT kernels.
      
      The callers of ____napi_schedule() in the networking core have been
      audited and are correct on PREEMPT_RT kernels as well.
      Reported-by: default avatarJuri Lelli <juri.lelli@redhat.com>
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarJuri Lelli <juri.lelli@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8380c81d
    • Johannes Berg's avatar
      alx: use fine-grained locking instead of RTNL · 4a5fe57e
      Johannes Berg authored
      In the alx driver, all locking depended on the RTNL, but
      that causes issues with ipconfig ("ip=..." command line)
      because that waits for the netdev to have a carrier while
      holding the RTNL, but the alx workers etc. require RTNL,
      so the carrier won't be set until the RTNL is dropped and
      can be acquired by alx workers. This causes long delays
      at boot, as reported by Nikolai Zhubr.
      
      Really the only sensible thing to do here is to not use
      the RTNL for everything, but instead have fine-grained
      locking for just the driver. Do that, it's not that hard.
      Reported-by: default avatarNikolai Zhubr <zhubr.2@gmail.com>
      Signed-off-by: default avatarJohannes Berg <johannes@sipsolutions.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4a5fe57e