1. 11 May, 2024 7 commits
  2. 10 May, 2024 4 commits
    • David S. Miller's avatar
      Merge tag 'gtp-24-05-07' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/gtp · f8beae07
      David S. Miller authored
      Pablo neira Ayuso says:
      
      ====================
      gtp pull request 24-05-07
      
      This v3 includes:
      - fix for clang uninitialized variable per Jakub.
      - address Smatch and Coccinelle reports per Simon
      - remove inline in new IPv6 support per Simon
      - fix memleaks in netlink control plane per Simon
      -o-
      
      The following patchset contains IPv6 GTP driver support for net-next,
      this also includes IPv6 over IPv4 and vice-versa:
      
      Patch #1 removes a unnecessary stack variable initialization in the
               socket routine.
      
      Patch #2 deals with GTP extension headers. This variable length extension
               header to decapsulate packets accordingly. Otherwise, packets are
               dropped when these extension headers are present which breaks
               interoperation with other non-Linux based GTP implementations.
      
      Patch #3 prepares for IPv6 support by moving IPv4 specific fields in PDP
               context objects to a union.
      
      Patch #4 adds IPv6 support while retaining backward compatibility.
               Three new attributes allows to declare an IPv6 GTP tunnel
               GTPA_FAMILY, GTPA_PEER_ADDR6 and GTPA_MS_ADDR6 as well as
               IFLA_GTP_LOCAL6 to declare the IPv6 GTP UDP socket. Up to this
               patch, only IPv6 outer in IPv6 inner is supported.
      
      Patch #5 uses IPv6 address /64 prefix for UE/MS in the inner headers.
               Unlike IPv4, which provides a 1:1 mapping between UE/MS,
               IPv6 tunnel encapsulates traffic for /64 address as specified
               by 3GPP TS. Patch has been split from Patch #4 to highlight
               this behaviour.
      
      Patch #6 passes up IPv6 link-local traffic, such as IPv6 SLAAC, for
               handling to userspace so they are handled as control packets.
      
      Patch #7 prepares to allow for GTP IPv4 over IPv6 and vice-versa by
               moving IP specific debugging out of the function to build
               IPv4 and IPv6 GTP packets.
      
      Patch #8 generalizes TOS/DSCP handling following similar approach as
               in the existing iptunnel infrastructure.
      
      Patch #9 adds a helper function to build an IPv4 GTP packet in the outer
               header.
      
      Patch #10 adds a helper function to build an IPv6 GTP packet in the outer
                header.
      
      Patch #11 adds support for GTP IPv4-over-IPv6 and vice-versa.
      
      Patch #12 allows to use the same TID/TEID (tunnel identifier) for inner
                IPv4 and IPv6 packets for better UE/MS dual stack integration.
      
      This series integrates with the osmocom.org project CI and TTCN-3 test
      infrastructure (Oliver Smith) as well as the userspace libgtpnl library.
      
      Thanks to Harald Welte, Oliver Smith and Pau Espin for reviewing and
      providing feedback through the osmocom.org redmine platform to make this
      happen.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f8beae07
    • gaoxingwang's avatar
      net: ipv6: fix wrong start position when receive hop-by-hop fragment · 1cd354fe
      gaoxingwang authored
      In IPv6, ipv6_rcv_core will parse the hop-by-hop type extension header and increase skb->transport_header by one extension header length.
      But if there are more other extension headers like fragment header at this time, the skb->transport_header points to the second extension header,
      not the transport layer header or the first extension header.
      
      This will result in the start and nexthdrp variable not pointing to the same position in ipv6frag_thdr_trunced,
      and ipv6_skip_exthdr returning incorrect offset and frag_off.Sometimes,the length of the last sharded packet is smaller than the calculated incorrect offset, resulting in packet loss.
      We can use network header to offset and calculate the correct position to solve this problem.
      
      Fixes: 9d9e937b (ipv6/netfilter: Discard first fragment not including all headers)
      Signed-off-by: default avatarGao Xingwang <gaoxingwang1@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1cd354fe
    • Eric Dumazet's avatar
      tcp: get rid of twsk_unique() · 383eed2d
      Eric Dumazet authored
      DCCP is going away soon, and had no twsk_unique() method.
      
      We can directly call tcp_twsk_unique() for TCP sockets.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Link: https://lore.kernel.org/r/20240507164140.940547-1-edumazet@google.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      383eed2d
    • Praveen Kumar Kannoju's avatar
      net/sched: adjust device watchdog timer to detect stopped queue at right time · 33fb988b
      Praveen Kumar Kannoju authored
      Applications are sensitive to long network latency, particularly
      heartbeat monitoring ones. Longer the tx timeout recovery higher the
      risk with such applications on a production machines. This patch
      remedies, yet honoring device set tx timeout.
      
      Modify watchdog next timeout to be shorter than the device specified.
      Compute the next timeout be equal to device watchdog timeout less the
      how long ago queue stop had been done. At next watchdog timeout tx
      timeout handler is called into if still in stopped state. Either called
      or not called, restore the watchdog timeout back to device specified.
      Signed-off-by: default avatarPraveen Kumar Kannoju <praveen.kannoju@oracle.com>
      Link: https://lore.kernel.org/r/20240508133617.4424-1-praveen.kannoju@oracle.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      33fb988b
  3. 09 May, 2024 29 commits
    • Joe Damato's avatar
      selftest: epoll_busy_poll: epoll busy poll tests · 60e0f986
      Joe Damato authored
      Add a simple test for the epoll busy poll ioctls, using the kernel
      selftest harness.
      
      This test ensures that the ioctls have the expected return codes and
      that the kernel properly gets and sets epoll busy poll parameters.
      
      The test can be expanded in the future to do real busy polling (provided
      another machine to act as the client is available).
      Signed-off-by: default avatarJoe Damato <jdamato@fastly.com>
      Link: https://lore.kernel.org/r/20240508184008.48264-1-jdamato@fastly.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      60e0f986
    • Jakub Kicinski's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · e7073830
      Jakub Kicinski authored
      Cross-merge networking fixes after downstream PR.
      
      No conflicts.
      
      Adjacent changes:
      
      drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
        35d92abf ("net: hns3: fix kernel crash when devlink reload during initialization")
        2a1a1a7b ("net: hns3: add command queue trace for hns3")
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e7073830
    • Linus Torvalds's avatar
      Merge tag 'net-6.9-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 8c3b7565
      Linus Torvalds authored
      Pull networking fixes from Paolo Abeni:
       "Including fixes from bluetooth and IPsec.
      
        The bridge patch is actually a follow-up to a recent fix in the same
        area. We have a pending v6.8 AF_UNIX regression; it should be solved
        soon, but not in time for this PR.
      
        Current release - regressions:
      
         - eth: ks8851: Queue RX packets in IRQ handler instead of disabling
           BHs
      
         - net: bridge: fix corrupted ethernet header on multicast-to-unicast
      
        Current release - new code bugs:
      
         - xfrm: fix possible bad pointer derferencing in error path
      
        Previous releases - regressionis:
      
         - core: fix out-of-bounds access in ops_init
      
         - ipv6:
            - fix potential uninit-value access in __ip6_make_skb()
            - fib6_rules: avoid possible NULL dereference in fib6_rule_action()
      
         - tcp: use refcount_inc_not_zero() in tcp_twsk_unique().
      
         - rtnetlink: correct nested IFLA_VF_VLAN_LIST attribute validation
      
         - rxrpc: fix congestion control algorithm
      
         - bluetooth:
            - l2cap: fix slab-use-after-free in l2cap_connect()
            - msft: fix slab-use-after-free in msft_do_close()
      
         - eth: hns3: fix kernel crash when devlink reload during
           initialization
      
         - eth: dsa: mv88e6xxx: add phylink_get_caps for the mv88e6320/21
           family
      
        Previous releases - always broken:
      
         - xfrm: preserve vlan tags for transport mode software GRO
      
         - tcp: defer shutdown(SEND_SHUTDOWN) for TCP_SYN_RECV sockets
      
         - eth: hns3: keep using user config after hardware reset"
      
      * tag 'net-6.9-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (47 commits)
        net: dsa: mv88e6xxx: read cmode on mv88e6320/21 serdes only ports
        net: dsa: mv88e6xxx: add phylink_get_caps for the mv88e6320/21 family
        net: hns3: fix kernel crash when devlink reload during initialization
        net: hns3: fix port vlan filter not disabled issue
        net: hns3: use appropriate barrier function after setting a bit value
        net: hns3: release PTP resources if pf initialization failed
        net: hns3: change type of numa_node_mask as nodemask_t
        net: hns3: direct return when receive a unknown mailbox message
        net: hns3: using user configure after hardware reset
        net/smc: fix neighbour and rtable leak in smc_ib_find_route()
        ipv6: prevent NULL dereference in ip6_output()
        hsr: Simplify code for announcing HSR nodes timer setup
        ipv6: fib6_rules: avoid possible NULL dereference in fib6_rule_action()
        dt-bindings: net: mediatek: remove wrongly added clocks and SerDes
        rxrpc: Only transmit one ACK per jumbo packet received
        rxrpc: Fix congestion control algorithm
        selftests: test_bridge_neigh_suppress.sh: Fix failures due to duplicate MAC
        ipv6: Fix potential uninit-value access in __ip6_make_skb()
        net: phy: marvell-88q2xxx: add support for Rev B1 and B2
        appletalk: Improve handling of broadcast packets
        ...
      8c3b7565
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux · 62788b0f
      Linus Torvalds authored
      Pull ARM fix from Russell King:
      
       - clear stale KASan stack poison when a CPU resumes
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux:
        ARM: 9381/1: kasan: clear stale stack poison
      62788b0f
    • Linus Torvalds's avatar
      Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 1bbc9915
      Linus Torvalds authored
      Pull dentry leak fix from Al Viro:
       "Dentry leak fix in the qibfs driver that I forgot to send a pull
        request for ;-/
      
        My apologies - it actually sat in vfs.git#fixes for more than two
        months..."
      
      * tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        qibfs: fix dentry leak
      1bbc9915
    • Samuel Thibault's avatar
      l2tp: Support several sockets with same IP/port quadruple · 628bc3e5
      Samuel Thibault authored
      Some l2tp providers will use 1701 as origin port and open several
      tunnels for the same origin and target. On the Linux side, this
      may mean opening several sockets, but then trafic will go to only
      one of them, losing the trafic for the tunnel of the other socket
      (or leaving it up to userland, consuming a lot of cpu%).
      
      This can also happen when the l2tp provider uses a cluster, and
      load-balancing happens to migrate from one origin IP to another one,
      for which a socket was already established. Managing reassigning
      tunnels from one socket to another would be very hairy for userland.
      
      Lastly, as documented in l2tpconfig(1), as client it may be necessary
      to use 1701 as origin port for odd firewalls reasons, which could
      prevent from establishing several tunnels to a l2tp server, for the
      same reason: trafic would get only on one of the two sockets.
      
      With the V2 protocol it is however easy to route trafic to the proper
      tunnel, by looking up the tunnel number in the network namespace. This
      fixes the three cases altogether.
      Signed-off-by: default avatarSamuel Thibault <samuel.thibault@ens-lyon.org>
      Link: https://lore.kernel.org/r/20240506215336.1470009-1-samuel.thibault@ens-lyon.orgSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      628bc3e5
    • Steffen Bätz's avatar
      net: dsa: mv88e6xxx: read cmode on mv88e6320/21 serdes only ports · 6e7ffa18
      Steffen Bätz authored
      On the mv88e6320 and 6321 switch family, port 0/1 are serdes only ports.
      Modified the mv88e6352_get_port4_serdes_cmode function to pass a port
      number since the register set of the 6352 is equal on the 6320/21.
      Signed-off-by: default avatarSteffen Bätz <steffen@innosonix.de>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarFabio Estevam <festevam@gmail.com>
      Link: https://lore.kernel.org/r/20240508072944.54880-3-steffen@innosonix.deSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      6e7ffa18
    • Steffen Bätz's avatar
      net: dsa: mv88e6xxx: add phylink_get_caps for the mv88e6320/21 family · f39bf3cf
      Steffen Bätz authored
      As of commit de5c9bf4 ("net: phylink: require supported_interfaces to
      be filled")
      Marvell 88e6320/21 switches fail to be probed:
      
      ...
      mv88e6085 30be0000.ethernet-1:00: phylink: error: empty supported_interfaces
      error creating PHYLINK: -22
      ...
      
      The problem stems from the use of mv88e6185_phylink_get_caps() to get
      the device capabilities.
      Since there are serdes only ports 0/1 included, create a new dedicated
      phylink_get_caps for the 6320 and 6321 to properly support their
      set of capabilities.
      
      Fixes: de5c9bf4 ("net: phylink: require supported_interfaces to be filled")
      Signed-off-by: default avatarSteffen Bätz <steffen@innosonix.de>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarFabio Estevam <festevam@gmail.com>
      Link: https://lore.kernel.org/r/20240508072944.54880-2-steffen@innosonix.deSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      f39bf3cf
    • Paolo Abeni's avatar
      Merge branch 'there-are-some-bugfix-for-the-hns3-ethernet-driver' · 393ceeb9
      Paolo Abeni authored
      Jijie Shao says:
      
      ====================
      There are some bugfix for the HNS3 ethernet driver
      ====================
      
      Link: https://lore.kernel.org/r/20240507134224.2646246-1-shaojijie@huawei.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      393ceeb9
    • Yonglong Liu's avatar
      net: hns3: fix kernel crash when devlink reload during initialization · 35d92abf
      Yonglong Liu authored
      The devlink reload process will access the hardware resources,
      but the register operation is done before the hardware is initialized.
      So, processing the devlink reload during initialization may lead to kernel
      crash.
      
      This patch fixes this by registering the devlink after
      hardware initialization.
      
      Fixes: cd624299 ("net: hns3: add support for registering devlink for VF")
      Fixes: 93305b77 ("net: hns3: fix kernel crash when devlink reload during pf initialization")
      Signed-off-by: default avatarYonglong Liu <liuyonglong@huawei.com>
      Signed-off-by: default avatarJijie Shao <shaojijie@huawei.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      35d92abf
    • Yonglong Liu's avatar
      net: hns3: fix port vlan filter not disabled issue · f5db7a3b
      Yonglong Liu authored
      According to hardware limitation, for device support modify
      VLAN filter state but not support bypass port VLAN filter,
      it should always disable the port VLAN filter. but the driver
      enables port VLAN filter when initializing, if there is no
      VLAN(except VLAN 0) id added, the driver will disable it
      in service task. In most time, it works fine. But there is
      a time window before the service task shceduled and net device
      being registered. So if user adds VLAN at this time, the driver
      will not update the VLAN filter state,  and the port VLAN filter
      remains enabled.
      
      To fix the problem, if support modify VLAN filter state but not
      support bypass port VLAN filter, set the port vlan filter to "off".
      
      Fixes: 184cd221 ("net: hns3: disable port VLAN filter when support function level VLAN filter control")
      Fixes: 2ba30662 ("net: hns3: add support for modify VLAN filter state")
      Signed-off-by: default avatarYonglong Liu <liuyonglong@huawei.com>
      Signed-off-by: default avatarJijie Shao <shaojijie@huawei.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      f5db7a3b
    • Peiyang Wang's avatar
      net: hns3: use appropriate barrier function after setting a bit value · 094c2812
      Peiyang Wang authored
      There is a memory barrier in followed case. When set the port down,
      hclgevf_set_timmer will set DOWN in state. Meanwhile, the service task has
      different behaviour based on whether the state is DOWN. Thus, to make sure
      service task see DOWN, use smp_mb__after_atomic after calling set_bit().
      
                CPU0                        CPU1
      ========================== ===================================
      hclgevf_set_timer_task()    hclgevf_periodic_service_task()
        set_bit(DOWN,state)         test_bit(DOWN,state)
      
      pf also has this issue.
      
      Fixes: ff200099 ("net: hns3: remove unnecessary work in hclgevf_main")
      Fixes: 1c6dfe6f ("net: hns3: remove mailbox and reset work in hclge_main")
      Signed-off-by: default avatarPeiyang Wang <wangpeiyang1@huawei.com>
      Signed-off-by: default avatarJijie Shao <shaojijie@huawei.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      094c2812
    • Peiyang Wang's avatar
      net: hns3: release PTP resources if pf initialization failed · 950aa423
      Peiyang Wang authored
      During the PF initialization process, hclge_update_port_info may return an
      error code for some reason. At this point,  the ptp initialization has been
      completed. To void memory leaks, the resources that are applied by ptp
      should be released. Therefore, when hclge_update_port_info returns an error
      code, hclge_ptp_uninit is called to release the corresponding resources.
      
      Fixes: eaf83ae5 ("net: hns3: add querying fec ability from firmware")
      Signed-off-by: default avatarPeiyang Wang <wangpeiyang1@huawei.com>
      Signed-off-by: default avatarJijie Shao <shaojijie@huawei.com>
      Reviewed-by: default avatarHariprasad Kelam <hkelam@marvell.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      950aa423
    • Peiyang Wang's avatar
      net: hns3: change type of numa_node_mask as nodemask_t · 6639a7b9
      Peiyang Wang authored
      It provides nodemask_t to describe the numa node mask in kernel. To
      improve transportability, change the type of numa_node_mask as nodemask_t.
      
      Fixes: 38caee9d ("net: hns3: Add support of the HNAE3 framework")
      Signed-off-by: default avatarPeiyang Wang <wangpeiyang1@huawei.com>
      Signed-off-by: default avatarJijie Shao <shaojijie@huawei.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      6639a7b9
    • Jian Shen's avatar
      net: hns3: direct return when receive a unknown mailbox message · 669554c5
      Jian Shen authored
      Currently, the driver didn't return when receive a unknown
      mailbox message, and continue checking whether need to
      generate a response. It's unnecessary and may be incorrect.
      
      Fixes: bb5790b7 ("net: hns3: refactor mailbox response scheme between PF and VF")
      Signed-off-by: default avatarJian Shen <shenjian15@huawei.com>
      Signed-off-by: default avatarJijie Shao <shaojijie@huawei.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      669554c5
    • Peiyang Wang's avatar
      net: hns3: using user configure after hardware reset · 05eb60e9
      Peiyang Wang authored
      When a reset occurring, it's supposed to recover user's configuration.
      Currently, the port info(speed, duplex and autoneg) is stored in hclge_mac
      and will be scheduled updated. Consider the case that reset was happened
      consecutively. During the first reset, the port info is configured with
      a temporary value cause the PHY is reset and looking for best link config.
      Second reset start and use pervious configuration which is not the user's.
      The specific process is as follows:
      
      +------+               +----+                +----+
      | USER |               | PF |                | HW |
      +---+--+               +-+--+                +-+--+
          |  ethtool --reset   |                     |
          +------------------->|    reset command    |
          |  ethtool --reset   +-------------------->|
          +------------------->|                     +---+
          |                    +---+                 |   |
          |                    |   |reset currently  |   | HW RESET
          |                    |   |and wait to do   |   |
          |                    |<--+                 |   |
          |                    | send pervious cfg   |<--+
          |                    | (1000M FULL AN_ON)  |
          |                    +-------------------->|
          |                    | read cfg(time task) |
          |                    | (10M HALF AN_OFF)   +---+
          |                    |<--------------------+   | cfg take effect
          |                    |    reset command    |<--+
          |                    +-------------------->|
          |                    |                     +---+
          |                    | send pervious cfg   |   | HW RESET
          |                    | (10M HALF AN_OFF)   |<--+
          |                    +-------------------->|
          |                    | read cfg(time task) |
          |                    |  (10M HALF AN_OFF)  +---+
          |                    |<--------------------+   | cfg take effect
          |                    |                     |   |
          |                    | read cfg(time task) |<--+
          |                    |  (10M HALF AN_OFF)  |
          |                    |<--------------------+
          |                    |                     |
          v                    v                     v
      
      To avoid aboved situation, this patch introduced req_speed, req_duplex,
      req_autoneg to store user's configuration and it only be used after
      hardware reset and to recover user's configuration
      
      Fixes: f5f2b3e4 ("net: hns3: add support for imp-controlled PHYs")
      Signed-off-by: default avatarPeiyang Wang <wangpeiyang1@huawei.com>
      Signed-off-by: default avatarJijie Shao <shaojijie@huawei.com>
      Reviewed-by: default avatarPrzemek Kitszel <przemyslaw.kitszel@intel.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      05eb60e9
    • Wen Gu's avatar
      net/smc: fix neighbour and rtable leak in smc_ib_find_route() · 2ddc0dd7
      Wen Gu authored
      In smc_ib_find_route(), the neighbour found by neigh_lookup() and rtable
      resolved by ip_route_output_flow() are not released or put before return.
      It may cause the refcount leak, so fix it.
      
      Link: https://lore.kernel.org/r/20240506015439.108739-1-guwen@linux.alibaba.com
      Fixes: e5c4744c ("net/smc: add SMC-Rv2 connection establishment")
      Signed-off-by: default avatarWen Gu <guwen@linux.alibaba.com>
      Link: https://lore.kernel.org/r/20240507125331.2808-1-guwen@linux.alibaba.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      2ddc0dd7
    • Jakub Kicinski's avatar
      Merge tag 'wireless-next-2024-05-08' of... · 83127eca
      Jakub Kicinski authored
      Merge tag 'wireless-next-2024-05-08' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next
      
      Kalle Valo says:
      
      ====================
      wireless-next patches for v6.10
      
      The third, and most likely the last, "new features" pull request for
      v6.10 with changes both in stack and in drivers. In ath12k and rtw89
      we disabled Wireless Extensions just like with iwlwifi earlier. Wi-Fi
      7 devices will not support Wireless Extensions (WEXT) anymore so if
      someone is still using the legacy WEXT interface it's time to switch
      to nl80211 now!
      
      We merged wireless into wireless-next as we decided not to send a
      wireless pull request to v6.9 this late in the cycle. Also an
      immutable branch with MHI subsystem was merged to get ath11k and
      ath12k hibernation working.
      
      Major changes:
      
      mac80211/cfg80211
       * handle color change per link
      
      mt76
       * mt7921 LED control
       * mt7925 EHT radiotap support
       * mt7920e PCI support
      
      ath12k
       * debugfs support
       * dfs_simulate_radar debugfs file
       * disable Wireless Extensions
       * suspend and hibernation support
       * ACPI support
       * refactoring in preparation of multi-link support
      
      ath11k
       * support hibernation (required changes in qrtr and MHI subsystems)
       * ieee80211-freq-limit Device Tree property support
      
      ath10k
       * firmware-name Device Tree property support
      
      rtw89
       * complete features of new WiFi 7 chip 8922AE including BT-coexistence
         and WoWLAN
       * use BIOS ACPI settings to set TX power and channels
       * disable Wireless Extensios on Wi-Fi 7 devices
      
      iwlwifi
       * block_esr debugfs file
       * support again firmware API 90 (was reverted earlier)
       * provide channel survey information for Automatic Channel Selection (ACS)
      
      * tag 'wireless-next-2024-05-08' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (214 commits)
        wifi: mwl8k: initialize cmd->addr[] properly
        wifi: iwlwifi: Ensure prph_mac dump includes all addresses
        wifi: iwlwifi: mvm: don't request statistics in restart
        wifi: iwlwifi: mvm: exit EMLSR if secondary link is not used
        wifi: iwlwifi: mvm: add beacon template version 14
        wifi: iwlwifi: mvm: align UATS naming with firmware
        wifi: iwlwifi: Force SCU_ACTIVE for specific platforms
        wifi: iwlwifi: mvm: record and return channel survey information
        wifi: iwlwifi: mvm: add the firmware API for channel survey
        wifi: iwlwifi: mvm: Fix race in scan completion
        wifi: iwlwifi: mvm: Add a print for invalid link pair due to bandwidth
        wifi: iwlwifi: mvm: add a debugfs for reading EMLSR blocking reasons
        wifi: iwlwifi: mvm: Add active EMLSR blocking reasons prints
        wifi: iwlwifi: bump FW API to 90 for BZ/SC devices
        wifi: iwlwifi: mvm: fix primary link setting
        wifi: iwlwifi: mvm: use already determined cmd_id
        wifi: iwlwifi: mvm: don't reset link selection during restart
        wifi: iwlwifi: Print EMLSR states name
        wifi: iwlwifi: mvm: Block EMLSR when a p2p/softAP vif is active
        wifi: iwlwifi: mvm: fix typo in debug print
        ...
      ====================
      
      Link: https://lore.kernel.org/r/20240508120726.85A10C113CC@smtp.kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      83127eca
    • Jakub Kicinski's avatar
      Merge branch 'netdevsim-add-napi-support' · d9308f51
      Jakub Kicinski authored
      David Wei says:
      
      ====================
      netdevsim: add NAPI support
      
      Add NAPI support to netdevsim and register its Rx queues with NAPI
      instances. Then add a selftest using the new netdev Python selftest
      infra to exercise the existing Netdev Netlink API, specifically the
      queue-get API.
      
      This expands test coverage and further fleshes out netdevsim as a test
      device. It's still my goal to make it useful for testing things like
      flow steering and ZC Rx.
      ====================
      
      Link: https://lore.kernel.org/r/20240507163228.2066817-1-dw@davidwei.ukSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d9308f51
    • David Wei's avatar
      net: selftest: add test for netdev netlink queue-get API · 1cf27042
      David Wei authored
      Add a selftest for netdev generic netlink. For now there is only a
      single test that exercises the `queue-get` API.
      
      The test works with netdevsim by default or with a real device by
      setting NETIF.
      
      Add a timeout param to cmd() since ethtool -L can take a long time on
      real devices.
      Signed-off-by: default avatarDavid Wei <dw@davidwei.uk>
      Link: https://lore.kernel.org/r/20240507163228.2066817-3-dw@davidwei.ukSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      1cf27042
    • David Wei's avatar
      netdevsim: add NAPI support · 3762ec05
      David Wei authored
      Add NAPI support to netdevim, similar to veth.
      
      * Add a nsim_rq rx queue structure to hold a NAPI instance and a skb
        queue.
      * During xmit, store the skb in the peer skb queue and schedule NAPI.
      * During napi_poll(), drain the skb queue and pass up the stack.
      * Add assoc between rxq and NAPI instance using netif_queue_set_napi().
      Signed-off-by: default avatarDavid Wei <dw@davidwei.uk>
      Link: https://lore.kernel.org/r/20240507163228.2066817-2-dw@davidwei.ukSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      3762ec05
    • Willem de Bruijn's avatar
      selftests: drv-net: add checksum tests · 1d0dc857
      Willem de Bruijn authored
      Run tools/testing/selftest/net/csum.c as part of drv-net.
      This binary covers multiple scenarios, based on arguments given,
      for both IPv4 and IPv6:
      
      - Accept UDP correct checksum
      - Detect UDP invalid checksum
      - Accept TCP correct checksum
      - Detect TCP invalid checksum
      
      - Transmit UDP: basic checksum offload
      - Transmit UDP: zero checksum conversion
      
      The test direction is reversed between receive and transmit tests, so
      that the NIC under test is always the local machine.
      
      In total this adds up to 12 testcases, with more to follow. For
      conciseness, I replaced individual functions with a function factory.
      
      Also detect hardware offload feature availability using Ethtool
      netlink and skip tests when either feature is off. This need may be
      common for offload feature tests and eventually deserving of a thin
      wrapper in lib.py.
      
      Missing are the PF_PACKET based send tests ('-P'). These use
      virtio_net_hdr to program hardware checksum offload. Which requires
      looking up the local MAC address and (harder) the MAC of the next hop.
      I'll have to give it some though how to do that robustly and where
      that code would belong.
      
      Tested:
      
              make -C tools/testing/selftests/ \
                      TARGETS="drivers/net drivers/net/hw" \
                      install INSTALL_PATH=/tmp/ksft
              cd /tmp/ksft
      
      	sudo NETIF=ens4 REMOTE_TYPE=ssh \
      		REMOTE_ARGS="root@10.40.0.2" \
      		LOCAL_V4="10.40.0.1" \
      		REMOTE_V4="10.40.0.2" \
      		./run_kselftest.sh -t drivers/net/hw:csum.py
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Link: https://lore.kernel.org/r/20240507154216.501111-1-willemdebruijn.kernel@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      1d0dc857
    • Eric Dumazet's avatar
      ipv6: prevent NULL dereference in ip6_output() · 4db783d6
      Eric Dumazet authored
      According to syzbot, there is a chance that ip6_dst_idev()
      returns NULL in ip6_output(). Most places in IPv6 stack
      deal with a NULL idev just fine, but not here.
      
      syzbot reported:
      
      general protection fault, probably for non-canonical address 0xdffffc00000000bc: 0000 [#1] PREEMPT SMP KASAN PTI
      KASAN: null-ptr-deref in range [0x00000000000005e0-0x00000000000005e7]
      CPU: 0 PID: 9775 Comm: syz-executor.4 Not tainted 6.9.0-rc5-syzkaller-00157-g6a30653b #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
       RIP: 0010:ip6_output+0x231/0x3f0 net/ipv6/ip6_output.c:237
      Code: 3c 1e 00 49 89 df 74 08 4c 89 ef e8 19 58 db f7 48 8b 44 24 20 49 89 45 00 49 89 c5 48 8d 9d e0 05 00 00 48 89 d8 48 c1 e8 03 <42> 0f b6 04 38 84 c0 4c 8b 74 24 28 0f 85 61 01 00 00 8b 1b 31 ff
      RSP: 0018:ffffc9000927f0d8 EFLAGS: 00010202
      RAX: 00000000000000bc RBX: 00000000000005e0 RCX: 0000000000040000
      RDX: ffffc900131f9000 RSI: 0000000000004f47 RDI: 0000000000004f48
      RBP: 0000000000000000 R08: ffffffff8a1f0b9a R09: 1ffffffff1f51fad
      R10: dffffc0000000000 R11: fffffbfff1f51fae R12: ffff8880293ec8c0
      R13: ffff88805d7fc000 R14: 1ffff1100527d91a R15: dffffc0000000000
      FS:  00007f135c6856c0(0000) GS:ffff8880b9400000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000020000080 CR3: 0000000064096000 CR4: 00000000003506f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       <TASK>
        NF_HOOK include/linux/netfilter.h:314 [inline]
        ip6_xmit+0xefe/0x17f0 net/ipv6/ip6_output.c:358
        sctp_v6_xmit+0x9f2/0x13f0 net/sctp/ipv6.c:248
        sctp_packet_transmit+0x26ad/0x2ca0 net/sctp/output.c:653
        sctp_packet_singleton+0x22c/0x320 net/sctp/outqueue.c:783
        sctp_outq_flush_ctrl net/sctp/outqueue.c:914 [inline]
        sctp_outq_flush+0x6d5/0x3e20 net/sctp/outqueue.c:1212
        sctp_side_effects net/sctp/sm_sideeffect.c:1198 [inline]
        sctp_do_sm+0x59cc/0x60c0 net/sctp/sm_sideeffect.c:1169
        sctp_primitive_ASSOCIATE+0x95/0xc0 net/sctp/primitive.c:73
        __sctp_connect+0x9cd/0xe30 net/sctp/socket.c:1234
        sctp_connect net/sctp/socket.c:4819 [inline]
        sctp_inet_connect+0x149/0x1f0 net/sctp/socket.c:4834
        __sys_connect_file net/socket.c:2048 [inline]
        __sys_connect+0x2df/0x310 net/socket.c:2065
        __do_sys_connect net/socket.c:2075 [inline]
        __se_sys_connect net/socket.c:2072 [inline]
        __x64_sys_connect+0x7a/0x90 net/socket.c:2072
        do_syscall_x64 arch/x86/entry/common.c:52 [inline]
        do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f
      
      Fixes: 778d80be ("ipv6: Add disable_ipv6 sysctl to disable IPv6 operaion on specific interface.")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarLarysa Zaremba <larysa.zaremba@intel.com>
      Link: https://lore.kernel.org/r/20240507161842.773961-1-edumazet@google.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      4db783d6
    • Lukasz Majewski's avatar
      hsr: Simplify code for announcing HSR nodes timer setup · 4893b8b3
      Lukasz Majewski authored
      Up till now the code to start HSR announce timer, which triggers sending
      supervisory frames, was assuming that hsr_netdev_notify() would be called
      at least twice for hsrX interface. This was required to have different
      values for old and current values of network device's operstate.
      
      This is problematic for a case where hsrX interface is already in the
      operational state when hsr_netdev_notify() is called, so timer is not
      configured to trigger and as a result the hsrX is not sending supervisory
      frames to HSR ring.
      
      This error has been discovered when hsr_ping.sh script was run. To be
      more specific - for the hsr1 and hsr2 the hsr_netdev_notify() was
      called at least twice with different IF_OPER_{LOWERDOWN|DOWN|UP} states
      assigned in hsr_check_carrier_and_operstate(hsr). As a result there was
      no issue with sending supervisory frames.
      However, with hsr3, the notify function was called only once with
      operstate set to IF_OPER_UP and timer responsible for triggering
      supervisory frames was not fired.
      
      The solution is to use netif_oper_up() and netif_running() helper
      functions to assess if network hsrX device is up.
      Only then, when the timer is not already pending, it is started.
      Otherwise it is deactivated.
      
      Fixes: f421436a ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)")
      Signed-off-by: default avatarLukasz Majewski <lukma@denx.de>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/20240507111214.3519800-1-lukma@denx.deSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      4893b8b3
    • Eric Dumazet's avatar
      phonet: no longer hold RTNL in route_dumpit() · 58a4ff5d
      Eric Dumazet authored
      route_dumpit() already relies on RCU, RTNL is not needed.
      
      Also change return value at the end of a dump.
      This allows NLMSG_DONE to be appended to the current
      skb at the end of a dump, saving a couple of recvmsg()
      system calls.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Remi Denis-Courmont <courmisch@gmail.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/20240507121748.416287-1-edumazet@google.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      58a4ff5d
    • Eric Dumazet's avatar
      net: annotate data-races around dev->if_port · 8d8b1a42
      Eric Dumazet authored
      Various ndo_set_config() methods can change dev->if_port
      
      dev->if_port is going to be read locklessly from
      rtnl_fill_link_ifmap().
      
      Add corresponding WRITE_ONCE() on writer sides.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/20240507184144.1230469-1-edumazet@google.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      8d8b1a42
    • Eric Dumazet's avatar
      ipv6: fib6_rules: avoid possible NULL dereference in fib6_rule_action() · d101291b
      Eric Dumazet authored
      syzbot is able to trigger the following crash [1],
      caused by unsafe ip6_dst_idev() use.
      
      Indeed ip6_dst_idev() can return NULL, and must always be checked.
      
      [1]
      
      Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN PTI
      KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
      CPU: 0 PID: 31648 Comm: syz-executor.0 Not tainted 6.9.0-rc4-next-20240417-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
       RIP: 0010:__fib6_rule_action net/ipv6/fib6_rules.c:237 [inline]
       RIP: 0010:fib6_rule_action+0x241/0x7b0 net/ipv6/fib6_rules.c:267
      Code: 02 00 00 49 8d 9f d8 00 00 00 48 89 d8 48 c1 e8 03 42 80 3c 20 00 74 08 48 89 df e8 f9 32 bf f7 48 8b 1b 48 89 d8 48 c1 e8 03 <42> 80 3c 20 00 74 08 48 89 df e8 e0 32 bf f7 4c 8b 03 48 89 ef 4c
      RSP: 0018:ffffc9000fc1f2f0 EFLAGS: 00010246
      RAX: 0000000000000000 RBX: 0000000000000000 RCX: 1a772f98c8186700
      RDX: 0000000000000003 RSI: ffffffff8bcac4e0 RDI: ffffffff8c1f9760
      RBP: ffff8880673fb980 R08: ffffffff8fac15ef R09: 1ffffffff1f582bd
      R10: dffffc0000000000 R11: fffffbfff1f582be R12: dffffc0000000000
      R13: 0000000000000080 R14: ffff888076509000 R15: ffff88807a029a00
      FS:  00007f55e82ca6c0(0000) GS:ffff8880b9400000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000001b31d23000 CR3: 0000000022b66000 CR4: 00000000003506f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       <TASK>
        fib_rules_lookup+0x62c/0xdb0 net/core/fib_rules.c:317
        fib6_rule_lookup+0x1fd/0x790 net/ipv6/fib6_rules.c:108
        ip6_route_output_flags_noref net/ipv6/route.c:2637 [inline]
        ip6_route_output_flags+0x38e/0x610 net/ipv6/route.c:2649
        ip6_route_output include/net/ip6_route.h:93 [inline]
        ip6_dst_lookup_tail+0x189/0x11a0 net/ipv6/ip6_output.c:1120
        ip6_dst_lookup_flow+0xb9/0x180 net/ipv6/ip6_output.c:1250
        sctp_v6_get_dst+0x792/0x1e20 net/sctp/ipv6.c:326
        sctp_transport_route+0x12c/0x2e0 net/sctp/transport.c:455
        sctp_assoc_add_peer+0x614/0x15c0 net/sctp/associola.c:662
        sctp_connect_new_asoc+0x31d/0x6c0 net/sctp/socket.c:1099
        __sctp_connect+0x66d/0xe30 net/sctp/socket.c:1197
        sctp_connect net/sctp/socket.c:4819 [inline]
        sctp_inet_connect+0x149/0x1f0 net/sctp/socket.c:4834
        __sys_connect_file net/socket.c:2048 [inline]
        __sys_connect+0x2df/0x310 net/socket.c:2065
        __do_sys_connect net/socket.c:2075 [inline]
        __se_sys_connect net/socket.c:2072 [inline]
        __x64_sys_connect+0x7a/0x90 net/socket.c:2072
        do_syscall_x64 arch/x86/entry/common.c:52 [inline]
        do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f
      
      Fixes: 5e5f3f0f ("[IPV6] ADDRCONF: Convert ipv6_get_saddr() to ipv6_dev_get_saddr().")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Link: https://lore.kernel.org/r/20240507163145.835254-1-edumazet@google.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d101291b
    • Eric Dumazet's avatar
      net: dst_cache: minor optimization in dst_cache_set_ip6() · e2d09e5a
      Eric Dumazet authored
      There is no need to use this_cpu_ptr(dst_cache->cache) twice.
      
      Compiler is unable to optimize the second call, because of
      per-cpu constraints.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Link: https://lore.kernel.org/r/20240507132717.627518-1-edumazet@google.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e2d09e5a
    • Eric Dumazet's avatar
      net: dst_cache: annotate data-races around dst_cache->reset_ts · 3b09b2bd
      Eric Dumazet authored
      dst_cache->reset_ts is read or written locklessly,
      add READ_ONCE() and WRITE_ONCE() annotations.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Link: https://lore.kernel.org/r/20240507132000.614591-1-edumazet@google.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      3b09b2bd