1. 17 Aug, 2022 6 commits
  2. 16 Aug, 2022 5 commits
  3. 15 Aug, 2022 11 commits
    • David S. Miller's avatar
      Merge branch 'mlxsw-fixes' · 5061e34c
      David S. Miller authored
      Petr Machata says:
      
      ====================
      mlxsw: Fixes for PTP support
      
      This set fixes several issues in mlxsw PTP code.
      
      - Patch #1 fixes compilation warnings.
      
      - Patch #2 adjusts the order of operation during cleanup, thereby
        closing the window after PTP state was already cleaned in the ASIC
        for the given port, but before the port is removed, when the user
        could still in theory make changes to the configuration.
      
      - Patch #3 protects the PTP configuration with a custom mutex, instead
        of relying on RTNL, which is not held in all access paths.
      
      - Patch #4 forbids enablement of PTP only in RX or only in TX. The
        driver implicitly assumed this would be the case, but neglected to
        sanitize the configuration.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5061e34c
    • Amit Cohen's avatar
      mlxsw: spectrum_ptp: Forbid PTP enablement only in RX or in TX · e01885c3
      Amit Cohen authored
      Currently mlxsw driver configures one global PTP configuration for all
      ports. The reason is that the switch behaves like a transparent clock
      between CPU port and front-panel ports. When time stamp is enabled in
      any port, the hardware is configured to update the correction field. The
      fact that the configuration of CPU port affects all the ports, makes the
      correction field update to be global for all ports. Otherwise, user will
      see odd values in the correction field, as the switch will update the
      correction field in the CPU port, but not in all the front-panel ports.
      
      The CPU port is relevant in both RX and TX, so to avoid problematic
      configuration, forbid PTP enablement only in one direction, i.e., only in
      RX or TX.
      
      Without the change:
      $ hwstamp_ctl -i swp1 -r 12 -t 0
      current settings:
      tx_type 0
      rx_filter 0
      new settings:
      tx_type 0
      rx_filter 2
      $ echo $?
      0
      
      With the change:
      $ hwstamp_ctl -i swp1 -r 12 -t 0
      current settings:
      tx_type 1
      rx_filter 2
      SIOCSHWTSTAMP failed: Invalid argument
      
      Fixes: 08ef8bc8 ("mlxsw: spectrum_ptp: Support SIOCGHWTSTAMP, SIOCSHWTSTAMP ioctls")
      Signed-off-by: default avatarAmit Cohen <amcohen@nvidia.com>
      Reviewed-by: default avatarPetr Machata <petrm@nvidia.com>
      Signed-off-by: default avatarPetr Machata <petrm@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e01885c3
    • Amit Cohen's avatar
      mlxsw: spectrum_ptp: Protect PTP configuration with a mutex · d72fdef2
      Amit Cohen authored
      Currently the functions mlxsw_sp2_ptp_{configure, deconfigure}_port()
      assume that they are called when RTNL is locked and they warn otherwise.
      
      The deconfigure function can be called when port is removed, for example
      as part of device reload, then there is no locked RTNL and the function
      warns [1].
      
      To avoid such case, do not assume that RTNL protects this code, add a
      dedicated mutex instead. The mutex protects 'ptp_state->config' which
      stores the existing global configuration in hardware. Use this mutex also
      to protect the code which configures the hardware. Then, there will be
      only one configuration in any time, which will be updated in 'ptp_state'
      and a race will be avoided.
      
      [1]:
      RTNL: assertion failed at drivers/net/ethernet/mellanox/mlxsw/spectrum_ptp.c (1600)
      WARNING: CPU: 1 PID: 1583493 at drivers/net/ethernet/mellanox/mlxsw/spectrum_ptp.c:1600 mlxsw_sp2_ptp_hwtstamp_set+0x2d3/0x300 [mlxsw_spectrum]
      [...]
      CPU: 1 PID: 1583493 Comm: devlink Not tainted5.19.0-rc8-custom-127022-gb371dffda095 #789
      Hardware name: Mellanox Technologies Ltd.MSN3420/VMOD0005, BIOS 5.11 01/06/2019
      RIP: 0010:mlxsw_sp2_ptp_hwtstamp_set+0x2d3/0x300[mlxsw_spectrum]
      [...]
      Call Trace:
       <TASK>
       mlxsw_sp_port_remove+0x7e/0x190 [mlxsw_spectrum]
       mlxsw_sp_fini+0xd1/0x270 [mlxsw_spectrum]
       mlxsw_core_bus_device_unregister+0x55/0x280 [mlxsw_core]
       mlxsw_devlink_core_bus_device_reload_down+0x1c/0x30[mlxsw_core]
       devlink_reload+0x1ee/0x230
       devlink_nl_cmd_reload+0x4de/0x580
       genl_family_rcv_msg_doit+0xdc/0x140
       genl_rcv_msg+0xd7/0x1d0
       netlink_rcv_skb+0x49/0xf0
       genl_rcv+0x1f/0x30
       netlink_unicast+0x22f/0x350
       netlink_sendmsg+0x208/0x440
       __sys_sendto+0xf0/0x140
       __x64_sys_sendto+0x1b/0x20
       do_syscall_64+0x35/0x80
       entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      Fixes: 08ef8bc8 ("mlxsw: spectrum_ptp: Support SIOCGHWTSTAMP, SIOCSHWTSTAMP ioctls")
      Reported-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Signed-off-by: default avatarAmit Cohen <amcohen@nvidia.com>
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Signed-off-by: default avatarPetr Machata <petrm@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d72fdef2
    • Amit Cohen's avatar
      mlxsw: spectrum: Clear PTP configuration after unregistering the netdevice · a159e986
      Amit Cohen authored
      Currently as part of removing port, PTP API is called to clear the
      existing configuration and set the 'rx_filter' and 'tx_type' to zero.
      The clearing is done before unregistering the netdevice, which means that
      there is a window of time in which the user can reconfigure PTP in the
      port, and this configuration will not be cleared.
      
      Reorder the operations, clear PTP configuration after unregistering the
      netdevice.
      
      Fixes: 87486427 ("mlxsw: spectrum: PTP: Support SIOCGHWTSTAMP, SIOCSHWTSTAMP ioctls")
      Signed-off-by: default avatarAmit Cohen <amcohen@nvidia.com>
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Signed-off-by: default avatarPetr Machata <petrm@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a159e986
    • Amit Cohen's avatar
      mlxsw: spectrum_ptp: Fix compilation warnings · 12e09138
      Amit Cohen authored
      In case that 'CONFIG_PTP_1588_CLOCK' is not enabled in the config file,
      there are implementations for the functions
      mlxsw_{sp,sp2}_ptp_txhdr_construct() as part of 'spectrum_ptp.h'. In this
      case, they should be defined as 'static' as they are not supposed to be
      used out of this file. Make the functions 'static', otherwise the following
      warnings are returned:
      
      "warning: no previous prototype for 'mlxsw_sp_ptp_txhdr_construct'"
      "warning: no previous prototype for 'mlxsw_sp2_ptp_txhdr_construct'"
      
      In addition, make the functions 'inline' for case that 'spectrum_ptp.h'
      will be included anywhere else and the functions would probably not be
      used, so compilation warnings about unused static will be returned.
      
      Fixes: 24157bc6 ("mlxsw: Send PTP packets as data packets to overcome a limitation")
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Signed-off-by: default avatarAmit Cohen <amcohen@nvidia.com>
      Reviewed-by: default avatarPetr Machata <petrm@nvidia.com>
      Signed-off-by: default avatarPetr Machata <petrm@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      12e09138
    • Jamal Hadi Salim's avatar
      net_sched: cls_route: disallow handle of 0 · 02799571
      Jamal Hadi Salim authored
      Follows up on:
      https://lore.kernel.org/all/20220809170518.164662-1-cascardo@canonical.com/
      
      handle of 0 implies from/to of universe realm which is not very
      sensible.
      
      Lets see what this patch will do:
      $sudo tc qdisc add dev $DEV root handle 1:0 prio
      
      //lets manufacture a way to insert handle of 0
      $sudo tc filter add dev $DEV parent 1:0 protocol ip prio 100 \
      route to 0 from 0 classid 1:10 action ok
      
      //gets rejected...
      Error: handle of 0 is not valid.
      We have an error talking to the kernel, -1
      
      //lets create a legit entry..
      sudo tc filter add dev $DEV parent 1:0 protocol ip prio 100 route from 10 \
      classid 1:10 action ok
      
      //what did the kernel insert?
      $sudo tc filter ls dev $DEV parent 1:0
      filter protocol ip pref 100 route chain 0
      filter protocol ip pref 100 route chain 0 fh 0x000a8000 flowid 1:10 from 10
      	action order 1: gact action pass
      	 random type none pass val 0
      	 index 1 ref 1 bind 1
      
      //Lets try to replace that legit entry with a handle of 0
      $ sudo tc filter replace dev $DEV parent 1:0 protocol ip prio 100 \
      handle 0x000a8000 route to 0 from 0 classid 1:10 action drop
      
      Error: Replacing with handle of 0 is invalid.
      We have an error talking to the kernel, -1
      
      And last, lets run Cascardo's POC:
      $ ./poc
      0
      0
      -22
      -22
      -22
      Signed-off-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Acked-by: default avatarStephen Hemminger <stephen@networkplumber.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      02799571
    • Xin Xiong's avatar
      net: fix potential refcount leak in ndisc_router_discovery() · 7396ba87
      Xin Xiong authored
      The issue happens on specific paths in the function. After both the
      object `rt` and `neigh` are grabbed successfully, when `lifetime` is
      nonzero but the metric needs change, the function just deletes the
      route and set `rt` to NULL. Then, it may try grabbing `rt` and `neigh`
      again if above conditions hold. The function simply overwrite `neigh`
      if succeeds or returns if fails, without decreasing the reference
      count of previous `neigh`. This may result in memory leaks.
      
      Fix it by decrementing the reference count of `neigh` in place.
      
      Fixes: 6b2e04bc ("net: allow user to set metric on default route learned via Router Advertisement")
      Signed-off-by: default avatarXin Xiong <xiongx18@fudan.edu.cn>
      Signed-off-by: default avatarXin Tan <tanxin.ctf@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7396ba87
    • David S. Miller's avatar
      Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net · 27b8d4d7
      David S. Miller authored
      -queue
      
      Tony Nguyen says:
      
      ====================
      Intel Wired LAN Driver Updates 2022-08-11 (ice)
      
      This series contains updates to ice driver only.
      
      Benjamin corrects a misplaced parenthesis for a WARN_ON check.
      
      Michal removes WARN_ON from a check as its recoverable and not
      warranting of a call trace.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      27b8d4d7
    • Alexander Mikhalitsyn's avatar
      neighbour: make proxy_queue.qlen limit per-device · 0ff4eb3d
      Alexander Mikhalitsyn authored
      Right now we have a neigh_param PROXY_QLEN which specifies maximum length
      of neigh_table->proxy_queue. But in fact, this limitation doesn't work well
      because check condition looks like:
      tbl->proxy_queue.qlen > NEIGH_VAR(p, PROXY_QLEN)
      
      The problem is that p (struct neigh_parms) is a per-device thing,
      but tbl (struct neigh_table) is a system-wide global thing.
      
      It seems reasonable to make proxy_queue limit per-device based.
      
      v2:
      	- nothing changed in this patch
      v3:
      	- rebase to net tree
      
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: Paolo Abeni <pabeni@redhat.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David Ahern <dsahern@kernel.org>
      Cc: Yajun Deng <yajun.deng@linux.dev>
      Cc: Roopa Prabhu <roopa@nvidia.com>
      Cc: Christian Brauner <brauner@kernel.org>
      Cc: netdev@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
      Cc: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
      Cc: Konstantin Khorenko <khorenko@virtuozzo.com>
      Cc: kernel@openvz.org
      Cc: devel@openvz.org
      Suggested-by: default avatarDenis V. Lunev <den@openvz.org>
      Signed-off-by: default avatarAlexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
      Reviewed-by: default avatarDenis V. Lunev <den@openvz.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0ff4eb3d
    • Denis V. Lunev's avatar
      neigh: fix possible DoS due to net iface start/stop loop · 66ba215c
      Denis V. Lunev authored
      Normal processing of ARP request (usually this is Ethernet broadcast
      packet) coming to the host is looking like the following:
      * the packet comes to arp_process() call and is passed through routing
        procedure
      * the request is put into the queue using pneigh_enqueue() if
        corresponding ARP record is not local (common case for container
        records on the host)
      * the request is processed by timer (within 80 jiffies by default) and
        ARP reply is sent from the same arp_process() using
        NEIGH_CB(skb)->flags & LOCALLY_ENQUEUED condition (flag is set inside
        pneigh_enqueue())
      
      And here the problem comes. Linux kernel calls pneigh_queue_purge()
      which destroys the whole queue of ARP requests on ANY network interface
      start/stop event through __neigh_ifdown().
      
      This is actually not a problem within the original world as network
      interface start/stop was accessible to the host 'root' only, which
      could do more destructive things. But the world is changed and there
      are Linux containers available. Here container 'root' has an access
      to this API and could be considered as untrusted user in the hosting
      (container's) world.
      
      Thus there is an attack vector to other containers on node when
      container's root will endlessly start/stop interfaces. We have observed
      similar situation on a real production node when docker container was
      doing such activity and thus other containers on the node become not
      accessible.
      
      The patch proposed doing very simple thing. It drops only packets from
      the same namespace in the pneigh_queue_purge() where network interface
      state change is detected. This is enough to prevent the problem for the
      whole node preserving original semantics of the code.
      
      v2:
      	- do del_timer_sync() if queue is empty after pneigh_queue_purge()
      v3:
      	- rebase to net tree
      
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: Paolo Abeni <pabeni@redhat.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David Ahern <dsahern@kernel.org>
      Cc: Yajun Deng <yajun.deng@linux.dev>
      Cc: Roopa Prabhu <roopa@nvidia.com>
      Cc: Christian Brauner <brauner@kernel.org>
      Cc: netdev@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
      Cc: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
      Cc: Konstantin Khorenko <khorenko@virtuozzo.com>
      Cc: kernel@openvz.org
      Cc: devel@openvz.org
      Investigated-by: default avatarAlexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
      Signed-off-by: default avatarDenis V. Lunev <den@openvz.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      66ba215c
    • Maxim Kochetkov's avatar
      net: qrtr: start MHI channel after endpoit creation · 68a838b8
      Maxim Kochetkov authored
      MHI channel may generates event/interrupt right after enabling.
      It may leads to 2 race conditions issues.
      
      1)
      Such event may be dropped by qcom_mhi_qrtr_dl_callback() at check:
      
      	if (!qdev || mhi_res->transaction_status)
      		return;
      
      Because dev_set_drvdata(&mhi_dev->dev, qdev) may be not performed at
      this moment. In this situation qrtr-ns will be unable to enumerate
      services in device.
      ---------------------------------------------------------------
      
      2)
      Such event may come at the moment after dev_set_drvdata() and
      before qrtr_endpoint_register(). In this case kernel will panic with
      accessing wrong pointer at qcom_mhi_qrtr_dl_callback():
      
      	rc = qrtr_endpoint_post(&qdev->ep, mhi_res->buf_addr,
      				mhi_res->bytes_xferd);
      
      Because endpoint is not created yet.
      --------------------------------------------------------------
      So move mhi_prepare_for_transfer_autoqueue after endpoint creation
      to fix it.
      
      Fixes: a2e2cc0d ("net: qrtr: Start MHI channels during init")
      Signed-off-by: default avatarMaxim Kochetkov <fido_max@inbox.ru>
      Reviewed-by: default avatarHemant Kumar <quic_hemantk@quicinc.com>
      Reviewed-by: default avatarManivannan Sadhasivam <mani@kernel.org>
      Reviewed-by: default avatarLoic Poulain <loic.poulain@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      68a838b8
  4. 13 Aug, 2022 3 commits
  5. 12 Aug, 2022 12 commits
    • Ivan Vecera's avatar
      iavf: Fix deadlock in initialization · cbe9e511
      Ivan Vecera authored
      Fix deadlock that occurs when iavf interface is a part of failover
      configuration.
      
      1. Mutex crit_lock is taken at the beginning of iavf_watchdog_task()
      2. Function iavf_init_config_adapter() is called when adapter
         state is __IAVF_INIT_CONFIG_ADAPTER
      3. iavf_init_config_adapter() calls register_netdevice() that emits
         NETDEV_REGISTER event
      4. Notifier function failover_event() then calls
         net_failover_slave_register() that calls dev_open()
      5. dev_open() calls iavf_open() that tries to take crit_lock in
         end-less loop
      
      Stack trace:
      ...
      [  790.251876]  usleep_range_state+0x5b/0x80
      [  790.252547]  iavf_open+0x37/0x1d0 [iavf]
      [  790.253139]  __dev_open+0xcd/0x160
      [  790.253699]  dev_open+0x47/0x90
      [  790.254323]  net_failover_slave_register+0x122/0x220 [net_failover]
      [  790.255213]  failover_slave_register.part.7+0xd2/0x180 [failover]
      [  790.256050]  failover_event+0x122/0x1ab [failover]
      [  790.256821]  notifier_call_chain+0x47/0x70
      [  790.257510]  register_netdevice+0x20f/0x550
      [  790.258263]  iavf_watchdog_task+0x7c8/0xea0 [iavf]
      [  790.259009]  process_one_work+0x1a7/0x360
      [  790.259705]  worker_thread+0x30/0x390
      
      To fix the situation we should check the current adapter state after
      first unsuccessful mutex_trylock() and return with -EBUSY if it is
      __IAVF_INIT_CONFIG_ADAPTER.
      
      Fixes: 226d5285 ("iavf: fix locking of critical sections")
      Signed-off-by: default avatarIvan Vecera <ivecera@redhat.com>
      Tested-by: default avatarKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      cbe9e511
    • Przemyslaw Patynowski's avatar
      iavf: Fix reset error handling · 31071173
      Przemyslaw Patynowski authored
      Do not call iavf_close in iavf_reset_task error handling. Doing so can
      lead to double call of napi_disable, which can lead to deadlock there.
      Removing VF would lead to iavf_remove task being stuck, because it
      requires crit_lock, which is held by iavf_close.
      Call iavf_disable_vf if reset fail, so that driver will clean up
      remaining invalid resources.
      During rapid VF resets, HW can fail to setup VF mailbox. Wrong
      error handling can lead to iavf_remove being stuck with:
      [ 5218.999087] iavf 0000:82:01.0: Failed to init adminq: -53
      ...
      [ 5267.189211] INFO: task repro.sh:11219 blocked for more than 30 seconds.
      [ 5267.189520]       Tainted: G S          E     5.18.0-04958-ga54ce370-dirty #1
      [ 5267.189764] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [ 5267.190062] task:repro.sh        state:D stack:    0 pid:11219 ppid:  8162 flags:0x00000000
      [ 5267.190347] Call Trace:
      [ 5267.190647]  <TASK>
      [ 5267.190927]  __schedule+0x460/0x9f0
      [ 5267.191264]  schedule+0x44/0xb0
      [ 5267.191563]  schedule_preempt_disabled+0x14/0x20
      [ 5267.191890]  __mutex_lock.isra.12+0x6e3/0xac0
      [ 5267.192237]  ? iavf_remove+0xf9/0x6c0 [iavf]
      [ 5267.192565]  iavf_remove+0x12a/0x6c0 [iavf]
      [ 5267.192911]  ? _raw_spin_unlock_irqrestore+0x1e/0x40
      [ 5267.193285]  pci_device_remove+0x36/0xb0
      [ 5267.193619]  device_release_driver_internal+0xc1/0x150
      [ 5267.193974]  pci_stop_bus_device+0x69/0x90
      [ 5267.194361]  pci_stop_and_remove_bus_device+0xe/0x20
      [ 5267.194735]  pci_iov_remove_virtfn+0xba/0x120
      [ 5267.195130]  sriov_disable+0x2f/0xe0
      [ 5267.195506]  ice_free_vfs+0x7d/0x2f0 [ice]
      [ 5267.196056]  ? pci_get_device+0x4f/0x70
      [ 5267.196496]  ice_sriov_configure+0x78/0x1a0 [ice]
      [ 5267.196995]  sriov_numvfs_store+0xfe/0x140
      [ 5267.197466]  kernfs_fop_write_iter+0x12e/0x1c0
      [ 5267.197918]  new_sync_write+0x10c/0x190
      [ 5267.198404]  vfs_write+0x24e/0x2d0
      [ 5267.198886]  ksys_write+0x5c/0xd0
      [ 5267.199367]  do_syscall_64+0x3a/0x80
      [ 5267.199827]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
      [ 5267.200317] RIP: 0033:0x7f5b381205c8
      [ 5267.200814] RSP: 002b:00007fff8c7e8c78 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
      [ 5267.201981] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f5b381205c8
      [ 5267.202620] RDX: 0000000000000002 RSI: 00005569420ee900 RDI: 0000000000000001
      [ 5267.203426] RBP: 00005569420ee900 R08: 000000000000000a R09: 00007f5b38180820
      [ 5267.204327] R10: 000000000000000a R11: 0000000000000246 R12: 00007f5b383c06e0
      [ 5267.205193] R13: 0000000000000002 R14: 00007f5b383bb880 R15: 0000000000000002
      [ 5267.206041]  </TASK>
      [ 5267.206970] Kernel panic - not syncing: hung_task: blocked tasks
      [ 5267.207809] CPU: 48 PID: 551 Comm: khungtaskd Kdump: loaded Tainted: G S          E     5.18.0-04958-ga54ce370-dirty #1
      [ 5267.208726] Hardware name: Dell Inc. PowerEdge R730/0WCJNT, BIOS 2.11.0 11/02/2019
      [ 5267.209623] Call Trace:
      [ 5267.210569]  <TASK>
      [ 5267.211480]  dump_stack_lvl+0x33/0x42
      [ 5267.212472]  panic+0x107/0x294
      [ 5267.213467]  watchdog.cold.8+0xc/0xbb
      [ 5267.214413]  ? proc_dohung_task_timeout_secs+0x30/0x30
      [ 5267.215511]  kthread+0xf4/0x120
      [ 5267.216459]  ? kthread_complete_and_exit+0x20/0x20
      [ 5267.217505]  ret_from_fork+0x22/0x30
      [ 5267.218459]  </TASK>
      
      Fixes: f0db7892 ("i40evf: use netdev variable in reset task")
      Signed-off-by: default avatarPrzemyslaw Patynowski <przemyslawx.patynowski@intel.com>
      Signed-off-by: default avatarJedrzej Jagielski <jedrzej.jagielski@intel.com>
      Tested-by: default avatarMarek Szlosek <marek.szlosek@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      31071173
    • Przemyslaw Patynowski's avatar
      iavf: Fix NULL pointer dereference in iavf_get_link_ksettings · 541a1af4
      Przemyslaw Patynowski authored
      Fix possible NULL pointer dereference, due to freeing of adapter->vf_res
      in iavf_init_get_resources. Previous commit introduced a regression,
      where receiving IAVF_ERR_ADMIN_QUEUE_NO_WORK from iavf_get_vf_config
      would free adapter->vf_res. However, netdev is still registered, so
      ethtool_ops can be called. Calling iavf_get_link_ksettings with no vf_res,
      will result with:
      [ 9385.242676] BUG: kernel NULL pointer dereference, address: 0000000000000008
      [ 9385.242683] #PF: supervisor read access in kernel mode
      [ 9385.242686] #PF: error_code(0x0000) - not-present page
      [ 9385.242690] PGD 0 P4D 0
      [ 9385.242696] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC PTI
      [ 9385.242701] CPU: 6 PID: 3217 Comm: pmdalinux Kdump: loaded Tainted: G S          E     5.18.0-04958-ga54ce370-dirty #1
      [ 9385.242708] Hardware name: Dell Inc. PowerEdge R730/0WCJNT, BIOS 2.11.0 11/02/2019
      [ 9385.242710] RIP: 0010:iavf_get_link_ksettings+0x29/0xd0 [iavf]
      [ 9385.242745] Code: 00 0f 1f 44 00 00 b8 01 ef ff ff 48 c7 46 30 00 00 00 00 48 c7 46 38 00 00 00 00 c6 46 0b 00 66 89 46 08 48 8b 87 68 0e 00 00 <f6> 40 08 80 75 50 8b 87 5c 0e 00 00 83 f8 08 74 7a 76 1d 83 f8 20
      [ 9385.242749] RSP: 0018:ffffc0560ec7fbd0 EFLAGS: 00010246
      [ 9385.242755] RAX: 0000000000000000 RBX: ffffc0560ec7fc08 RCX: 0000000000000000
      [ 9385.242759] RDX: ffffffffc0ad4550 RSI: ffffc0560ec7fc08 RDI: ffffa0fc66674000
      [ 9385.242762] RBP: 00007ffd1fb2bf50 R08: b6a2d54b892363ee R09: ffffa101dc14fb00
      [ 9385.242765] R10: 0000000000000000 R11: 0000000000000004 R12: ffffa0fc66674000
      [ 9385.242768] R13: 0000000000000000 R14: ffffa0fc66674000 R15: 00000000ffffffa1
      [ 9385.242771] FS:  00007f93711a2980(0000) GS:ffffa0fad72c0000(0000) knlGS:0000000000000000
      [ 9385.242775] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 9385.242778] CR2: 0000000000000008 CR3: 0000000a8e61c003 CR4: 00000000003706e0
      [ 9385.242781] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [ 9385.242784] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [ 9385.242787] Call Trace:
      [ 9385.242791]  <TASK>
      [ 9385.242793]  ethtool_get_settings+0x71/0x1a0
      [ 9385.242814]  __dev_ethtool+0x426/0x2f40
      [ 9385.242823]  ? slab_post_alloc_hook+0x4f/0x280
      [ 9385.242836]  ? kmem_cache_alloc_trace+0x15d/0x2f0
      [ 9385.242841]  ? dev_ethtool+0x59/0x170
      [ 9385.242848]  dev_ethtool+0xa7/0x170
      [ 9385.242856]  dev_ioctl+0xc3/0x520
      [ 9385.242866]  sock_do_ioctl+0xa0/0xe0
      [ 9385.242877]  sock_ioctl+0x22f/0x320
      [ 9385.242885]  __x64_sys_ioctl+0x84/0xc0
      [ 9385.242896]  do_syscall_64+0x3a/0x80
      [ 9385.242904]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
      [ 9385.242918] RIP: 0033:0x7f93702396db
      [ 9385.242923] Code: 73 01 c3 48 8b 0d ad 57 38 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 7d 57 38 00 f7 d8 64 89 01 48
      [ 9385.242927] RSP: 002b:00007ffd1fb2bf18 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
      [ 9385.242932] RAX: ffffffffffffffda RBX: 000055671b1d2fe0 RCX: 00007f93702396db
      [ 9385.242935] RDX: 00007ffd1fb2bf20 RSI: 0000000000008946 RDI: 0000000000000007
      [ 9385.242937] RBP: 00007ffd1fb2bf20 R08: 0000000000000003 R09: 0030763066307330
      [ 9385.242940] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffd1fb2bf80
      [ 9385.242942] R13: 0000000000000007 R14: 0000556719f6de90 R15: 00007ffd1fb2c1b0
      [ 9385.242948]  </TASK>
      [ 9385.242949] Modules linked in: iavf(E) xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nft_compat nf_nat_tftp nft_objref nf_conntrack_tftp bridge stp llc nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables rfkill nfnetlink vfat fat irdma ib_uverbs ib_core intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm iTCO_wdt iTCO_vendor_support ice irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl i40e pcspkr intel_cstate joydev mei_me intel_uncore mxm_wmi mei ipmi_ssif lpc_ich ipmi_si acpi_power_meter xfs libcrc32c mgag200 i2c_algo_bit drm_shmem_helper drm_kms_helper sd_mod t10_pi crc64_rocksoft crc64 syscopyarea sg sysfillrect sysimgblt fb_sys_fops drm ixgbe ahci libahci libata crc32c_intel mdio dca wmi dm_mirror dm_region_hash dm_log dm_mod ipmi_devintf ipmi_msghandler fuse
      [ 9385.243065]  [last unloaded: iavf]
      
      Dereference happens in if (ADV_LINK_SUPPORT(adapter)) statement
      
      Fixes: 209f2f9c ("iavf: Add support for VIRTCHNL_VF_OFFLOAD_VLAN_V2 negotiation")
      Signed-off-by: default avatarPrzemyslaw Patynowski <przemyslawx.patynowski@intel.com>
      Signed-off-by: default avatarJedrzej Jagielski <jedrzej.jagielski@intel.com>
      Tested-by: default avatarMarek Szlosek <marek.szlosek@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      541a1af4
    • Przemyslaw Patynowski's avatar
      iavf: Fix adminq error handling · 41983161
      Przemyslaw Patynowski authored
      iavf_alloc_asq_bufs/iavf_alloc_arq_bufs allocates with dma_alloc_coherent
      memory for VF mailbox.
      Free DMA regions for both ASQ and ARQ in case error happens during
      configuration of ASQ/ARQ registers.
      Without this change it is possible to see when unloading interface:
      74626.583369: dma_debug_device_change: device driver has pending DMA allocations while released from device [count=32]
      One of leaked entries details: [device address=0x0000000b27ff9000] [size=4096 bytes] [mapped with DMA_BIDIRECTIONAL] [mapped as coherent]
      
      Fixes: d358aa9a ("i40evf: init code and hardware support")
      Signed-off-by: default avatarPrzemyslaw Patynowski <przemyslawx.patynowski@intel.com>
      Signed-off-by: default avatarJedrzej Jagielski <jedrzej.jagielski@intel.com>
      Tested-by: default avatarMarek Szlosek <marek.szlosek@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      41983161
    • Li Qiong's avatar
      net: lan966x: fix checking for return value of platform_get_irq_byname() · 40b4ac88
      Li Qiong authored
      The platform_get_irq_byname() returns non-zero IRQ number
      or negative error number. "if (irq)" always true, chang it
      to "if (irq > 0)"
      Signed-off-by: default avatarLi Qiong <liqiong@nfschina.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      40b4ac88
    • Jason Wang's avatar
      net: cxgb3: Fix comment typo · 75d8620d
      Jason Wang authored
      The double `the' is duplicated in the comment, remove one.
      Signed-off-by: default avatarJason Wang <wangborong@cdjrlc.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      75d8620d
    • Jason Wang's avatar
      bnx2x: Fix comment typo · 0619d0fa
      Jason Wang authored
      The double `the' is duplicated in the comment, remove one.
      Signed-off-by: default avatarJason Wang <wangborong@cdjrlc.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0619d0fa
    • Jason Wang's avatar
      net: ipa: Fix comment typo · 9221b289
      Jason Wang authored
      The double `is' is duplicated in the comment, remove one.
      Signed-off-by: default avatarJason Wang <wangborong@cdjrlc.com>
      Reviewed-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9221b289
    • Michael S. Tsirkin's avatar
      virtio_net: fix endian-ness for RSS · 95bb6330
      Michael S. Tsirkin authored
      Using native endian-ness for device supplied fields is wrong
      on BE platforms. Sparse warns about this.
      
      Fixes: 91f41f01 ("drivers/net/virtio_net: Added RSS hash report.")
      Cc: "Andrew Melnychenko" <andrew@daynix.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      95bb6330
    • Xin Xiong's avatar
      net/sunrpc: fix potential memory leaks in rpc_sysfs_xprt_state_change() · bfc48f1b
      Xin Xiong authored
      The issue happens on some error handling paths. When the function
      fails to grab the object `xprt`, it simply returns 0, forgetting to
      decrease the reference count of another object `xps`, which is
      increased by rpc_sysfs_xprt_kobj_get_xprt_switch(), causing refcount
      leaks. Also, the function forgets to check whether `xps` is valid
      before using it, which may result in NULL-dereferencing issues.
      
      Fix it by adding proper error handling code when either `xprt` or
      `xps` is NULL.
      
      Fixes: 5b7eb784 ("SUNRPC: take a xprt offline using sysfs")
      Signed-off-by: default avatarXin Xiong <xiongx18@fudan.edu.cn>
      Signed-off-by: default avatarXin Tan <tanxin.ctf@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bfc48f1b
    • Jilin Yuan's avatar
      skfp/h: fix repeated words in comments · 86d2155e
      Jilin Yuan authored
      Delete the redundant word 'the'.
      Signed-off-by: default avatarJilin Yuan <yuanjilin@cdjrlc.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      86d2155e
    • Mikulas Patocka's avatar
      rds: add missing barrier to release_refill · 9f414eb4
      Mikulas Patocka authored
      The functions clear_bit and set_bit do not imply a memory barrier, thus it
      may be possible that the waitqueue_active function (which does not take
      any locks) is moved before clear_bit and it could miss a wakeup event.
      
      Fix this bug by adding a memory barrier after clear_bit.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9f414eb4
  6. 11 Aug, 2022 3 commits
    • Linus Torvalds's avatar
      Merge tag 'net-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 7ebfc85e
      Linus Torvalds authored
      Pull networking fixes from Jakub Kicinski:
       "Including fixes from bluetooth, bpf, can and netfilter.
      
        A little larger than usual but it's all fixes, no late features. It's
        large partially because of timing, and partially because of follow ups
        to stuff that got merged a week or so before the merge window and
        wasn't as widely tested. Maybe the Bluetooth fixes are a little
        alarming so we'll address that, but the rest seems okay and not scary.
      
        Notably we're including a fix for the netfilter Kconfig [1], your WiFi
        warning [2] and a bluetooth fix which should unblock syzbot [3].
      
        Current release - regressions:
      
         - Bluetooth:
            - don't try to cancel uninitialized works [3]
            - L2CAP: fix use-after-free caused by l2cap_chan_put
      
         - tls: rx: fix device offload after recent rework
      
         - devlink: fix UAF on failed reload and leftover locks in mlxsw
      
        Current release - new code bugs:
      
         - netfilter:
            - flowtable: fix incorrect Kconfig dependencies [1]
            - nf_tables: fix crash when nf_trace is enabled
      
         - bpf:
            - use proper target btf when exporting attach_btf_obj_id
            - arm64: fixes for bpf trampoline support
      
         - Bluetooth:
            - ISO: unlock on error path in iso_sock_setsockopt()
            - ISO: fix info leak in iso_sock_getsockopt()
            - ISO: fix iso_sock_getsockopt for BT_DEFER_SETUP
            - ISO: fix memory corruption on iso_pinfo.base
            - ISO: fix not using the correct QoS
            - hci_conn: fix updating ISO QoS PHY
      
         - phy: dp83867: fix get nvmem cell fail
      
        Previous releases - regressions:
      
         - wifi: cfg80211: fix validating BSS pointers in
           __cfg80211_connect_result [2]
      
         - atm: bring back zatm uAPI after ATM had been removed
      
         - properly fix old bug making bonding ARP monitor mode not being able
           to work with software devices with lockless Tx
      
         - tap: fix null-deref on skb->dev in dev_parse_header_protocol
      
         - revert "net: usb: ax88179_178a needs FLAG_SEND_ZLP" it helps some
           devices and breaks others
      
         - netfilter:
            - nf_tables: many fixes rejecting cross-object linking which may
              lead to UAFs
            - nf_tables: fix null deref due to zeroed list head
            - nf_tables: validate variable length element extension
      
         - bgmac: fix a BUG triggered by wrong bytes_compl
      
         - bcmgenet: indicate MAC is in charge of PHY PM
      
        Previous releases - always broken:
      
         - bpf:
            - fix bad pointer deref in bpf_sys_bpf() injected via test infra
            - disallow non-builtin bpf programs calling the prog_run command
            - don't reinit map value in prealloc_lru_pop
            - fix UAFs during the read of map iterator fd
            - fix invalidity check for values in sk local storage map
            - reject sleepable program for non-resched map iterator
      
         - mptcp:
            - move subflow cleanup in mptcp_destroy_common()
            - do not queue data on closed subflows
      
         - virtio_net: fix memory leak inside XDP_TX with mergeable
      
         - vsock: fix memory leak when multiple threads try to connect()
      
         - rework sk_user_data sharing to prevent psock leaks
      
         - geneve: fix TOS inheriting for ipv4
      
         - tunnels & drivers: do not use RT_TOS for IPv6 flowlabel
      
         - phy: c45 baset1: do not skip aneg configuration if clock role is
           not specified
      
         - rose: avoid overflow when /proc displays timer information
      
         - x25: fix call timeouts in blocking connects
      
         - can: mcp251x: fix race condition on receive interrupt
      
         - can: j1939:
            - replace user-reachable WARN_ON_ONCE() with netdev_warn_once()
            - fix memory leak of skbs in j1939_session_destroy()
      
        Misc:
      
         - docs: bpf: clarify that many things are not uAPI
      
         - seg6: initialize induction variable to first valid array index (to
           silence clang vs objtool warning)
      
         - can: ems_usb: fix clang 14's -Wunaligned-access warning"
      
      * tag 'net-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (117 commits)
        net: atm: bring back zatm uAPI
        dpaa2-eth: trace the allocated address instead of page struct
        net: add missing kdoc for struct genl_multicast_group::flags
        nfp: fix use-after-free in area_cache_get()
        MAINTAINERS: use my korg address for mt7601u
        mlxsw: minimal: Fix deadlock in ports creation
        bonding: fix reference count leak in balance-alb mode
        net: usb: qmi_wwan: Add support for Cinterion MV32
        bpf: Shut up kern_sys_bpf warning.
        net/tls: Use RCU API to access tls_ctx->netdev
        tls: rx: device: don't try to copy too much on detach
        tls: rx: device: bound the frag walk
        net_sched: cls_route: remove from list when handle is 0
        selftests: forwarding: Fix failing tests with old libnet
        net: refactor bpf_sk_reuseport_detach()
        net: fix refcount bug in sk_psock_get (2)
        selftests/bpf: Ensure sleepable program is rejected by hash map iter
        selftests/bpf: Add write tests for sk local storage map iterator
        selftests/bpf: Add tests for reading a dangling map iter fd
        bpf: Only allow sleepable program for resched-able iterator
        ...
      7ebfc85e
    • Linus Torvalds's avatar
      Merge tag 'acpi-5.20-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · e091ba5c
      Linus Torvalds authored
      Pull more ACPI updates from Rafael Wysocki:
       "These fix up direct references to the fwnode field in struct device
        and extend ACPI device properties support.
      
        Specifics:
      
         - Replace direct references to the fwnode field in struct device with
           dev_fwnode() and device_match_fwnode() (Andy Shevchenko)
      
         - Make the ACPI code handling device properties support properties
           with buffer values (Sakari Ailus)"
      
      * tag 'acpi-5.20-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        ACPI: property: Fix error handling in acpi_init_properties()
        ACPI: VIOT: Do not dereference fwnode in struct device
        ACPI: property: Read buffer properties as integers
        ACPI: property: Add support for parsing buffer property UUID
        ACPI: property: Unify integer value reading functions
        ACPI: property: Switch node property referencing from ifs to a switch
        ACPI: property: Move property ref argument parsing into a new function
        ACPI: property: Use acpi_object_type consistently in property ref parsing
        ACPI: property: Tie data nodes to acpi handles
        ACPI: property: Return type of acpi_add_nondev_subnodes() should be bool
      e091ba5c
    • Linus Torvalds's avatar
      Merge tag 'iomap-6.0-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · 8745889a
      Linus Torvalds authored
      Pull more iomap updates from Darrick Wong:
       "In the past 10 days or so I've not heard any ZOMG STOP style
        complaints about removing ->writepage support from gfs2 or zonefs, so
        here's the pull request removing them (and the underlying fs iomap
        support) from the kernel:
      
         - Remove iomap_writepage and all callers, since the mm apparently
           never called the zonefs or gfs2 writepage functions"
      
      * tag 'iomap-6.0-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
        iomap: remove iomap_writepage
        zonefs: remove ->writepage
        gfs2: remove ->writepage
        gfs2: stop using generic_writepages in gfs2_ail1_start_one
      8745889a