1. 01 Jan, 2023 3 commits
  2. 30 Dec, 2022 18 commits
  3. 28 Dec, 2022 19 commits
    • Eli Cohen's avatar
      net/mlx5: Lag, fix failure to cancel delayed bond work · 4d1c1379
      Eli Cohen authored
      Commit 0d4e8ed1 ("net/mlx5: Lag, avoid lockdep warnings")
      accidentally removed a call to cancel delayed bond work thus it may
      cause queued delay to expire and fall on an already destroyed work
      queue.
      
      Fix by restoring the call cancel_delayed_work_sync() before
      destroying the workqueue.
      
      This prevents call trace such as this:
      
      [  329.230417] BUG: kernel NULL pointer dereference, address: 0000000000000000
       [  329.231444] #PF: supervisor write access in kernel mode
       [  329.232233] #PF: error_code(0x0002) - not-present page
       [  329.233007] PGD 0 P4D 0
       [  329.233476] Oops: 0002 [#1] SMP
       [  329.234012] CPU: 5 PID: 145 Comm: kworker/u20:4 Tainted: G OE      6.0.0-rc5_mlnx #1
       [  329.235282] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
       [  329.236868] Workqueue: mlx5_cmd_0000:08:00.1 cmd_work_handler [mlx5_core]
       [  329.237886] RIP: 0010:_raw_spin_lock+0xc/0x20
       [  329.238585] Code: f0 0f b1 17 75 02 f3 c3 89 c6 e9 6f 3c 5f ff 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 0f 1f 44 00 00 31 c0 ba 01 00 00 00 <f0> 0f b1 17 75 02 f3 c3 89 c6 e9 45 3c 5f ff 0f 1f 44 00 00 0f 1f
       [  329.241156] RSP: 0018:ffffc900001b0e98 EFLAGS: 00010046
       [  329.241940] RAX: 0000000000000000 RBX: ffffffff82374ae0 RCX: 0000000000000000
       [  329.242954] RDX: 0000000000000001 RSI: 0000000000000014 RDI: 0000000000000000
       [  329.243974] RBP: ffff888106ccf000 R08: ffff8881004000c8 R09: ffff888100400000
       [  329.244990] R10: 0000000000000000 R11: ffffffff826669f8 R12: 0000000000002000
       [  329.246009] R13: 0000000000000005 R14: ffff888100aa7ce0 R15: ffff88852ca80000
       [  329.247030] FS:  0000000000000000(0000) GS:ffff88852ca80000(0000) knlGS:0000000000000000
       [  329.248260] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       [  329.249111] CR2: 0000000000000000 CR3: 000000016d675001 CR4: 0000000000770ee0
       [  329.250133] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       [  329.251152] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
       [  329.252176] PKRU: 55555554
      
      Fixes: 0d4e8ed1 ("net/mlx5: Lag, avoid lockdep warnings")
      Signed-off-by: default avatarEli Cohen <elic@nvidia.com>
      Reviewed-by: default avatarMaor Dickman <maord@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      4d1c1379
    • Maor Dickman's avatar
      net/mlx5e: Set geneve_tlv_option_0_exist when matching on geneve option · e54638a8
      Maor Dickman authored
      The cited patch added support of matching on geneve option by setting
      geneve_tlv_option_0_data mask and key but didn't set geneve_tlv_option_0_exist
      bit which is required on some HWs when matching geneve_tlv_option_0_data parameter,
      this may cause in some cases for packets to wrongly match on rules with different
      geneve option.
      
      Example of such case is packet with geneve_tlv_object class=789 and data=456
      will wrongly match on rule with match geneve_tlv_object class=123 and data=456.
      
      Fix it by setting geneve_tlv_option_0_exist bit when supported by the HW when matching
      on geneve_tlv_option_0_data parameter.
      
      Fixes: 9272e3df ("net/mlx5e: Geneve, Add support for encap/decap flows offload")
      Signed-off-by: default avatarMaor Dickman <maord@nvidia.com>
      Reviewed-by: default avatarRoi Dayan <roid@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      e54638a8
    • Adham Faris's avatar
      net/mlx5e: Fix hw mtu initializing at XDP SQ allocation · 1e267ab8
      Adham Faris authored
      Current xdp xmit functions logic (mlx5e_xmit_xdp_frame_mpwqe or
      mlx5e_xmit_xdp_frame), validates xdp packet length by comparing it to
      hw mtu (configured at xdp sq allocation) before xmiting it. This check
      does not account for ethernet fcs length (calculated and filled by the
      nic). Hence, when we try sending packets with length > (hw-mtu -
      ethernet-fcs-size), the device port drops it and tx_errors_phy is
      incremented. Desired behavior is to catch these packets and drop them
      by the driver.
      
      Fix this behavior in XDP SQ allocation function (mlx5e_alloc_xdpsq) by
      subtracting ethernet FCS header size (4 Bytes) from current hw mtu
      value, since ethernet FCS is calculated and written to ethernet frames
      by the nic.
      
      Fixes: d8bec2b2 ("net/mlx5e: Support bpf_xdp_adjust_head()")
      Signed-off-by: default avatarAdham Faris <afaris@nvidia.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      1e267ab8
    • Chris Mi's avatar
      net/mlx5e: Always clear dest encap in neigh-update-del · 2951b2e1
      Chris Mi authored
      The cited commit introduced a bug for multiple encapsulations flow.
      If one dest encap becomes invalid, the flow is set slow path flag.
      But when other dests encap become invalid, they are not cleared due
      to slow path flag of the flow. When neigh-update-add is running, it
      will use invalid encap.
      
      Fix it by checking slow path flag after clearing dest encap.
      
      Fixes: 9a5f9cc7 ("net/mlx5e: Fix possible use-after-free deleting fdb rule")
      Signed-off-by: default avatarChris Mi <cmi@nvidia.com>
      Reviewed-by: default avatarRoi Dayan <roid@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      2951b2e1
    • Chris Mi's avatar
      net/mlx5e: CT: Fix ct debugfs folder name · 849190e3
      Chris Mi authored
      Need to use sprintf to build a string instead of sscanf. Otherwise
      dirname is null and both "ct_nic" and "ct_fdb" won't be created.
      But its redundant anyway as driver could be in switchdev mode but
      still add nic rules. So use "ct" as folder name.
      
      Fixes: 77422a8f ("net/mlx5e: CT: Add ct driver counters")
      Signed-off-by: default avatarChris Mi <cmi@nvidia.com>
      Reviewed-by: default avatarRoi Dayan <roid@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      849190e3
    • Tariq Toukan's avatar
      net/mlx5e: Fix RX reporter for XSK RQs · f8c18a57
      Tariq Toukan authored
      RX reporter mistakenly reads from the regular (inactive) RQ
      when XSK RQ is active. Fix it here.
      
      Fixes: 3db4c85c ("net/mlx5e: xsk: Use queue indices starting from 0 for XSK queues")
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Reviewed-by: default avatarGal Pressman <gal@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      f8c18a57
    • Dragos Tatulea's avatar
      net/mlx5e: IPoIB, Don't allow CQE compression to be turned on by default · b12d581e
      Dragos Tatulea authored
      mlx5e_build_nic_params will turn CQE compression on if the hardware
      capability is enabled and the slow_pci_heuristic condition is detected.
      As IPoIB doesn't support CQE compression, make sure to disable the
      feature in the IPoIB profile init.
      
      Please note that the feature is not exposed to the user for IPoIB
      interfaces, so it can't be subsequently turned on.
      
      Fixes: b797a684 ("net/mlx5e: Enable CQE compression when PCI is slower than link")
      Signed-off-by: default avatarDragos Tatulea <dtatulea@nvidia.com>
      Reviewed-by: default avatarGal Pressman <gal@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      b12d581e
    • Shay Drory's avatar
      net/mlx5: Fix RoCE setting at HCA level · c4ad5f2b
      Shay Drory authored
      mlx5 PF can disable RoCE for its VFs and SFs. In such case RoCE is
      marked as unsupported on those VFs/SFs.
      The cited patch added an option for disable (and enable) RoCE at HCA
      level. However, that commit didn't check whether RoCE is supported on
      the HCA and enabled user to try and set RoCE to on.
      Fix it by checking whether the HCA supports RoCE.
      
      Fixes: fbfa97b4 ("net/mlx5: Disable roce at HCA level")
      Signed-off-by: default avatarShay Drory <shayd@nvidia.com>
      Reviewed-by: default avatarMoshe Shemesh <moshe@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      c4ad5f2b
    • Shay Drory's avatar
      net/mlx5: Avoid recovery in probe flows · 9078e843
      Shay Drory authored
      Currently, recovery is done without considering whether the device is
      still in probe flow.
      This may lead to recovery before device have finished probed
      successfully. e.g.: while mlx5_init_one() is running. Recovery flow is
      using functionality that is loaded only by mlx5_init_one(), and there
      is no point in running recovery without mlx5_init_one() finished
      successfully.
      
      Fix it by waiting for probe flow to finish and checking whether the
      device is probed before trying to perform recovery.
      
      Fixes: 51d138c2 ("net/mlx5: Fix health error state handling")
      Signed-off-by: default avatarShay Drory <shayd@nvidia.com>
      Reviewed-by: default avatarMoshe Shemesh <moshe@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      9078e843
    • Shay Drory's avatar
      net/mlx5: Fix io_eq_size and event_eq_size params validation · 44aee8ea
      Shay Drory authored
      io_eq_size and event_eq_size params are of param type
      DEVLINK_PARAM_TYPE_U32. But, the validation callback is addressing them
      as DEVLINK_PARAM_TYPE_U16.
      
      This cause mismatch in validation in big-endian systems, in which
      values in range were rejected while 268500991 was accepted.
      Fix it by checking the U32 value in the validation callback.
      
      Fixes: 0844fa5f ("net/mlx5: Let user configure io_eq_size param")
      Signed-off-by: default avatarShay Drory <shayd@nvidia.com>
      Reviewed-by: default avatarMoshe Shemesh <moshe@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      44aee8ea
    • Jiri Pirko's avatar
      net/mlx5: Add forgotten cleanup calls into mlx5_init_once() error path · 2a35b2c2
      Jiri Pirko authored
      There are two cleanup calls missing in mlx5_init_once() error path.
      Add them making the error path flow to be the same as
      mlx5_cleanup_once().
      
      Fixes: 52ec462e ("net/mlx5: Add reserved-gids support")
      Fixes: 7c39afb3 ("net/mlx5: PTP code migration to driver core section")
      Signed-off-by: default avatarJiri Pirko <jiri@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      2a35b2c2
    • Moshe Shemesh's avatar
      net/mlx5: E-Switch, properly handle ingress tagged packets on VST · 1f0ae22a
      Moshe Shemesh authored
      Fix SRIOV VST mode behavior to insert cvlan when a guest tag is already
      present in the frame. Previous VST mode behavior was to drop packets or
      override existing tag, depending on the device version.
      
      In this patch we fix this behavior by correctly building the HW steering
      rule with a push vlan action, or for older devices we ask the FW to stack
      the vlan when a vlan is already present.
      
      Fixes: 07bab950 ("net/mlx5: E-Switch, Refactor eswitch ingress acl codes")
      Fixes: dfcb1ed3 ("net/mlx5: E-Switch, Vport ingress/egress ACLs rules for VST mode")
      Signed-off-by: default avatarMoshe Shemesh <moshe@nvidia.com>
      Reviewed-by: default avatarMark Bloch <mbloch@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      1f0ae22a
    • Pedro Tammela's avatar
      net/sched: fix retpoline wrapper compilation on configs without tc filters · 40cab44b
      Pedro Tammela authored
      Rudi reports a compilation failure on x86_64 when CONFIG_NET_CLS or
      CONFIG_NET_CLS_ACT is not set but CONFIG_RETPOLINE is set.
      A misplaced '#endif' was causing the issue.
      
      Fixes: 7f0e8102 ("net/sched: add retpoline wrapper for tc")
      Tested-by: default avatarRudi Heitbaum <rudi@heitbaum.com>
      Signed-off-by: default avatarPedro Tammela <pctammela@mojatatu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      40cab44b
    • Xuezhi Zhang's avatar
      s390/qeth: convert sysfs snprintf to sysfs_emit · c2052189
      Xuezhi Zhang authored
      Follow the advice of the Documentation/filesystems/sysfs.rst
      and show() should only use sysfs_emit() or sysfs_emit_at()
      when formatting the value to be returned to user space.
      Signed-off-by: default avatarXuezhi Zhang <zhangxuezhi1@coolpad.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c2052189
    • David S. Miller's avatar
      Merge branch 'r8169-fixes' · 0e3d1835
      David S. Miller authored
      Chunhao Lin says:
      
      ====================
      r8169: fix dmar pte write access is not set error
      
      This series fixes dmar pte write access is not set error.
      
      Chunhao Lin (2):
        r8169: move rtl_wol_enable_rx() and rtl_prepare_power_down()
        r8169: fix dmar pte write access is not set error
      
      v2:
      -update commit message
      -adjust the code according to current kernel code
      v3:
      -update title and commit message
      -split the patch
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0e3d1835
    • Chunhao Lin's avatar
      r8169: fix dmar pte write access is not set error · bb41c13c
      Chunhao Lin authored
      When close device, if wol is enabled, rx will be enabled. When open
      device it will cause rx packet to be dma to the wrong memory address
      after pci_set_master() and system log will show blow messages.
      
      DMAR: DRHD: handling fault status reg 3
      DMAR: [DMA Write] Request device [02:00.0] PASID ffffffff fault addr
      ffdd4000 [fault reason 05] PTE Write access is not set
      
      In this patch, driver disable tx/rx when close device. If wol is
      enabled, only enable rx filter and disable rxdv_gate(if support) to
      let hardware only receive packet to fifo but not to dma it.
      Signed-off-by: default avatarChunhao Lin <hau@realtek.com>
      Reviewed-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bb41c13c
    • Chunhao Lin's avatar
      r8169: move rtl_wol_enable_rx() and rtl_prepare_power_down() · ad425666
      Chunhao Lin authored
      There is no functional change. Moving these two functions for following
      patch "r8169: fix dmar pte write access is not set error".
      Signed-off-by: default avatarChunhao Lin <hau@realtek.com>
      Reviewed-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ad425666
    • David S. Miller's avatar
      Merge branch 'ethtool_gert_phy_stats-fixes' · e71460d4
      David S. Miller authored
      Daniil Tatianin says:
      
      ====================
      net/ethtool/ioctl: split ethtool_get_phy_stats into multiple helpers
      
      This series fixes a potential NULL dereference in ethtool_get_phy_stats
      while also attempting to refactor/split said function into multiple
      helpers so that it's easier to reason about what's going on.
      
      I've taken Andrew Lunn's suggestions on the previous version of this
      patch and added a bit of my own.
      
      Changes since v1:
      - Remove an extra newline in the first patch
      - Move WARN_ON_ONCE into the if check as it already returns the
        result of the comparison
      - Actually split ethtool_get_phy_stats instead of attempting to
        refactor it
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e71460d4
    • Daniil Tatianin's avatar
      net/ethtool/ioctl: split ethtool_get_phy_stats into multiple helpers · 201ed315
      Daniil Tatianin authored
      So that it's easier to follow and make sense of the branching and
      various conditions.
      
      Stats retrieval has been split into two separate functions
      ethtool_get_phy_stats_phydev & ethtool_get_phy_stats_ethtool.
      The former attempts to retrieve the stats using phydev & phy_ops, while
      the latter uses ethtool_ops.
      
      Actual n_stats validation & array allocation has been moved into a new
      ethtool_vzalloc_stats_array helper.
      
      This also fixes a potential NULL dereference of
      ops->get_ethtool_phy_stats where it was getting called in an else branch
      unconditionally without making sure it was actually present.
      
      Found by Linux Verification Center (linuxtesting.org) with the SVACE
      static analysis tool.
      Signed-off-by: default avatarDaniil Tatianin <d-tatianin@yandex-team.ru>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      201ed315