1. 01 Jan, 2023 7 commits
    • Sean Anderson's avatar
      net: phy: Update documentation for get_rate_matching · 6d4cfcf9
      Sean Anderson authored
      Now that phylink no longer calls phy_get_rate_matching with
      PHY_INTERFACE_MODE_NA, phys no longer need to support it. Remove the
      documentation mandating support.
      
      Fixes: 7642cc28 ("net: phylink: fix PHY validation with rate adaption")
      Signed-off-by: default avatarSean Anderson <sean.anderson@seco.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6d4cfcf9
    • David S. Miller's avatar
      Merge branch 'dsa-qca8k-fixes' · d02b8256
      David S. Miller authored
      Christian Marangi says:
      
      ====================
      net: dsa: qca8k: multiple fix on mdio read/write
      
      Due to some problems in reading the Documentation and elaborating it
      some wrong assumption were done. The error was reported and notice only
      now due to how things are setup in the code flow.
      
      First 2 patch fix mgmt eth where the lenght calculation is very
      confusing and in step of word size. (the related commit description have
      an extensive description about how this mess works)
      
      Last 3 patch revert the broken mdio cache and apply a correct version
      that should still save some extra mdio in phy poll secnario.
      
      These 5 patch fix each related problem and apply what the Documentation
      actually say.
      
      Changes v2:
      - Add cover letter
      - Fix typo in revert patch
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d02b8256
    • Christian Marangi's avatar
      net: dsa: qca8k: improve mdio master read/write by using single lo/hi · a4165830
      Christian Marangi authored
      Improve mdio master read/write by using singe mii read/write lo/hi.
      
      In a read and write we need to poll the mdio master regs in a busy loop
      to check for a specific bit present in the upper half of the reg. We can
      ignore the other half since it won't contain useful data. This will save
      an additional useless read for each read and write operation.
      
      In a read operation the returned data is present in the mdio master reg
      lower half. We can ignore the other half since it won't contain useful
      data. This will save an additional useless read for each read operation.
      
      In a read operation it's needed to just set the hi half of the mdio
      master reg as the lo half will be replaced by the result. This will save
      an additional useless write for each read operation.
      Tested-by: default avatarRonald Wahl <ronald.wahl@raritan.com>
      Signed-off-by: default avatarChristian Marangi <ansuelsmth@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a4165830
    • Christian Marangi's avatar
      net: dsa: qca8k: introduce single mii read/write lo/hi · cfbd6de5
      Christian Marangi authored
      It may be useful to read/write just the lo or hi half of a reg.
      
      This is especially useful for phy poll with the use of mdio master.
      The mdio master reg is composed by the first 16 bit related to setup and
      the other half with the returned data or data to write.
      
      Refactor the mii function to permit single mii read/write of lo or hi
      half of the reg.
      Tested-by: default avatarRonald Wahl <ronald.wahl@raritan.com>
      Signed-off-by: default avatarChristian Marangi <ansuelsmth@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cfbd6de5
    • Christian Marangi's avatar
      Revert "net: dsa: qca8k: cache lo and hi for mdio write" · 03cb9e6d
      Christian Marangi authored
      This reverts commit 2481d206.
      
      The Documentation is very confusing about the topic.
      The cache logic for hi and lo is wrong and actually miss some regs to be
      actually written.
      
      What the Documentation actually intended was that it's possible to skip
      writing hi OR lo if half of the reg is not needed to be written or read.
      
      Revert the change in favor of a better and correct implementation.
      Reported-by: default avatarRonald Wahl <ronald.wahl@raritan.com>
      Signed-off-by: default avatarChristian Marangi <ansuelsmth@gmail.com>
      Cc: stable@vger.kernel.org # v5.18+
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      03cb9e6d
    • Christian Marangi's avatar
      net: dsa: tag_qca: fix wrong MGMT_DATA2 size · d9dba91b
      Christian Marangi authored
      It was discovered that MGMT_DATA2 can contain up to 28 bytes of data
      instead of the 12 bytes written in the Documentation by accounting the
      limit of 16 bytes declared in Documentation subtracting the first 4 byte
      in the packet header.
      
      Update the define with the real world value.
      Tested-by: default avatarRonald Wahl <ronald.wahl@raritan.com>
      Fixes: c2ee8181 ("net: dsa: tag_qca: add define for handling mgmt Ethernet packet")
      Signed-off-by: default avatarChristian Marangi <ansuelsmth@gmail.com>
      Cc: stable@vger.kernel.org # v5.18+
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d9dba91b
    • Christian Marangi's avatar
      net: dsa: qca8k: fix wrong length value for mgmt eth packet · 9807ae69
      Christian Marangi authored
      The assumption that Documentation was right about how this value work was
      wrong. It was discovered that the length value of the mgmt header is in
      step of word size.
      
      As an example to process 4 byte of data the correct length to set is 2.
      To process 8 byte 4, 12 byte 6, 16 byte 8...
      
      Odd values will always return the next size on the ack packet.
      (length of 3 (6 byte) will always return 8 bytes of data)
      
      This means that a value of 15 (0xf) actually means reading/writing 32 bytes
      of data instead of 16 bytes. This behaviour is totally absent and not
      documented in the switch Documentation.
      
      In fact from Documentation the max value that mgmt eth can process is
      16 byte of data while in reality it can process 32 bytes at once.
      
      To handle this we always round up the length after deviding it for word
      size. We check if the result is odd and we round another time to align
      to what the switch will provide in the ack packet.
      The workaround for the length limit of 15 is still needed as the length
      reg max value is 0xf(15)
      Reported-by: default avatarRonald Wahl <ronald.wahl@raritan.com>
      Tested-by: default avatarRonald Wahl <ronald.wahl@raritan.com>
      Fixes: 90386223 ("net: dsa: qca8k: add support for larger read/write size with mgmt Ethernet")
      Signed-off-by: default avatarChristian Marangi <ansuelsmth@gmail.com>
      Cc: stable@vger.kernel.org # v5.18+
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9807ae69
  2. 30 Dec, 2022 18 commits
  3. 28 Dec, 2022 15 commits
    • Eli Cohen's avatar
      net/mlx5: Lag, fix failure to cancel delayed bond work · 4d1c1379
      Eli Cohen authored
      Commit 0d4e8ed1 ("net/mlx5: Lag, avoid lockdep warnings")
      accidentally removed a call to cancel delayed bond work thus it may
      cause queued delay to expire and fall on an already destroyed work
      queue.
      
      Fix by restoring the call cancel_delayed_work_sync() before
      destroying the workqueue.
      
      This prevents call trace such as this:
      
      [  329.230417] BUG: kernel NULL pointer dereference, address: 0000000000000000
       [  329.231444] #PF: supervisor write access in kernel mode
       [  329.232233] #PF: error_code(0x0002) - not-present page
       [  329.233007] PGD 0 P4D 0
       [  329.233476] Oops: 0002 [#1] SMP
       [  329.234012] CPU: 5 PID: 145 Comm: kworker/u20:4 Tainted: G OE      6.0.0-rc5_mlnx #1
       [  329.235282] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
       [  329.236868] Workqueue: mlx5_cmd_0000:08:00.1 cmd_work_handler [mlx5_core]
       [  329.237886] RIP: 0010:_raw_spin_lock+0xc/0x20
       [  329.238585] Code: f0 0f b1 17 75 02 f3 c3 89 c6 e9 6f 3c 5f ff 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 0f 1f 44 00 00 31 c0 ba 01 00 00 00 <f0> 0f b1 17 75 02 f3 c3 89 c6 e9 45 3c 5f ff 0f 1f 44 00 00 0f 1f
       [  329.241156] RSP: 0018:ffffc900001b0e98 EFLAGS: 00010046
       [  329.241940] RAX: 0000000000000000 RBX: ffffffff82374ae0 RCX: 0000000000000000
       [  329.242954] RDX: 0000000000000001 RSI: 0000000000000014 RDI: 0000000000000000
       [  329.243974] RBP: ffff888106ccf000 R08: ffff8881004000c8 R09: ffff888100400000
       [  329.244990] R10: 0000000000000000 R11: ffffffff826669f8 R12: 0000000000002000
       [  329.246009] R13: 0000000000000005 R14: ffff888100aa7ce0 R15: ffff88852ca80000
       [  329.247030] FS:  0000000000000000(0000) GS:ffff88852ca80000(0000) knlGS:0000000000000000
       [  329.248260] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       [  329.249111] CR2: 0000000000000000 CR3: 000000016d675001 CR4: 0000000000770ee0
       [  329.250133] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       [  329.251152] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
       [  329.252176] PKRU: 55555554
      
      Fixes: 0d4e8ed1 ("net/mlx5: Lag, avoid lockdep warnings")
      Signed-off-by: default avatarEli Cohen <elic@nvidia.com>
      Reviewed-by: default avatarMaor Dickman <maord@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      4d1c1379
    • Maor Dickman's avatar
      net/mlx5e: Set geneve_tlv_option_0_exist when matching on geneve option · e54638a8
      Maor Dickman authored
      The cited patch added support of matching on geneve option by setting
      geneve_tlv_option_0_data mask and key but didn't set geneve_tlv_option_0_exist
      bit which is required on some HWs when matching geneve_tlv_option_0_data parameter,
      this may cause in some cases for packets to wrongly match on rules with different
      geneve option.
      
      Example of such case is packet with geneve_tlv_object class=789 and data=456
      will wrongly match on rule with match geneve_tlv_object class=123 and data=456.
      
      Fix it by setting geneve_tlv_option_0_exist bit when supported by the HW when matching
      on geneve_tlv_option_0_data parameter.
      
      Fixes: 9272e3df ("net/mlx5e: Geneve, Add support for encap/decap flows offload")
      Signed-off-by: default avatarMaor Dickman <maord@nvidia.com>
      Reviewed-by: default avatarRoi Dayan <roid@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      e54638a8
    • Adham Faris's avatar
      net/mlx5e: Fix hw mtu initializing at XDP SQ allocation · 1e267ab8
      Adham Faris authored
      Current xdp xmit functions logic (mlx5e_xmit_xdp_frame_mpwqe or
      mlx5e_xmit_xdp_frame), validates xdp packet length by comparing it to
      hw mtu (configured at xdp sq allocation) before xmiting it. This check
      does not account for ethernet fcs length (calculated and filled by the
      nic). Hence, when we try sending packets with length > (hw-mtu -
      ethernet-fcs-size), the device port drops it and tx_errors_phy is
      incremented. Desired behavior is to catch these packets and drop them
      by the driver.
      
      Fix this behavior in XDP SQ allocation function (mlx5e_alloc_xdpsq) by
      subtracting ethernet FCS header size (4 Bytes) from current hw mtu
      value, since ethernet FCS is calculated and written to ethernet frames
      by the nic.
      
      Fixes: d8bec2b2 ("net/mlx5e: Support bpf_xdp_adjust_head()")
      Signed-off-by: default avatarAdham Faris <afaris@nvidia.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      1e267ab8
    • Chris Mi's avatar
      net/mlx5e: Always clear dest encap in neigh-update-del · 2951b2e1
      Chris Mi authored
      The cited commit introduced a bug for multiple encapsulations flow.
      If one dest encap becomes invalid, the flow is set slow path flag.
      But when other dests encap become invalid, they are not cleared due
      to slow path flag of the flow. When neigh-update-add is running, it
      will use invalid encap.
      
      Fix it by checking slow path flag after clearing dest encap.
      
      Fixes: 9a5f9cc7 ("net/mlx5e: Fix possible use-after-free deleting fdb rule")
      Signed-off-by: default avatarChris Mi <cmi@nvidia.com>
      Reviewed-by: default avatarRoi Dayan <roid@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      2951b2e1
    • Chris Mi's avatar
      net/mlx5e: CT: Fix ct debugfs folder name · 849190e3
      Chris Mi authored
      Need to use sprintf to build a string instead of sscanf. Otherwise
      dirname is null and both "ct_nic" and "ct_fdb" won't be created.
      But its redundant anyway as driver could be in switchdev mode but
      still add nic rules. So use "ct" as folder name.
      
      Fixes: 77422a8f ("net/mlx5e: CT: Add ct driver counters")
      Signed-off-by: default avatarChris Mi <cmi@nvidia.com>
      Reviewed-by: default avatarRoi Dayan <roid@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      849190e3
    • Tariq Toukan's avatar
      net/mlx5e: Fix RX reporter for XSK RQs · f8c18a57
      Tariq Toukan authored
      RX reporter mistakenly reads from the regular (inactive) RQ
      when XSK RQ is active. Fix it here.
      
      Fixes: 3db4c85c ("net/mlx5e: xsk: Use queue indices starting from 0 for XSK queues")
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Reviewed-by: default avatarGal Pressman <gal@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      f8c18a57
    • Dragos Tatulea's avatar
      net/mlx5e: IPoIB, Don't allow CQE compression to be turned on by default · b12d581e
      Dragos Tatulea authored
      mlx5e_build_nic_params will turn CQE compression on if the hardware
      capability is enabled and the slow_pci_heuristic condition is detected.
      As IPoIB doesn't support CQE compression, make sure to disable the
      feature in the IPoIB profile init.
      
      Please note that the feature is not exposed to the user for IPoIB
      interfaces, so it can't be subsequently turned on.
      
      Fixes: b797a684 ("net/mlx5e: Enable CQE compression when PCI is slower than link")
      Signed-off-by: default avatarDragos Tatulea <dtatulea@nvidia.com>
      Reviewed-by: default avatarGal Pressman <gal@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      b12d581e
    • Shay Drory's avatar
      net/mlx5: Fix RoCE setting at HCA level · c4ad5f2b
      Shay Drory authored
      mlx5 PF can disable RoCE for its VFs and SFs. In such case RoCE is
      marked as unsupported on those VFs/SFs.
      The cited patch added an option for disable (and enable) RoCE at HCA
      level. However, that commit didn't check whether RoCE is supported on
      the HCA and enabled user to try and set RoCE to on.
      Fix it by checking whether the HCA supports RoCE.
      
      Fixes: fbfa97b4 ("net/mlx5: Disable roce at HCA level")
      Signed-off-by: default avatarShay Drory <shayd@nvidia.com>
      Reviewed-by: default avatarMoshe Shemesh <moshe@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      c4ad5f2b
    • Shay Drory's avatar
      net/mlx5: Avoid recovery in probe flows · 9078e843
      Shay Drory authored
      Currently, recovery is done without considering whether the device is
      still in probe flow.
      This may lead to recovery before device have finished probed
      successfully. e.g.: while mlx5_init_one() is running. Recovery flow is
      using functionality that is loaded only by mlx5_init_one(), and there
      is no point in running recovery without mlx5_init_one() finished
      successfully.
      
      Fix it by waiting for probe flow to finish and checking whether the
      device is probed before trying to perform recovery.
      
      Fixes: 51d138c2 ("net/mlx5: Fix health error state handling")
      Signed-off-by: default avatarShay Drory <shayd@nvidia.com>
      Reviewed-by: default avatarMoshe Shemesh <moshe@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      9078e843
    • Shay Drory's avatar
      net/mlx5: Fix io_eq_size and event_eq_size params validation · 44aee8ea
      Shay Drory authored
      io_eq_size and event_eq_size params are of param type
      DEVLINK_PARAM_TYPE_U32. But, the validation callback is addressing them
      as DEVLINK_PARAM_TYPE_U16.
      
      This cause mismatch in validation in big-endian systems, in which
      values in range were rejected while 268500991 was accepted.
      Fix it by checking the U32 value in the validation callback.
      
      Fixes: 0844fa5f ("net/mlx5: Let user configure io_eq_size param")
      Signed-off-by: default avatarShay Drory <shayd@nvidia.com>
      Reviewed-by: default avatarMoshe Shemesh <moshe@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      44aee8ea
    • Jiri Pirko's avatar
      net/mlx5: Add forgotten cleanup calls into mlx5_init_once() error path · 2a35b2c2
      Jiri Pirko authored
      There are two cleanup calls missing in mlx5_init_once() error path.
      Add them making the error path flow to be the same as
      mlx5_cleanup_once().
      
      Fixes: 52ec462e ("net/mlx5: Add reserved-gids support")
      Fixes: 7c39afb3 ("net/mlx5: PTP code migration to driver core section")
      Signed-off-by: default avatarJiri Pirko <jiri@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      2a35b2c2
    • Moshe Shemesh's avatar
      net/mlx5: E-Switch, properly handle ingress tagged packets on VST · 1f0ae22a
      Moshe Shemesh authored
      Fix SRIOV VST mode behavior to insert cvlan when a guest tag is already
      present in the frame. Previous VST mode behavior was to drop packets or
      override existing tag, depending on the device version.
      
      In this patch we fix this behavior by correctly building the HW steering
      rule with a push vlan action, or for older devices we ask the FW to stack
      the vlan when a vlan is already present.
      
      Fixes: 07bab950 ("net/mlx5: E-Switch, Refactor eswitch ingress acl codes")
      Fixes: dfcb1ed3 ("net/mlx5: E-Switch, Vport ingress/egress ACLs rules for VST mode")
      Signed-off-by: default avatarMoshe Shemesh <moshe@nvidia.com>
      Reviewed-by: default avatarMark Bloch <mbloch@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      1f0ae22a
    • Pedro Tammela's avatar
      net/sched: fix retpoline wrapper compilation on configs without tc filters · 40cab44b
      Pedro Tammela authored
      Rudi reports a compilation failure on x86_64 when CONFIG_NET_CLS or
      CONFIG_NET_CLS_ACT is not set but CONFIG_RETPOLINE is set.
      A misplaced '#endif' was causing the issue.
      
      Fixes: 7f0e8102 ("net/sched: add retpoline wrapper for tc")
      Tested-by: default avatarRudi Heitbaum <rudi@heitbaum.com>
      Signed-off-by: default avatarPedro Tammela <pctammela@mojatatu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      40cab44b
    • Xuezhi Zhang's avatar
      s390/qeth: convert sysfs snprintf to sysfs_emit · c2052189
      Xuezhi Zhang authored
      Follow the advice of the Documentation/filesystems/sysfs.rst
      and show() should only use sysfs_emit() or sysfs_emit_at()
      when formatting the value to be returned to user space.
      Signed-off-by: default avatarXuezhi Zhang <zhangxuezhi1@coolpad.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c2052189
    • David S. Miller's avatar
      Merge branch 'r8169-fixes' · 0e3d1835
      David S. Miller authored
      Chunhao Lin says:
      
      ====================
      r8169: fix dmar pte write access is not set error
      
      This series fixes dmar pte write access is not set error.
      
      Chunhao Lin (2):
        r8169: move rtl_wol_enable_rx() and rtl_prepare_power_down()
        r8169: fix dmar pte write access is not set error
      
      v2:
      -update commit message
      -adjust the code according to current kernel code
      v3:
      -update title and commit message
      -split the patch
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0e3d1835