1. 23 Jul, 2020 4 commits
    • Claudiu Manoil's avatar
      enetc: Remove the imdio bus on PF probe bailout · c6dd6488
      Claudiu Manoil authored
      enetc_imdio_remove() is missing from the enetc_pf_probe()
      bailout path. Not surprisingly because enetc_setup_serdes()
      is registering the imdio bus for internal purposes, and it's
      not obvious that enetc_imdio_remove() currently performs the
      teardown of enetc_setup_serdes().
      To fix this, define enetc_teardown_serdes() to wrap
      enetc_imdio_remove() (improve code maintenance) and call it
      on bailout and remove paths.
      
      Fixes: 975d183e ("net: enetc: Initialize SerDes for SGMII and USXGMII protocols")
      Signed-off-by: default avatarClaudiu Manoil <claudiu.manoil@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c6dd6488
    • Wang Hai's avatar
      net: qed: Remove unneeded cast from memory allocation · 7979a7d2
      Wang Hai authored
      Remove casting the values returned by memory allocation function.
      
      Coccinelle emits WARNING: casting value returned by memory allocation
      unction to (struct roce_destroy_qp_req_output_params *) is useless.
      
      This issue was detected by using the Coccinelle software.
      Signed-off-by: default avatarWang Hai <wanghai38@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7979a7d2
    • Vladimir Oltean's avatar
      net: phy: fix check in get_phy_c45_ids · fb16d465
      Vladimir Oltean authored
      After the patch below, the iteration through the available MMDs is
      completely short-circuited, and devs_in_pkg remains set to the initial
      value of zero.
      
      Due to devs_in_pkg being zero, the rest of get_phy_c45_ids() is
      short-circuited too: the following loop never reaches below this point
      either (it executes "continue" for every device in package, failing to
      retrieve PHY ID for any of them):
      
      	/* Now probe Device Identifiers for each device present. */
      	for (i = 1; i < num_ids; i++) {
      		if (!(devs_in_pkg & (1 << i)))
      			continue;
      
      So c45_ids->device_ids remains populated with zeroes. This causes an
      Aquantia AQR412 PHY (same as any C45 PHY would, in fact) to be probed by
      the Generic PHY driver.
      
      The issue seems to be a case of submitting partially committed work (and
      therefore testing something other than was submitted).
      
      The intention of the patch was to delay exiting the loop until one more
      condition is reached (the devs_in_pkg read from hardware is either 0, OR
      mostly f's). So fix the patch to reflect that.
      
      Tested with traffic on a LS1028A-QDS, the PHY is now probed correctly
      using the Aquantia driver. The devs_in_pkg bit field is set to
      0xe000009a, and the MMDs that are present have the following IDs:
      
      [    5.600772] libphy: get_phy_c45_ids: device_ids[1]=0x3a1b662
      [    5.618781] libphy: get_phy_c45_ids: device_ids[3]=0x3a1b662
      [    5.630797] libphy: get_phy_c45_ids: device_ids[4]=0x3a1b662
      [    5.654535] libphy: get_phy_c45_ids: device_ids[7]=0x3a1b662
      [    5.791723] libphy: get_phy_c45_ids: device_ids[29]=0x3a1b662
      [    5.804050] libphy: get_phy_c45_ids: device_ids[30]=0x3a1b662
      [    5.816375] libphy: get_phy_c45_ids: device_ids[31]=0x0
      
      [    7.690237] mscc_felix 0000:00:00.5: PHY [0.5:00] driver [Aquantia AQR412] (irq=POLL)
      [    7.704739] mscc_felix 0000:00:00.5: PHY [0.5:01] driver [Aquantia AQR412] (irq=POLL)
      [    7.718918] mscc_felix 0000:00:00.5: PHY [0.5:02] driver [Aquantia AQR412] (irq=POLL)
      [    7.733044] mscc_felix 0000:00:00.5: PHY [0.5:03] driver [Aquantia AQR412] (irq=POLL)
      
      Fixes: bba238ed ("net: phy: continue searching for C45 MMDs even if first returned ffff:ffff")
      Reported-by: default avatarColin King <colin.king@canonical.com>
      Reported-by: default avatarIoana Ciornei <ioana.ciornei@nxp.com>
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fb16d465
    • Richard Sailer's avatar
      net: dccp: Add SIOCOUTQ IOCTL support (send buffer fill) · 749c08f8
      Richard Sailer authored
      This adds support for the SIOCOUTQ IOCTL to get the send buffer fill
      of a DCCP socket, like UDP and TCP sockets already have.
      
      Regarding the used data field: DCCP uses per packet sequence numbers,
      not per byte, so sequence numbers can't be used like in TCP. sk_wmem_queued
      is not used by DCCP and always 0, even in test on highly congested paths.
      Therefore this uses sk_wmem_alloc like in UDP.
      Signed-off-by: default avatarRichard Sailer <richard_siegfried@systemli.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      749c08f8
  2. 22 Jul, 2020 15 commits
  3. 21 Jul, 2020 21 commits
    • David S. Miller's avatar
      Merge branch 'dpaa2-eth-add-support-for-TBF-offload' · 4303aa98
      David S. Miller authored
      Ioana Ciornei says:
      
      ====================
      dpaa2-eth: add support for TBF offload
      
      This patch set adds support for TBF offload in dpaa2-eth.
      The first patch restructures how the .ndo_setup_tc() callback is
      implemented (each Qdisc is treated in a separate function), the second
      patch just adds the necessary APIs for configuring the Tx shaper and the
      last one is handling TC_SETUP_QDISC_TBF and configures as requested the
      shaper.
      ====================
      Reviewed-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4303aa98
    • Ioana Ciornei's avatar
      dpaa2-eth: add support for TBF offload · 3657cdaf
      Ioana Ciornei authored
      React to TC_SETUP_QDISC_TBF and configure the egress shaper as
      appropriate with the maximum rate and burst size requested by the user.
      TBF can only be offloaded on DPAA2 when it's the root qdisc, ie it's a
      per port shaper.
      Signed-off-by: default avatarIoana Ciornei <ioana.ciornei@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3657cdaf
    • Ioana Ciornei's avatar
      dpaa2-eth: add API for Tx shaping · 39344a89
      Ioana Ciornei authored
      Add the necessary API (dpni_set_tx_shaping) for configuring the rate and
      burst size of a per port shaper in DPAA2.
      Signed-off-by: default avatarIoana Ciornei <ioana.ciornei@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      39344a89
    • Ioana Ciornei's avatar
      dpaa2-eth: move the mqprio setup into a separate function · e3ec13be
      Ioana Ciornei authored
      Move the setup done for MQPRIO into a separate function so that
      with the addition of another offload we do not crowd
      dpaa2_eth_setup_tc(). After this restructuring it's easier to see what
      is supported in terms of Qdisc offloading.
      Signed-off-by: default avatarIoana Ciornei <ioana.ciornei@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e3ec13be
    • Florian Westphal's avatar
      mptcp: move helper to where its used · c1d069e3
      Florian Westphal authored
      Only used in token.c.
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c1d069e3
    • David S. Miller's avatar
      Merge branch 'devlink-small-improvements' · 1fe4085f
      David S. Miller authored
      Parav Pandit says:
      
      ====================
      devlink small improvements
      
      This short series improves the devlink code for lock commment,
      simplifying checks and keeping the scope of mutex lock for necessary
      fields.
      
      Patch summary:
      Patch-1 Keep the devlink_mutex for only for necessary changes.
      Patch-2 Avoids duplicate check for reload flag
      Patch-3 Adds missing comment for the scope of devlink instance lock
      Patch-4 Constify devlink instance pointer
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1fe4085f
    • Parav Pandit's avatar
      devlink: Constify devlink instance pointer · eac5f8a9
      Parav Pandit authored
      Constify devlink instance pointer while checking if reload operation is
      supported or not.
      
      This helps to review the scope of checks done in reload.
      Signed-off-by: default avatarParav Pandit <parav@mellanox.com>
      Reviewed-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      eac5f8a9
    • Parav Pandit's avatar
      devlink: Add comment for devlink instance lock · 336ce1c9
      Parav Pandit authored
      Add comment to describe the purpose of devlink instance lock.
      Signed-off-by: default avatarParav Pandit <parav@mellanox.com>
      Reviewed-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      336ce1c9
    • Parav Pandit's avatar
      devlink: Avoid duplicate check for reload enabled flag · 9232a3e6
      Parav Pandit authored
      Reload operation is enabled or not is already checked by
      devlink_reload(). Hence, remove the duplicate check from
      devlink_nl_cmd_reload().
      Signed-off-by: default avatarParav Pandit <parav@mellanox.com>
      Reviewed-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9232a3e6
    • Parav Pandit's avatar
      devlink: Do not hold devlink mutex when initializing devlink fields · 6553e561
      Parav Pandit authored
      There is no need to hold a device global lock when initializing
      devlink device fields of a devlink instance which is not yet part of the
      devices list.
      Signed-off-by: default avatarParav Pandit <parav@mellanox.com>
      Reviewed-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6553e561
    • Heiner Kallweit's avatar
      r8169: allow to enable ASPM on RTL8125A · 3fc364c0
      Heiner Kallweit authored
      For most chip versions this has been added already. Allow also for
      RTL8125A to enable ASPM.
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3fc364c0
    • David S. Miller's avatar
      Merge branch 'ena-driver-new-features' · 4c8024f7
      David S. Miller authored
      Arthur Kiyanovski says:
      
      ====================
      ENA driver new features
      
      V4 changes:
      -----------
      Add smp_rmb() to "net: ena: avoid unnecessary rearming of interrupt
      vector when busy-polling" to adhere to the linux kernel memory model,
      and update the commit message accordingly.
      
      V3 changes:
      -----------
      1. Add "net: ena: enable support of rss hash key and function
         changes" patch again, with more explanations why it should
         be in net-next in commit message.
      2. Add synchronization considerations to "net: ena: avoid unnecessary
         rearming of interrupt vector when busy-polling"
      
      V2 changes:
      -----------
      1. Update commit messages of 2 patches to be more verbose.
      2. Remove "net: ena: enable support of rss hash key and function
         changes" patch. Will be resubmitted net.
      
      V1 cover letter:
      ----------------
      This patchset contains performance improvements, support for new devices
      and functionality:
      
      1. Support for upcoming ENA devices
      2. Avoid unnecessary IRQ unmasking in busy poll to reduce interrupt rate
      3. Enabling device support for RSS function and key manipulation
      4. Support for NIC-based traffic mirroring (SPAN port)
      5. Additional PCI device ID
      6. Cosmetic changes
      ====================
      Acked-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4c8024f7
    • Arthur Kiyanovski's avatar
      net: ena: support new LLQ acceleration mode · 0e3a3f6d
      Arthur Kiyanovski authored
      New devices add a new hardware acceleration engine, which adds some
      restrictions to the driver.
      Metadata descriptor must be present for each packet and the maximum
      burst size between two doorbells is now limited to a number
      advertised by the device.
      
      This patch adds:
      1. A handshake protocol between the driver and the device, so the
      device will enable the accelerated queues only when both sides
      support it.
      
      2. The driver support for the new acceleration engine:
      2.1. Send metadata descriptor for each Tx packet.
      2.2. Limit the number of packets sent between doorbells.(*)
      
      (*) A previous driver implementation of this feature was comitted in
      commit 05d62ca2 ("net: ena: add handling of llq max tx burst size")
      however the design of the interface between the driver and device
      changed since then. This change is reflected in this commit.
      Signed-off-by: default avatarNetanel Belgazal <netanel@amazon.com>
      Signed-off-by: default avatarArthur Kiyanovski <akiyano@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0e3a3f6d
    • Arthur Kiyanovski's avatar
      net: ena: move llq configuration from ena_probe to ena_device_init() · c29efeae
      Arthur Kiyanovski authored
      When the ENA device resets to recover from some error state, all LLQ
      configuration values are reset to their defaults, because LLQ is
      initialized only once during ena_probe().
      
      Changes in this commit:
      1. Move the LLQ configuration process into ena_init_device()
      which is called from both ena_probe() and ena_restore_device(). This
      way, LLQ setup configurations that are different from the default
      values will survive resets.
      
      2. Extract the LLQ bar mapping to ena_map_llq_bar(),
      and call once in the lifetime of the driver from ena_probe(),
      since there is no need to unmap and map the LLQ bar again every reset.
      
      3. Map the LLQ bar if it exists, regardless if initialization of LLQ
      placement policy (ENA_ADMIN_PLACEMENT_POLICY_DEV) succeeded
      or not. Initialization might fail the first time, falling back to the
      ENA_ADMIN_PLACEMENT_POLICY_HOST placement policy, but later succeed
      after device reset, in which case the LLQ bar needs to be mapped
      already.
      Signed-off-by: default avatarSameeh Jubran <sameehj@amazon.com>
      Signed-off-by: default avatarArthur Kiyanovski <akiyano@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c29efeae
    • Arthur Kiyanovski's avatar
      net: ena: enable support of rss hash key and function changes · 0ee60edf
      Arthur Kiyanovski authored
      Add the rss_configurable_function_key bit to driver_supported_feature.
      
      This bit tells the device that the driver in question supports the
      retrieving and updating of RSS function and hash key, and therefore
      the device should allow RSS function and key manipulation.
      
      This commit turns on  device support for hash key and RSS function
      management. Without this commit this feature is turned off at the
      device and appears to the user as unsupported.
      
      This commit concludes the following series of already merged commits:
      commit 0af3c4e2 ("net: ena: changes to RSS hash key allocation")
      commit c1bd17e5 ("net: ena: change default RSS hash function to Toeplitz")
      commit f66c2ea3 ("net: ena: allow setting the hash function without changing the key")
      commit e9a1de37 ("net: ena: fix error returning in ena_com_get_hash_function()")
      commit 80f8443f ("net: ena: avoid unnecessary admin command when RSS function set fails")
      commit 6a4f7dc8 ("net: ena: rss: do not allocate key when not supported")
      commit 0d1c3de7 ("net: ena: fix incorrect default RSS key")
      
      The above commits represent the last part of the implementation of
      this feature, and with them merged the feature can be enabled
      in the device.
      Signed-off-by: default avatarArthur Kiyanovski <akiyano@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0ee60edf
    • Arthur Kiyanovski's avatar
      net: ena: add support for traffic mirroring · 0f505c60
      Arthur Kiyanovski authored
      Add support for traffic mirroring, where the hardware reads the
      buffer from the instance memory directly.
      
      Traffic Mirroring needs access to the rx buffers in the instance.
      To have this access, this patch:
      1. Changes the code to map and unmap the rx buffers bidirectionally.
      2. Enables the relevant bit in driver_supported_features to indicate
         to the FW that this driver supports traffic mirroring.
      
      Rx completion is not generated until mirroring is done to avoid
      the situation where the driver changes the buffer before it is
      mirrored.
      Signed-off-by: default avatarArthur Kiyanovski <akiyano@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0f505c60
    • Arthur Kiyanovski's avatar
      net: ena: cosmetic: change ena_com_stats_admin stats to u64 · 0dcec686
      Arthur Kiyanovski authored
      The size of the admin statistics in ena_com_stats_admin is changed
      from 32bit to 64bit so to align with the sizes of the other statistics
      in the driver (i.e. rx_stats, tx_stats and ena_stats_dev).
      
      This is done as part of an effort to create a unified API to read
      statistics.
      Signed-off-by: default avatarShay Agroskin <shayagr@amazon.com>
      Signed-off-by: default avatarArthur Kiyanovski <akiyano@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0dcec686
    • Arthur Kiyanovski's avatar
      net: ena: cosmetic: satisfy gcc warning · 79890d3f
      Arthur Kiyanovski authored
      gcc 4.8 reports a warning when initializing with = {0}.
      Dropping the "0" from the braces fixes the issue.
      This fix is not ANSI compatible but is allowed by gcc.
      Signed-off-by: default avatarSameeh Jubran <sameehj@amazon.com>
      Signed-off-by: default avatarArthur Kiyanovski <akiyano@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      79890d3f
    • Arthur Kiyanovski's avatar
      net: ena: add reserved PCI device ID · 866032ab
      Arthur Kiyanovski authored
      Add a reserved PCI device ID to the driver's table
      Used for internal testing purposes.
      Signed-off-by: default avatarArthur Kiyanovski <akiyano@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      866032ab
    • Arthur Kiyanovski's avatar
      net: ena: avoid unnecessary rearming of interrupt vector when busy-polling · 1e5ae350
      Arthur Kiyanovski authored
      For an overview of the race created by this patch goto synchronization
      label.
      
      In napi busy-poll mode, the kernel invokes the napi handler of the
      device repeatedly to poll the NIC's receive queues. This process
      repeats until a timeout, specific for each connection, is up.
      By polling packets in busy-poll mode the user may gain lower latency
      and higher throughput (since the kernel no longer waits for interrupts
      to poll the queues) in expense of CPU usage.
      
      Upon completing a napi routine, the driver checks whether
      the routine was called by an interrupt handler. If so, the driver
      re-enables interrupts for the device. This is needed since an
      interrupt routine invocation disables future invocations until
      explicitly re-enabled.
      
      The driver avoids re-enabling the interrupts if they were not disabled
      in the first place (e.g. if driver in busy mode).
      Originally, the driver checked whether interrupt re-enabling is needed
      by reading the 'ena_napi->unmask_interrupt' variable. This atomic
      variable was set upon interrupt and cleared after re-enabling it.
      
      In the 4.10 Linux version, the 'napi_complete_done' call was changed
      so that it returns 'false' when device should not re-enable
      interrupts, and 'true' otherwise. The change includes reading the
      "NAPIF_STATE_IN_BUSY_POLL" flag to check if the napi call is in
      busy-poll mode, and if so, return 'false'.
      The driver was changed to re-enable interrupts according to this
      routine's return value.
      The Linux community rejected the use of the
      'ena_napi->unmaunmask_interrupt' variable to determine whether
      unmasking is needed, and urged to use napi_napi_complete_done()
      return value solely.
      See https://lore.kernel.org/patchwork/patch/741149/ for more details
      
      As explained, a busy-poll session exists for a specified timeout
      value, after which it exits the busy-poll mode and re-enters it later.
      This leads to many invocations of the napi handler where
      napi_complete_done() false indicates that interrupts should be
      re-enabled.
      This creates a bug in which the interrupts are re-enabled
      unnecessarily.
      To reproduce this bug:
          1) echo 50 | sudo tee /proc/sys/net/core/busy_poll
          2) echo 50 | sudo tee /proc/sys/net/core/busy_read
          3) Add counters that check whether
          'ena_unmask_interrupt(tx_ring, rx_ring);'
          is called without disabling the interrupts in the first
          place (i.e. with calling the interrupt routine
          ena_intr_msix_io())
      
      Steps 1+2 enable busy-poll as the default mode for new connections.
      
      The busy poll routine rearms the interrupts after every session by
      design, and so we need to add an extra check that the interrupts were
      masked in the first place.
      
      synchronization:
      This patch introduces a race between the interrupt handler
      ena_intr_msix_io() and the napi routine ena_io_poll().
      Some macros and instruction were added to prevent this race from leaving
      the interrupts masked. The following specifies the different race
      scenarios in this patch:
      
      1) interrupt handler and napi routine run sequentially
          i) interrupt handler is called, sets 'interrupts_masked' flag and
      	successfully schedules the napi handler via softirq.
      
          In this scenario the napi routine might not see the flag change
          for several reasons:
      	a) The flag is stored in a register by the compiler. For this
      	case the WRITE_ONCE macro which prevents this.
      	b) The compiler might reorder the instruction. For this the
      	smp_wmb() instruction was used which implies a compiler memory
      	barrier.
      	c) On archs with weak consistency model (like ARM64) the napi
      	routine might be scheduled and start running before the flag
      	STORE instruction is committed to cache/memory. To ensure this
      	doesn't happen, the smp_wmb() instruction was added. It ensures
      	that the flag set instruction is committed before scheduling
      	napi.
      
          ii) compiler reorders the flag's value check in the 'if' with
          the flag set in the napi routine.
      
          This scenario is prevented by smp_rmb() call after the flag check.
      
      2) interrupt handler and napi routine run in parallel (can happen when
      busy poll routine invokes the napi handler)
      
          i) interrupt handler sets the flag in one core, while the napi
          routine reads it in another core.
      
          This scenario also is divided into two cases:
      	a) napi_complete_done() doesn't finish running, in which case
      	napi_sched() would just set NAPIF_STATE_MISSED and the napi
      	routine would reschedule itself without changing the flag's value.
      
      	b) napi_complete_done() finishes running. In this case the
      	napi routine might override the flag's value.
      	This doesn't present any rise since it later unmasks the
      	interrupt vector.
      Signed-off-by: default avatarShay Agroskin <shayagr@amazon.com>
      Signed-off-by: default avatarArthur Kiyanovski <akiyano@amazon.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1e5ae350
    • Yuval Basson's avatar
      qed: Fix ILT and XRCD bitmap memory leaks · d4eae993
      Yuval Basson authored
      - Free ILT lines used for XRC-SRQ's contexts.
      - Free XRCD bitmap
      
      Fixes: b8204ad8 ("qed: changes to ILT to support XRC")
      Fixes: 7bfb399e ("qed: Add XRC to RoCE")
      Signed-off-by: default avatarMichal Kalderon <mkalderon@marvell.com>
      Signed-off-by: default avatarYuval Basson <ybason@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d4eae993