1. 23 Jan, 2023 8 commits
    • Vladimir Oltean's avatar
      net: dsa: add plumbing for changing and getting MAC merge layer state · 5f6c2d49
      Vladimir Oltean authored
      The DSA core is in charge of the ethtool_ops of the net devices
      associated with switch ports, so in case a hardware driver supports the
      MAC merge layer, DSA must pass the callbacks through to the driver.
      Add support for precisely that.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5f6c2d49
    • Vladimir Oltean's avatar
      net: ethtool: add helpers for MM fragment size translation · dd1c4164
      Vladimir Oltean authored
      We deliberately make the Linux UAPI pass the minimum fragment size in
      octets, even though IEEE 802.3 defines it as discrete values, and
      addFragSize is just the multiplier. This is because there is nothing
      impossible in operating with an in-between value for the fragment size
      of non-final preempted fragments, and there may even appear hardware
      which supports the in-between sizes.
      
      For the hardware which just understands the addFragSize multiplier,
      create two helpers which translate back and forth the values passed in
      octets.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dd1c4164
    • Vladimir Oltean's avatar
      net: ethtool: add helpers for aggregate statistics · 449c5459
      Vladimir Oltean authored
      When a pMAC exists but the driver is unable to atomically query the
      aggregate eMAC+pMAC statistics, the user should be given back at least
      the sum of eMAC and pMAC counters queried separately.
      
      This is a generic problem, so add helpers in ethtool to do this
      operation, if the driver doesn't have a better way to report aggregate
      stats. Do this in a way that does not require changes to these functions
      when new stats are added (basically treat the structures as an array of
      u64 values, except for the first element which is the stats source).
      
      In include/linux/ethtool.h, there is already a section where helper
      function prototypes should be placed. The trouble is, this section is
      too early, before the definitions of struct ethtool_eth_mac_stats et.al.
      Move that section at the end and append these new helpers to it.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      449c5459
    • Vladimir Oltean's avatar
      docs: ethtool: document ETHTOOL_A_STATS_SRC and ETHTOOL_A_PAUSE_STATS_SRC · c319df10
      Vladimir Oltean authored
      Two new netlink attributes were added to PAUSE_GET and STATS_GET and
      their replies. Document them.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c319df10
    • Vladimir Oltean's avatar
      net: ethtool: netlink: retrieve stats from multiple sources (eMAC, pMAC) · 04692c90
      Vladimir Oltean authored
      IEEE 802.3-2018 clause 99 defines a MAC Merge sublayer which contains an
      Express MAC and a Preemptible MAC. Both MACs are hidden to higher and
      lower layers and visible as a single MAC (packet classification to eMAC
      or pMAC on TX is done based on priority; classification on RX is done
      based on SFD).
      
      For devices which support a MAC Merge sublayer, it is desirable to
      retrieve individual packet counters from the eMAC and the pMAC, as well
      as aggregate statistics (their sum).
      
      Introduce a new ETHTOOL_A_STATS_SRC attribute which is part of the
      policy of ETHTOOL_MSG_STATS_GET and, and an ETHTOOL_A_PAUSE_STATS_SRC
      which is part of the policy of ETHTOOL_MSG_PAUSE_GET (accepted when
      ETHTOOL_FLAG_STATS is set in the common ethtool header). Both of these
      take values from enum ethtool_mac_stats_src, defaulting to "aggregate"
      in the absence of the attribute.
      
      Existing drivers do not need to pay attention to this enum which was
      added to all driver-facing structures, just the ones which report the
      MAC merge layer as supported.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      04692c90
    • Vladimir Oltean's avatar
      docs: ethtool-netlink: document interface for MAC Merge layer · 37000004
      Vladimir Oltean authored
      Show details about the structures passed back and forth related to MAC
      Merge layer configuration, state and statistics. The rendered htmldocs
      will be much more verbose due to the kerneldoc references.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      37000004
    • Vladimir Oltean's avatar
      net: ethtool: add support for MAC Merge layer · 2b30f829
      Vladimir Oltean authored
      The MAC merge sublayer (IEEE 802.3-2018 clause 99) is one of 2
      specifications (the other being Frame Preemption; IEEE 802.1Q-2018
      clause 6.7.2), which work together to minimize latency caused by frame
      interference at TX. The overall goal of TSN is for normal traffic and
      traffic with a bounded deadline to be able to cohabitate on the same L2
      network and not bother each other too much.
      
      The standards achieve this (partly) by introducing the concept of
      preemptible traffic, i.e. Ethernet frames that have a custom value for
      the Start-of-Frame-Delimiter (SFD), and these frames can be fragmented
      and reassembled at L2 on a link-local basis. The non-preemptible frames
      are called express traffic, they are transmitted using a normal SFD, and
      they can preempt preemptible frames, therefore having lower latency,
      which can matter at lower (100 Mbps) link speeds, or at high MTUs (jumbo
      frames around 9K). Preemption is not recursive, i.e. a P frame cannot
      preempt another P frame. Preemption also does not depend upon priority,
      or otherwise said, an E frame with prio 0 will still preempt a P frame
      with prio 7.
      
      In terms of implementation, the standards talk about the presence of an
      express MAC (eMAC) which handles express traffic, and a preemptible MAC
      (pMAC) which handles preemptible traffic, and these MACs are multiplexed
      on the same MII by a MAC merge layer.
      
      To support frame preemption, the definition of the SFD was generalized
      to SMD (Start-of-mPacket-Delimiter), where an mPacket is essentially an
      Ethernet frame fragment, or a complete frame. Stations unaware of an SMD
      value different from the standard SFD will treat P frames as error
      frames. To prevent that from happening, a negotiation process is
      defined.
      
      On RX, packets are dispatched to the eMAC or pMAC after being filtered
      by their SMD. On TX, the eMAC/pMAC classification decision is taken by
      the 802.1Q spec, based on packet priority (each of the 8 user priority
      values may have an admin-status of preemptible or express).
      
      The MAC Merge layer and the Frame Preemption parameters have some degree
      of independence in terms of how software stacks are supposed to deal
      with them. The activation of the MM layer is supposed to be controlled
      by an LLDP daemon (after it has been communicated that the link partner
      also supports it), after which a (hardware-based or not) verification
      handshake takes place, before actually enabling the feature. So the
      process is intended to be relatively plug-and-play. Whereas FP settings
      are supposed to be coordinated across a network using something
      approximating NETCONF.
      
      The support contained here is exclusively for the 802.3 (MAC Merge)
      portions and not for the 802.1Q (Frame Preemption) parts. This API is
      sufficient for an LLDP daemon to do its job. The FP adminStatus variable
      from 802.1Q is outside the scope of an LLDP daemon.
      
      I have taken a few creative licenses and augmented the Linux kernel UAPI
      compared to the standard managed objects recommended by IEEE 802.3.
      These are:
      
      - ETHTOOL_A_MM_PMAC_ENABLED: According to Figure 99-6: Receive
        Processing state diagram, a MAC Merge layer is always supposed to be
        able to receive P frames. However, this implies keeping the pMAC
        powered on, which will consume needless power in applications where FP
        will never be used. If LLDP is used, the reception of an Additional
        Ethernet Capabilities TLV from the link partner is sufficient
        indication that the pMAC should be enabled. So my proposal is that in
        Linux, we keep the pMAC turned off by default and that user space
        turns it on when needed.
      
      - ETHTOOL_A_MM_VERIFY_ENABLED: The IEEE managed object is called
        aMACMergeVerifyDisableTx. I opted for consistency (positive logic) in
        the boolean netlink attributes offered, so this is also positive here.
        Other than the meaning being reversed, they correspond to the same
        thing.
      
      - ETHTOOL_A_MM_MAX_VERIFY_TIME: I found it most reasonable for a LLDP
        daemon to maximize the verifyTime variable (delay between SMD-V
        transmissions), to maximize its chances that the LP replies. IEEE says
        that the verifyTime can range between 1 and 128 ms, but the NXP ENETC
        stupidly keeps this variable in a 7 bit register, so the maximum
        supported value is 127 ms. I could have chosen to hardcode this in the
        LLDP daemon to a lower value, but why not let the kernel expose its
        supported range directly.
      
      - ETHTOOL_A_MM_TX_MIN_FRAG_SIZE: the standard managed object is called
        aMACMergeAddFragSize, and expresses the "additional" fragment size
        (on top of ETH_ZLEN), whereas this expresses the absolute value of the
        fragment size.
      
      - ETHTOOL_A_MM_RX_MIN_FRAG_SIZE: there doesn't appear to exist a managed
        object mandated by the standard, but user space clearly needs to know
        what is the minimum supported fragment size of our local receiver,
        since LLDP must advertise a value no lower than that.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2b30f829
    • Peilin Ye's avatar
      net/sock: Introduce trace_sk_data_ready() · 40e0b090
      Peilin Ye authored
      As suggested by Cong, introduce a tracepoint for all ->sk_data_ready()
      callback implementations.  For example:
      
      <...>
        iperf-609  [002] .....  70.660425: sk_data_ready: family=2 protocol=6 func=sock_def_readable
        iperf-609  [002] .....  70.660436: sk_data_ready: family=2 protocol=6 func=sock_def_readable
      <...>
      Suggested-by: default avatarCong Wang <cong.wang@bytedance.com>
      Signed-off-by: default avatarPeilin Ye <peilin.ye@bytedance.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      40e0b090
  2. 21 Jan, 2023 17 commits
  3. 20 Jan, 2023 15 commits