1. 02 Jan, 2024 20 commits
  2. 01 Jan, 2024 17 commits
    • David S. Miller's avatar
      Merge branch 'phy-listing-link_topology-tracking' · 9fb3dc1e
      David S. Miller authored
      Maxime Chevallier says:
      
      ====================
      Introduce PHY listing and link_topology tracking
      
      Here's a V5 of the multi-PHY support series.
      
      At a glance, besides some minor fixes and R'd-by from Andrew, one of the
      thing this series does is remove the ASSERT_RTNL() from the
      topo_add_phy/del_phy operations.
      
      These operations will take a PHY device and put it into the list of
      devices associated to a netdevice. The main thing to protect here is the
      list itself, but since we use xarrays, my naive understanding of it is
      that it contains its own protection scheme. There shouldn't be a need
      for more locking, as the insertion/deletion paths are already hooked
      into the PHY connection to a netdev, or disconnection from it.
      
      Now for the rest of the cover :
      
      As a remainder, this ongoing work aims ultimately at supporting complex
      link topologies that involve multiplexing multiple PHYs/SFPs on a single
      netdevice. As a first step, it's required that we are able to enumerate the
      PHYs on a given ethernet interface.
      
      By just doing so, we also improve already-existing use-cases, namely the
      copper SFP modules support when a media-converter is used (as we have 2
      PHYs on the link, but only one is referenced by net_device.phydev, which
      is used on a variety of netlink commands).
      
      The series is architectured as follows :
      
      - The first patch adds the notion of phy_link_topology, which tracks
      all PHYs attached to a netdevice.
      
      - Patches 2, 3 and 4 adds some plumbing into SFP and phylib to be able
        to connect the dots when building the topology tree, to know which PHY
        is connected to which SFP bus, trying not to be too invasive on phylib.
      
      - Patch 5 allows passing a PHY_INDEX to ethnl commands. I'm uncertain about
        this, as there are at least 4 netlink commands ( 5 with the one introduced
        in patch 7 ) that targets PHYs directly or indirectly, which to me makes
        it worth-it to have a generic way to pass a PHY index to commands, however
        the approach taken may be too generic.
      
      - Patch 6 is the netlink spec update + ethtool-user.c|h autogenerated code
      update (the autogenerated code triggers checkpatch warning though)
      
      - Patch 7 introduces a new netlink command set to list PHYs on a netdevice.
      It implements a custom DUMP and GET operation to allow filtered dumps,
      that lists all PHYs on a given netdevice. I couldn't use most of ethnl's
      plumbing though.
      
      - Patch 8 is the netlink spec update + ethtool-user.c|h update for that
      new command
      
      - Patch 8,9,10 and 11 updates the PLCA, strset, cable-test and pse netlink
      commands to use the user-provided PHY instead of net_device.phydev.
      
      - Finally patch 12 adds some documentation for this whole work.
      
      Examples
      ========
      
      Here's a short overview of the kind of operations you can have regarding
      the PHY topology. These tests were performed on a MacchiatoBin, which
      has 3 interfaces :
      
      eth0 and eth1 have the following layout:
      
      MAC - PHY - SFP
      
      eth2 has this more classic topology :
      
      MAC - PHY - RJ45
      
      finally eth3 has the following topology :
      
      MAC - SFP
      
      When performing a dump with all interfaces down, we don't get any
      result, as no PHY has been attached to their respective net_device :
      
      None
      
      The following output is with eth0, eth2 and eth3 up, but no SFP module
      inserted in none of the interfaces :
      
      [{'downstream-sfp-name': 'sfp-eth0',
        'drvname': 'mv88x3310',
        'header': {'dev-index': 2, 'dev-name': 'eth0'},
        'id': 0,
        'index': 1,
        'name': 'f212a600.mdio-mii:00',
        'upstream-type': 'mac'},
       {'drvname': 'Marvell 88E1510',
        'header': {'dev-index': 4, 'dev-name': 'eth2'},
        'id': 21040593,
        'index': 1,
        'name': 'f212a200.mdio-mii:00',
        'upstream-type': 'mac'}]
      
      And now is a dump operation with a copper SFP in the eth0 port :
      
      [{'downstream-sfp-name': 'sfp-eth0',
        'drvname': 'mv88x3310',
        'header': {'dev-index': 2, 'dev-name': 'eth0'},
        'id': 0,
        'index': 1,
        'name': 'f212a600.mdio-mii:00',
        'upstream-type': 'mac'},
       {'drvname': 'Marvell 88E1111',
        'header': {'dev-index': 2, 'dev-name': 'eth0'},
        'id': 21040322,
        'index': 2,
        'name': 'i2c:sfp-eth0:16',
        'upstream': {'index': 1, 'sfp-name': 'sfp-eth0'},
        'upstream-type': 'phy'},
       {'drvname': 'Marvell 88E1510',
        'header': {'dev-index': 4, 'dev-name': 'eth2'},
        'id': 21040593,
        'index': 1,
        'name': 'f212a200.mdio-mii:00',
        'upstream-type': 'mac'}]
      
       -- Note that this shouldn't actually work as the 88x3310 PHY doesn't allow
      a 1G SFP to be connected to its SFP interface, and I don't have a 10G copper SFP,
      so for the sake of the demo I applied the following modification, which
      of courses gives a non-functionnal link, but the PHY attach still works,
      which is what I want to demonstrate :
      
      @@ -488,7 +488,7 @@ static int mv3310_sfp_insert(void *upstream, const struct sfp_eeprom_id *id)
      
              if (iface != PHY_INTERFACE_MODE_10GBASER) {
                      dev_err(&phydev->mdio.dev, "incompatible SFP module inserted\n");
      -               return -EINVAL;
      +               //return -EINVAL;
              }
              return 0;
       }
      
      Finally an example of the filtered DUMP operation that Jakub suggested
      in V1 :
      
      [{'downstream-sfp-name': 'sfp-eth0',
        'drvname': 'mv88x3310',
        'header': {'dev-index': 2, 'dev-name': 'eth0'},
        'id': 0,
        'index': 1,
        'name': 'f212a600.mdio-mii:00',
        'upstream-type': 'mac'},
       {'drvname': 'Marvell 88E1111',
        'header': {'dev-index': 2, 'dev-name': 'eth0'},
        'id': 21040322,
        'index': 2,
        'name': 'i2c:sfp-eth0:16',
        'upstream': {'index': 1, 'sfp-name': 'sfp-eth0'},
        'upstream-type': 'phy'}]
      
      And a classic GET operation allows querying a single PHY's info :
      
      {'drvname': 'Marvell 88E1111',
       'header': {'dev-index': 2, 'dev-name': 'eth0'},
       'id': 21040322,
       'index': 2,
       'name': 'i2c:sfp-eth0:16',
       'upstream': {'index': 1, 'sfp-name': 'sfp-eth0'},
       'upstream-type': 'phy'}
      
      Changed in V5:
      - Removed the RTNL assertion in the topology ops
      - Made the phy_topo_get_phy inline
      - Fixed the PSE-PD multi-PHY support by re-adding a wrongly dropped
        check
      - Fixed some typos in the documentation
      - Fixed reverse xmas trees
      
      Changes in V4:
      - Dropped the RFC flag
      - Made the net_device integration independent to having phylib enabled
      - Removed the autogenerated ethtool-user code for the YNL specs
      
      Changes in V3:
      - Added RTNL assertions where needed
      - Fixed issues in the DUMP code for PHY_GET, which crashed when running it
        twice in a row
      - Added the documentation, and moved in-source docs around
      - renamed link_topology to phy_link_topology
      
      Changes in V2:
      - Added the DUMP operation
      - Added much more information in the reported data, to be able to reconstruct
        precisely the topology tree
      - renamed phy_list to link_topology
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9fb3dc1e
    • Maxime Chevallier's avatar
      Documentation: networking: document phy_link_topology · 32bb4515
      Maxime Chevallier authored
      The newly introduced phy_link_topology tracks all ethernet PHYs that are
      attached to a netdevice. Document the base principle, internal and
      external APIs. As the phy_link_topology is expected to be extended, this
      documentation will hold any further improvements and additions made
      relative to topology handling.
      Signed-off-by: default avatarMaxime Chevallier <maxime.chevallier@bootlin.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      32bb4515
    • Maxime Chevallier's avatar
      net: ethtool: strset: Allow querying phy stats by index · d078d480
      Maxime Chevallier authored
      The ETH_SS_PHY_STATS command gets PHY statistics. Use the phydev pointer
      from the ethnl request to allow query phy stats from each PHY on the
      link.
      Signed-off-by: default avatarMaxime Chevallier <maxime.chevallier@bootlin.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d078d480
    • Maxime Chevallier's avatar
      net: ethtool: cable-test: Target the command to the requested PHY · fcc4b105
      Maxime Chevallier authored
      Cable testing is a PHY-specific command. Instead of targeting the command
      towards dev->phydev, use the request to pick the targeted PHY.
      Signed-off-by: default avatarMaxime Chevallier <maxime.chevallier@bootlin.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fcc4b105
    • Maxime Chevallier's avatar
      net: ethtool: pse-pd: Target the command to the requested PHY · 345237db
      Maxime Chevallier authored
      PSE and PD configuration is a PHY-specific command. Instead of targeting
      the command towards dev->phydev, use the request to pick the targeted
      PHY device.
      Signed-off-by: default avatarMaxime Chevallier <maxime.chevallier@bootlin.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      345237db
    • Maxime Chevallier's avatar
      net: ethtool: plca: Target the command to the requested PHY · 7db69ec9
      Maxime Chevallier authored
      PLCA is a PHY-specific command. Instead of targeting the command
      towards dev->phydev, use the request to pick the targeted PHY.
      Signed-off-by: default avatarMaxime Chevallier <maxime.chevallier@bootlin.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7db69ec9
    • Maxime Chevallier's avatar
      netlink: specs: add ethnl PHY_GET command set · 95132a01
      Maxime Chevallier authored
      The PHY_GET command, supporting both DUMP and GET operations, is used to
      retrieve the list of PHYs connected to a netdevice, and get topology
      information to know where exactly it sits on the physical link.
      
      Add the netlink specs corresponding to that command.
      Signed-off-by: default avatarMaxime Chevallier <maxime.chevallier@bootlin.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      95132a01
    • Maxime Chevallier's avatar
      net: ethtool: Introduce a command to list PHYs on an interface · 63d5eaf3
      Maxime Chevallier authored
      As we have the ability to track the PHYs connected to a net_device
      through the link_topology, we can expose this list to userspace. This
      allows userspace to use these identifiers for phy-specific commands and
      take the decision of which PHY to target by knowing the link topology.
      
      Add PHY_GET and PHY_DUMP, which can be a filtered DUMP operation to list
      devices on only one interface.
      Signed-off-by: default avatarMaxime Chevallier <maxime.chevallier@bootlin.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      63d5eaf3
    • Maxime Chevallier's avatar
      netlink: specs: add phy-index as a header parameter · c29451ae
      Maxime Chevallier authored
      Update the spec to take the newly introduced phy-index as a generic
      request parameter.
      Signed-off-by: default avatarMaxime Chevallier <maxime.chevallier@bootlin.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c29451ae
    • Maxime Chevallier's avatar
      net: ethtool: Allow passing a phy index for some commands · 2ab0edb5
      Maxime Chevallier authored
      Some netlink commands are target towards ethernet PHYs, to control some
      of their features. As there's several such commands, add the ability to
      pass a PHY index in the ethnl request, which will populate the generic
      ethnl_req_info with the relevant phydev when the command targets a PHY.
      Signed-off-by: default avatarMaxime Chevallier <maxime.chevallier@bootlin.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2ab0edb5
    • Maxime Chevallier's avatar
      net: sfp: Add helper to return the SFP bus name · dedd702a
      Maxime Chevallier authored
      Knowing the bus name is helpful when we want to expose the link topology
      to userspace, add a helper to return the SFP bus name.
      Signed-off-by: default avatarMaxime Chevallier <maxime.chevallier@bootlin.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dedd702a
    • Maxime Chevallier's avatar
      net: phy: add helpers to handle sfp phy connect/disconnect · 034fcc21
      Maxime Chevallier authored
      There are a few PHY drivers that can handle SFP modules through their
      sfp_upstream_ops. Introduce Phylib helpers to keep track of connected
      SFP PHYs in a netdevice's namespace, by adding the SFP PHY to the
      upstream PHY's netdev's namespace.
      
      By doing so, these SFP PHYs can be enumerated and exposed to users,
      which will be able to use their capabilities.
      Signed-off-by: default avatarMaxime Chevallier <maxime.chevallier@bootlin.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      034fcc21
    • Maxime Chevallier's avatar
      net: sfp: pass the phy_device when disconnecting an sfp module's PHY · 9c5625f5
      Maxime Chevallier authored
      Pass the phy_device as a parameter to the sfp upstream .disconnect_phy
      operation. This is preparatory work to help track phy devices across
      a net_device's link.
      Signed-off-by: default avatarMaxime Chevallier <maxime.chevallier@bootlin.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9c5625f5
    • Maxime Chevallier's avatar
      net: phy: Introduce ethernet link topology representation · 02018c54
      Maxime Chevallier authored
      Link topologies containing multiple network PHYs attached to the same
      net_device can be found when using a PHY as a media converter for use
      with an SFP connector, on which an SFP transceiver containing a PHY can
      be used.
      
      With the current model, the transceiver's PHY can't be used for
      operations such as cable testing, timestamping, macsec offload, etc.
      
      The reason being that most of the logic for these configuration, coming
      from either ethtool netlink or ioctls tend to use netdev->phydev, which
      in multi-phy systems will reference the PHY closest to the MAC.
      
      Introduce a numbering scheme allowing to enumerate PHY devices that
      belong to any netdev, which can in turn allow userspace to take more
      precise decisions with regard to each PHY's configuration.
      
      The numbering is maintained per-netdev, in a phy_device_list.
      The numbering works similarly to a netdevice's ifindex, with
      identifiers that are only recycled once INT_MAX has been reached.
      
      This prevents races that could occur between PHY listing and SFP
      transceiver removal/insertion.
      
      The identifiers are assigned at phy_attach time, as the numbering
      depends on the netdevice the phy is attached to.
      Signed-off-by: default avatarMaxime Chevallier <maxime.chevallier@bootlin.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      02018c54
    • David S. Miller's avatar
      Merge tag 'nf-next-23-12-22' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next · 109bf4cf
      David S. Miller authored
      Pablo Neira Ayuso says:
      
      ====================
      netfilter pull request 23-12-22
      
      The following patchset contains Netfilter updates for net-next:
      
      1) Add locking for NFT_MSG_GETSETELEM_RESET requests, to address a
         race scenario with two concurrent processes running a dump-and-reset
         which exposes negative counters to userspace, from Phil Sutter.
      
      2) Use GFP_KERNEL in pipapo GC, from Florian Westphal.
      
      3) Reorder nf_flowtable struct members, place the read-mostly parts
         accessed by the datapath first. From Florian Westphal.
      
      4) Set on dead flag for NFT_MSG_NEWSET in abort path,
         from Florian Westphal.
      
      5) Support filtering zone in ctnetlink, from Felix Huettner.
      
      6) Bail out if user tries to redefine an existing chain with different
         type in nf_tables.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      109bf4cf
    • David S. Miller's avatar
      Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next · 240436c0
      David S. Miller authored
      Daniel Borkmann says:
      
      ====================
      bpf-next-for-netdev
      The following pull-request contains BPF updates for your *net-next* tree.
      
      We've added 22 non-merge commits during the last 3 day(s) which contain
      a total of 23 files changed, 652 insertions(+), 431 deletions(-).
      
      The main changes are:
      
      1) Add verifier support for annotating user's global BPF subprogram arguments
         with few commonly requested annotations for a better developer experience,
         from Andrii Nakryiko.
      
         These tags are:
           - Ability to annotate a special PTR_TO_CTX argument
           - Ability to annotate a generic PTR_TO_MEM as non-NULL
      
      2) Support BPF verifier tracking of BPF_JNE which helps cases when the compiler
         transforms (unsigned) "a > 0" into "if a == 0 goto xxx" and the like, from
         Menglong Dong.
      
      3) Fix a warning in bpf_mem_cache's check_obj_size() as reported by LKP, from Hou Tao.
      
      4) Re-support uid/gid options when mounting bpffs which had to be reverted with
         the prior token series revert to avoid conflicts, from Daniel Borkmann.
      
      5) Fix a libbpf NULL pointer dereference in bpf_object__collect_prog_relos() found
         from fuzzing the library with malformed ELF files, from Mingyi Zhang.
      
      6) Skip DWARF sections in libbpf's linker sanity check given compiler options to
         generate compressed debug sections can trigger a rejection due to misalignment,
         from Alyssa Ross.
      
      7) Fix an unnecessary use of the comma operator in BPF verifier, from Simon Horman.
      
      8) Fix format specifier for unsigned long values in cpustat sample, from Colin Ian King.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      240436c0
    • Luiz Angelo Daros de Luca's avatar
      net: mdio: get/put device node during (un)registration · cff9c565
      Luiz Angelo Daros de Luca authored
      The __of_mdiobus_register() function was storing the device node in
      dev.of_node without increasing its reference count. It implicitly relied
      on the caller to maintain the allocated node until the mdiobus was
      unregistered.
      
      Now, __of_mdiobus_register() will acquire the node before assigning it,
      and of_mdiobus_unregister_callback() will be called at the end of
      mdio_unregister().
      
      Drivers can now release the node immediately after MDIO registration.
      Some of them are already doing that even before this patch.
      Signed-off-by: default avatarLuiz Angelo Daros de Luca <luizluca@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cff9c565
  3. 29 Dec, 2023 3 commits
    • David S. Miller's avatar
      Merge tag 'mlx5-updates-2023-12-20' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · 92de776d
      David S. Miller authored
      Saeed Mahameed says:
      
      ====================
      mlx5-updates-2023-12-20
      
      mlx5 Socket direct support and management PF profile.
      
      Tariq Says:
      ===========
      Support Socket-Direct multi-dev netdev
      
      This series adds support for combining multiple devices (PFs) of the
      same port under one netdev instance. Passing traffic through different
      devices belonging to different NUMA sockets saves cross-numa traffic and
      allows apps running on the same netdev from different numas to still
      feel a sense of proximity to the device and achieve improved
      performance.
      
      We achieve this by grouping PFs together, and creating the netdev only
      once all group members are probed. Symmetrically, we destroy the netdev
      once any of the PFs is removed.
      
      The channels are distributed between all devices, a proper configuration
      would utilize the correct close numa when working on a certain app/cpu.
      
      We pick one device to be a primary (leader), and it fills a special
      role.  The other devices (secondaries) are disconnected from the network
      in the chip level (set to silent mode). All RX/TX traffic is steered
      through the primary to/from the secondaries.
      
      Currently, we limit the support to PFs only, and up to two devices
      (sockets).
      
      ===========
      
      Armen Says:
      ===========
      Management PF support and module integration
      
      This patch rolls out comprehensive support for the Management Physical
      Function (MGMT PF) within the mlx5 driver. It involves updating the
      mlx5 interface header to introduce necessary definitions for MGMT PF
      and adding a new management PF netdev profile, which will allow the host
      side to communicate with the embedded linux on Blue-field devices.
      
      ===========
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      92de776d
    • Ido Schimmel's avatar
      genetlink: Use internal flags for multicast groups · cd4d7263
      Ido Schimmel authored
      As explained in commit e0378187 ("drop_monitor: Require
      'CAP_SYS_ADMIN' when joining "events" group"), the "flags" field in the
      multicast group structure reuses uAPI flags despite the field not being
      exposed to user space. This makes it impossible to extend its use
      without adding new uAPI flags, which is inappropriate for internal
      kernel checks.
      
      Solve this by adding internal flags (i.e., "GENL_MCAST_*") and convert
      the existing users to use them instead of the uAPI flags.
      
      Tested using the reproducers in commit 44ec98ea ("psample: Require
      'CAP_NET_ADMIN' when joining "packets" group") and commit e0378187
      ("drop_monitor: Require 'CAP_SYS_ADMIN' when joining "events" group").
      
      No functional changes intended.
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarMat Martineau <martineau@kernel.org>
      Reviewed-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cd4d7263
    • Greg Kroah-Hartman's avatar
      iucv: make iucv_bus const · f732ba4a
      Greg Kroah-Hartman authored
      Now that the driver core can properly handle constant struct bus_type,
      move the iucv_bus variable to be a constant structure as well, placing
      it into read-only memory which can not be modified at runtime.
      
      Cc: Wenjia Zhang <wenjia@linux.ibm.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: Paolo Abeni <pabeni@redhat.com>
      Cc: linux-s390@vger.kernel.org
      Cc: netdev@vger.kernel.org
      Acked-by: default avatarAlexandra Winter <wintera@linux.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f732ba4a