1. 19 Oct, 2022 20 commits
    • Ido Schimmel's avatar
      selftests: bridge_vlan_mcast: Delete qdiscs during cleanup · 6fb1faa1
      Ido Schimmel authored
      The qdiscs are added during setup, but not deleted during cleanup,
      resulting in the following error messages:
      
       # ./bridge_vlan_mcast.sh
       [...]
       # ./bridge_vlan_mcast.sh
       Error: Exclusivity flag on, cannot modify.
       Error: Exclusivity flag on, cannot modify.
      
      Solve by deleting the qdiscs during cleanup.
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Acked-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6fb1faa1
    • David S. Miller's avatar
      Merge branch 'dpaa-phylink' · 5cacb2c7
      David S. Miller authored
      Sean Anderson says:
      
      ====================
      net: dpaa: Convert to phylink
      
      This series converts the DPAA driver to phylink.
      
      I have tried to maintain backwards compatibility with existing device
      trees whereever possible. However, one area where I was unable to
      achieve this was with QSGMII. Please refer to patch 2 for details.
      
      All mac drivers have now been converted. I would greatly appreciate if
      anyone has T-series or P-series boards they can test/debug this series
      on. I only have an LS1046ARDB. Everything but QSGMII should work without
      breakage; QSGMII needs patches 7 and 8. For this reason, the last 4
      patches in this series should be applied together (and should not go
      through separate trees).
      
      Changes in v7:
      - provide phylink_validate_mask_caps() helper
      - Fix oops if memac_pcs_create returned -EPROBE_DEFER
      - Fix using pcs-names instead of pcs-handle-names
      - Fix not checking for -ENODATA when looking for sgmii pcs
      - Fix 81-character line
      - Simplify memac_validate with phylink_validate_mask_caps
      
      Changes in v6:
      - Remove unnecessary $ref from renesas,rzn1-a5psw
      - Remove unnecessary type from pcs-handle-names
      - Add maxItems to pcs-handle
      - Fix 81-character line
      - Fix uninitialized variable in dtsec_mac_config
      
      Changes in v5:
      - Add Lynx PCS binding
      
      Changes in v4:
      - Use pcs-handle-names instead of pcs-names, as discussed
      - Don't fail if phy support was not compiled in
      - Split off rate adaptation series
      - Split off DPAA "preparation" series
      - Split off Lynx 10G support
      - t208x: Mark MAC1 and MAC2 as 10G
      - Add XFI PCS for t208x MAC1/MAC2
      
      Changes in v3:
      - Expand pcs-handle to an array
      - Add vendor prefix 'fsl,' to rgmii and mii properties.
      - Set maxItems for pcs-names
      - Remove phy-* properties from example because dt-schema complains and I
        can't be bothered to figure out how to make it work.
      - Add pcs-handle as a preferred version of pcsphy-handle
      - Deprecate pcsphy-handle
      - Remove mii/rmii properties
      - Put the PCS mdiodev only after we are done with it (since the PCS
        does not perform a get itself).
      - Remove _return label from memac_initialization in favor of returning
        directly
      - Fix grabbing the default PCS not checking for -ENODATA from
        of_property_match_string
      - Set DTSEC_ECNTRL_R100M in dtsec_link_up instead of dtsec_mac_config
      - Remove rmii/mii properties
      - Replace 1000Base... with 1000BASE... to match IEEE capitalization
      - Add compatibles for QSGMII PCSs
      - Split arm and powerpcs dts updates
      
      Changes in v2:
      - Better document how we select which PCS to use in the default case
      - Move PCS_LYNX dependency to fman Kconfig
      - Remove unused variable slow_10g_if
      - Restrict valid link modes based on the phy interface. This is easier
        to set up, and mostly captures what I intended to do the first time.
        We now have a custom validate which restricts half-duplex for some SoCs
        for RGMII, but generally just uses the default phylink validate.
      - Configure the SerDes in enable/disable
      - Properly implement all ethtool ops and ioctls. These were mostly
        stubbed out just enough to compile last time.
      - Convert 10GEC and dTSEC as well
      - Fix capitalization of mEMAC in commit messages
      - Add nodes for QSGMII PCSs
      - Add nodes for QSGMII PCSs
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5cacb2c7
    • Sean Anderson's avatar
      arm64: dts: layerscape: Add nodes for QSGMII PCSs · 4e748b1b
      Sean Anderson authored
      Now that we actually read registers from QSGMII PCSs, it's important
      that we have the correct address (instead of hoping that we're the MAC
      with all the QSGMII PCSs on its bus). This adds nodes for the QSGMII
      PCSs.  The exact mapping of QSGMII to MACs depends on the SoC.
      
      Since the first QSGMII PCSs share an address with the SGMII and XFI
      PCSs, we only add new nodes for PCSs 2-4. This avoids address conflicts
      on the bus.
      Signed-off-by: default avatarSean Anderson <sean.anderson@seco.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4e748b1b
    • Sean Anderson's avatar
      powerpc: dts: qoriq: Add nodes for QSGMII PCSs · 4e31b808
      Sean Anderson authored
      Now that we actually read registers from QSGMII PCSs, it's important
      that we have the correct address (instead of hoping that we're the MAC
      with all the QSGMII PCSs on its bus). This adds nodes for the QSGMII
      PCSs. They have the same addresses on all SoCs (e.g. if QSGMIIA is
      present it's used for MACs 1 through 4).
      
      Since the first QSGMII PCSs share an address with the SGMII and XFI
      PCSs, we only add new nodes for PCSs 2-4. This avoids address conflicts
      on the bus.
      Signed-off-by: default avatarSean Anderson <sean.anderson@seco.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4e31b808
    • Sean Anderson's avatar
      powerpc: dts: t208x: Mark MAC1 and MAC2 as 10G · 36926a7d
      Sean Anderson authored
      On the T208X SoCs, MAC1 and MAC2 support XGMII. Add some new MAC dtsi
      fragments, and mark the QMAN ports as 10G.
      
      Fixes: da414bb9 ("powerpc/mpc85xx: Add FSL QorIQ DPAA FMan support to the SoC device tree(s)")
      Signed-off-by: default avatarSean Anderson <sean.anderson@seco.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      36926a7d
    • Sean Anderson's avatar
      net: dpaa: Convert to phylink · 5d93cfcf
      Sean Anderson authored
      This converts DPAA to phylink. All macs are converted. This should work
      with no device tree modifications (including those made in this series),
      except for QSGMII (as noted previously).
      
      The mEMAC configuration is one of the tricker areas. I have tried to
      capture all the restrictions across the various models. Most of the time,
      we assume that if the serdes supports a mode or the phy-interface-mode
      specifies it, then we support it. The only place we can't do this is
      (RG)MII, since there's no serdes. In that case, we rely on a (new)
      devicetree property. There are also several cases where half-duplex is
      broken. Unfortunately, only a single compatible is used for the MAC, so we
      have to use the board compatible instead.
      
      The 10GEC conversion is very straightforward, since it only supports XAUI.
      There is generally nothing to configure.
      
      The dTSEC conversion is broadly similar to mEMAC, but is simpler because we
      don't support configuring the SerDes (though this can be easily added) and
      we don't have multiple PCSs. From what I can tell, there's nothing
      different in the driver or documentation between SGMII and 1000BASE-X
      except for the advertising. Similarly, I couldn't find anything about
      2500BASE-X. In both cases, I treat them like SGMII. These modes aren't used
      by any in-tree boards. Similarly, despite being mentioned in the driver, I
      couldn't find any documented SoCs which supported QSGMII.  I have left it
      unimplemented for now.
      Signed-off-by: default avatarSean Anderson <sean.anderson@seco.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5d93cfcf
    • Sean Anderson's avatar
      net: fman: memac: Use lynx pcs driver · a7c2a32e
      Sean Anderson authored
      Although not stated in the datasheet, as far as I can tell PCS for mEMACs
      is a "Lynx." By reusing the existing driver, we can remove the PCS
      management code from the memac driver. This requires calling some PCS
      functions manually which phylink would usually do for us, but we will let
      it do that soon.
      
      One problem is that we don't actually have a PCS for QSGMII. We pretend
      that each mEMAC's MDIO bus has four QSGMII PCSs, but this is not the case.
      Only the "base" mEMAC's MDIO bus has the four QSGMII PCSs. This is not an
      issue yet, because we never get the PCS state. However, it will be once the
      conversion to phylink is complete, since the links will appear to never
      come up. To get around this, we allow specifying multiple PCSs in pcsphy.
      This breaks backwards compatibility with old device trees, but only for
      QSGMII. IMO this is the only reasonable way to figure out what the actual
      QSGMII PCS is.
      
      Additionally, we now also support a separate XFI PCS. This can allow the
      SerDes driver to set different addresses for the SGMII and XFI PCSs so they
      can be accessed at the same time.
      Signed-off-by: default avatarSean Anderson <sean.anderson@seco.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a7c2a32e
    • Sean Anderson's avatar
      net: fman: memac: Add serdes support · 0fc83bd7
      Sean Anderson authored
      This adds support for using a serdes which has to be configured. This is
      primarly in preparation for phylink conversion, which will then change the
      serdes mode dynamically.
      Signed-off-by: default avatarSean Anderson <sean.anderson@seco.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0fc83bd7
    • Russell King (Oracle)'s avatar
      net: phylink: provide phylink_validate_mask_caps() helper · f392a184
      Russell King (Oracle) authored
      Provide a helper that restricts the link modes according to the
      phylink capabilities.
      Signed-off-by: default avatarRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      [rebased on net-next/master and added documentation]
      Signed-off-by: default avatarSean Anderson <sean.anderson@seco.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f392a184
    • Sean Anderson's avatar
      dt-bindings: net: fman: Add additional interface properties · 045d0501
      Sean Anderson authored
      At the moment, mEMACs are configured almost completely based on the
      phy-connection-type. That is, if the phy interface is RGMII, it assumed
      that RGMII is supported. For some interfaces, it is assumed that the
      RCW/bootloader has set up the SerDes properly. This is generally OK, but
      restricts runtime reconfiguration. The actual link state is never
      reported.
      
      To address these shortcomings, the driver will need additional
      information. First, it needs to know how to access the PCS/PMAs (in
      order to configure them and get the link status). The SGMII PCS/PMA is
      the only currently-described PCS/PMA. Add the XFI and QSGMII PCS/PMAs as
      well. The XFI (and 10GBASE-KR) PCS/PMA is a c45 "phy" which sits on the
      same MDIO bus as SGMII PCS/PMA. By default they will have conflicting
      addresses, but they are also not enabled at the same time by default.
      Therefore, we can let the XFI PCS/PMA be the default when
      phy-connection-type is xgmii. This will allow for
      backwards-compatibility.
      
      QSGMII, however, cannot work with the current binding. This is because
      the QSGMII PCS/PMAs are only present on one MAC's MDIO bus. At the
      moment this is worked around by having every MAC write to the PCS/PMA
      addresses (without checking if they are present). This only works if
      each MAC has the same configuration, and only if we don't need to know
      the status. Because the QSGMII PCS/PMA will typically be located on a
      different MDIO bus than the MAC's SGMII PCS/PMA, there is no fallback
      for the QSGMII PCS/PMA.
      Signed-off-by: default avatarSean Anderson <sean.anderson@seco.com>
      Reviewed-by: default avatarRob Herring <robh@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      045d0501
    • Sean Anderson's avatar
      dt-bindings: net: Add Lynx PCS binding · 00af103d
      Sean Anderson authored
      This binding is fairly bare-bones for now, since the Lynx driver doesn't
      parse any properties (or match based on the compatible). We just need it
      in order to prevent the PCS nodes from having phy devices attached to
      them. This is not really a problem, but it is a bit inefficient.
      
      This binding is really for three separate PCSs (SGMII, QSGMII, and XFI).
      However, the driver treats all of them the same. This works because the
      SGMII and XFI devices typically use the same address, and the SerDes
      driver (or RCW) muxes between them. The QSGMII PCSs have the same
      register layout as the SGMII PCSs. To do things properly, we'd probably
      do something like
      
      	ethernet-pcs@0 {
      		#pcs-cells = <1>;
      		compatible = "fsl,lynx-pcs";
      		reg = <0>, <1>, <2>, <3>;
      	};
      
      but that would add complexity, and we can describe the hardware just
      fine using separate PCSs for now.
      Signed-off-by: default avatarSean Anderson <sean.anderson@seco.com>
      Reviewed-by: default avatarRob Herring <robh@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      00af103d
    • Sean Anderson's avatar
      dt-bindings: net: Expand pcs-handle to an array · 76025ee5
      Sean Anderson authored
      This allows multiple phandles to be specified for pcs-handle, such as
      when multiple PCSs are present for a single MAC. To differentiate
      between them, also add a pcs-handle-names property.
      Signed-off-by: default avatarSean Anderson <sean.anderson@seco.com>
      Reviewed-by: default avatarRob Herring <robh@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      76025ee5
    • David S. Miller's avatar
      Merge branch 'net-marvell-yaml' · 88a2b3cb
      David S. Miller authored
      Michał Grzelak says:
      
      ====================
      net: further improvements to marvell,pp2.yaml
      
      This patchset addresses problems with reg ranges and
      additional $refs. It also limits phy-mode and aligns examples.
      
      Best regards,
      Michał
      
      ---
      Changelog:
      v4->v5
      - drop '+' from all patternProperties
      - restrict range of patternProperties to [0-2] in top level
      - drop the $ref in patternProperties:'^...':properties:reg
      - add patternProperties:'^...':properties:reg:maximum:2
      - drop $ref in patternProperties:'^...':properties:phys
      - add patternProperties:'^...':properties:phys:maxItems:1
      - limit phy-mode to the subset found in dts files
      - reflect the order of subnodes' properties in subnodes' required:
      - restrict range of pattern to [0-2] in marvell,armada-7k-pp22 case
      - restrict range of pattern to [0-1] in marvell,armada-375-pp2 case
      - align to 4 spaces all examples:
      - add specified maximum to allOf:if:then-else:properties:reg
      
      v3->v4
      - change commit message of first patch
      - move allOf:$ref to patternProperties:'^...':$ref
      - deprecate port-id in favour of reg
      - move reg to front of properties list in patternProperties
      - reflect the order of properties in required list in
        patternProperties
      - add unevaluatedProperties: false to patternProperties
      - change unevaluated- to additionalProperties at top level
      - add property phys: to ports subnode
      - extend example binding with additional information about phys and sfp
      - hook phys property to phy-consumer.yaml schema
      
      v2->v3
      - move 'reg:description' to 'allOf:if:then'
      - change '#size-cells: true' and '#address-cells: true'
        to '#size-cells: const: 0' and '#address-cells: const: 1'
      - replace all occurences of pattern "^eth\{hex_num}*"
        with "^(ethernet-)?port@[0-9]+$"
      - add description in 'patternProperties:^...'
      - add 'patternProperties:^...:interrupt-names:minItems: 1'
      - add 'patternProperties:^...:reg:description'
      - update 'patternProperties:^...:port-id:description'
      - add 'patternProperties:^...:required: - reg'
      - update '*:description:' to uppercase
      - add 'allOf:then:required:marvell,system-controller'
      - skip quotation marks from 'allOf:$ref'
      - add 'else' schema to match 'allOf:if:then'
      - restrict 'clocks' in 'allOf:if:then'
      - restrict 'clock-names' in 'allOf:if:then'
      - add #address-cells=<1>; #size-cells=<0>; in 'examples:'
      - change every "ethX" to "ethernet-port@X" in 'examples:'
      - add "reg" and comment in all ports in 'examples:'
      - change /ethernet/eth0/phy-mode in examples://Armada-375
        to "rgmii-id"
      - replace each cpm_ with cp0_ in 'examples:'
      - replace each _syscon0 with _clk0 in 'examples:'
      - remove each eth0X label in 'examples:'
      - update armada-375.dtsi and armada-cp11x.dtsi to match
        marvell,pp2.yaml
      
      v1->v2
      - move 'properties' to the front of the file
      - remove blank line after 'properties'
      - move 'compatible' to the front of 'properties'
      - move 'clocks', 'clock-names' and 'reg' definitions to 'properties'
      - substitute all occurences of 'marvell,armada-7k-pp2' with
        'marvell,armada-7k-pp22'
      - add properties:#size-cells and properties:#address-cells
      - specify list in 'interrupt-names'
      - remove blank lines after 'patternProperties'
      - remove '^interrupt' and '^#.*-cells$' patterns
      - remove blank line after 'allOf'
      - remove first 'if-then-else' block from 'allOf'
      - negate the condition in allOf:if schema
      - delete 'interrupt-controller' from section 'examples'
      - delete '#interrupt-cells' from section 'examples'
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      88a2b3cb
    • Marcin Wojtas's avatar
      ARM: dts: armada-375: Update network description to match schema · 844e4498
      Marcin Wojtas authored
      Update the PP2 ethernet ports subnodes' names to match
      schema enforced by the marvell,pp2.yaml contents.
      
      Add new required properties ('reg') which contains information
      about the port ID, keeping 'port-id' ones for backward
      compatibility.
      Signed-off-by: default avatarMarcin Wojtas <mw@semihalf.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      844e4498
    • Marcin Wojtas's avatar
      arm64: dts: marvell: Update network description to match schema · 2994bf77
      Marcin Wojtas authored
      Update the PP2 ethernet ports subnodes' names to match
      schema enforced by the marvell,pp2.yaml contents.
      
      Add new required properties ('reg') which contains information
      about the port ID, keeping 'port-id' ones for backward
      compatibility.
      Signed-off-by: default avatarMarcin Wojtas <mw@semihalf.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2994bf77
    • Michał Grzelak's avatar
      dt-bindings: net: marvell,pp2: convert to json-schema · c4d175c3
      Michał Grzelak authored
      Convert the marvell,pp2 bindings from text to proper schema.
      
      Move 'marvell,system-controller' and 'dma-coherent' properties from
      port up to the controller node, to match what is actually done in DT.
      
      Rename all subnodes to match "^(ethernet-)?port@[0-2]$" and deprecate
      port-id in favour of 'reg'.
      Signed-off-by: default avatarMichał Grzelak <mig@semihalf.com>
      Reviewed-by: default avatarRob Herring <robh@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c4d175c3
    • Govindarajulu Varadarajan's avatar
      enic: define constants for legacy interrupts offset · e2ac2a00
      Govindarajulu Varadarajan authored
      Use macro instead of function calls. These values are constant and will
      not change.
      Signed-off-by: default avatarGovindarajulu Varadarajan <govind.varadar@gmail.com>
      Link: https://lore.kernel.org/r/20221018005804.188643-1-govind.varadar@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e2ac2a00
    • Shenwei Wang's avatar
      net: fec: remove the unused functions · f3d27ae0
      Shenwei Wang authored
      Removed those unused functions since we simplified the driver
      by using the page pool to manage RX buffers.
      Signed-off-by: default avatarShenwei Wang <shenwei.wang@nxp.com>
      Link: https://lore.kernel.org/r/20221017161236.1563975-1-shenwei.wang@nxp.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      f3d27ae0
    • Arnd Bergmann's avatar
      net: remove smc911x driver · a2fd0844
      Arnd Bergmann authored
      This driver was used on Arm and SH machines until 2009, when the
      last platforms moved to the smsc911x driver for the same hardware.
      
      Time to retire this version.
      
      Link: https://lore.kernel.org/netdev/1232010482-3744-1-git-send-email-steve.glendinning@smsc.com/Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Link: https://lore.kernel.org/r/20221017121900.3520108-1-arnd@kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      a2fd0844
    • Jakub Kicinski's avatar
      Merge tag 'for-netdev' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next · 3566a79c
      Jakub Kicinski authored
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf-next 2022-10-18
      
      We've added 33 non-merge commits during the last 14 day(s) which contain
      a total of 31 files changed, 874 insertions(+), 538 deletions(-).
      
      The main changes are:
      
      1) Add RCU grace period chaining to BPF to wait for the completion
         of access from both sleepable and non-sleepable BPF programs,
         from Hou Tao & Paul E. McKenney.
      
      2) Improve helper UAPI by explicitly defining BPF_FUNC_xxx integer
         values. In the wild we have seen OS vendors doing buggy backports
         where helper call numbers mismatched. This is an attempt to make
         backports more foolproof, from Andrii Nakryiko.
      
      3) Add libbpf *_opts API-variants for bpf_*_get_fd_by_id() functions,
         from Roberto Sassu.
      
      4) Fix libbpf's BTF dumper for structs with padding-only fields,
         from Eduard Zingerman.
      
      5) Fix various libbpf bugs which have been found from fuzzing with
         malformed BPF object files, from Shung-Hsi Yu.
      
      6) Clean up an unneeded check on existence of SSE2 in BPF x86-64 JIT,
         from Jie Meng.
      
      7) Fix various ASAN bugs in both libbpf and selftests when running
         the BPF selftest suite on arm64, from Xu Kuohai.
      
      8) Fix missing bpf_iter_vma_offset__destroy() call in BPF iter selftest
         and use in-skeleton link pointer to remove an explicit bpf_link__destroy(),
         from Jiri Olsa.
      
      9) Fix BPF CI breakage by pointing to iptables-legacy instead of relying
         on symlinked iptables which got upgraded to iptables-nft,
         from Martin KaFai Lau.
      
      10) Minor BPF selftest improvements all over the place, from various others.
      
      * tag 'for-netdev' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (33 commits)
        bpf/docs: Update README for most recent vmtest.sh
        bpf: Use rcu_trace_implies_rcu_gp() for program array freeing
        bpf: Use rcu_trace_implies_rcu_gp() in local storage map
        bpf: Use rcu_trace_implies_rcu_gp() in bpf memory allocator
        rcu-tasks: Provide rcu_trace_implies_rcu_gp()
        selftests/bpf: Use sys_pidfd_open() helper when possible
        libbpf: Fix null-pointer dereference in find_prog_by_sec_insn()
        libbpf: Deal with section with no data gracefully
        libbpf: Use elf_getshdrnum() instead of e_shnum
        selftest/bpf: Fix error usage of ASSERT_OK in xdp_adjust_tail.c
        selftests/bpf: Fix error failure of case test_xdp_adjust_tail_grow
        selftest/bpf: Fix memory leak in kprobe_multi_test
        selftests/bpf: Fix memory leak caused by not destroying skeleton
        libbpf: Fix memory leak in parse_usdt_arg()
        libbpf: Fix use-after-free in btf_dump_name_dups
        selftests/bpf: S/iptables/iptables-legacy/ in the bpf_nf and xdp_synproxy test
        selftests/bpf: Alphabetize DENYLISTs
        selftests/bpf: Add tests for _opts variants of bpf_*_get_fd_by_id()
        libbpf: Introduce bpf_link_get_fd_by_id_opts()
        libbpf: Introduce bpf_btf_get_fd_by_id_opts()
        ...
      ====================
      
      Link: https://lore.kernel.org/r/20221018210631.11211-1-daniel@iogearbox.netSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      3566a79c
  2. 18 Oct, 2022 7 commits
  3. 13 Oct, 2022 13 commits
    • Hou Tao's avatar
      selftests/bpf: Use sys_pidfd_open() helper when possible · 62c69e89
      Hou Tao authored
      SYS_pidfd_open may be undefined for old glibc, so using sys_pidfd_open()
      helper defined in task_local_storage_helpers.h instead to fix potential
      build failure.
      
      And according to commit 7615d9e1 ("arch: wire-up pidfd_open()"), the
      syscall number of pidfd_open is always 434 except for alpha architure,
      so update the definition of __NR_pidfd_open accordingly.
      Signed-off-by: default avatarHou Tao <houtao1@huawei.com>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/bpf/20221011071249.3471760-1-houtao@huaweicloud.com
      62c69e89
    • Andrii Nakryiko's avatar
      Merge branch 'libbpf: fix fuzzer-reported issues' · e94e0a2d
      Andrii Nakryiko authored
      Shung-Hsi Yu says:
      
      ====================
      
      Hi, this patch set fixes several fuzzer-reported issues of libbpf when
      dealing with (malformed) BPF object file:
      
      - patch #1 fix out-of-bound heap write reported by oss-fuzz (currently
        incorrectly marked as fixed)
      
      - patch #2 and #3 fix null-pointer dereference found by locally-run
        fuzzer.
      
      v2:
      - Rebase to bpf-next
      - Move elf_getshdrnum() closer to where it's result is used in patch #1, as
        suggested by Andrii
        - Touch up the comment in bpf_object__elf_collect(), replacing mention of
          e_shnum with elf_getshdrnum()
      - Minor wording change in commit message of patch #1 to for better readability
      - Remove extra note that comes after commit message in patch #1
      
      v1: https://lore.kernel.org/bpf/20221007174816.17536-1-shung-hsi.yu@suse.com/
      ====================
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      e94e0a2d
    • Andrii Nakryiko's avatar
      Merge branch 'Fix bugs found by ASAN when running selftests' · 6e73e683
      Andrii Nakryiko authored
      Xu Kuohai says:
      
      ====================
      
      From: Xu Kuohai <xukuohai@huawei.com>
      
      This series fixes bugs found by ASAN when running bpf selftests on arm64.
      
      v4:
      - Address Andrii's suggestions
      
      v3: https://lore.kernel.org/bpf/5311e154-c2d4-91a5-ccb8-f5adede579ed@huawei.com
      - Fix error failure of case test_xdp_adjust_tail_grow exposed by this series
      
      v2: https://lore.kernel.org/bpf/20221010070454.577433-1-xukuohai@huaweicloud.com
      - Rebase and fix conflict
      
      v1: https://lore.kernel.org/bpf/20221009131830.395569-1-xukuohai@huaweicloud.com
      ====================
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      6e73e683
    • Shung-Hsi Yu's avatar
      libbpf: Fix null-pointer dereference in find_prog_by_sec_insn() · d0d382f9
      Shung-Hsi Yu authored
      When there are no program sections, obj->programs is left unallocated,
      and find_prog_by_sec_insn()'s search lands on &obj->programs[0] == NULL,
      and will cause null-pointer dereference in the following access to
      prog->sec_idx.
      
      Guard the search with obj->nr_programs similar to what's being done in
      __bpf_program__iter() to prevent null-pointer access from happening.
      
      Fixes: db2b8b06 ("libbpf: Support CO-RE relocations for multi-prog sections")
      Signed-off-by: default avatarShung-Hsi Yu <shung-hsi.yu@suse.com>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/bpf/20221012022353.7350-4-shung-hsi.yu@suse.com
      d0d382f9
    • Shung-Hsi Yu's avatar
      libbpf: Deal with section with no data gracefully · 35a85550
      Shung-Hsi Yu authored
      ELF section data pointer returned by libelf may be NULL (if section has
      SHT_NOBITS), so null check section data pointer before attempting to
      copy license and kversion section.
      
      Fixes: cb1e5e96 ("bpf tools: Collect version and license from ELF sections")
      Signed-off-by: default avatarShung-Hsi Yu <shung-hsi.yu@suse.com>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/bpf/20221012022353.7350-3-shung-hsi.yu@suse.com
      35a85550
    • Shung-Hsi Yu's avatar
      libbpf: Use elf_getshdrnum() instead of e_shnum · 51deedc9
      Shung-Hsi Yu authored
      This commit replace e_shnum with the elf_getshdrnum() helper to fix two
      oss-fuzz-reported heap-buffer overflow in __bpf_object__open. Both
      reports are incorrectly marked as fixed and while still being
      reproducible in the latest libbpf.
      
        # clusterfuzz-testcase-minimized-bpf-object-fuzzer-5747922482888704
        libbpf: loading object 'fuzz-object' from buffer
        libbpf: sec_cnt is 0
        libbpf: elf: section(1) .data, size 0, link 538976288, flags 2020202020202020, type=2
        libbpf: elf: section(2) .data, size 32, link 538976288, flags 202020202020ff20, type=1
        =================================================================
        ==13==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6020000000c0 at pc 0x0000005a7b46 bp 0x7ffd12214af0 sp 0x7ffd12214ae8
        WRITE of size 4 at 0x6020000000c0 thread T0
        SCARINESS: 46 (4-byte-write-heap-buffer-overflow-far-from-bounds)
            #0 0x5a7b45 in bpf_object__elf_collect /src/libbpf/src/libbpf.c:3414:24
            #1 0x5733c0 in bpf_object_open /src/libbpf/src/libbpf.c:7223:16
            #2 0x5739fd in bpf_object__open_mem /src/libbpf/src/libbpf.c:7263:20
            ...
      
      The issue lie in libbpf's direct use of e_shnum field in ELF header as
      the section header count. Where as libelf implemented an extra logic
      that, when e_shnum == 0 && e_shoff != 0, will use sh_size member of the
      initial section header as the real section header count (part of ELF
      spec to accommodate situation where section header counter is larger
      than SHN_LORESERVE).
      
      The above inconsistency lead to libbpf writing into a zero-entry calloc
      area. So intead of using e_shnum directly, use the elf_getshdrnum()
      helper provided by libelf to retrieve the section header counter into
      sec_cnt.
      
      Fixes: 0d6988e1 ("libbpf: Fix section counting logic")
      Fixes: 25bbbd7a ("libbpf: Remove assumptions about uniqueness of .rodata/.data/.bss maps")
      Signed-off-by: default avatarShung-Hsi Yu <shung-hsi.yu@suse.com>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=40868
      Link: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=40957
      Link: https://lore.kernel.org/bpf/20221012022353.7350-2-shung-hsi.yu@suse.com
      51deedc9
    • Xu Kuohai's avatar
      selftest/bpf: Fix error usage of ASSERT_OK in xdp_adjust_tail.c · cbc1c998
      Xu Kuohai authored
      xdp_adjust_tail.c calls ASSERT_OK() to check the return value of
      bpf_prog_test_load(), but the condition is not correct. Fix it.
      
      Fixes: 791cad02 ("bpf: selftests: Get rid of CHECK macro in xdp_adjust_tail.c")
      Signed-off-by: default avatarXu Kuohai <xukuohai@huawei.com>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Acked-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      Link: https://lore.kernel.org/bpf/20221011120108.782373-7-xukuohai@huaweicloud.com
      cbc1c998
    • Xu Kuohai's avatar
      selftests/bpf: Fix error failure of case test_xdp_adjust_tail_grow · 4abdb1d5
      Xu Kuohai authored
      test_xdp_adjust_tail_grow failed with ipv6:
        test_xdp_adjust_tail_grow:FAIL:ipv6 unexpected error: -28 (errno 28)
      
      The reason is that this test case tests ipv4 before ipv6, and when ipv4
      test finished, topts.data_size_out was set to 54, which is smaller than the
      ipv6 output data size 114, so ipv6 test fails with NOSPC error.
      
      Fix it by reset topts.data_size_out to sizeof(buf) before testing ipv6.
      
      Fixes: 04fcb5f9 ("selftests/bpf: Migrate from bpf_prog_test_run")
      Signed-off-by: default avatarXu Kuohai <xukuohai@huawei.com>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Acked-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      Link: https://lore.kernel.org/bpf/20221011120108.782373-6-xukuohai@huaweicloud.com
      4abdb1d5
    • Xu Kuohai's avatar
      selftest/bpf: Fix memory leak in kprobe_multi_test · 6d2e21dc
      Xu Kuohai authored
      The get_syms() function in kprobe_multi_test.c does not free the string
      memory allocated by sscanf correctly. Fix it.
      
      Fixes: 5b6c7e5c ("selftests/bpf: Add attach bench test")
      Signed-off-by: default avatarXu Kuohai <xukuohai@huawei.com>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Acked-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      Link: https://lore.kernel.org/bpf/20221011120108.782373-5-xukuohai@huaweicloud.com
      6d2e21dc
    • Xu Kuohai's avatar
      selftests/bpf: Fix memory leak caused by not destroying skeleton · 6e8280b9
      Xu Kuohai authored
      Some test cases does not destroy skeleton object correctly, causing ASAN
      to report memory leak warning. Fix it.
      
      Fixes: 0ef6740e ("selftests/bpf: Add tests for kptr_ref refcounting")
      Fixes: 1642a394 ("selftests/bpf: Add struct argument tests with fentry/fexit programs.")
      Signed-off-by: default avatarXu Kuohai <xukuohai@huawei.com>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Acked-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      Link: https://lore.kernel.org/bpf/20221011120108.782373-4-xukuohai@huaweicloud.com
      6e8280b9
    • Xu Kuohai's avatar
      libbpf: Fix memory leak in parse_usdt_arg() · 0dc9254e
      Xu Kuohai authored
      In the arm64 version of parse_usdt_arg(), when sscanf returns 2, reg_name
      is allocated but not freed. Fix it.
      
      Fixes: 0f861992 ("libbpf: Usdt aarch64 arg parsing support")
      Signed-off-by: default avatarXu Kuohai <xukuohai@huawei.com>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Acked-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      Link: https://lore.kernel.org/bpf/20221011120108.782373-3-xukuohai@huaweicloud.com
      0dc9254e
    • Xu Kuohai's avatar
      libbpf: Fix use-after-free in btf_dump_name_dups · 93c660ca
      Xu Kuohai authored
      ASAN reports an use-after-free in btf_dump_name_dups:
      
      ERROR: AddressSanitizer: heap-use-after-free on address 0xffff927006db at pc 0xaaaab5dfb618 bp 0xffffdd89b890 sp 0xffffdd89b928
      READ of size 2 at 0xffff927006db thread T0
          #0 0xaaaab5dfb614 in __interceptor_strcmp.part.0 (test_progs+0x21b614)
          #1 0xaaaab635f144 in str_equal_fn tools/lib/bpf/btf_dump.c:127
          #2 0xaaaab635e3e0 in hashmap_find_entry tools/lib/bpf/hashmap.c:143
          #3 0xaaaab635e72c in hashmap__find tools/lib/bpf/hashmap.c:212
          #4 0xaaaab6362258 in btf_dump_name_dups tools/lib/bpf/btf_dump.c:1525
          #5 0xaaaab636240c in btf_dump_resolve_name tools/lib/bpf/btf_dump.c:1552
          #6 0xaaaab6362598 in btf_dump_type_name tools/lib/bpf/btf_dump.c:1567
          #7 0xaaaab6360b48 in btf_dump_emit_struct_def tools/lib/bpf/btf_dump.c:912
          #8 0xaaaab6360630 in btf_dump_emit_type tools/lib/bpf/btf_dump.c:798
          #9 0xaaaab635f720 in btf_dump__dump_type tools/lib/bpf/btf_dump.c:282
          #10 0xaaaab608523c in test_btf_dump_incremental tools/testing/selftests/bpf/prog_tests/btf_dump.c:236
          #11 0xaaaab6097530 in test_btf_dump tools/testing/selftests/bpf/prog_tests/btf_dump.c:875
          #12 0xaaaab6314ed0 in run_one_test tools/testing/selftests/bpf/test_progs.c:1062
          #13 0xaaaab631a0a8 in main tools/testing/selftests/bpf/test_progs.c:1697
          #14 0xffff9676d214 in __libc_start_main ../csu/libc-start.c:308
          #15 0xaaaab5d65990  (test_progs+0x185990)
      
      0xffff927006db is located 11 bytes inside of 16-byte region [0xffff927006d0,0xffff927006e0)
      freed by thread T0 here:
          #0 0xaaaab5e2c7c4 in realloc (test_progs+0x24c7c4)
          #1 0xaaaab634f4a0 in libbpf_reallocarray tools/lib/bpf/libbpf_internal.h:191
          #2 0xaaaab634f840 in libbpf_add_mem tools/lib/bpf/btf.c:163
          #3 0xaaaab636643c in strset_add_str_mem tools/lib/bpf/strset.c:106
          #4 0xaaaab6366560 in strset__add_str tools/lib/bpf/strset.c:157
          #5 0xaaaab6352d70 in btf__add_str tools/lib/bpf/btf.c:1519
          #6 0xaaaab6353e10 in btf__add_field tools/lib/bpf/btf.c:2032
          #7 0xaaaab6084fcc in test_btf_dump_incremental tools/testing/selftests/bpf/prog_tests/btf_dump.c:232
          #8 0xaaaab6097530 in test_btf_dump tools/testing/selftests/bpf/prog_tests/btf_dump.c:875
          #9 0xaaaab6314ed0 in run_one_test tools/testing/selftests/bpf/test_progs.c:1062
          #10 0xaaaab631a0a8 in main tools/testing/selftests/bpf/test_progs.c:1697
          #11 0xffff9676d214 in __libc_start_main ../csu/libc-start.c:308
          #12 0xaaaab5d65990  (test_progs+0x185990)
      
      previously allocated by thread T0 here:
          #0 0xaaaab5e2c7c4 in realloc (test_progs+0x24c7c4)
          #1 0xaaaab634f4a0 in libbpf_reallocarray tools/lib/bpf/libbpf_internal.h:191
          #2 0xaaaab634f840 in libbpf_add_mem tools/lib/bpf/btf.c:163
          #3 0xaaaab636643c in strset_add_str_mem tools/lib/bpf/strset.c:106
          #4 0xaaaab6366560 in strset__add_str tools/lib/bpf/strset.c:157
          #5 0xaaaab6352d70 in btf__add_str tools/lib/bpf/btf.c:1519
          #6 0xaaaab6353ff0 in btf_add_enum_common tools/lib/bpf/btf.c:2070
          #7 0xaaaab6354080 in btf__add_enum tools/lib/bpf/btf.c:2102
          #8 0xaaaab6082f50 in test_btf_dump_incremental tools/testing/selftests/bpf/prog_tests/btf_dump.c:162
          #9 0xaaaab6097530 in test_btf_dump tools/testing/selftests/bpf/prog_tests/btf_dump.c:875
          #10 0xaaaab6314ed0 in run_one_test tools/testing/selftests/bpf/test_progs.c:1062
          #11 0xaaaab631a0a8 in main tools/testing/selftests/bpf/test_progs.c:1697
          #12 0xffff9676d214 in __libc_start_main ../csu/libc-start.c:308
          #13 0xaaaab5d65990  (test_progs+0x185990)
      
      The reason is that the key stored in hash table name_map is a string
      address, and the string memory is allocated by realloc() function, when
      the memory is resized by realloc() later, the old memory may be freed,
      so the address stored in name_map references to a freed memory, causing
      use-after-free.
      
      Fix it by storing duplicated string address in name_map.
      
      Fixes: 919d2b1d ("libbpf: Allow modification of BTF and add btf__add_str API")
      Signed-off-by: default avatarXu Kuohai <xukuohai@huawei.com>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Acked-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      Link: https://lore.kernel.org/bpf/20221011120108.782373-2-xukuohai@huaweicloud.com
      93c660ca
    • Linus Torvalds's avatar
      Merge tag 'net-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 66ae0436
      Linus Torvalds authored
      Pull networking fixes from Jakub Kicinski:
       "Including fixes from netfilter, and wifi.
      
      Current release - regressions:
      
         - Revert "net/sched: taprio: make qdisc_leaf() see the
           per-netdev-queue pfifo child qdiscs", it may cause crashes when the
           qdisc is reconfigured
      
         - inet: ping: fix splat due to packet allocation refactoring in inet
      
         - tcp: clean up kernel listener's reqsk in inet_twsk_purge(), fix UAF
           due to races when per-netns hash table is used
      
        Current release - new code bugs:
      
         - eth: adin1110: check in netdev_event that netdev belongs to driver
      
         - fixes for PTR_ERR() vs NULL bugs in driver code, from Dan and co.
      
        Previous releases - regressions:
      
         - ipv4: handle attempt to delete multipath route when fib_info
           contains an nh reference, avoid oob access
      
         - wifi: fix handful of bugs in the new Multi-BSSID code
      
         - wifi: mt76: fix rate reporting / throughput regression on mt7915
           and newer, fix checksum offload
      
         - wifi: iwlwifi: mvm: fix double list_add at
           iwl_mvm_mac_wake_tx_queue (other cases)
      
         - wifi: mac80211: do not drop packets smaller than the LLC-SNAP
           header on fast-rx
      
        Previous releases - always broken:
      
         - ieee802154: don't warn zero-sized raw_sendmsg()
      
         - ipv6: ping: fix wrong checksum for large frames
      
         - mctp: prevent double key removal and unref
      
         - tcp/udp: fix memory leaks and races around IPV6_ADDRFORM
      
         - hv_netvsc: fix race between VF offering and VF association message
      
        Misc:
      
         - remove -Warray-bounds silencing in the drivers, compilers fixed"
      
      * tag 'net-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (73 commits)
        sunhme: fix an IS_ERR() vs NULL check in probe
        net: marvell: prestera: fix a couple NULL vs IS_ERR() checks
        kcm: avoid potential race in kcm_tx_work
        tcp: Clean up kernel listener's reqsk in inet_twsk_purge()
        net: phy: micrel: Fixes FIELD_GET assertion
        openvswitch: add nf_ct_is_confirmed check before assigning the helper
        tcp: Fix data races around icsk->icsk_af_ops.
        ipv6: Fix data races around sk->sk_prot.
        tcp/udp: Call inet6_destroy_sock() in IPv6 sk->sk_destruct().
        udp: Call inet6_destroy_sock() in setsockopt(IPV6_ADDRFORM).
        tcp/udp: Fix memory leak in ipv6_renew_options().
        mctp: prevent double key removal and unref
        selftests: netfilter: Fix nft_fib.sh for all.rp_filter=1
        netfilter: rpfilter/fib: Populate flowic_l3mdev field
        selftests: netfilter: Test reverse path filtering
        net/mlx5: Make ASO poll CQ usable in atomic context
        tcp: cdg: allow tcp_cdg_release() to be called multiple times
        inet: ping: fix recent breakage
        ipv6: ping: fix wrong checksum for large frames
        net: ethernet: ti: am65-cpsw: set correct devlink flavour for unused ports
        ...
      66ae0436