1. 08 Jan, 2022 4 commits
    • Vladimir Oltean's avatar
      net: dsa: felix: add port fast age support · 5cad43a5
      Vladimir Oltean authored
      Add support for flushing the MAC table on a given port in the ocelot
      switch library, and use this functionality in the felix DSA driver.
      
      This operation is needed when a port leaves a bridge to become
      standalone, and when the learning is disabled, and when the STP state
      changes to a state where no FDB entry should be present.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Link: https://lore.kernel.org/r/20220107144229.244584-1-vladimir.oltean@nxp.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      5cad43a5
    • Vladimir Oltean's avatar
      net: mscc: ocelot: fix incorrect balancing with down LAG ports · a14e6b69
      Vladimir Oltean authored
      Assuming the test setup described here:
      https://patchwork.kernel.org/project/netdevbpf/cover/20210205130240.4072854-1-vladimir.oltean@nxp.com/
      (swp1 and swp2 are in bond0, and bond0 is in a bridge with swp0)
      
      it can be seen that when swp1 goes down (on either board A or B), then
      traffic that should go through that port isn't forwarded anywhere.
      
      A dump of the PGID table shows the following:
      
      PGID_DST[0] = ports 0
      PGID_DST[1] = ports 1
      PGID_DST[2] = ports 2
      PGID_DST[3] = ports 3
      PGID_DST[4] = ports 4
      PGID_DST[5] = ports 5
      PGID_DST[6] = no ports
      PGID_AGGR[0] = ports 0, 1, 2, 3, 4, 5
      PGID_AGGR[1] = ports 0, 1, 2, 3, 4, 5
      PGID_AGGR[2] = ports 0, 1, 2, 3, 4, 5
      PGID_AGGR[3] = ports 0, 1, 2, 3, 4, 5
      PGID_AGGR[4] = ports 0, 1, 2, 3, 4, 5
      PGID_AGGR[5] = ports 0, 1, 2, 3, 4, 5
      PGID_AGGR[6] = ports 0, 1, 2, 3, 4, 5
      PGID_AGGR[7] = ports 0, 1, 2, 3, 4, 5
      PGID_AGGR[8] = ports 0, 1, 2, 3, 4, 5
      PGID_AGGR[9] = ports 0, 1, 2, 3, 4, 5
      PGID_AGGR[10] = ports 0, 1, 2, 3, 4, 5
      PGID_AGGR[11] = ports 0, 1, 2, 3, 4, 5
      PGID_AGGR[12] = ports 0, 1, 2, 3, 4, 5
      PGID_AGGR[13] = ports 0, 1, 2, 3, 4, 5
      PGID_AGGR[14] = ports 0, 1, 2, 3, 4, 5
      PGID_AGGR[15] = ports 0, 1, 2, 3, 4, 5
      PGID_SRC[0] = ports 1, 2
      PGID_SRC[1] = ports 0
      PGID_SRC[2] = ports 0
      PGID_SRC[3] = no ports
      PGID_SRC[4] = no ports
      PGID_SRC[5] = no ports
      PGID_SRC[6] = ports 0, 1, 2, 3, 4, 5
      
      Whereas a "good" PGID configuration for that setup should have looked
      like this:
      
      PGID_DST[0] = ports 0
      PGID_DST[1] = ports 1, 2
      PGID_DST[2] = ports 1, 2
      PGID_DST[3] = ports 3
      PGID_DST[4] = ports 4
      PGID_DST[5] = ports 5
      PGID_DST[6] = no ports
      PGID_AGGR[0] = ports 0, 2, 3, 4, 5
      PGID_AGGR[1] = ports 0, 2, 3, 4, 5
      PGID_AGGR[2] = ports 0, 2, 3, 4, 5
      PGID_AGGR[3] = ports 0, 2, 3, 4, 5
      PGID_AGGR[4] = ports 0, 2, 3, 4, 5
      PGID_AGGR[5] = ports 0, 2, 3, 4, 5
      PGID_AGGR[6] = ports 0, 2, 3, 4, 5
      PGID_AGGR[7] = ports 0, 2, 3, 4, 5
      PGID_AGGR[8] = ports 0, 2, 3, 4, 5
      PGID_AGGR[9] = ports 0, 2, 3, 4, 5
      PGID_AGGR[10] = ports 0, 2, 3, 4, 5
      PGID_AGGR[11] = ports 0, 2, 3, 4, 5
      PGID_AGGR[12] = ports 0, 2, 3, 4, 5
      PGID_AGGR[13] = ports 0, 2, 3, 4, 5
      PGID_AGGR[14] = ports 0, 2, 3, 4, 5
      PGID_AGGR[15] = ports 0, 2, 3, 4, 5
      PGID_SRC[0] = ports 1, 2
      PGID_SRC[1] = ports 0
      PGID_SRC[2] = ports 0
      PGID_SRC[3] = no ports
      PGID_SRC[4] = no ports
      PGID_SRC[5] = no ports
      PGID_SRC[6] = ports 0, 1, 2, 3, 4, 5
      
      In other words, in the "bad" configuration, the attempt is to remove the
      inactive swp1 from the destination ports via PGID_DST. But when a MAC
      table entry is learned, it is learned towards PGID_DST 1, because that
      is the logical port id of the LAG itself (it is equal to the lowest
      numbered member port). So when swp1 becomes inactive, if we set
      PGID_DST[1] to contain just swp1 and not swp2, the packet will not have
      any chance to reach the destination via swp2.
      
      The "correct" way to remove swp1 as a destination is via PGID_AGGR
      (remove swp1 from the aggregation port groups for all aggregation
      codes). This means that PGID_DST[1] and PGID_DST[2] must still contain
      both swp1 and swp2. This makes the MAC table still treat packets
      destined towards the single-port LAG as "multicast", and the inactive
      ports are removed via the aggregation code tables.
      
      The change presented here is a design one: the ocelot_get_bond_mask()
      function used to take an "only_active_ports" argument. We don't need
      that. The only call site that specifies only_active_ports=true,
      ocelot_set_aggr_pgids(), must retrieve the entire bonding mask, because
      it must program that into PGID_DST. Additionally, it must also clear the
      inactive ports from the bond mask here, which it can't do if bond_mask
      just contains the active ports:
      
      	ac = ocelot_read_rix(ocelot, ANA_PGID_PGID, i);
      	ac &= ~bond_mask;  <---- here
      	/* Don't do division by zero if there was no active
      	 * port. Just make all aggregation codes zero.
      	 */
      	if (num_active_ports)
      		ac |= BIT(aggr_idx[i % num_active_ports]);
      	ocelot_write_rix(ocelot, ac, ANA_PGID_PGID, i);
      
      So it becomes the responsibility of ocelot_set_aggr_pgids() to take
      ocelot_port->lag_tx_active into consideration when populating the
      aggr_idx array.
      
      Fixes: 23ca3b72 ("net: mscc: ocelot: rebalance LAGs on link up/down events")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Link: https://lore.kernel.org/r/20220107164332.402133-1-vladimir.oltean@nxp.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      a14e6b69
    • Jakub Kicinski's avatar
      Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue · a5e7d9bb
      Jakub Kicinski authored
      Tony Nguyen says:
      
      ====================
      40GbE Intel Wired LAN Driver Updates 2022-01-07
      
      This series contains updates to i40e and iavf drivers.
      
      Karen limits per VF MAC filters so that one VF does not consume all
      filters for i40e.
      
      Jedrzej reduces busy wait time for admin queue calls for i40e.
      
      Mateusz updates firmware versions to reflect new supported NVM images
      and renames an error to remove non-inclusive language for i40e.
      
      Yang Li fixes a set but not used warning for i40e.
      
      Jason Wang removes an unneeded variable for iavf.
      
      * '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
        iavf: remove an unneeded variable
        i40e: remove variables set but not used
        i40e: Remove non-inclusive language
        i40e: Update FW API version
        i40e: Minimize amount of busy-waiting during AQ send
        i40e: Add ensurance of MacVlan resources for every trusted VF
      ====================
      
      Link: https://lore.kernel.org/r/20220107175704.438387-1-anthony.l.nguyen@intel.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      a5e7d9bb
    • Gal Pressman's avatar
      net/tls: Fix skb memory leak when running kTLS traffic · ffef737f
      Gal Pressman authored
      The cited Fixes commit introduced a memory leak when running kTLS
      traffic (with/without hardware offloads).
      I'm running nginx on the server side and wrk on the client side and get
      the following:
      
        unreferenced object 0xffff8881935e9b80 (size 224):
        comm "softirq", pid 0, jiffies 4294903611 (age 43.204s)
        hex dump (first 32 bytes):
          80 9b d0 36 81 88 ff ff 00 00 00 00 00 00 00 00  ...6............
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        backtrace:
          [<00000000efe2a999>] build_skb+0x1f/0x170
          [<00000000ef521785>] mlx5e_skb_from_cqe_mpwrq_linear+0x2bc/0x610 [mlx5_core]
          [<00000000945d0ffe>] mlx5e_handle_rx_cqe_mpwrq+0x264/0x9e0 [mlx5_core]
          [<00000000cb675b06>] mlx5e_poll_rx_cq+0x3ad/0x17a0 [mlx5_core]
          [<0000000018aac6a9>] mlx5e_napi_poll+0x28c/0x1b60 [mlx5_core]
          [<000000001f3369d1>] __napi_poll+0x9f/0x560
          [<00000000cfa11f72>] net_rx_action+0x357/0xa60
          [<000000008653b8d7>] __do_softirq+0x282/0x94e
          [<00000000644923c6>] __irq_exit_rcu+0x11f/0x170
          [<00000000d4085f8f>] irq_exit_rcu+0xa/0x20
          [<00000000d412fef4>] common_interrupt+0x7d/0xa0
          [<00000000bfb0cebc>] asm_common_interrupt+0x1e/0x40
          [<00000000d80d0890>] default_idle+0x53/0x70
          [<00000000f2b9780e>] default_idle_call+0x8c/0xd0
          [<00000000c7659e15>] do_idle+0x394/0x450
      
      I'm not familiar with these areas of the code, but I've added this
      sk_defer_free_flush() to tls_sw_recvmsg() based on a hunch and it
      resolved the issue.
      
      Fixes: f35f8219 ("tcp: defer skb freeing after socket lock is released")
      Signed-off-by: default avatarGal Pressman <gal@nvidia.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/20220102081253.9123-1-gal@nvidia.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      ffef737f
  2. 07 Jan, 2022 36 commits