- 22 Nov, 2023 20 commits
-
-
Gan, Yi Fang authored
Current implementation supports driver level VLAN tag stripping only. The features is always on if CONFIG_VLAN_8021Q is enabled in kernel config and is not user configurable. This patch add support to MAC level VLAN tag stripping and can be configured through ethtool. If the rx-vlan-offload is off, the VLAN tag will be stripped by driver. If the rx-vlan-offload is on, the VLAN tag will be stripped by MAC. Command: ethtool -K <interface> rx-vlan-offload off | on Signed-off-by: Lai Peter Jun Ann <jun.ann.lai@intel.com> Signed-off-by: Gan, Yi Fang <yi.fang.gan@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Murali Karicheri authored
When MC (multicast) list is updated by the networking layer due to a user command and as well as when allmulti flag is set, it needs to be passed to the enslaved Ethernet devices. This patch allows this to happen by implementing ndo_change_rx_flags() and ndo_set_rx_mode() API calls that in turns pass it to the slave devices using existing API calls. Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: Ravi Gunasekaran <r-gunasekaran@ti.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextJakub Kicinski authored
Daniel Borkmann says: ==================== pull-request: bpf-next 2023-11-21 We've added 85 non-merge commits during the last 12 day(s) which contain a total of 63 files changed, 4464 insertions(+), 1484 deletions(-). The main changes are: 1) Huge batch of verifier changes to improve BPF register bounds logic and range support along with a large test suite, and verifier log improvements, all from Andrii Nakryiko. 2) Add a new kfunc which acquires the associated cgroup of a task within a specific cgroup v1 hierarchy where the latter is identified by its id, from Yafang Shao. 3) Extend verifier to allow bpf_refcount_acquire() of a map value field obtained via direct load which is a use-case needed in sched_ext, from Dave Marchevsky. 4) Fix bpf_get_task_stack() helper to add the correct crosstask check for the get_perf_callchain(), from Jordan Rome. 5) Fix BPF task_iter internals where lockless usage of next_thread() was wrong. The rework also simplifies the code, from Oleg Nesterov. 6) Fix uninitialized tail padding via LIBBPF_OPTS_RESET, and another fix for certain BPF UAPI structs to fix verifier failures seen in bpf_dynptr usage, from Yonghong Song. 7) Add BPF selftest fixes for map_percpu_stats flakes due to per-CPU BPF memory allocator not being able to allocate per-CPU pointer successfully, from Hou Tao. 8) Add prep work around dynptr and string handling for kfuncs which is later going to be used by file verification via BPF LSM and fsverity, from Song Liu. 9) Improve BPF selftests to update multiple prog_tests to use ASSERT_* macros, from Yuran Pereira. 10) Optimize LPM trie lookup to check prefixlen before walking the trie, from Florian Lehner. 11) Consolidate virtio/9p configs from BPF selftests in config.vm file given they are needed consistently across archs, from Manu Bretelle. 12) Small BPF verifier refactor to remove register_is_const(), from Shung-Hsi Yu. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (85 commits) selftests/bpf: Replaces the usage of CHECK calls for ASSERTs in vmlinux selftests/bpf: Replaces the usage of CHECK calls for ASSERTs in bpf_obj_id selftests/bpf: Replaces the usage of CHECK calls for ASSERTs in bind_perm selftests/bpf: Replaces the usage of CHECK calls for ASSERTs in bpf_tcp_ca selftests/bpf: reduce verboseness of reg_bounds selftest logs bpf: bpf_iter_task_next: use next_task(kit->task) rather than next_task(kit->pos) bpf: bpf_iter_task_next: use __next_thread() rather than next_thread() bpf: task_group_seq_get_next: use __next_thread() rather than next_thread() bpf: emit frameno for PTR_TO_STACK regs if it differs from current one bpf: smarter verifier log number printing logic bpf: omit default off=0 and imm=0 in register state log bpf: emit map name in register state if applicable and available bpf: print spilled register state in stack slot bpf: extract register state printing bpf: move verifier state printing code to kernel/bpf/log.c bpf: move verbose_linfo() into kernel/bpf/log.c bpf: rename BPF_F_TEST_SANITY_STRICT to BPF_F_TEST_REG_INVARIANTS bpf: Remove test for MOVSX32 with offset=32 selftests/bpf: add iter test requiring range x range logic veristat: add ability to set BPF_F_TEST_SANITY_STRICT flag with -r flag ... ==================== Link: https://lore.kernel.org/r/20231122000500.28126-1-daniel@iogearbox.netSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Michael Chan says: ==================== bnxt_en: Prepare to support new P7 chips This patchset is to prepare the driver to support the new P7 chips by refactoring and modifying the code. The P7 chip is built on the P5 chip and many code paths can be modified to support both chips. The whole patchset to have basic support for P7 chips is about 20 patches so a follow-on patchset will complete the support and add the new PCI IDs. The first 8 patches are changes to the backing store logic to support both chips with mostly common code paths and datastructures. Both chips require host backing store memory but the relevant firmware APIs have been modified to make it easier to support new backing store memory types. The next 4 patches are changes to TX and RX ring indexing logic and NAPI logic. The main changes are to increment the TX and RX producers unbounded and to do any masking only when needed. These changes are needed to support the P7 chips which require additional higher bits in these producer indices. The NAPI logic is also slightly modifed. The last patch is a rename of BNXT_FLAG_CHIP_P5 to BNXT_FLAG_P5_PLUS and other related macro changes to make it clear that the P5_PLUS macro applies to P5 and newer chips. ==================== Link: https://lore.kernel.org/r/20231120234405.194542-1-michael.chan@broadcom.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Randy Schacher authored
In preparation to support a new P7 chip which has a lot of similarities with the P5 chip, rename the BNXT_FLAG_CHIP_P5 flag to BNXT_FLAG_CHIP_P5_PLUS. This will make it clear that the flag is for P5 and newer chips. Also, since there are no additional P5 variants in production, rename BNXT_FLAG_CHIP_P5_THOR() to BNXT_FLAG_CHIP_P5() to keep the naming more simple. Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Signed-off-by: Randy Schacher <stuart.schacher@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-14-michael.chan@broadcom.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Michael Chan authored
Modify the NAPI logic for the new doorbell mechanism on P7 chips. These changes are compatible with the current P5 chips. In the current logic, bnxt_poll_p5() services 1 or more CQs for each MSIX. Each MSIX has an associated NQ and each NQ has 1 or more associated CQs. If any CQ reaches NAPI budget, we'll stay in polling mode and will unconditionally check and service all CQs until we exit polling. We always re-arm all CQs when we exit polling. To be compatible with the new Toggle bit mechanism in P7 chips, we need to modify the logic so that we service and re-arm the CQ only if we receive an NQE notification for work for that CQ. We add a new had_nqe_notify bit to the cp_ring_info structure and it gets set when we see the NQE notification for that CQ anytime during polling. We'll service and re-arm only the CQs with the had_nqe_notify bits set. Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-13-michael.chan@broadcom.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Michael Chan authored
Modify the RX indexing logic for both RX ring and RX aggregation ring just like the TX logic. Change it so that the index increments unbounded and mask it only when needed. Modify the existing RX macros so that the index is not masked. Add new macros RING_RX()/RING_RX_AGG() to mask it only when needed to get the index of rxr->rx_buf_ring[] and rxr->rx_agg_ring[]. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-12-michael.chan@broadcom.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Michael Chan authored
Change the TX ring logic so that the index increments unbounded and mask it only when needed. Modify the existing macros so that the index is not masked. Add a new macro RING_TX() to mask it only when needed to get the index of txr->tx_buf_ring[]. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-11-michael.chan@broadcom.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Michael Chan authored
This allows the doorbell related logic to mask the doorbell index to the proper range before writing the doorbell. The current code masks the doorbell index immediately to keep it in the legal ranges for the most part. Subsequent patches will change the logic so that the index increments unbounded and it only gets masked before use. This is preparation work for the new chip that requires an additional Epoch bit in the doorbell that needs to toggle when the index has wrapped around. This patch just adds the basic infrastructure and the logic is largely unchanged. We now replace RING_CMP() with the new DB_RING_IDX() at appropriate places where we mask the completion ring index before writing the doorbell. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-10-michael.chan@broadcom.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Michael Chan authored
Newer chips starting with 57600 will use this new firmware HWRM call to configure backing store memory. Add this new call if it is supported by the firmware. Reviewed-by: Hongguang Gao <hongguang.gao@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-9-michael.chan@broadcom.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Michael Chan authored
Use the new v2 firmware API if supported by the firmware. We now have the infrastructure to support the v2 API. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-8-michael.chan@broadcom.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Michael Chan authored
In bnxt_alloc_ctx_mem(), the logic to set up the context memory entries and to allocate the context memory tables is done repetitively. Add a helper function to simplify the code. The setup of the Fast Path TQM entries relies on some information from the Slow Path TQM entries. Copy the SP_TQM entries to the FP_TQM entries to simplify the logic. Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-7-michael.chan@broadcom.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Michael Chan authored
Use the newly added pg_info field in bnxt_ctx_mem_type struct and remove the standalone page info structures in bnxt_ctx_mem_info. This now completes the reorganization of the context memory structures to work better with the new and more flexible firmware interface for newer chips. Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-6-michael.chan@broadcom.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Michael Chan authored
This will further improve the organization of the bnxt_ctx_mem_info structure by moving the standalone page info structures into the bnxt_ctx_mem_type array. Add the allocation and free logic first and the next patch will migrate to use the new infrastructure. Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-5-michael.chan@broadcom.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Michael Chan authored
The current code uses a flat bnxt_ctx_mem_info structure to store 8 types of context memory for the NIC. All the context memory types are very similar and have similar parameters. They can all share a common structure to improve the organization. Also, new firmware interface will provide a new API to retrieve each type of context memory by calling the API repeatedly. This patch reorganizes the bnxt_ctx_mem_info structure to fit better with the new firmware interface. It will also work with the legacy firmware interface. The flat fields in bnxt_ctx_mem_info are replaced by the bnxt_ctx_mem_type array. The bnxt_mem_init array info will no longer be needed. Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-4-michael.chan@broadcom.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Michael Chan authored
We always free bp->ctx right after calling bnxt_free_ctx_mem(), so just free it at the end of that function to make things simpler. Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-3-michael.chan@broadcom.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Michael Chan authored
bnxt_alloc_ctx_mem() calls bnxt_hwrm_func_backing_store_qcaps() to allocate the memory for bp->ctx. Initialize bp->ctx with the allocated memory and let the caller free it during unwind. The unwind logic is already there, we just need to always set bp->ctx to the allocated memory so the caller will always free it. This simplifies the logic and makes it easier to expand on the backing store logic. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-2-michael.chan@broadcom.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Jakub Kicinski says: ==================== net: page_pool: plit the page_pool_params into fast and slow Small refactoring in prep for adding more page pool params which won't be needed on the fast path. v1: https://lore.kernel.org/all/20231024160220.3973311-1-kuba@kernel.org/ RFC: https://lore.kernel.org/all/20230816234303.3786178-1-kuba@kernel.org/ ==================== Link: https://lore.kernel.org/r/20231121000048.789613-1-kuba@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
To fully benefit from previous commit add one byte of state in the first cache line recording if we need to look at the slow part. The packing isn't all that impressive right now, we create a 7B hole. I'm expecting Olek's rework will reshuffle this, anyway. Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Reviewed-by: Mina Almasry <almasrymina@google.com> Link: https://lore.kernel.org/r/20231121000048.789613-3-kuba@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
struct page_pool is rather performance critical and we use 16B of the first cache line to store 2 pointers used only by test code. Future patches will add more informational (non-fast path) attributes. It's convenient for the user of the API to not have to worry which fields are fast and which are slow path. Use struct groups to split the params into the two categories internally. Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Reviewed-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Link: https://lore.kernel.org/r/20231121000048.789613-2-kuba@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
- 21 Nov, 2023 20 commits
-
-
Jakub Kicinski authored
Petr Machata says: ==================== mlxsw: Preparations for support of CFF flood mode PGT is an in-HW table that maps addresses to sets of ports. Then when some HW process needs a set of ports as an argument, instead of embedding the actual set in the dynamic configuration, what gets configured is the address referencing the set. The HW then works with the appropriate PGT entry. Among other allocations, the PGT currently contains two large blocks for bridge flooding: one for 802.1q and one for 802.1d. Within each of these blocks are three tables, for unknown-unicast, multicast and broadcast flooding: . . . | 802.1q | 802.1d | . . . | UC | MC | BC | UC | MC | BC | \______ _____/ \_____ ______/ v v FID flood vectors Thus each FID (which corresponds to an 802.1d bridge or one VLAN in an 802.1q bridge) uses three flood vectors spread across a fairly large region of PGT. This way of organizing the flood table (called "controlled") is not very flexible. E.g. to decrease a bridge scale and store more IP MC vectors, one would need to completely rewrite the bridge PGT blocks, or resort to hacks such as storing individual MC flood vectors into unused part of the bridge table. In order to address these shortcomings, Spectrum-2 and above support what is called CFF flood mode, for Compressed FID Flooding. In CFF flood mode, each FID has a little table of its own, with three entries adjacent to each other, one for unknown-UC, one for MC, one for BC. This allows for a much more fine-grained approach to PGT management, where bits of it are allocated on demand. . . . | FID | FID | FID | FID | FID | . . . |U|M|B|U|M|B|U|M|B|U|M|B|U|M|B| \_____________ _____________/ v FID flood vectors Besides the FID table organization, the CFF flood mode also impacts Router Subport (RSP) table. This table contains flood vectors for rFIDs, which are FIDs that reference front panel ports or LAGs. The RSP table contains two entries per front panel port and LAG, one for unknown-UC traffic, and one for everything else. Currently, the FW allocates and manages the table in its own part of PGT. rFIDs are marked with flood_rsp bit and managed specially. In CFF mode, rFIDs are managed as all other FIDs. The driver therefore has to allocate and maintain the flood vectors. Like with bridge FIDs, this is more work, but increases flexibility of the system. The FW currently supports both the controlled and CFF flood modes. To shed complexity, in the future it should only support CFF flood mode. Hence this patchset, which is the first in series of two to add CFF flood mode support to mlxsw. There are FW versions out there that do not support CFF flood mode, and on Spectrum-1 in particular, there is no plan to support it at all. mlxsw will therefore have to support both controlled flood mode as well as CFF. Another aspect is that at least on Spectrum-1, there are FW versions out there that claim to support CFF flood mode, but then reject or ignore configurations enabling the same. The driver thus has to have a say in whether an attempt to configure CFF flood mode should even be made. Much like with the LAG mode, the feature is therefore expressed in terms of "does the driver prefer CFF flood mode?", and "what flood mode the PCI module managed to configure the FW with". This gives to the driver a chance to determine whether CFF flood mode configuration should be attempted. In this patchset, we lay the ground with new definitions, registers and their fields, and some minor code shaping. The next patchset will be more focused on introducing necessary abstractions and implementation. - Patches #1 and #2 add CFF-related items to the command interface. - Patch #3 adds a new resource, for maximum number of flood profiles supported. (A flood profile is a mapping between traffic type and offset in the per-FID flood vector table.) - Patches #4 to #8 adjust reg.h. The SFFP register is added, which is used for configuring the abovementioned traffic-type-to-offset mapping. The SFMR, register, which serves for FID configuration, is extended with fields specific to CFF mode. And other minor adjustments. - Patches #9 and #10 add the plumbing for CFF mode: a way to request that CFF flood mode be configured, and a way to query the flood mode that was actually configured. - Patch #11 removes dead code. - Patches #12 and #13 add helpers that the next patchset will make use of. Patch #14 moves RIF setup ahead so that FID code can make use of it. ==================== Link: https://lore.kernel.org/r/cover.1700503643.git.petrm@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Petr Machata authored
For subport RIFs, the setup initializes, among other things, RIF port and LAG numbers. Those are important to determine where in the PGT the RIF FID will be stored. Therefore, call the RIF setup before fid_get. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/f24d8cad7e4748b8e8e0e16894ca6a20704dea32.1700503644.git.petrm@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Petr Machata authored
In the CFF flood mode, responsibility for management of the PGT entries for rFIDs is moved from FW to the driver. All rFIDs are based off either a front panel port, or a LAG port. The flood vectors for port-based rFIDs enable just the port itself, the ones for LAG-based rFIDs enable all member ports of the LAG in question. Since all rFIDs based off the same port have the same flood vector, and similarly for LAG-based rFIDs, the flood entries are shared. The PGT address of the flood vector is therefore determined based on the port (or LAG) number of the RIF connected with the rFID. Add a helper to determine subport number given a RIF, to be used in these calculations. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/d7ab43cf5b021f785f363f236e4b6780d10eea93.1700503644.git.petrm@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Petr Machata authored
Both mlxsw_sp_fid_op() and mlxsw_sp_fid_edit_op() pack the core of SFMR the same way. Extract the common code into a helper and call that. Extract out of that a wrapper that just calls mlxsw_reg_sfmr_pack(), because it will be useful for the dummy family later on. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/31f32b4d767183f6cb197148d0792feab2efadba.1700503644.git.petrm@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Petr Machata authored
The caller already only calls mlxsw_sp_fid_flood_tables_init() and mlxsw_sp_fid_flood_tables_fini() if (fid_family->flood_tables). There is no configuration where the pointer is non-NULL, but the number of tables is zero. So drop the conditions. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/897c6841bc756ac632b797bf67ac83c6a66ba359.1700503644.git.petrm@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Petr Machata authored
There are FW versions out there that do not support CFF flood mode, and on Spectrum-1 in particular, there is no plan to support it at all. mlxsw will therefore have to support both controlled flood mode as well as CFF. There are also FW versions out there that claim to support CFF flood mode, but then reject or ignore configurations enabling the same. The driver thus has to have a say in whether an attempt to configure CFF flood mode should even be made, and what to use as a fallback. Hence express the feature in terms of "does the driver prefer CFF flood mode?", and "what flood mode the PCI module managed to configure the FW with". This gives to the driver a chance to determine whether CFF flood mode configuration should be attempted. The latter bit was added in previous patches. In this patch, add the bit that allows the driver to determine whether CFF enablement should be attempted, and the enablement code itself. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/41640a0ee58e0a9538f820f7b601a0e35f6449e4.1700503644.git.petrm@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Petr Machata authored
CFF mode, for Compressed FID Flooding, is a way of organizing flood vectors in the PGT table. The bus module determines whether CFF is supported, can configure flood mode to CFF if it is, and knows what flood mode has been configured. Therefore add a bus callback to determine the configured flood mode. Also add to core an API to query it. Since after this patch, we rely on mlxsw_pci->flood_mode being set, it becomes a coding error if a driver invokes this function with a set of fields that misses the initialization. Warn and bail out in that case. The CFF mode is not used as of this patch. The code to actually use it will be added later. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/889d58759dd40f5037f2206b9fc4a78a9240da80.1700503644.git.petrm@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Petr Machata authored
Add the field cff_mid_base, which specifies at which point in PGT the per-FID flood table is stored. Add cff_prf_id, the profile ID, which determines on which row of the flood table a flood vector can be found for a given traffic type. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/3ad7ae38cf6534bedcd876f16090d109a814b3e3.1700503644.git.petrm@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Petr Machata authored
In CFF mode, it is necessary to set a different set of SFMR fields. Leave in mlxsw_reg_sfmr_pack() only the common bits, and move the parts relevant to controlled flood mode directly to the call site. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/6f29639ebc3ca0722272e6c644ca910096469413.1700503644.git.petrm@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Petr Machata authored
The MLXSW_REG_ZERO at the beginning of the function wipes the whole payload. There's no need to set vtfp and vv to false explicitly. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/04a51ea7cf31eea0ef7707311d8e864e2d9ef307.1700503644.git.petrm@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Petr Machata authored
Some existing fields and the whole register of SFGC are reserved in CFF mode. Backport the reservation note to these fields. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/e1d5977a8cb778227e4ea2fd1515529957ce5de7.1700503643.git.petrm@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Petr Machata authored
The SFFP register populates the fid flooding profile tables used for the NVE flooding and Compressed-FID Flooding (CFF). Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/ca42eb67763bd0c7cf035afc62ef73632f3f61a6.1700503643.git.petrm@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Petr Machata authored
max_cap_nve_flood_prf describes maximum number of NVE flooding profiles. The same value then applies for flooding profiles for flooding in CFF mode. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/064a2e013d879e5f5494167a6c120c4bb85a2204.1700503643.git.petrm@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Petr Machata authored
PGT, a port-group table is an in-HW block of specialized memory that holds sets of ports. Allocated within the PGT are series of flood tables that describe to which ports traffic of various types (unknown UC, BC, MC) should be flooded from which FID. The hitherto-used layout of these flood tables is being replaced with a more flexible scheme, called compressed FID flooding (CFF). CFF can be configured through CONFIG_PROFILE.flood_mode. In this patch, add MLXSW_CMD_MBOX_CONFIG_PROFILE_FLOOD_MODE_CFF, the value to use to enable the CFF mode. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/fc2e063742856492f8f22b0b87abf431ea6d53d0.1700503643.git.petrm@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Petr Machata authored
PGT, a port-group table is an in-HW block of specialized memory that holds sets of ports. Allocated within the PGT are series of flood tables that describe to which ports traffic of various types (unknown UC, BC, MC) should be flooded from which FID. The hitherto-used layout of these flood tables is being replaced with a more flexible scheme, called compressed FID flooding (CFF). CFF can be configured through CONFIG_PROFILE.flood_mode. cff_support determines whether CONFIG_PROFILE.flood_mode can be set to CFF. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/af727d0e1095e30fa45c7e60404637cdc491aeec.1700503643.git.petrm@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Networking supports changing netdevice's netns and name at the same time. This allows avoiding name conflicts and having to rename the interface in multiple steps. E.g. netns1={eth0, eth1}, netns2={eth1} - we want to move netns1:eth1 to netns2 and call it eth0 there. If we can't rename "in flight" we'd need to (1) rename eth1 -> $tmp, (2) change netns, (3) rename $tmp -> eth0. To rename the underlying struct device we have to call device_rename(). The rename()'s MOVE event, however, doesn't "belong" to either the old or the new namespace. If there are conflicts on both sides it's actually impossible to issue a real MOVE (old name -> new name) without confusing user space. And Daniel reports that such confusions do in fact happen for systemd, in real life. Since we already issue explicit REMOVE and ADD events manually - suppress the MOVE event completely. Move the ADD after the rename, so that the REMOVE uses the old name, and the ADD the new one. If there is no rename this changes the picture as follows: Before: old ns | KERNEL[213.399289] remove /devices/virtual/net/eth0 (net) new ns | KERNEL[213.401302] add /devices/virtual/net/eth0 (net) new ns | KERNEL[213.401397] move /devices/virtual/net/eth0 (net) After: old ns | KERNEL[266.774257] remove /devices/virtual/net/eth0 (net) new ns | KERNEL[266.774509] add /devices/virtual/net/eth0 (net) If there is a rename and a conflict (using the exact eth0/eth1 example explained above) we get this: Before: old ns | KERNEL[224.316833] remove /devices/virtual/net/eth1 (net) new ns | KERNEL[224.318551] add /devices/virtual/net/eth1 (net) new ns | KERNEL[224.319662] move /devices/virtual/net/eth0 (net) After: old ns | KERNEL[333.033166] remove /devices/virtual/net/eth1 (net) new ns | KERNEL[333.035098] add /devices/virtual/net/eth0 (net) Note that "in flight" rename is only performed when needed. If there is no conflict for old name in the target netns - the rename will be performed separately by dev_change_name(), as if the rename was a different command, and there will still be a MOVE event for the rename: Before: old ns | KERNEL[194.416429] remove /devices/virtual/net/eth0 (net) new ns | KERNEL[194.418809] add /devices/virtual/net/eth0 (net) new ns | KERNEL[194.418869] move /devices/virtual/net/eth0 (net) new ns | KERNEL[194.420866] move /devices/virtual/net/eth1 (net) After: old ns | KERNEL[71.917520] remove /devices/virtual/net/eth0 (net) new ns | KERNEL[71.919155] add /devices/virtual/net/eth0 (net) new ns | KERNEL[71.920729] move /devices/virtual/net/eth1 (net) If deleting the MOVE event breaks some user space we should insert an explicit kobject_uevent(MOVE) after the ADD, like this: @@ -11192,6 +11192,12 @@ int __dev_change_net_namespace(struct net_device *dev, struct net *net, kobject_uevent(&dev->dev.kobj, KOBJ_ADD); netdev_adjacent_add_links(dev); + /* User space wants an explicit MOVE event, issue one unless + * dev_change_name() will get called later and issue one. + */ + if (!pat || new_name[0]) + kobject_uevent(&dev->dev.kobj, KOBJ_MOVE); + /* Adapt owner in case owning user namespace of target network * namespace is different from the original one. */ Reported-by: Daniel Gröber <dxld@darkboxed.org> Link: https://lore.kernel.org/all/20231010121003.x3yi6fihecewjy4e@House.clients.dxld.at/Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/all/20231120184140.578375-1-kuba@kernel.org/Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jose Ignacio Tornos Martinez authored
The device is always reset two consecutive times (ax88179_reset is called twice), one from usbnet_probe during the device binding and the other from usbnet_open. Remove the non-necessary reset during the device binding and let the reset operation from open to keep the normal behavior (tested with generic ASIX Electronics Corp. AX88179 Gigabit Ethernet device). Reported-by: Herb Wei <weihao.bj@ieisystem.com> Tested-by: Herb Wei <weihao.bj@ieisystem.com> Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm@redhat.com> Link: https://lore.kernel.org/r/20231120121239.54504-1-jtornosm@redhat.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Andrii Nakryiko authored
Yuran Pereira says: ==================== selftests/bpf: Update multiple prog_tests to use ASSERT_ macros Multiple files/programs in `tools/testing/selftests/bpf/prog_tests/` still heavily use the `CHECK` macro, even when better `ASSERT_` alternatives are available. As it was already pointed out by Yonghong Song [1] in the bpf selftests the use of the ASSERT_* series of macros is preferred over the CHECK macro. This patchset replaces the usage of `CHECK(` macros to the equivalent `ASSERT_` family of macros in the following prog_tests: - bind_perm.c - bpf_obj_id.c - bpf_tcp_ca.c - vmlinux.c [1] https://lore.kernel.org/lkml/0a142924-633c-44e6-9a92-2dc019656bf2@linux.dev Changes in v3: - Addressed the following points mentioned by Yonghong Song - Improved `bpf_map_lookup_elem` assertion in bpf_tcp_ca. - Replaced assertion introduced in v2 with one that checks `thread_ret` instead of `pthread_join`. This ensures that `server`'s return value (thread_ret) is the one being checked, as oposed to `pthread_join`'s return value, since the latter one is less likely to fail. Changes in v2: - Fixed pthread_join assertion that broke the previous test Previous version: v2 - https://lore.kernel.org/lkml/GV1PR10MB6563AECF8E94798A1E5B36A4E8B6A@GV1PR10MB6563.EURPRD10.PROD.OUTLOOK.COM v1 - https://lore.kernel.org/lkml/GV1PR10MB6563FCFF1C5DEBE84FEA985FE8B0A@GV1PR10MB6563.EURPRD10.PROD.OUTLOOK.COM ==================== Link: https://lore.kernel.org/r/GV1PR10MB6563BEFEA4269E1DDBC264B1E8BBA@GV1PR10MB6563.EURPRD10.PROD.OUTLOOK.COMSigned-off-by: Andrii Nakryiko <andrii@kernel.org>
-
Yuran Pereira authored
vmlinux.c uses the `CHECK` calls even though the use of ASSERT_ series of macros is preferred in the bpf selftests. This patch replaces all `CHECK` calls for equivalent `ASSERT_` macro calls. Signed-off-by: Yuran Pereira <yuran.pereira@hotmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/GV1PR10MB6563ED1023A2A3AEF30BDA5DE8BBA@GV1PR10MB6563.EURPRD10.PROD.OUTLOOK.COM
-
Yuran Pereira authored
bpf_obj_id uses the `CHECK` calls even though the use of ASSERT_ series of macros is preferred in the bpf selftests. This patch replaces all `CHECK` calls for equivalent `ASSERT_` macro calls. Signed-off-by: Yuran Pereira <yuran.pereira@hotmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/GV1PR10MB65639AA3A10B4BBAA79952C7E8BBA@GV1PR10MB6563.EURPRD10.PROD.OUTLOOK.COM
-