- 22 Nov, 2023 16 commits
-
-
Randy Schacher authored
In preparation to support a new P7 chip which has a lot of similarities with the P5 chip, rename the BNXT_FLAG_CHIP_P5 flag to BNXT_FLAG_CHIP_P5_PLUS. This will make it clear that the flag is for P5 and newer chips. Also, since there are no additional P5 variants in production, rename BNXT_FLAG_CHIP_P5_THOR() to BNXT_FLAG_CHIP_P5() to keep the naming more simple. Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Signed-off-by: Randy Schacher <stuart.schacher@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-14-michael.chan@broadcom.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Michael Chan authored
Modify the NAPI logic for the new doorbell mechanism on P7 chips. These changes are compatible with the current P5 chips. In the current logic, bnxt_poll_p5() services 1 or more CQs for each MSIX. Each MSIX has an associated NQ and each NQ has 1 or more associated CQs. If any CQ reaches NAPI budget, we'll stay in polling mode and will unconditionally check and service all CQs until we exit polling. We always re-arm all CQs when we exit polling. To be compatible with the new Toggle bit mechanism in P7 chips, we need to modify the logic so that we service and re-arm the CQ only if we receive an NQE notification for work for that CQ. We add a new had_nqe_notify bit to the cp_ring_info structure and it gets set when we see the NQE notification for that CQ anytime during polling. We'll service and re-arm only the CQs with the had_nqe_notify bits set. Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-13-michael.chan@broadcom.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Michael Chan authored
Modify the RX indexing logic for both RX ring and RX aggregation ring just like the TX logic. Change it so that the index increments unbounded and mask it only when needed. Modify the existing RX macros so that the index is not masked. Add new macros RING_RX()/RING_RX_AGG() to mask it only when needed to get the index of rxr->rx_buf_ring[] and rxr->rx_agg_ring[]. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-12-michael.chan@broadcom.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Michael Chan authored
Change the TX ring logic so that the index increments unbounded and mask it only when needed. Modify the existing macros so that the index is not masked. Add a new macro RING_TX() to mask it only when needed to get the index of txr->tx_buf_ring[]. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-11-michael.chan@broadcom.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Michael Chan authored
This allows the doorbell related logic to mask the doorbell index to the proper range before writing the doorbell. The current code masks the doorbell index immediately to keep it in the legal ranges for the most part. Subsequent patches will change the logic so that the index increments unbounded and it only gets masked before use. This is preparation work for the new chip that requires an additional Epoch bit in the doorbell that needs to toggle when the index has wrapped around. This patch just adds the basic infrastructure and the logic is largely unchanged. We now replace RING_CMP() with the new DB_RING_IDX() at appropriate places where we mask the completion ring index before writing the doorbell. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-10-michael.chan@broadcom.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Michael Chan authored
Newer chips starting with 57600 will use this new firmware HWRM call to configure backing store memory. Add this new call if it is supported by the firmware. Reviewed-by: Hongguang Gao <hongguang.gao@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-9-michael.chan@broadcom.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Michael Chan authored
Use the new v2 firmware API if supported by the firmware. We now have the infrastructure to support the v2 API. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-8-michael.chan@broadcom.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Michael Chan authored
In bnxt_alloc_ctx_mem(), the logic to set up the context memory entries and to allocate the context memory tables is done repetitively. Add a helper function to simplify the code. The setup of the Fast Path TQM entries relies on some information from the Slow Path TQM entries. Copy the SP_TQM entries to the FP_TQM entries to simplify the logic. Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-7-michael.chan@broadcom.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Michael Chan authored
Use the newly added pg_info field in bnxt_ctx_mem_type struct and remove the standalone page info structures in bnxt_ctx_mem_info. This now completes the reorganization of the context memory structures to work better with the new and more flexible firmware interface for newer chips. Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-6-michael.chan@broadcom.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Michael Chan authored
This will further improve the organization of the bnxt_ctx_mem_info structure by moving the standalone page info structures into the bnxt_ctx_mem_type array. Add the allocation and free logic first and the next patch will migrate to use the new infrastructure. Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-5-michael.chan@broadcom.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Michael Chan authored
The current code uses a flat bnxt_ctx_mem_info structure to store 8 types of context memory for the NIC. All the context memory types are very similar and have similar parameters. They can all share a common structure to improve the organization. Also, new firmware interface will provide a new API to retrieve each type of context memory by calling the API repeatedly. This patch reorganizes the bnxt_ctx_mem_info structure to fit better with the new firmware interface. It will also work with the legacy firmware interface. The flat fields in bnxt_ctx_mem_info are replaced by the bnxt_ctx_mem_type array. The bnxt_mem_init array info will no longer be needed. Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-4-michael.chan@broadcom.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Michael Chan authored
We always free bp->ctx right after calling bnxt_free_ctx_mem(), so just free it at the end of that function to make things simpler. Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-3-michael.chan@broadcom.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Michael Chan authored
bnxt_alloc_ctx_mem() calls bnxt_hwrm_func_backing_store_qcaps() to allocate the memory for bp->ctx. Initialize bp->ctx with the allocated memory and let the caller free it during unwind. The unwind logic is already there, we just need to always set bp->ctx to the allocated memory so the caller will always free it. This simplifies the logic and makes it easier to expand on the backing store logic. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-2-michael.chan@broadcom.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Jakub Kicinski says: ==================== net: page_pool: plit the page_pool_params into fast and slow Small refactoring in prep for adding more page pool params which won't be needed on the fast path. v1: https://lore.kernel.org/all/20231024160220.3973311-1-kuba@kernel.org/ RFC: https://lore.kernel.org/all/20230816234303.3786178-1-kuba@kernel.org/ ==================== Link: https://lore.kernel.org/r/20231121000048.789613-1-kuba@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
To fully benefit from previous commit add one byte of state in the first cache line recording if we need to look at the slow part. The packing isn't all that impressive right now, we create a 7B hole. I'm expecting Olek's rework will reshuffle this, anyway. Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Reviewed-by: Mina Almasry <almasrymina@google.com> Link: https://lore.kernel.org/r/20231121000048.789613-3-kuba@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
struct page_pool is rather performance critical and we use 16B of the first cache line to store 2 pointers used only by test code. Future patches will add more informational (non-fast path) attributes. It's convenient for the user of the API to not have to worry which fields are fast and which are slow path. Use struct groups to split the params into the two categories internally. Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Reviewed-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Link: https://lore.kernel.org/r/20231121000048.789613-2-kuba@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
- 21 Nov, 2023 24 commits
-
-
Jakub Kicinski authored
Petr Machata says: ==================== mlxsw: Preparations for support of CFF flood mode PGT is an in-HW table that maps addresses to sets of ports. Then when some HW process needs a set of ports as an argument, instead of embedding the actual set in the dynamic configuration, what gets configured is the address referencing the set. The HW then works with the appropriate PGT entry. Among other allocations, the PGT currently contains two large blocks for bridge flooding: one for 802.1q and one for 802.1d. Within each of these blocks are three tables, for unknown-unicast, multicast and broadcast flooding: . . . | 802.1q | 802.1d | . . . | UC | MC | BC | UC | MC | BC | \______ _____/ \_____ ______/ v v FID flood vectors Thus each FID (which corresponds to an 802.1d bridge or one VLAN in an 802.1q bridge) uses three flood vectors spread across a fairly large region of PGT. This way of organizing the flood table (called "controlled") is not very flexible. E.g. to decrease a bridge scale and store more IP MC vectors, one would need to completely rewrite the bridge PGT blocks, or resort to hacks such as storing individual MC flood vectors into unused part of the bridge table. In order to address these shortcomings, Spectrum-2 and above support what is called CFF flood mode, for Compressed FID Flooding. In CFF flood mode, each FID has a little table of its own, with three entries adjacent to each other, one for unknown-UC, one for MC, one for BC. This allows for a much more fine-grained approach to PGT management, where bits of it are allocated on demand. . . . | FID | FID | FID | FID | FID | . . . |U|M|B|U|M|B|U|M|B|U|M|B|U|M|B| \_____________ _____________/ v FID flood vectors Besides the FID table organization, the CFF flood mode also impacts Router Subport (RSP) table. This table contains flood vectors for rFIDs, which are FIDs that reference front panel ports or LAGs. The RSP table contains two entries per front panel port and LAG, one for unknown-UC traffic, and one for everything else. Currently, the FW allocates and manages the table in its own part of PGT. rFIDs are marked with flood_rsp bit and managed specially. In CFF mode, rFIDs are managed as all other FIDs. The driver therefore has to allocate and maintain the flood vectors. Like with bridge FIDs, this is more work, but increases flexibility of the system. The FW currently supports both the controlled and CFF flood modes. To shed complexity, in the future it should only support CFF flood mode. Hence this patchset, which is the first in series of two to add CFF flood mode support to mlxsw. There are FW versions out there that do not support CFF flood mode, and on Spectrum-1 in particular, there is no plan to support it at all. mlxsw will therefore have to support both controlled flood mode as well as CFF. Another aspect is that at least on Spectrum-1, there are FW versions out there that claim to support CFF flood mode, but then reject or ignore configurations enabling the same. The driver thus has to have a say in whether an attempt to configure CFF flood mode should even be made. Much like with the LAG mode, the feature is therefore expressed in terms of "does the driver prefer CFF flood mode?", and "what flood mode the PCI module managed to configure the FW with". This gives to the driver a chance to determine whether CFF flood mode configuration should be attempted. In this patchset, we lay the ground with new definitions, registers and their fields, and some minor code shaping. The next patchset will be more focused on introducing necessary abstractions and implementation. - Patches #1 and #2 add CFF-related items to the command interface. - Patch #3 adds a new resource, for maximum number of flood profiles supported. (A flood profile is a mapping between traffic type and offset in the per-FID flood vector table.) - Patches #4 to #8 adjust reg.h. The SFFP register is added, which is used for configuring the abovementioned traffic-type-to-offset mapping. The SFMR, register, which serves for FID configuration, is extended with fields specific to CFF mode. And other minor adjustments. - Patches #9 and #10 add the plumbing for CFF mode: a way to request that CFF flood mode be configured, and a way to query the flood mode that was actually configured. - Patch #11 removes dead code. - Patches #12 and #13 add helpers that the next patchset will make use of. Patch #14 moves RIF setup ahead so that FID code can make use of it. ==================== Link: https://lore.kernel.org/r/cover.1700503643.git.petrm@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Petr Machata authored
For subport RIFs, the setup initializes, among other things, RIF port and LAG numbers. Those are important to determine where in the PGT the RIF FID will be stored. Therefore, call the RIF setup before fid_get. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/f24d8cad7e4748b8e8e0e16894ca6a20704dea32.1700503644.git.petrm@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Petr Machata authored
In the CFF flood mode, responsibility for management of the PGT entries for rFIDs is moved from FW to the driver. All rFIDs are based off either a front panel port, or a LAG port. The flood vectors for port-based rFIDs enable just the port itself, the ones for LAG-based rFIDs enable all member ports of the LAG in question. Since all rFIDs based off the same port have the same flood vector, and similarly for LAG-based rFIDs, the flood entries are shared. The PGT address of the flood vector is therefore determined based on the port (or LAG) number of the RIF connected with the rFID. Add a helper to determine subport number given a RIF, to be used in these calculations. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/d7ab43cf5b021f785f363f236e4b6780d10eea93.1700503644.git.petrm@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Petr Machata authored
Both mlxsw_sp_fid_op() and mlxsw_sp_fid_edit_op() pack the core of SFMR the same way. Extract the common code into a helper and call that. Extract out of that a wrapper that just calls mlxsw_reg_sfmr_pack(), because it will be useful for the dummy family later on. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/31f32b4d767183f6cb197148d0792feab2efadba.1700503644.git.petrm@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Petr Machata authored
The caller already only calls mlxsw_sp_fid_flood_tables_init() and mlxsw_sp_fid_flood_tables_fini() if (fid_family->flood_tables). There is no configuration where the pointer is non-NULL, but the number of tables is zero. So drop the conditions. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/897c6841bc756ac632b797bf67ac83c6a66ba359.1700503644.git.petrm@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Petr Machata authored
There are FW versions out there that do not support CFF flood mode, and on Spectrum-1 in particular, there is no plan to support it at all. mlxsw will therefore have to support both controlled flood mode as well as CFF. There are also FW versions out there that claim to support CFF flood mode, but then reject or ignore configurations enabling the same. The driver thus has to have a say in whether an attempt to configure CFF flood mode should even be made, and what to use as a fallback. Hence express the feature in terms of "does the driver prefer CFF flood mode?", and "what flood mode the PCI module managed to configure the FW with". This gives to the driver a chance to determine whether CFF flood mode configuration should be attempted. The latter bit was added in previous patches. In this patch, add the bit that allows the driver to determine whether CFF enablement should be attempted, and the enablement code itself. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/41640a0ee58e0a9538f820f7b601a0e35f6449e4.1700503644.git.petrm@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Petr Machata authored
CFF mode, for Compressed FID Flooding, is a way of organizing flood vectors in the PGT table. The bus module determines whether CFF is supported, can configure flood mode to CFF if it is, and knows what flood mode has been configured. Therefore add a bus callback to determine the configured flood mode. Also add to core an API to query it. Since after this patch, we rely on mlxsw_pci->flood_mode being set, it becomes a coding error if a driver invokes this function with a set of fields that misses the initialization. Warn and bail out in that case. The CFF mode is not used as of this patch. The code to actually use it will be added later. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/889d58759dd40f5037f2206b9fc4a78a9240da80.1700503644.git.petrm@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Petr Machata authored
Add the field cff_mid_base, which specifies at which point in PGT the per-FID flood table is stored. Add cff_prf_id, the profile ID, which determines on which row of the flood table a flood vector can be found for a given traffic type. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/3ad7ae38cf6534bedcd876f16090d109a814b3e3.1700503644.git.petrm@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Petr Machata authored
In CFF mode, it is necessary to set a different set of SFMR fields. Leave in mlxsw_reg_sfmr_pack() only the common bits, and move the parts relevant to controlled flood mode directly to the call site. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/6f29639ebc3ca0722272e6c644ca910096469413.1700503644.git.petrm@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Petr Machata authored
The MLXSW_REG_ZERO at the beginning of the function wipes the whole payload. There's no need to set vtfp and vv to false explicitly. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/04a51ea7cf31eea0ef7707311d8e864e2d9ef307.1700503644.git.petrm@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Petr Machata authored
Some existing fields and the whole register of SFGC are reserved in CFF mode. Backport the reservation note to these fields. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/e1d5977a8cb778227e4ea2fd1515529957ce5de7.1700503643.git.petrm@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Petr Machata authored
The SFFP register populates the fid flooding profile tables used for the NVE flooding and Compressed-FID Flooding (CFF). Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/ca42eb67763bd0c7cf035afc62ef73632f3f61a6.1700503643.git.petrm@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Petr Machata authored
max_cap_nve_flood_prf describes maximum number of NVE flooding profiles. The same value then applies for flooding profiles for flooding in CFF mode. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/064a2e013d879e5f5494167a6c120c4bb85a2204.1700503643.git.petrm@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Petr Machata authored
PGT, a port-group table is an in-HW block of specialized memory that holds sets of ports. Allocated within the PGT are series of flood tables that describe to which ports traffic of various types (unknown UC, BC, MC) should be flooded from which FID. The hitherto-used layout of these flood tables is being replaced with a more flexible scheme, called compressed FID flooding (CFF). CFF can be configured through CONFIG_PROFILE.flood_mode. In this patch, add MLXSW_CMD_MBOX_CONFIG_PROFILE_FLOOD_MODE_CFF, the value to use to enable the CFF mode. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/fc2e063742856492f8f22b0b87abf431ea6d53d0.1700503643.git.petrm@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Petr Machata authored
PGT, a port-group table is an in-HW block of specialized memory that holds sets of ports. Allocated within the PGT are series of flood tables that describe to which ports traffic of various types (unknown UC, BC, MC) should be flooded from which FID. The hitherto-used layout of these flood tables is being replaced with a more flexible scheme, called compressed FID flooding (CFF). CFF can be configured through CONFIG_PROFILE.flood_mode. cff_support determines whether CONFIG_PROFILE.flood_mode can be set to CFF. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/af727d0e1095e30fa45c7e60404637cdc491aeec.1700503643.git.petrm@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Networking supports changing netdevice's netns and name at the same time. This allows avoiding name conflicts and having to rename the interface in multiple steps. E.g. netns1={eth0, eth1}, netns2={eth1} - we want to move netns1:eth1 to netns2 and call it eth0 there. If we can't rename "in flight" we'd need to (1) rename eth1 -> $tmp, (2) change netns, (3) rename $tmp -> eth0. To rename the underlying struct device we have to call device_rename(). The rename()'s MOVE event, however, doesn't "belong" to either the old or the new namespace. If there are conflicts on both sides it's actually impossible to issue a real MOVE (old name -> new name) without confusing user space. And Daniel reports that such confusions do in fact happen for systemd, in real life. Since we already issue explicit REMOVE and ADD events manually - suppress the MOVE event completely. Move the ADD after the rename, so that the REMOVE uses the old name, and the ADD the new one. If there is no rename this changes the picture as follows: Before: old ns | KERNEL[213.399289] remove /devices/virtual/net/eth0 (net) new ns | KERNEL[213.401302] add /devices/virtual/net/eth0 (net) new ns | KERNEL[213.401397] move /devices/virtual/net/eth0 (net) After: old ns | KERNEL[266.774257] remove /devices/virtual/net/eth0 (net) new ns | KERNEL[266.774509] add /devices/virtual/net/eth0 (net) If there is a rename and a conflict (using the exact eth0/eth1 example explained above) we get this: Before: old ns | KERNEL[224.316833] remove /devices/virtual/net/eth1 (net) new ns | KERNEL[224.318551] add /devices/virtual/net/eth1 (net) new ns | KERNEL[224.319662] move /devices/virtual/net/eth0 (net) After: old ns | KERNEL[333.033166] remove /devices/virtual/net/eth1 (net) new ns | KERNEL[333.035098] add /devices/virtual/net/eth0 (net) Note that "in flight" rename is only performed when needed. If there is no conflict for old name in the target netns - the rename will be performed separately by dev_change_name(), as if the rename was a different command, and there will still be a MOVE event for the rename: Before: old ns | KERNEL[194.416429] remove /devices/virtual/net/eth0 (net) new ns | KERNEL[194.418809] add /devices/virtual/net/eth0 (net) new ns | KERNEL[194.418869] move /devices/virtual/net/eth0 (net) new ns | KERNEL[194.420866] move /devices/virtual/net/eth1 (net) After: old ns | KERNEL[71.917520] remove /devices/virtual/net/eth0 (net) new ns | KERNEL[71.919155] add /devices/virtual/net/eth0 (net) new ns | KERNEL[71.920729] move /devices/virtual/net/eth1 (net) If deleting the MOVE event breaks some user space we should insert an explicit kobject_uevent(MOVE) after the ADD, like this: @@ -11192,6 +11192,12 @@ int __dev_change_net_namespace(struct net_device *dev, struct net *net, kobject_uevent(&dev->dev.kobj, KOBJ_ADD); netdev_adjacent_add_links(dev); + /* User space wants an explicit MOVE event, issue one unless + * dev_change_name() will get called later and issue one. + */ + if (!pat || new_name[0]) + kobject_uevent(&dev->dev.kobj, KOBJ_MOVE); + /* Adapt owner in case owning user namespace of target network * namespace is different from the original one. */ Reported-by: Daniel Gröber <dxld@darkboxed.org> Link: https://lore.kernel.org/all/20231010121003.x3yi6fihecewjy4e@House.clients.dxld.at/Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/all/20231120184140.578375-1-kuba@kernel.org/Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jose Ignacio Tornos Martinez authored
The device is always reset two consecutive times (ax88179_reset is called twice), one from usbnet_probe during the device binding and the other from usbnet_open. Remove the non-necessary reset during the device binding and let the reset operation from open to keep the normal behavior (tested with generic ASIX Electronics Corp. AX88179 Gigabit Ethernet device). Reported-by: Herb Wei <weihao.bj@ieisystem.com> Tested-by: Herb Wei <weihao.bj@ieisystem.com> Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm@redhat.com> Link: https://lore.kernel.org/r/20231120121239.54504-1-jtornosm@redhat.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Russell King (Oracle) authored
Use for_each_set_bit() rather than open coding the for() test_bit() loop. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Link: https://lore.kernel.org/r/E1r4p15-00Cpxe-C7@rmk-PC.armlinux.org.ukSigned-off-by: Paolo Abeni <pabeni@redhat.com>
-
Baruch Siach authored
The code to show extended descriptor is identical to normal one. Consolidate the code to remove duplication. Signed-off-by: Baruch Siach <baruch@tkos.co.il> Link: https://lore.kernel.org/r/a2a5c5ce9338bdea60ec71d7eeb00fe757281557.1700372381.git.baruch@tkos.co.ilSigned-off-by: Paolo Abeni <pabeni@redhat.com>
-
Baruch Siach authored
One newline per line should be enough. Reduce the verbosity of descriptors dump. Reviewed-by: Serge Semin <fancer.lancer@gmail.com> Signed-off-by: Baruch Siach <baruch@tkos.co.il> Link: https://lore.kernel.org/r/444f3b1dd409fdb14ed2a1ae7679a86b110dadcd.1700372381.git.baruch@tkos.co.ilSigned-off-by: Paolo Abeni <pabeni@redhat.com>
-
Zhengchao Shao authored
If failed to allocate "tags" or could not find the final upper device from start_dev's upper list in bond_verify_device_path(), only the loopback detection of the current upper device should be affected, and the system is no need to be panic. So return -ENOMEM in alb_upper_dev_walk to stop walking, print some warn information when failed to allocate memory for vlan tags in bond_verify_device_path. I also think that the following function calls netdev_walk_all_upper_dev_rcu ---->>>alb_upper_dev_walk ---------->>>bond_verify_device_path From this way, "end device" can eventually be obtained from "start device" in bond_verify_device_path, IS_ERR(tags) could be instead of IS_ERR_OR_NULL(tags) in alb_upper_dev_walk. Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com> Link: https://lore.kernel.org/r/20231118081653.1481260-1-shaozhengchao@huawei.comSigned-off-by: Paolo Abeni <pabeni@redhat.com>
-
Shinas Rasheed authored
Add PCI Endpoint NIC support for Octeon CN10K devices. CN10K devices are part of Octeon 10 family products with similar PCI NIC characteristics. These include: - CN10KA - CNF10KA - CNF10KB - CN10KB Update supported device list in Documentation Signed-off-by: Shinas Rasheed <srasheed@marvell.com> Link: https://lore.kernel.org/r/20231117103817.2468176-1-srasheed@marvell.comSigned-off-by: Paolo Abeni <pabeni@redhat.com>
-
Lorenzo Bianconi authored
Introduce WED offloading support for boards with more than 4GB of memory. Co-developed-by: Sujuan Chen <sujuan.chen@mediatek.com> Signed-off-by: Sujuan Chen <sujuan.chen@mediatek.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/1c7efdf5d384ea7af3c0209723e40b2ee0f956bf.1700239272.git.lorenzo@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Pedro Tammela says: ==================== selftests: tc-testing: more updates to tdc Address the issues making tdc timeout on downstream CIs like lkp and tuxsuite. ==================== Link: https://lore.kernel.org/r/20231117171208.2066136-1-pctammela@mojatatu.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-