- 28 Jul, 2021 6 commits
-
-
Peilin Ye authored
Currently, when doing rate limiting using the tc-police(8) action, the easiest way is to simply drop the packets which exceed or conform the configured bandwidth limit. Add a new option to tc-skbmod(8), so that users may use the ECN [1] extension to explicitly inform the receiver about the congestion instead of dropping packets "on the floor". The 2 least significant bits of the Traffic Class field in IPv4 and IPv6 headers are used to represent different ECN states [2]: 0b00: "Non ECN-Capable Transport", Non-ECT 0b10: "ECN Capable Transport", ECT(0) 0b01: "ECN Capable Transport", ECT(1) 0b11: "Congestion Encountered", CE As an example: $ tc filter add dev eth0 parent 1: protocol ip prio 10 \ matchall action skbmod ecn Doing the above marks all ECT(0) and ECT(1) packets as CE. It does NOT affect Non-ECT or non-IP packets. In the tc-police scenario mentioned above, users may pipe a tc-police action and a tc-skbmod "ecn" action together to achieve ECN-based rate limiting. For TCP connections, upon receiving a CE packet, the receiver will respond with an ECE packet, asking the sender to reduce their congestion window. However ECN also works with other L4 protocols e.g. DCCP and SCTP [2], and our implementation does not touch or care about L4 headers. The updated tc-skbmod SYNOPSIS looks like the following: tc ... action skbmod { set SETTABLE | swap SWAPPABLE | ecn } ... Only one of "set", "swap" or "ecn" shall be used in a single tc-skbmod command. Trying to use more than one of them at a time is considered undefined behavior; pipe multiple tc-skbmod commands together instead. "set" and "swap" only affect Ethernet packets, while "ecn" only affects IPv{4,6} packets. It is also worth mentioning that, in theory, the same effect could be achieved by piping a "police" action and a "bpf" action using the bpf_skb_ecn_set_ce() helper, but this requires eBPF programming from the user, thus impractical. Depends on patch "net/sched: act_skbmod: Skip non-Ethernet packets". [1] https://datatracker.ietf.org/doc/html/rfc3168 [2] https://en.wikipedia.org/wiki/Explicit_Congestion_NotificationReviewed-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Peilin Ye <peilin.ye@bytedance.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Yang Yingliang authored
If nfp_tunnel_add_ipv6_off() fails, it should return error code in nfp_fl_ct_add_offload(). Fixes: 5a2b9304 ("nfp: flower-ct: compile match sections of flow_payload") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Louis Peens <louis.peens@corigine.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Leon Romanovsky says: ==================== Remove duplicated devlink registration check Changelog: v1: * Added two new patches that remove registration field from mlx5 and ti drivers. v0: https://lore.kernel.org/lkml/ed7bbb1e4c51dd58e6035a058e93d16f883b09ce.1627215829.git.leonro@nvidia.com -------------------------------------------------------------------- Both registered flag and devlink pointer are set at the same time and indicate the same thing - devlink/devlink_port are ready. Instead of checking ->registered use devlink pointer as an indication. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Leon Romanovsky authored
Both registered flag and devlink pointer are set at the same time and indicate the same thing - devlink/devlink_port are ready. Instead of checking ->registered use devlink pointer as an indication. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Leon Romanovsky authored
Devlink is an integral part of mlx5 driver and all flows ensure that devlink_*_register() will success. That makes the ->registered check an obsolete. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Leon Romanovsky authored
The commit that introduced devlink support released devlink resources in wrong order, that made an unwind flow to be asymmetrical. In addition, the am65-cpsw-nuss used internal to devlink core field - registered. In order to fix the unwind flow and remove such access to the registered field, rewrite the code to call devlink_port_unregister only on registered ports. Fixes: 58356eb3 ("net: ti: am65-cpsw-nuss: Add devlink support") Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 27 Jul, 2021 34 commits
-
-
David S. Miller authored
Alex Elder says: ==================== net: ipa: add clock references This series continues preparation for implementing runtime power management for IPA. We need to ensure that the IPA core clock and interconnects are operational whenever IPA hardware is accessed. And in particular this means that any external entry point that can lead to accessing IPA hardware must guarantee the hardware is "up" when it is accessed. The first four patches in this series take IPA clock references when needed by such external entry points, dropping those references in those same functions when they are no longer required. The last patch is a bit different, though it too prepares for enabling runtime power management. It avoids suspending/resuming endpoints if setup is not complete. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alex Elder authored
Until we complete the setup stage of initialization, GSI is not initialized and therefore endpoints aren't usable. So avoid suspending endpoints during system suspend unless setup is complete. Clear the setup_complete flag at the top of ipa_teardown() to reflect the fact that things are no longer in setup state. Get rid of a misplaced (and superfluous) comment. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alex Elder authored
The IPA network device can be opened at any time, and an opened network device can be stopped any time. Both of these callback functions require access to the hardware, and therefore they need the IPA clock to be operational. Take an IPA clock reference in both the ->open and ->stop callback functions, dropping the reference when they are done accessing hardware. The ->start_xmit callback requires a little different handling, and that will be added separately. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alex Elder authored
The remoteproc SSR callback function for the modem requires hardware access when handling a modem crash or shutdown. Take and later release an IPA clock reference in ipa_modem_crashed(), to ensure the hardware is operational. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alex Elder authored
Two places call ipa_setup(). The first, ipa_probe(), holds an IPA clock reference when calling ipa_setup() (if the AP is responsible for IPA firmware loading). But if the modem is loading IPA firmware, ipa_smp2p_modem_setup_ready_isr() calls ipa_setup() after the modem has signaled the hardware is ready. This can happen at any time, and there is no guarantee the hardware is active. Have ipa_smp2p_modem_setup() take an IPA clock reference before it calls ipa_setup(), and release it once setup is complete. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alex Elder authored
Any entry point that leads to IPA hardware access must ensure the hardware is operational (clocked). Currently we ensure this by taking an extra clock reference during setup that is not released until we receive a system suspend request. But this extra reference will soon go away. When the platform driver ->probe function is called, we first need hardware access in ipa_config(). Although ipa_config() takes an IPA clock reference, it the special reference taken to prevent suspending the hardware. Have ipa_probe() take a reference before calling ipa_config(), so that the "no-suspend" reference can eventually go away. Drop this reference before ipa_probe() returns. Similarly, the driver ->remove function can be called at any time. Take an IPA clock reference at the beginning of that function, and drop it again after the deconfig stage has completed (at which point hardware access is no longer needed). Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Alex Elder says: ==================== net: ipa: IPA interrupt cleanup The first patch in this series makes all IPA interrupt handling be done in a threaded context. The remaining ones refactor some code to simplify that threaded handler function. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alex Elder authored
Now that ipa_isr_thread() is a simple wrapper that gets a clock reference around ipa_interrupt_process_all(), get rid of the called function and just open-code it in ipa_isr_thread(). Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alex Elder authored
The pending IPA interrupts are checked by ipa_isr_thread(), and interrupts are processed only if an enabled interrupt has a condition pending. But ipa_interrupt_process_all() now makes the same check, so the one in ipa_isr_thread() can just be skipped. Also in ipa_isr_thread(), any interrupt conditions pending which are not enabled are cleared. Here too, ipa_interrupt_process_all() now clears such excess interrupt conditions, so ipa_isr_thread() doesn't have to. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alex Elder authored
We ignore any IPA interrupt that has no handler. If any interrupt conditions without a handler exist when an IPA interrupt occurs, clear those conditions. Add a debug message to report which ones are being cleared. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alex Elder authored
When the IPA interrupt handler runs, the IPA core clock must already be operational, and the interconnect providing access by the AP to IPA config space must be enabled too. Currently we ensure this by taking a top-level "stay awake" IPA clock reference, but that will soon go away. In preparation for that, move all handling for the IPA IRQ into the thread function. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Pavel Skripkin authored
Syzbot reported warning in netlbl_cipsov4_add(). The problem was in too big doi_def->map.std->lvl.local_size passed to kcalloc(). Since this value comes from userpace there is no need to warn if value is not correct. The same problem may occur with other kcalloc() calls in this function, so, I've added __GFP_NOWARN flag to all kcalloc() calls there. Reported-and-tested-by: syzbot+cdd51ee2e6b0b2e18c0d@syzkaller.appspotmail.com Fixes: 96cb8e33 ("[NetLabel]: CIPSOv4 and Unlabeled packet integration") Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Shannon Nelson says: ==================== ionic: driver updates 27-July-2021 This is a collection of small driver updates for adding a couple of small features and for a bit of code cleaning. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Shannon Nelson authored
Prefix the log output with the function string as in other debug messages. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Shannon Nelson authored
If there's only one queue, there is no need to enable the rxhashing. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Shannon Nelson authored
There are a few things that we can't safely do when the fw is resetting, as the driver may be in the middle of rebuilding queue structures. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Shannon Nelson authored
We don't use these fields, so remove them from the definition. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Shannon Nelson authored
Add the new VF to our internal count before we start configuring it. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Shannon Nelson authored
Based on Alex's review notes on [1], we don't need to write to the buf_info elements as often, and can tighten up how they are used. Also, use prefetchw() to warm up the page struct for a later get_page(). [1] https://lore.kernel.org/netdev/CAKgT0UfyjoAN7LTnq0NMZfXRv4v7iTCPyAb9pVr3qWMhop_BVw@mail.gmail.com/Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Shannon Nelson authored
Initialize err to 0 instead of ENOMEM, and specifically set err to ENOMEM in the devm_kcalloc() failure cases. Also, add an error message to the end of reconfig. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Shannon Nelson authored
Print the version of the DSC firmware seen when we do a fresh ident check. Because the FW can be updated by the external orchestration system, this helps us track that FW has been updated on the DSC. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Shannon Nelson authored
The top 4 bits of the fw_status in dev_info_regs is reserved for the status generation. This generation number is an arbitrary value defined when firmware starts up. If the FW is killed/crashed/stopped and then restarted, it will create a different generation number. With this mechanism, the host driver can detect that the FW has crashed and restarted, and the driver can then take steps to re-initialize its connection. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Shannon Nelson authored
When running in a small kdump kernel, we can play nice and minimize our resource use to help make sure that kdump is successful in its mission. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Arnd Bergmann says: ==================== ndo_ioctl rework This series is a follow-up to the series for removing compat_alloc_user_space() and copy_in_user() that has now been merged. I wanted to be sure I address all the ways that 'struct ifreq' is used in device drivers through .ndo_do_ioctl, originally to prove that my approach of changing the struct definition was correct, but then I discarded that approach and went on anyway. Roughly, the contents here are: - split out all the users of SIOCDEVPRIVATE ioctls into a separate ndo_siocdevprivate callback, to better see what gets used where - fix compat handling for those drivers that pass data directly inside of 'ifreq' rather than using an indirect ifr_data pointer - remove unreachable code in ndo_ioctl handlers that relies on command codes we never pass into that, in particular for wireless drivers - split out the ethernet specific ioctls into yet another ndo_eth_ioctl callback, as these are by far the most common use of ndo_do_ioctl today. I considered splitting them further into MII and timestamp controls, but went with the simpler change for now. - split out bonding and wandev ioctls into separate helpers - rework the bridge handling with a separate callback At this point, only a few oddball things remain in ndo_do_ioctl: appletalk and ieee802154 pass down SIOCSIFADDR/SIOCGIFADDR and some wireless drivers have completely dead code. I have thoroughly compile tested this on randconfig builds, but not done any notable runtime testing, so please review. All of it is also available as part of a larger branch at https://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground.git \ compat-alloc-user-space-12 Changes since v2: - rebase to net-next - fix qeth regression - Cc driver maintainers for each patch and in cover letter Changes since v1: - rebase to linux-5.14-rc2 - add conversion for ndo_siowandev, bridge and bonding drivers - leave broken wifi drivers untouched for now Link: https://lore.kernel.org/netdev/20201106221743.3271965-14-arnd@kernel.org/ ====================
-
Arnd Bergmann authored
All other user triggered operations are gone from ndo_ioctl, so move the SIOCBOND family into a custom operation as well. The .ndo_ioctl() helper is no longer called by the dev_ioctl.c code now, but there are still a few definitions in obsolete wireless drivers as well as the appletalk and ieee802154 layers to call SIOCSIFADDR/SIOCGIFADDR helpers from inside the kernel. Cc: Jay Vosburgh <j.vosburgh@gmail.com> Cc: Veaceslav Falico <vfalico@gmail.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Arnd Bergmann authored
Working towards obsoleting the .ndo_do_ioctl operation entirely, stop passing the SIOCBRADDIF/SIOCBRDELIF device ioctl commands into this callback. My first attempt was to add another ndo_siocbr() callback, but as there is only a single driver that takes these commands and there is already a hook mechanism to call directly into this driver, extend this hook instead, and use it for both the deviceless and the device specific ioctl commands. Cc: Roopa Prabhu <roopa@nvidia.com> Cc: Nikolay Aleksandrov <nikolay@nvidia.com> Cc: bridge@lists.linux-foundation.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Arnd Bergmann authored
Some drivers that use SIOCDEVPRIVATE ioctl commands modify the ifreq structure and expect it to be passed back to user space, which has never really happened for compat mode because the calling these drivers through ndo_do_ioctl requires overwriting the ifr_data pointer. Now that all drivers are converted to ndo_siocdevprivate, change it to handle this correctly in both compat and native mode. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Arnd Bergmann authored
In order to further reduce the scope of ndo_do_ioctl(), move out the SIOCWANDEV handling into a new network device operation function. Adjust the prototype to only pass the if_settings sub-structure in place of the ifreq, and remove the redundant 'cmd' argument in the process. Cc: Krzysztof Halasa <khc@pm.waw.pl> Cc: "Jan \"Yenya\" Kasprzak" <kas@fi.muni.cz> Cc: Kevin Curtis <kevin.curtis@farsite.co.uk> Cc: Zhao Qiang <qiang.zhao@nxp.com> Cc: Martin Schiller <ms@dev.tdt.de> Cc: Jiri Slaby <jirislaby@kernel.org> Cc: linux-x25@vger.kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Arnd Bergmann authored
Most users of ndo_do_ioctl are ethernet drivers that implement the MII commands SIOCGMIIPHY/SIOCGMIIREG/SIOCSMIIREG, or hardware timestamping with SIOCSHWTSTAMP/SIOCGHWTSTAMP. Separate these from the few drivers that use ndo_do_ioctl to implement SIOCBOND, SIOCBR and SIOCWANDEV commands. This is a purely cosmetic change intended to help readers find their way through the implementation. Cc: Doug Ledford <dledford@redhat.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jay Vosburgh <j.vosburgh@gmail.com> Cc: Veaceslav Falico <vfalico@gmail.com> Cc: Andy Gospodarek <andy@greyhouse.net> Cc: Andrew Lunn <andrew@lunn.ch> Cc: Vivien Didelot <vivien.didelot@gmail.com> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Vladimir Oltean <olteanv@gmail.com> Cc: Leon Romanovsky <leon@kernel.org> Cc: linux-rdma@vger.kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Arnd Bergmann authored
The compat handlers for SIOCDEVPRIVATE are incorrect for any driver that passes data as part of struct ifreq rather than as an ifr_data pointer, or that passes data back this way, since the compat_ifr_data_ioctl() helper overwrites the ifr_data pointer and does not copy anything back out. Since all drivers using devprivate commands are now converted to the new .ndo_siocdevprivate callback, fix this by adding the missing piece and passing the pointer separately the whole way. This further unifies the native and compat logic for socket ioctls, as the new code now passes the correct pointer as well as the correct data for both native and compat ioctls. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Arnd Bergmann authored
The ndo_do_ioctl callback is never called with the COSAIO* commands, so this is never used. Call the hdlc_ioctl function directly instead. Any user space code that relied on this function working as intended has never worked in a mainline kernel since before linux-1.0. Cc: "Jan \"Yenya\" Kasprzak" <kas@fi.muni.cz> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Arnd Bergmann authored
The wan drivers each support some custom SIOCDEVPRIVATE ioctls, plus the common SIOCWANDEV command. Split these so the ioctl callback only deals with SIOCWANDEV and the rest is handled by ndo_siocdevprivate. It might make sense to also split out SIOCWANDEV into a separate callback in order to eventually remove ndo_do_ioctl entirely. Cc: Krzysztof Halasa <khc@pm.waw.pl> Cc: Kevin Curtis <kevin.curtis@farsite.co.uk> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Arnd Bergmann authored
ppp has a custom statistics interface using SIOCDEVPRIVATE ioctl commands that works correctly in compat mode. Convert it to use ndo_siocdevprivate as a cleanup. Cc: Paul Mackerras <paulus@samba.org> Cc: linux-ppp@vger.kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Arnd Bergmann authored
The private sb1000 ioctl commands all work correctly in compat mode. Change the to ndo_siocdevprivate as a cleanup. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-