- 19 Jan, 2018 40 commits
-
-
David S. Miller authored
Alexander Aring says: ==================== net: sched: cls: add extack support this patch adds extack support for TC classifier subsystem. The first patch fixes some code style issues for this patch series pointed out by checkpatch. The other patches until the last one prepares extack handling for the TC classifier subsystem and handle generic extack errors. The last patch is an example for u32 classifier to add extack support inside the callbacks delete and change. There exists a init callback as well, but most classifier implementation run a kalloc() once to allocate something. Not necessary _yet_ to add extack support now. - Alex Cc: David Ahern <dsahern@gmail.com> changes since v3: - fix accidentally move of config option mismatch message in PATCH 2/8 correct one is 4/8, detected by kbuildbot (Thank you) - Removed patch "net: sched: cls: add extack support for tc_setup_cb_call" PATCH 7/8 in version v2 as suggested by Jakub Kicinski (Thank you) - changed NL_SET_ERR_MSG to NL_SET_ERR_MSG_MOD as suggested by Jakub Kicinski in u32 cls (Thank You) - Removed text from cover letter that I was waiting for Jiri's Patches as detected by Jamal Hadi Salim (Thank you). changes since v2: - rebased on Jiri's patches (Thank you) - several spelling fixes pointed out by Cong Wang (Thank you) - several spelling fixes pointed out by David Ahern (Thank you) - use David Ahern recommendation if config option is mismatch, but combine it with Cong Wang recommendation to put config name into it (Thank you) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alexander Aring authored
This patch adds extack support for the u32 classifier as example for delete and init callback. Cc: David Ahern <dsahern@gmail.com> Signed-off-by: Alexander Aring <aring@mojatatu.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alexander Aring authored
This patch adds extack handling for the tcf_change_indev function which is common used by TC classifier implementations. Cc: David Ahern <dsahern@gmail.com> Signed-off-by: Alexander Aring <aring@mojatatu.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alexander Aring authored
This patch adds extack support for classifier delete callback api. This prepares to handle extack support inside each specific classifier implementation. Cc: David Ahern <dsahern@gmail.com> Signed-off-by: Alexander Aring <aring@mojatatu.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alexander Aring authored
The tcf_exts_validate function calls the act api change callback. For preparing extack support for act api, this patch adds the extack as parameter for this function which is common used in cls implementations. Furthermore the tcf_exts_validate will call action init callback which prepares the TC action subsystem for extack support. Cc: David Ahern <dsahern@gmail.com> Signed-off-by: Alexander Aring <aring@mojatatu.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alexander Aring authored
This patch adds extack support for classifier change callback api. This prepares to handle extack support inside each specific classifier implementation. Cc: David Ahern <dsahern@gmail.com> Signed-off-by: Alexander Aring <aring@mojatatu.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alexander Aring authored
This patch adds extack support for generic cls handling. The extack will be set deeper to each called function which is not part of netdev core api. Cc: David Ahern <dsahern@gmail.com> Signed-off-by: Alexander Aring <aring@mojatatu.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alexander Aring authored
This patch changes some code style issues pointed out by checkpatch inside the TC cls subsystem. Signed-off-by: Alexander Aring <aring@mojatatu.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Yuval Mintz authored
During initialization the driver checks whether the flashed FW image suits its requirements by checking that it's sufficiently new. However, there's only a weak backward compatibility scheme that is actually guaranteed by the FW, so driver must also upper bound the version to prevent compatibility issues between current driver and some possible future fw. Signed-off-by: Yuval Mintz <yuvalm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Jakub Kicinski says: ==================== nfp: devlink, capabilities extensions and updates This series starts with an improvement to the usability of the device memory accessors (CPP transactions). Next few patches are devoted to fixing the devlink locking. After recent patches for mlxsw the locking scheme of devlink ops has to be reworked. Following patches improve NFP code dealing with "representors", and expands the error message printed when driver has no support for loaded FW. Second part of the series is focused on vNIC capabilities read from vNIC control memory (often referred to as "BAR0" for historical reasons). TLV capability format is established and immediately made use of. The next patches rework parsing of features for control vNIC which allows apps to mask out features they don't want enabled. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
BPF firmware currently exposes IRQ moderation capability. The driver will make use of it by default, inserting 50 usec delay to every control message exchange. This cuts the number of messages per second we can exchange by almost half. None of the other capabilities make much sense for BPF control vNIC, either. Disable them all. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Most vNIC capabilities are netdev related. It makes no sense to initialize them and waste FW resources. Some are even counter-productive, like IRQ moderation, which will slow down exchange of control messages. Add to nfp_app a mask of enabled control vNIC capabilities for apps to use. Make flower and BPF enable all capabilities for now. No functional changes. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
nfp_net_init() is a little long and we are about to add more code to reading capabilties. Move the capability reading, parsing and validating out. Only actual initialization will stay in nfp_net_init(). No functional changes. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Allow specifying alternative vNIC mailbox location in TLV caps. This way we can size the mailbox to the needs and not necessarily waste 512B of ctrl memory space. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
PCIe island clock frequency is used when converting coalescing parameters from usecs to NFP timestamps. Most chips don't run at 1200MHz, allow FW to provide us with the real frequency. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
NFP is entirely programmable, including the PCI data interface. Using a fixed control BAR layout certainly makes implementations easier, but require careful considerations when space is allocated. Once BAR area is allocated to one feature nothing else can use it. Allocating space statically also requires it to be sized upfront, which leads to either unnecessary limitation or wastage. We currently have a 32bit capability word defined which tells drivers which application FW features are supported. Most of the bits are exhausted. The same bits are also reused for enabling specific features. Bulk of capabilities don't have a need for an enable bit, however, leading to confusion and wastage. TLVs seems like a better fit for expressing capabilities of applications running on programmable hardware. This patch leaves the front of the BAR as is, and declares a TLV capability start at offset 0x58. Most of the space up to 0x0d90 is already allocated, but the used space can be wrapped with RESERVED TLVs. E.g.: Address Type Length 0x0058 RESERVED 0xe00 /* Wrap basic structures */ 0x0e5c FEATURE_A 0x004 0x0e64 FEATURE_B 0x004 0x0e6c RESERVED 0x990 /* Wrap qeueue stats */ 0x1800 FEATURE_C 0x100 Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
When driver app matching loaded FW is not found users are faced with: nfp: failed to find app with ID 0x%02x This message does not properly explain that matching driver code is either not built into the driver or the driver is too old. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Representors are grouped in sets by type. Currently the whole sets are under RCU protection, but individual representor pointers are not. This causes some inconveniences when representors have to be destroyed, because we have to allocate new sets to remove any representors. Protect the individual pointers with RCU. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
The write side of repr tables is always done under pf->lock. Add a helper to dereference repr table pointers under protection of that lock. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Devlink used to have two global locks: devlink lock and port lock, our lock ordering looked like this: devlink lock -> driver's pf->lock -> devlink port lock After recent changes port lock was replaced with per-instance lock. Unfortunately, new per-instance lock is taken on most operations now. This means we can only grab the pf->lock from the port split/unsplit ops. Lock ordering looks like this: devlink lock -> driver's pf->lock -> devlink instance lock Since we can't take pf->lock from most devlink ops, make sure nfp_apps are prepared to service them as soon as devlink is registered. Locking the pf must be pushed down after nfp_app_init() callback. The init order looks like this: nfp_app_init devlink_register nfp_app_start netdev/port_register As soon as app_init is done nfp_apps must be ready to service devlink-related callbacks. apps can only register their own devlink objects from nfp_app_start. Fixes: 2406e7e5 ("devlink: Add per devlink instance lock") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
NFP app is currently shut down as soon as all the vNICs are gone. This means we can't depend on the app existing throughout the lifetime of the device. Free the app only from PCI remove path. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Currently the helpers for accessing 4 or 8 byte values over the CPP bus return the length of IO on success. If the IO was short caller has to deal with error handling. The short IO for 4/8B values is completely impractical. Make the helpers return an error if full access was not possible. Fix the few places which are actually dealing with errors correctly, most call sites already only deal with negative return codes. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Yuchung Cheng says: ==================== tcp: do not use RTT from delayed ACKs for min-RTT This patch set prevents TCP sender from using RTT samples from (suspected) delayed ACKs as the minimum RTT, to avoid unbounded over-estimation of the network path delay. This issue is common when a connection has extended periods of one packet chit-chat beyond the min RTT filter window. The first patch does that for TCP general min RTT estimation. The second patch addresses specifically the BBR congestion control's min RTT filter. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Yuchung Cheng authored
A persistent connection may send tiny amount of data (e.g. health-check) for a long period of time. BBR's windowed min RTT filter may only see RTT samples from delayed ACKs causing BBR to grossly over-estimate the path delay depending how much the ACK was delayed at the receiver. This patch skips RTT samples that are likely coming from delayed ACKs. Note that it is possible the sender never obtains a valid measure to set the min RTT. In this case BBR will continue to set cwnd to initial window which seems fine because the connection is thin stream. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Priyaranjan Jha <priyarjha@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Yuchung Cheng authored
This patch avoids having TCP sender or congestion control overestimate the min RTT by orders of magnitude. This happens when all the samples in the windowed filter are one-packet transfer like small request and health-check like chit-chat, which is farily common for applications using persistent connections. This patch tries to conservatively labels and skip RTT samples obtained from this type of workload. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jon Maloy authored
Letting tipc_poll() dereference a socket's pointer to struct tipc_group entails a race risk, as the group item may be deleted in a concurrent tipc_sk_join() or tipc_sk_leave() thread. We now move the 'open' flag in struct tipc_group to struct tipc_sock, and let the former retain only a pointer to the moved field. This will eliminate the race risk. Reported-by: syzbot+799dafde0286795858ac@syzkaller.appspotmail.com Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Lorenzo Bianconi authored
Remove the switch block in l2tp_nl_cmd_session_create() that checks pseudowire-specific parameters since just L2TP_PWTYPE_ETH and L2TP_PWTYPE_PPP are currently supported and no actual checks are performed. Moreover the L2TP_PWTYPE_IP/default case presents a harmless issue in error handling (break instead of goto out_tunnel) Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Acked-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ganesh Goudar authored
on T6, IPv6 filter would occupy 2 tids instead of 4. Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Lorenzo Bianconi says: ==================== l2tp: set l2specific_len based on l2specific_type Do not rely on l2specific_len value provided by userspace but set sublayer length according to l2specific_type. Mark L2TP_ATTR_L2SPEC_LEN attribute as not used Changes since v2: - drop the patch related to a fix in the switch default case in l2tp_nl_cmd_session_create() - use L2SPECTYPE_NONE as default case in l2tp_get_l2specific_len() Changes since v1: - remove l2specific_len parameter - add sanity check on l2specific_type provided by userspace ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Lorenzo Bianconi authored
Reviewed-by: Guillaume Nault <g.nault@alphalink.fr> Tested-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Lorenzo Bianconi authored
Remove l2specific_len configuration parameter since now L2-Specific Sublayer length is computed according to l2specific_type provided by userspace. Reviewed-by: Guillaume Nault <g.nault@alphalink.fr> Tested-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Lorenzo Bianconi authored
Remove l2specific_len dependency while building l2tpv3 header or parsing the received frame since default L2-Specific Sublayer is always four bytes long and we don't need to rely on a user supplied value. Moreover in l2tp netlink code there are no sanity checks to enforce the relation between l2specific_len and l2specific_type, so sending a malformed netlink message is possible to set l2specific_type to L2TP_L2SPECTYPE_DEFAULT (or even L2TP_L2SPECTYPE_NONE) and set l2specific_len to a value greater than 4 leaking memory on the wire and sending corrupted frames. Reviewed-by: Guillaume Nault <g.nault@alphalink.fr> Tested-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Lorenzo Bianconi authored
Add sanity check on l2specific_type provided by userspace in l2tp_nl_cmd_session_create() since just L2TP_L2SPECTYPE_DEFAULT and L2TP_L2SPECTYPE_NONE are currently supported. Moreover explicitly set l2specific_type to L2TP_L2SPECTYPE_DEFAULT only if the userspace does not provide a value for it Reviewed-by: Guillaume Nault <g.nault@alphalink.fr> Tested-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Rahul Lakkireddy says: ==================== cxgb4: reduce memory footprint for collecting firmware dump Firmware dump can be large (upto 2 GB). In low memory conditions, ethtool fails to allocate such large memory. So, use zlib deflate to compress collected firmware dump. Patch 1 updates collection logic to use compression. Patch 2 adds zlib deflate to compress collected firmware dump. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Rahul Lakkireddy authored
Use zlib deflate to compress firmware dump. Collect and compress as much firmware dump as possible into a 32 MB buffer. Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: Vishal Kulkarni <vishal@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Rahul Lakkireddy authored
Update firmware dump collection logic to use compression when available. Let collection logic attempt to do compression, instead of returning out of memory early. Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: Vishal Kulkarni <vishal@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Talat Batheesh authored
Helmut reported a bug about devision by zero while running traffic and doing physical cable pull test. When the cable unplugged the ppms become zero, so when dividing the current ppms by the previous ppms in the next dim iteration there is devision by zero. This patch prevent this division for both ppms and epms. Fixes: c3164d2f ("net/mlx5e: Added BW check for DIM decision mechanism") Fixes: 4c4dbb4a ("net/mlx5e: Move dynamic interrupt coalescing code to include/linux") Reported-by: Helmut Grauer <helmut.grauer@de.ibm.com> Signed-off-by: Talat Batheesh <talatb@mellanox.com> Signed-off-by: Tal Gilboa <talgi@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Wei Yongjun authored
Fixes the following sparse warnings: net/core/devlink.c:2297:25: warning: symbol 'devlink_resource_find' was not declared. Should it be static? net/core/devlink.c:2322:6: warning: symbol 'devlink_resource_validate_children' was not declared. Should it be static? Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Wei Yongjun authored
Fixes the following sparse warning: drivers/net/ethernet/mellanox/mlxsw/spectrum_kvdl.c:289:5: warning: symbol 'mlxsw_sp_kvdl_part_occ' was not declared. Should it be static? Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Zhu Yanjun authored
The variable miistat is not used. So it is removed. CC: Srinivas Eeda <srinivas.eeda@oracle.com> CC: Joe Jin <joe.jin@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-