Commits · c6614738a89ce7feea70bdc89c74953d007b1a2f · nexedi / linux

15 Nov, 2019 17 commits

octeontx2-af: Add macro to generate mbox handlers declarations · c6614738

Subbaraya Sundeep authored Nov 14, 2019

For every mailbox handler added to rvu, we are adding a function
declaration in rvu header file. Cleaned this up by adding a macro
to generate these declarations automatically.
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c6614738

octeontx2-af: Sync hw mbox with bounce buffer. · fdb90298

Geetha sowjanya authored Nov 14, 2019

If mailbox client has a bounce buffer or a intermediate buffer where
mbox messages are framed then copy them from there to HW buffer.
If 'mbase' and 'hw_mbase' are not same then assume 'mbase' points to
bounce buffer.

This patch also adds msg_size field to mbox header to copy only valid
data instead of whole buffer.
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

fdb90298

octeontx2-af: Add mbox API to validate all responses · a36740f6

Sunil Goutham authored Nov 14, 2019

Added a new mailbox API which goes through all responses
to check their IDs and response codes.

Also added logic to prevent queuing multiple works to
process the same mailbox message. This scenario happens
when AF is processing a PF's request and menawhile PF
sends ACK to AF sent UP message, then mbox_hdr->num_msgs
in the PF->AF DOWN mbox region will be nonzero and AF
will end up processing PF's request again. This is fixed
by taking a backup of num_msgs counter and clearing the
same in the mbox region before scheduling work.
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a36740f6

octeontx2-af: Add NPC MCAM entry allocation status to debugfs · e07fb507

Sunil Goutham authored Nov 14, 2019

Added support to display current NPC MCAM entries and counter's allocation
status ín debugfs.

cat /sys/kernel/debug/octeontx2/npc/mcam_info' will dump following info
- MCAM Rx and Tx keysize
- Total MCAM entries and counters
- Current available count
- Count of number of MCAM entries and counters allocated
  by a RVU PF/VF device.

Also, one NPC MCAM counter (last one) is reserved and mapped to
NPC RX_INTF's MISS_ACTION to count dropped packets due to no MCAM
entry match. This pkt drop counter can be checked via debugfs.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e07fb507

octeontx2-af: Add per CGX port level NIX Rx/Tx counters · f967488d

Linu Cherian authored Nov 14, 2019

A CGX port is shared by a RVU PF and it's VFs. These per
CGX port level NIX Rx/Tx counters are cumilative stats of
all NIXLFs sharing this port. These stats when compared
to CGX Rx/Tx stats helps in identifying pkts dropped within
the system, if any.
Signed-off-by: Linu Cherian <lcherian@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f967488d

octeontx2-af: Add CGX LMAC stats to debugfs · c57211b5

Prakash Brahmajyosyula authored Nov 14, 2019

This patch adds CGX LMAC physical interface or serdes Rx/Tx
packet stats to debugfs.

'cat cgx<idx>/lmac<idx>/stats' dumps the current interface link
status and Rx/Tx stats. Stats include pkt received/transmitted,
dropped, pause frames etc etc.
Signed-off-by: Prakash Brahmajyosyula <bprakash@marvell.com>
Signed-off-by: Linu Cherian <lcherian@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c57211b5

octeontx2-af: Add NDC block stats to debugfs. · c5a797e0

Prakash Brahmajyosyula authored Nov 14, 2019

NDC is a data cache unit which caches NPA and NIX block's
aura/pool/RQ/SQ/CQ/etc contexts to reduce number of costly
DRAM accesses.

This patch adds support to dump cache's performance stats
like cache line hit/miss counters, average cycles taken for
accessing cached and non-cached data. This will help in
checking if NPA/NIX context reads/writes are having NDC cache
misses which inturn might effect performance.

Also changed NDC enums to reflect correct NDC hardware instance.
Signed-off-by: Prakash Brahmajyosyula <bprakash@marvell.com>
Signed-off-by: Linu Cherian <lcherian@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c5a797e0

octeontx2-af: Add NIX RQ, SQ and CQ contexts to debugfs · 02e202c3

Prakash Brahmajyosyula authored Nov 14, 2019

To aid in debugging NIX block related issues, added support to dump
NIX block LF's RQ, SQ and CQ hardware contexts in debugfs. User can
check which contexts are enabled currently and dump it's current HW
context.

Four new files 'qsize', 'rq_ctx', 'sq_ctx' and 'cq_ctx' are added to the
debugfs at 'sys/kernel/debug/octeontx2/nix/'

'echo <nixlf index> > qsize' will display current enabled CQ/SQ/RQs.
'echo <nixlf> [rq number/all] > rq_ctx',
'echo <nixlf> [sq number/all] > sq_ctx' &
'echo <nixlf> [cq number/all] > cq_ctx' will dump RQ/SQ/CQ's current
hardware context.
Signed-off-by: Prakash Brahmajyosyula <bprakash@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

02e202c3

octeontx2-af: Add NPA aura and pool contexts to debugfs · 8756828a

Christina Jacob authored Nov 14, 2019

To aid in debugging NPA related issues, added support to dump
NPA (pool allocator) block LF's aura and pool hardware contexts in
debugfs. User can check which contexts are enabled currently and dump
it's current HW context.

Three new files 'qsize', 'aura_ctx', 'pool_ctx' are added to the
debugfs at 'sys/kernel/debug/octeontx2/npa/'

'echo <npalf index> > qsize' will display current enabled Aura/Pools.
'echo <npalf> [aura number/all] > aura_ctx' &
'echo <npalf> [aura number/all] > pool_ctx' will dump Aura/Pool
context info.
Signed-off-by: Christina Jacob <cjacob@marvell.com>
Signed-off-by: Prakash Brahmajyosyula <bprakash@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8756828a

octeontx2-af: Dump current resource provisioning status · 23205e6d

Christina Jacob authored Nov 14, 2019

Added support to dump current resource provisioning status
of all resource virtualization unit (RVU) block's
(i.e NPA, NIX, SSO, SSOW, CPT, TIM) local functions attached
to a PF_FUNC into a debugfs file.

'cat /sys/kernel/debug/octeontx2/rsrc_alloc'
will show the current block LF's allocation status.
Signed-off-by: Christina Jacob <cjacob@marvell.com>
Signed-off-by: Prakash Brahmajyosyula <bprakash@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

23205e6d

net: mvneta: fix build skb for bm capable devices · b37fa92e

Lorenzo Bianconi authored Nov 14, 2019

Fix build_skb for bm capable devices when they fall-back using swbm path
(e.g. when bm properties are configured in device tree but
CONFIG_MVNETA_BM_ENABLE is not set). In this case rx_offset_correction is
overwritten so we need to use it building skb instead of
MVNETA_SKB_HEADROOM directly

Fixes: 8dc9a088 ("net: mvneta: rely on build_skb in mvneta_rx_swbm poll routine")
Fixes: 0db51da7 ("net: mvneta: add basic XDP support")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reported-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

b37fa92e

Merge tag 'mlx5-updates-2019-11-12' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · f97d139a

David S. Miller authored Nov 14, 2019

Saeed Mahameed says:

====================
mlx5-updates-2019-11-12

1) Merge mlx5-next for devlink reload and flowtable offloads dependencies
2) Devlink reload support
3) TC Flowtable offloads
4) Misc cleanup
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

f97d139a

r8169: use r8168d_modify_extpage in rtl8168f_config_eee_phy · d0db136f

Heiner Kallweit authored Nov 13, 2019

Use r8168d_modify_extpage() also in rtl8168f_config_eee_phy() to
simplify the code.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d0db136f

ibmveth: Detect unsupported packets before sending to the hypervisor · 6f227543

Cris Forno authored Nov 13, 2019

Currently, when ibmveth receive a loopback packet, it reports an
ambiguous error message "tx: h_send_logical_lan failed with rc=-4"
because the hypervisor rejects those types of packets. This fix
detects loopback packet and assures the source packet's MAC address
matches the driver's MAC address before transmitting to the
hypervisor.
Signed-off-by: Cris Forno <cforno12@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6f227543

net: phy: dp83869: Add TI dp83869 phy · 01db923e

Dan Murphy authored Nov 13, 2019

Add support for the TI DP83869 Gigabit ethernet phy
device.

The DP83869 is a robust, low power, fully featured
Physical Layer transceiver with integrated PMD
sublayers to support 10BASE-T, 100BASE-TX and
1000BASE-T Ethernet protocols.
Signed-off-by: Dan Murphy <dmurphy@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

01db923e

dt-bindings: net: dp83869: Add TI dp83869 phy · 4d66c56f

Dan Murphy authored Nov 13, 2019

Add dt bindings for the TI dp83869 Gigabit ethernet phy
device.
Signed-off-by: Dan Murphy <dmurphy@ti.com>
CC: Rob Herring <robh+dt@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

4d66c56f

net: openvswitch: add hash info to upcall · bd1903b7

Tonghao Zhang authored Nov 13, 2019

When using the kernel datapath, the upcall don't
include skb hash info relatived. That will introduce
some problem, because the hash of skb is important
in kernel stack. For example, VXLAN module uses
it to select UDP src port. The tx queue selection
may also use the hash in stack.

Hash is computed in different ways. Hash is random
for a TCP socket, and hash may be computed in hardware,
or software stack. Recalculation hash is not easy.

Hash of TCP socket is computed:
tcp_v4_connect
    -> sk_set_txhash (is random)

__tcp_transmit_skb
    -> skb_set_hash_from_sk

There will be one upcall, without information of skb
hash, to ovs-vswitchd, for the first packet of a TCP
session. The rest packets will be processed in Open vSwitch
modules, hash kept. If this tcp session is forward to
VXLAN module, then the UDP src port of first tcp packet
is different from rest packets.

TCP packets may come from the host or dockers, to Open vSwitch.
To fix it, we store the hash info to upcall, and restore hash
when packets sent back.

+---------------+          +-------------------------+
|   Docker/VMs  |          |     ovs-vswitchd        |
+----+----------+          +-+--------------------+--+
     |                       ^                    |
     |                       |                    |
     |                       |  upcall            v restore packet hash (not recalculate)
     |                     +-+--------------------+--+
     |  tap netdev         |                         |   vxlan module
     +--------------->     +-->  Open vSwitch ko     +-->
       or internal type    |                         |
                           +-------------------------+

Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2019-October/364062.htmlSigned-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

bd1903b7

14 Nov, 2019 8 commits

Merge branch 'Rework-mt762x-GDM-setup-flow' · 839554b7

David S. Miller authored Nov 14, 2019

MarkLee says:

====================
Rework mt762x GDM setup flow

The mt762x GDM block is mainly used to setup the HW internal
rx path from GMAC to RX DMA engine(PDMA) and the packet
switching engine(PSE) is responsed to do the data forward
following the GDM configuration.

This patch set have three goals :

1. Integrate GDM/PSE setup operations into single function "mtk_gdm_config"

2. Refine the timing of GDM/PSE setup, move it from mtk_hw_init
   to mtk_open

3. Enable GDM GDMA_DROP_ALL mode to drop all packet during the
   stop operation
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

839554b7

net: ethernet: mediatek: Enable GDM GDMA_DROP_ALL mode · 8d66a818

MarkLee authored Nov 13, 2019

Enable GDM GDMA_DROP_ALL mode to drop all packet during the
stop operation. This is recommended by the mt762x HW design
to drop all packet from GMAC before stopping PDMA.
Signed-off-by: MarkLee <Mark-MC.Lee@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8d66a818

net: ethernet: mediatek: Refine the timing of GDM/PSE setup · 5ac9eda0

MarkLee authored Nov 13, 2019

Refine the timing of GDM/PSE setup, move it from mtk_hw_init
to mtk_open. This is recommended by the mt762x HW design to
do GDM/PSE setup only after PDMA has been started.

We exclude mt7628 in mtk_gdm_config function since it is a old IP
and there is no GDM/PSE block on it.
Signed-off-by: MarkLee <Mark-MC.Lee@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

5ac9eda0

net: ethernet: mediatek: Integrate GDM/PSE setup operations · 8d3f4a95

MarkLee authored Nov 13, 2019

Integrate GDM/PSE setup operations into single function "mtk_gdm_config"
Signed-off-by: MarkLee <Mark-MC.Lee@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8d3f4a95

net: dsa: sja1105: Simplify reset handling · abfb228a

Vladimir Oltean authored Nov 13, 2019

We don't really need 10k species of reset. Remove everything except cold
reset which is what is actually used. Too bad the hardware designers
couldn't agree to use the same bit field for rev 1 and rev 2, so the
(*reset_cmd) function pointer is there to stay.

However let's simplify the prototype and give it a struct dsa_switch (we
want to avoid forward-declarations of structures, in this case struct
sja1105_private, wherever we can).
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

abfb228a

Merge branch 'PTP-clock-source-for-SJA1105-tc-taprio-offload' · ccb68993

David S. Miller authored Nov 14, 2019

Vladimir Oltean says:

====================
PTP clock source for SJA1105 tc-taprio offload

This series makes the IEEE 802.1Qbv egress scheduler of the sja1105
switch use a time reference that is synchronized to the network. This
enables quite a few real Time Sensitive Networking use cases, since in
this mode the switch can offer its clients a TDMA sort of access to the
network, and guaranteed latency for frames that are properly scheduled
based on the common PTP time.

The driver needs to do a 2-part activity:
- Program the gate control list into the static config and upload it
  over SPI to the switch (already supported)
- Write the activation time of the scheduler (base-time) into the
  PTPSCHTM register, and set the PTPSTRTSCH bit.
- Monitor the activation of the scheduler at the planned time and its
  health.

Ok, 3 parts.

The time-aware scheduler cannot be programmed to activate at a time in
the past, and there is some logic to avoid that.

PTPCLKCORP is one of those "black magic" registers that just need to be
written to the length of the cycle. There is a 40-line long comment in
the second patch which explains why.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

ccb68993

net: dsa: sja1105: Implement state machine for TAS with PTP clock source · 86db36a3

Vladimir Oltean authored Nov 12, 2019

Tested using the following bash script and the tc from iproute2-next:

	#!/bin/bash

	set -e -u -o pipefail

	NSEC_PER_SEC="1000000000"

	gatemask() {
		local tc_list="$1"
		local mask=0

		for tc in ${tc_list}; do
			mask=$((${mask} | (1 << ${tc})))
		done

		printf "%02x" ${mask}
	}

	if ! systemctl is-active --quiet ptp4l; then
		echo "Please start the ptp4l service"
		exit
	fi

	now=$(phc_ctl /dev/ptp1 get | gawk '/clock time is/ { print $5; }')
	# Phase-align the base time to the start of the next second.
	sec=$(echo "${now}" | gawk -F. '{ print $1; }')
	base_time="$(((${sec} + 1) * ${NSEC_PER_SEC}))"

	tc qdisc add dev swp5 parent root handle 100 taprio \
		num_tc 8 \
		map 0 1 2 3 5 6 7 \
		queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \
		base-time ${base_time} \
		sched-entry S $(gatemask 7) 100000 \
		sched-entry S $(gatemask "0 1 2 3 4 5 6") 400000 \
		clockid CLOCK_TAI flags 2

The "state machine" is a workqueue invoked after each manipulation
command on the PTP clock (reset, adjust time, set time, adjust
frequency) which checks over the state of the time-aware scheduler.
So it is not monitored periodically, only in reaction to a PTP command
typically triggered from a userspace daemon (linuxptp). Otherwise there
is no reason for things to go wrong.

Now that the timecounter/cyclecounter has been replaced with hardware
operations on the PTP clock, the TAS Kconfig now depends upon PTP and
the standalone clocksource operating mode has been removed.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

86db36a3

net: dsa: sja1105: Make the PTP command read-write · 41603d78

Vladimir Oltean authored Nov 12, 2019

The PTPSTRTSCH and PTPSTOPSCH bits are actually readable and indicate
whether the time-aware scheduler is running or not. We will be using
that for monitoring the scheduler in the next patch, so refactor the PTP
command API in order to allow that.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

41603d78

13 Nov, 2019 15 commits

cxgb4: Fix an error code in cxgb4_mqprio_alloc_hw_resources() · 72c99609

Dan Carpenter authored Nov 13, 2019

"ret" is zero or possibly uninitialized on this error path.  It
should be a negative error code instead.

Fixes: 2d0cb84d ("cxgb4: add ETHOFLD hardware queue support")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

72c99609

net: atlantic: Signedness bug in aq_vec_isr_legacy() · d4137871

Dan Carpenter authored Nov 13, 2019

irqreturn_t type is an enum and in this context it's unsigned, so "err"
can't be irqreturn_t or it breaks the error handling.  In fact the "err"
variable is only used to store integers (never irqreturn_t) so it should
be declared as int.

I removed the initialization because it's not required.  Using a bogus
initializer turns off GCC's uninitialized variable warnings.  Secondly,
there is a GCC warning about unused assignments and we would like to
enable that feature eventually so we have been trying to remove these
unnecessary initializers.

Fixes: 7b0c342f ("net: atlantic: code style cleanup")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d4137871

bnxt_en: Fix array overrun in bnxt_fill_l2_rewrite_fields(). · 3128aad1

Venkat Duvvuru authored Nov 13, 2019

Fix the array overrun while keeping the eth_addr and eth_addr_mask
pointers as u16 to avoid unaligned u16 access.  These were overlooked
when modifying the code to use u16 pointer for proper alignment.

Fixes: 90f90624 ("bnxt_en: Add support for L2 rewrite")
Reported-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

3128aad1

net/mlx5: TC: Offload flow table rules · 84179981

Paul Blakey authored Nov 12, 2019

Since both tc rules and flow table rules are of the same format,
we can re-use tc parsing for that, and move the flow table rules
to their steering domain - In this case, the next chain after
max tc chain.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

84179981

net/mlx5: Add devlink reload · 4383cfcc

Michael Guralnik authored Oct 27, 2019

Implement devlink reload for mlx5.

Usage example:
devlink dev reload pci/0000:06:00.0
Signed-off-by: Michael Guralnik <michaelgur@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

4383cfcc

net/mlx5e: Set netdev name space on creation · 71c6eaeb

Michael Guralnik authored Oct 29, 2019

Use devlink instance name space to set the netdev net namespace.

Preparation patch for devlink reload implementation.
Signed-off-by: Michael Guralnik <michaelgur@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

71c6eaeb

net/mlx5e: Fix error flow cleanup in mlx5e_tc_tun_create_header_ipv4/6 · 85bf490a

Eli Cohen authored Oct 31, 2019

Be sure to release the neighbour in case of failures after successful
route lookup.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

85bf490a

net/mlx5: Remove redundant NULL initializations · e6014afd

Eli Cohen authored Oct 30, 2019

Neighbour initializations to NULL are not necessary as the pointers are
not used if an error is returned, and if success returned, pointers are
initialized.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

e6014afd

net/mlx5: Read num_vfs before disabling SR-IOV · a7cba0a4

Parav Pandit authored Oct 28, 2019

mlx5_device_disable_sriov() currently reads num_vfs from the PCI core.
However when mlx5_device_disable_sriov() is executed, SR-IOV is
already disabled at the PCI level.
Due to this disable_hca() cleanup is not done during SR-IOV disable
flow.

mlx5_sriov_disable()
  pci_enable_sriov()
  mlx5_device_disable_sriov() <- num_vfs is zero here.

When SR-IOV enablement fails during mlx5_sriov_enable(), HCA's are left
in enabled stage because mlx5_device_disable_sriov() relies on num_vfs
from PCI core.

mlx5_sriov_enable()
  mlx5_device_enable_sriov()
  pci_enable_sriov() <- Fails

Hence, to overcome above issues,
(a) Read num_vfs before disabling SR-IOV and use it.
(b) Use num_vfs given when enabling sriov in error unwinding path.

Fixes: d886aba6 ("net/mlx5: Reduce dependency on enabled_vfs counter and num_vfs")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

a7cba0a4

net/mlx5: DR, Fix matcher builders select check · 86bb811b

Alex Vesker authored Nov 10, 2019

When selecting a matcher ste_builder_arr will always be evaluated
as true, instead check if num_of_builders is set for validity.

Fixes: 667f2646 ("net/mlx5: DR, Support IPv4 and IPv6 mixed matcher")
Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

86bb811b

Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux · c94ef13b

Saeed Mahameed authored Nov 13, 2019

1) New generic devlink param "enable_roce", for downstream devlink
   reload support

2) Do vport ACL configuration on per vport basis when
   enabling/disabling a vport. This enables to have vports enabled/disabled
   outside of eswitch config for future

3) Split the code for legacy vs offloads mode and make it clear

4) Tide up vport locking and workqueue usage

5) Fix metadata enablement for ECPF

6) Make explicit use of VF property to publish IB_DEVICE_VIRTUAL_FUNCTION

7) E-Switch and flow steering core low level support and refactoring for
   netfilter flowtables offload
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

c94ef13b

net/mlx5: Add new chain for netfilter flow table offload · 975b992f

Paul Blakey authored Nov 12, 2019

Netfilter tables (nftables) implements a software datapath that
comes after tc ingress datapath. The datapath supports offloading
such rules via the flow table offload API.

This API is currently only used by NFT and it doesn't provide the
global priority in regards to tc offload, so we assume offloading such
rules must come after tc. It does provide a flow table priority
parameter, so we need to provide some supported priority range.

For that, split fastpath prio to two, flow table offload and tc offload,
with one dedicated priority chain for flow table offload.

Next patch will re-use the multi chain API to access this chain by
allowing access to this chain by the fdb_sub_namespace.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

975b992f

net/mlx5: Refactor creating fast path prio chains · 439e843f

Paul Blakey authored Nov 12, 2019

Next patch will re-use this to add a new chain but in a
different prio.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

439e843f

net/mlx5: Accumulate levels for chains prio namespaces · 34b13cb3

Paul Blakey authored Nov 12, 2019

Tc chains are implemented by creating a chained prio steering type, and
inside it there is a namespace for each chain (FDB_TC_MAX_CHAINS). Each
of those has a list of priorities.

Currently, all namespaces in a prio start at the parent prio level.
But since we can jump from chain (namespace) to another chain in the
same prio, we need the levels for higher chains to be higher as well.
So we created unused prios to account for levels in previous namespaces.

Fix that by accumulating the namespaces levels if we are inside a chained
type prio, and removing the unused prios.

Fixes: 328edb49 ('net/mlx5: Split FDB fast path prio to multiple namespaces')
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

34b13cb3

net/mlx5: Define fdb tc levels per prio · 4db7b98e

Paul Blakey authored Nov 12, 2019

Define FDB_TC_LEVELS_PER_PRIO instead of magic number 2.
This is the number of levels used by each tc prio table in the fdb.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

4db7b98e