Commits · 1214628cb1868254e107230c9052f28ff9899b6a · nexedi / linux

07 Feb, 2017 5 commits

bridge: move write-heavy fdb members in their own cache line · 1214628c

Nikolay Aleksandrov authored Feb 04, 2017

Fdb's used and updated fields are written to on every packet forward and
packet receive respectively. Thus if we are receiving packets from a
particular fdb, they'll cause false-sharing with everyone who has looked
it up (even if it didn't match, since mac/vid share cache line!). The
"used" field is even worse since it is updated on every packet forward
to that fdb, thus the standard config where X ports use a single gateway
results in 100% fdb false-sharing. Note that this patch does not prevent
the last scenario, but it makes it better for other bridge participants
which are not using that fdb (and are only doing lookups over it).
The point is with this move we make sure that only communicating parties
get the false-sharing, in a later patch we'll show how to avoid that too.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1214628c

bridge: move to workqueue gc · f7cdee8a

Nikolay Aleksandrov authored Feb 04, 2017

Move the fdb garbage collector to a workqueue which fires at least 10
milliseconds apart and cleans chain by chain allowing for other tasks
to run in the meantime. When having thousands of fdbs the system is much
more responsive. Most importantly remove the need to check if the
matched entry has expired in __br_fdb_get that causes false-sharing and
is completely unnecessary if we cleanup entries, at worst we'll get 10ms
of traffic for that entry before it gets deleted.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f7cdee8a

bridge: modify bridge and port to have often accessed fields in one cache line · 1f90c7f3

Nikolay Aleksandrov authored Feb 04, 2017

Move around net_bridge so the vlan fields are in the beginning since
they're checked on every packet even if vlan filtering is disabled.
For the port move flags & vlan group to the beginning, so they're in the
same cache line with the port's state (both flags and state are checked
on each packet).
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1f90c7f3

bpf: enable verifier to add 0 to packet ptr · 63dfef75

William Tu authored Feb 04, 2017

The patch fixes the case when adding a zero value to the packet
pointer.  The zero value could come from src_reg equals type
BPF_K or CONST_IMM.  The patch fixes both, otherwise the verifer
reports the following error:
  [...]
    R0=imm0,min_value=0,max_value=0
    R1=pkt(id=0,off=0,r=4)
    R2=pkt_end R3=fp-12
    R4=imm4,min_value=4,max_value=4
    R5=pkt(id=0,off=4,r=4)
  269: (bf) r2 = r0     // r2 becomes imm0
  270: (77) r2 >>= 3
  271: (bf) r4 = r1     // r4 becomes pkt ptr
  272: (0f) r4 += r2    // r4 += 0
  addition of negative constant to packet pointer is not allowed
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Mihai Budiu <mbudiu@vmware.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

63dfef75

bpf: test for AND edge cases · 29200c19

Josef Bacik authored Feb 03, 2017

These two tests are based on the work done for f23cc643. The first test is
just a basic one to make sure we don't allow AND'ing negative values, even if it
would result in a valid index for the array. The second is a cleaned up version
of the original testcase provided by Jann Horn that resulted in the commit.
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

29200c19

06 Feb, 2017 35 commits

Merge branch 'dsa-add-fabric-notifier' · 9172d2a0

David S. Miller authored Feb 06, 2017

Vivien Didelot says:

====================
net: dsa: add fabric notifier

When a switch fabric is composed of multiple switch chips, these chips
must be programmed accordingly when an event occurred on one of them.

Examples of such event include hardware bridging: when a Linux bridge
spans interconnected chips, they must be programmed to allow external
ports to ingress frames on their internal ports.

Another example is cross-chip hardware VLANs. Switch chips in-between
interconnected bridge ports must also configure a given VLAN to allow
packets to pass through them.

In order to support that, this patchset introduces a non-intrusive
notifier mechanism. It adds a notifier head in every DSA switch tree
(the said fabric), and a notifier block in every DSA switch chip.

When an even occurs, it is chained to all notifiers of the fabric.
Switch chips can react accordingly if they are cross-chip capable.

On a dynamic debug enabled system, bridging a port in a multi-chip
fabric will print something like this (ZII Rev B board):

    # brctl addif br0 lan3
    mv88e6085 0.1:00: crosschip DSA port 1.0 bridged to br0
    mv88e6085 0.4:00: crosschip DSA port 1.0 bridged to br0
    # brctl delif br0 lan3
    mv88e6085 0.1:00: crosschip DSA port 1.0 unbridged from br0
    mv88e6085 0.4:00: crosschip DSA port 1.0 unbridged from br0

Currently only bridging events are added. A patchset introducing support
for cross-chip hardware bridging configuration in mv88e6xxx will follow
right after. Then events for switchdev operations are next on the line.

We should note that non-switchdev events do not support rolling-back
switch-wide operations. We'll have to work on closer integration with
switchdev for that, like introducing new attributes or objects, to
benefit from the prepare and commit phases.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

9172d2a0

net: dsa: introduce bridge notifier · 04d3a4c6

Vivien Didelot authored Feb 03, 2017

A slave device will now notify the switch fabric once its port is
bridged or unbridged, instead of calling directly its switch operations.

This code allows propagating cross-chip bridging events in the fabric.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

04d3a4c6

net: dsa: add switch notifier · f515f192

Vivien Didelot authored Feb 03, 2017

Add a notifier block per DSA switch, registered against a notifier head
in the switch fabric they belong to.

This infrastructure will allow to propagate fabric-wide events such as
port bridging, VLAN configuration, etc. If a DSA switch driver cares
about cross-chip configuration, such events can be caught.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f515f192

net: dsa: change state setter scope · c5d35cb3

Vivien Didelot authored Feb 03, 2017

The scope of the functions inside net/dsa/slave.c must be the slave
net_device pointer. Change to state setter helper accordingly to
simplify callers.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c5d35cb3

net: dsa: rollback bridging on error · 9c265426

Vivien Didelot authored Feb 03, 2017

When an error is returned during the bridging of a port in a
NETDEV_CHANGEUPPER event, net/core/dev.c rolls back the operation.

Be consistent and unassign dp->bridge_dev when this happens.

In the meantime, add comments to document this behavior.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9c265426

net: dsa: simplify netdevice events handling · 8e92ab3a

Vivien Didelot authored Feb 03, 2017

Simplify the code handling the slave netdevice notifier call by
providing a dsa_slave_changeupper helper for NETDEV_CHANGEUPPER, and so
on (only this event is supported at the moment.)

Return NOTIFY_DONE when we did not care about an event, and NOTIFY_OK
when we were concerned but no error occurred, as the API suggests.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8e92ab3a

net: dsa: move netdevice notifier registration · 88e4f0ca

Vivien Didelot authored Feb 03, 2017

Move the netdevice notifier block register code in slave.c and provide
helpers for dsa.c to register and unregister it.

At the same time, check for errors since (un)register_netdevice_notifier
may fail.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

88e4f0ca

net/mlx5e: fix another maybe-uninitialized false-positive · 321fa4ff

Arnd Bergmann authored Feb 03, 2017

In commit abeffce9 ("net/mlx5e: Fix a -Wmaybe-uninitialized warning"), I fixed a
gcc warning for the ipv4 offload handling. Now we get the same warning for the
added ipv6 support:

drivers/net/ethernet/mellanox/mlx5/core/en_tc.c:815:40: warning: 'out_dev' may be used uninitialized in this function [-Wmaybe-uninitialized]

We can apply the same workaround here as well.

Fixes: ce99f6b9 ("net/mlx5e: Support SRIOV TC encapsulation offloads for IPv6 tunnels")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

321fa4ff

net-next: treewide use is_vlan_dev() helper function. · d0d7b10b

Parav Pandit authored Feb 04, 2017

This patch makes use of is_vlan_dev() function instead of flag
comparison which is exactly done by is_vlan_dev() helper function.
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Jon Maxwell <jmaxwell37@gmail.com>
Acked-by: Johannes Thumshirn <jth@kernel.org>
Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d0d7b10b

net/mlx4_en: fix a condition · 73cfb2a2

Dan Carpenter authored Feb 03, 2017

There is a "||" vs "|" typo here so we test 0x1 instead of 0x6.

Fixes: 1f8176f7 ("net/mlx4_en: Check the enabling pptx/pprx flags in SET_PORT wrapper flow")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

73cfb2a2

sfc: don't rearm interrupts if busy polling · f820c0ac

Bert Kenward authored Feb 06, 2017

Since commit 364b6055 ("net: busy-poll: return busypolling status
to drivers"), napi_complete_done() returns a boolean that can be used
by drivers to conditionally rearm interrupts.

Testing with a 7142 shows a small latency improvement of ~100 ns.
Signed-off-by: Bert Kenward <bkenward@solarflare.com>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f820c0ac

sctp: process fwd tsn chunk only when prsctp is enabled · d15c9ede

Xin Long authored Feb 03, 2017

This patch is to check if asoc->peer.prsctp_capable is set before
processing fwd tsn chunk, if not, it will return an ERROR to the
peer, just as rfc3758 section 3.3.1 demands.
Reported-by: Julian Cordes <julian.cordes@gmail.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d15c9ede

Merge branch 'mlxsw-cleanup-neigh-handling' · 3bc32d03

David S. Miller authored Feb 06, 2017

Jiri Pirko says:

====================
mlxsw: cleanup neigh handling

Ido says:

This series addresses long standing issues in the mlxsw driver
concerning neighbour reflection. It also prepares the code for follow-up
changes dealing with proper resource cleanup and nexthop reflection.

The first two patches convert the neighbour reflection code to use an
ordered workqueue, to prevent re-ordering of NEIGH_UPDATE events that
may happen following subsequent patches.

The third to fifth patches remove the ndo_neigh_{construct,destroy}
entry points from the driver, thereby relying only on NEIGH_UPDATE
events for neighbour reflection. This simplifies the code considerably.

Last patches are fallout and adjust nits in the code I noticed while
going over it.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

3bc32d03

mlxsw: spectrum_router: Fix typo in comment · fd76d910

Ido Schimmel authored Feb 06, 2017

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

fd76d910

mlxsw: spectrum_router: Don't read 'nud_state' without lock · 01b1aa35

Ido Schimmel authored Feb 06, 2017

We periodically ask the neighbouring system to try and resolve
neighbours that are used for nexthops, but aren't currently resolved.

However, 'nud_state' is protected by the neighbour lock, so we shouldn't
access it without taking it. Instead, we can simply check the
'connected' field of the neighbour entry, which we update upon
NEIGH_UPDATE events.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

01b1aa35

mlxsw: spectrum_router: Remove redundant check · 8a0b7275

Ido Schimmel authored Feb 06, 2017

We only add neighbour entries that are also used for nexthops to
'nexthop_neighs_list', so when iterating over this list there's no need
to check that the entry is indeed used for nexthops.

Remove the redundant check.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8a0b7275

net: remove ndo_neigh_{construct, destroy} from stacked devices · a8eca326

Ido Schimmel authored Feb 06, 2017

In commit 18bfb924 ("net: introduce default neigh_construct/destroy
ndo calls for L2 upper devices") we added these ndos to stacked devices
such as team and bond, so that calls will be propagated to mlxsw.

However, previous commit removed the reliance on these ndos and no new
users of these ndos have appeared since above mentioned commit. We can
therefore safely remove this dead code.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a8eca326

mlxsw: spectrum_router: Simplify neighbour reflection · 5c8802f1

Ido Schimmel authored Feb 06, 2017

Up until now we had two interfaces for neighbour related configuration:
ndo_neigh_{construct,destroy} and NEIGH_UPDATE netevents. The ndos were
used to add and remove neighbours from the driver's cache, whereas the
netevent was used to reflect the neighbours into the device's tables.

However, if the NUD state of a neighbour isn't NUD_VALID or if the
neighbour is dead, then there's really no reason for us to keep it
inside our cache. The only exception to this rule are neighbours that
are also used for nexthops, which we periodically refresh to get them
resolved.

We can therefore eliminate the ndo entry point into the driver and
simplify the code, making it similar to the FIB reflection, which is
based solely on events. This also helps us avoid a locking issue, in
which the RIF cache was traversed without proper locking during
insertion into the neigh entry cache.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

5c8802f1

mlxsw: spectrum_router: Remove unused variable · de04b6a3

Ido Schimmel authored Feb 06, 2017

Since commit 33b1341c ("mlxsw: spectrum_router: Fix handling of
neighbour structure") we no longer use destination IP for neighbour
lookup, so remove it.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

de04b6a3

mlxsw: spectrum_router: Use ordered workqueue for neigh updates · e60234dd

Ido Schimmel authored Feb 06, 2017

We currently associate each neighbour entry with a work item, so it's
not possible to have multiple events queued for the same neighbour
entry. However, this is about to be changed so that the neighbour entry
is only resolved when the work item is scheduled.

The above can result in a mismatch between the kernel's and the device's
neighbour table, unless the associated work items are processed in the
order in which they were submitted.

Do that by migrating the NEIGH_UPDATE work items to be processed in the
ordered workqueue which was recently introduced in mlxsw in commit
a3832b31 ("mlxsw: core: Create an ordered workqueue for FIB
offload").
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e60234dd

mlxsw: core: Queue work immediately instead of delaying it · a0e4761d

Ido Schimmel authored Feb 06, 2017

We always use zero delay before queueing a work on the ordered workqueue
('mlxsw_owq'), so use work_struct directly instead of delayable work.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a0e4761d

Merge tag 'linux-can-next-for-4.11-20170206' of... · fcdc103d

David S. Miller authored Feb 06, 2017

Merge tag 'linux-can-next-for-4.11-20170206' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next

Marc Kleine-Budde says:

====================
pull-request: can-next 2017-02-06

this is a pull request of 16 patches for net-next/master.

The first two patches by David Jander and me add the rx-offload
framework for CAN devices to the kernel. The remaining 14 patches
convert the flexcan driver to make use of it.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

fcdc103d

mlxsw: reg: Fix HTGT register length · e158e5ef

Elad Raz authored Feb 06, 2017

HTGT register length is limited to 32 bytes and not 256 bytes.
Signed-off-by: Elad Raz <eladr@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e158e5ef

net: mvneta: implement .set_wol and .get_wol · b60a00f9

Jingju Hou authored Feb 06, 2017

The mvneta itself does not support WOL, but the PHY might.
So pass the calls to the PHY
Signed-off-by: Jingju Hou <houjingj@marvell.com>
Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

b60a00f9

can: flexcan: switch imx6 and vf610 to timestamp based offloading · 096de07f

Marc Kleine-Budde authored Sep 01, 2015

This patch switches the imx6 and vf610 based SoCs from the hardware FIFO
to the timestamp based rx offloading.
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

096de07f

can: flexcan: add support for timestamp based rx-offload · b3cf53e9

Marc Kleine-Budde authored Sep 01, 2015

The flexcan IP core has 64 mailboxes. For now they are configured for
RX as a hardware FIFO. This FIFO has a fixed depth of 6 CAN frames. In
some high load scenarios it turns out thas this buffer is too small.

In order to have a buffer larger than the 6 frames FIFO, this patch adds
support for timestamp based offloading via the generic rx-offload
infrastructure.
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

b3cf53e9

can: flexcan: add quirk FLEXCAN_QUIRK_ENABLE_EACEN_RRS · 9eb7aa89

Marc Kleine-Budde authored Sep 01, 2015

In order to receive RTR frames in the non HW FIFO mode the RSS and EACEN bits
of the reg_ctrl2 have to be activated. As this has no side effect in the FIFO
mode, we do this unconditionally on cores with the reg_ctrl2.
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

9eb7aa89

can: flexcan: activate individual RX masking and initialize reg_rximr · 4bd888a8

Marc Kleine-Budde authored Aug 31, 2015

Modern flexcan IP cores support two RX modes. One is using the 6 fames deep
hardware FIFO, the other is using up to 64 mailboxes (in non FIFO mode). For
now only the HW FIFO mode is activated.

In order to make use of the RX mailboxes the individual RX masking feature has
to be activated, otherwise matching mailboxes are overwritten during the
reception process. This however switches on the individual RX masking, which
uses reg_rximr registers for masking.

This patch activates the individual RX masking feature unconditionally and
initializes the mask registers (reg_rximr) with 0x0 == "don't care", which
switches off any filtering.
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

4bd888a8

can: flexcan: make use of rx-offload's irq_offload_fifo · 30164759

Marc Kleine-Budde authored May 10, 2015

This patch converts the flexcan driver to make use of the rx-offload
can_rx_offload_irq_offload_fifo() helper function. The idea is to read
the CAN frames already in the interrupt context, as the depth of the
flexcan HW FIFO is too shallow, resulting in too many missed frames.
During a normal NAPI poll the frames are the pushed into the upper
layers.
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

30164759

can: flexcan: make TX mailbox selectable during runtime · b93917c3

Marc Kleine-Budde authored Jul 12, 2015

This patch makes the TX mailbox selectable duing runtime. This is a preparation
patch to use of the hardware FIFO selectable via runtime. As the TX mailbox
number is different in HW FIFO and normal mode.
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

b93917c3

can: flexcan: calculate default value for imask1 during runtime · 28ac7dcd

Marc Kleine-Budde authored Aug 04, 2015

This patch converts the define FLEXCAN_IFLAG_DEFAULT into the runtime
calculated value priv->reg_imask1_default. This is a preparation patch to make
the TX mailbox selectable during runtime, too.
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

28ac7dcd

can: flexcan: flexcan_irq(): don't unconditionally return IRQ_HANDLED · dd2f122a

Marc Kleine-Budde authored Jan 18, 2017

This patch changes the flexcan_irq() function to only return
IRQ_HANDLED, if the interrupt really has been handled, otherwise
IRQ_NONE is returned.
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

dd2f122a

can: flexcan: flexcan_poll_bus_err(): fold in do_bus_err() · a5c02f66

Marc Kleine-Budde authored Jan 18, 2017

This patch folds in the do_bus_err() function into
flexcan_poll_bus_err().
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

a5c02f66

can: flexcan: flexcan_poll_state(): no need to initialize new_state, rx_state, tx_state · 238443df

Marc Kleine-Budde authored Jan 18, 2017

This patch removed the not needed initialisation from the new_state,
rx_state, tx_state variabled.
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

238443df

can: flexcan: do_bus_err(): convert rx_,tx_errors into bool · d166f56b

Marc Kleine-Budde authored Jan 17, 2017

This patch converts the rx_errors and tx_errors from int into bool
values, to reflect their actual meaning.
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

d166f56b