Commits · 7a723099be325f6c5edd25c775b672a056907f75 · nexedi / linux

05 Jun, 2018 4 commits

Merge branch 'net-phy-improve-PM-handling-of-PHY-MDIO' · 7a723099

David S. Miller authored Jun 05, 2018

Heiner Kallweit says:

====================
net: phy: improve PM handling of PHY/MDIO

Current implementation of MDIO bus PM ops doesn't actually implement
bus-specific PM ops but just calls PM ops defined on a device level
what doesn't seem to be fully in line with the core PM model.

When looking e.g. at __device_suspend() the PM core looks for PM ops
of a device in a specific order:
1. device PM domain
2. device type
3. device class
4. device bus

I think it has good reason that there's no PM ops on device level.
The situation can be improved by modeling PHY's as device type of
a MDIO device. If for some other type of MDIO device PM ops are
needed, it could be modeled as struct device_type as well.
====================
Tested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

7a723099

net: phy: remove PM ops from MDIO bus · 9107c05e

Heiner Kallweit authored Jun 02, 2018

Current implementation of MDIO bus PM ops doesn't actually implement
bus-specific PM ops but just calls PM ops defined on a device level
what doesn't seem to be fully in line with the core PM model.

When looking e.g. at __device_suspend() the PM core looks for PM ops
of a device in a specific order:
1. device PM domain
2. device type
3. device class
4. device bus

I think it has good reason that there's no PM ops on device level.

Now that a device type representation of PHY's as special type of MDIO
devices was added (only user of MDIO bus PM ops), the MDIO bus
PM ops can be removed including member pm of struct mdio_device.

If for some other type of MDIO device PM ops are needed, it should be
modeled as struct device_type as well.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9107c05e

net: phy: add struct device_type representation of a PHY · 7f4828ff

Heiner Kallweit authored Jun 02, 2018

A PHY is a type of MDIO device, so let's model it as struct device_type
and place PM ops, attribute groups and release callback on device type
level. For this the attribute definitions have to be moved.
This change allows us to get rid of the PM ops on a bus level in a second
step.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7f4828ff

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 7d840a60
David S. Miller authored Jun 04, 2018

7d840a60

04 Jun, 2018 36 commits

Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue · d67b66b4

David S. Miller authored Jun 04, 2018

Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2018-06-04

This series contains a smorgasbord of updates to documentation, e1000e,
igb, ixgbe, ixgbevf and i40e.

Benjamin Poirier fixes a potential kernel crash due to NULL pointer
dereference in e1000e.

Jeff updates the kernel documentation for e100 and e1000 to correct
default values and URLs which were incorrect in the documentation.  Also
took the time to update these to the new reStructured text format for
kernel documentation.

Joanna Yurdal fixes a missing PTP transmit timestamp by ensuring that
TSICR gets cleared when ICR is cleared.

Sergey updates igb to reset all the transmit queues at one time so that
we only have to wait once for all the queues to be reset.

Alex fixes ixgbevf so that malicious driver detection (MDD) can co-exist
with XDP.

Emil and Tony extend the RTNL lock to ensure we get the most up-to-date
values for the bits and avoid a possible race condition when going down.

YueHaibing from Huawei introduces a helper function in ixgbe for
operation reads to simplify the code a bit more.

Daniel Borkmann adds support for XDP meta data when using build SKB
for i40e.

Shannon Nelson provides twp fixes for the IPSec code in ixgbe, first is
to make sure we do not try to offload the decryption of any incoming
packet that is destined for the management engine.  The other fix is to
resolve a cast problem introduced by a sparse cleanup patch.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

d67b66b4

net: hns: Fix the process of adding broadcast addresses to tcam · f0b964e5

Xi Wang authored Jun 04, 2018

If the multicast mask value in device tree is configured not all
0xff, the broadcast mac will be lost from tcam table after the
execution of command 'ifconfig up'. The address is appended by
hns_ae_start, but will be clear later by hns_nic_set_rx_mode
called in dev_open process.

This patch fixed it by not use the multicast mask when add a
broadcast address.

Fixes: b5996f11 ("net: add Hisilicon Network Subsystem basic ethernet support")
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f0b964e5

net: sched: return error code when tcf proto is not found · 0e399035

Vlad Buslov authored Jun 04, 2018

If requested tcf proto is not found, get and del filter netlink protocol
handlers output error message to extack, but do not return actual error
code. Add check to return ENOENT when result of tp find function is NULL
pointer.

Fixes: c431f89b ("net: sched: split tc_ctl_tfilter into three handlers")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0e399035

team: use netdev_features_t instead of u32 · 25ea6654

Dan Carpenter authored Jun 04, 2018

This code was introduced in 2011 around the same time that we made
netdev_features_t a u64 type.  These days a u32 is not big enough to
hold all the potential features.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

25ea6654

net_failover: Use netdev_features_t instead of u32 · a746407a

Dan Carpenter authored Jun 04, 2018

The features mask needs to be a netdev_features_t (u64) because a u32
is not big enough.

Fixes: cfc80d9a ("net: Introduce net_failover driver")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a746407a

qed: use dma_zalloc_coherent instead of allocator/memset · ff2e351e

YueHaibing authored Jun 04, 2018

Use dma_zalloc_coherent instead of dma_alloc_coherent
followed by memset 0.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ff2e351e

wan/fsl_ucc_hdlc: use dma_zalloc_coherent instead of allocator/memset · 1f55c286

YueHaibing authored Jun 04, 2018

Use dma_zalloc_coherent instead of dma_alloc_coherent
followed by memset 0.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1f55c286

Merge branch 'for-upstream' of... · 828da432

David S. Miller authored Jun 04, 2018

Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next

Johan Hedberg says:

====================
pull request: bluetooth-next 2018-06-04

Here's one last bluetooth-next pull request for the 4.18 kernel:

 - New USB device IDs for Realtek 8822BE and 8723DE
 - reset/resume fix for Dell Inspiron 5565
 - Fix HCI_UART_INIT_PENDING flag behavior
 - Fix patching behavior for some ATH3012 models
 - A few other minor cleanups & fixes

Please let me know if there are any issues pulling. Thanks.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

828da432

docs: networking: fix minor typos in various documentation files · bb38ccce

Olivier Gayot authored Jun 04, 2018

This patch fixes some typos/misspelling errors in the
Documentation/networking files.
Signed-off-by: Olivier Gayot <olivier.gayot@sigexec.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bb38ccce

net: do not allow changing SO_REUSEADDR/SO_REUSEPORT on bound sockets · f396922d

Maciej Żenczykowski authored Jun 03, 2018

It is not safe to do so because such sockets are already in the
hash tables and changing these options can result in invalidating
the tb->fastreuse(port) caching.

This can have later far reaching consequences wrt. bind conflict checks
which rely on these caches (for optimization purposes).

Not to mention that you can currently end up with two identical
non-reuseport listening sockets bound to the same local ip:port
by clearing reuseport on them after they've already both been bound.

There is unfortunately no EISBOUND error or anything similar,
and EISCONN seems to be misleading for a bound-but-not-connected
socket, so use EUCLEAN 'Structure needs cleaning' which AFAICT
is the closest you can get to meaning 'socket in bad state'.
(although perhaps EINVAL wouldn't be a bad choice either?)

This does unfortunately run the risk of breaking buggy
userspace programs...
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Change-Id: I77c2b3429b2fdf42671eee0fa7a8ba721c94963b
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f396922d

net-tcp: extend tcp_tw_reuse sysctl to enable loopback only optimization · 79e9fed4

Maciej Żenczykowski authored Jun 03, 2018

This changes the /proc/sys/net/ipv4/tcp_tw_reuse from a boolean
to an integer.

It now takes the values 0, 1 and 2, where 0 and 1 behave as before,
while 2 enables timewait socket reuse only for sockets that we can
prove are loopback connections:
  ie. bound to 'lo' interface or where one of source or destination
  IPs is 127.0.0.0/8, ::ffff:127.0.0.0/104 or ::1.

This enables quicker reuse of ephemeral ports for loopback connections
- where tcp_tw_reuse is 100% safe from a protocol perspective
(this assumes no artificially induced packet loss on 'lo').

This also makes estblishing many loopback connections *much* faster
(allocating ports out of the first half of the ephemeral port range
is significantly faster, then allocating from the second half)

Without this change in a 32K ephemeral port space my sample program
(it just establishes and closes [::1]:ephemeral -> [::1]:server_port
connections in a tight loop) fails after 32765 connections in 24 seconds.
With it enabled 50000 connections only take 4.7 seconds.

This is particularly problematic for IPv6 where we only have one local
address and cannot play tricks with varying source IP from 127.0.0.0/8
pool.
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Wei Wang <weiwan@google.com>
Change-Id: I0377961749979d0301b7b62871a32a4b34b654e1
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

79e9fed4

qed: Add srq core support for RoCE and iWARP · 39dbc646

Yuval Bason authored Jun 03, 2018

This patch adds support for configuring SRQ and provides the necessary
APIs for rdma upper layer driver (qedr) to enable the SRQ feature.
Signed-off-by: Michal Kalderon <michal.kalderon@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: Yuval Bason <yuval.bason@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

39dbc646

Merge branch 'bnx2-warnings' · 7a9ee41b

David S. Miller authored Jun 04, 2018

Varsha Rao says:

====================
net: bnx2: Fix checkpatch and clang warnings

This patchset fixes NULL comparison and extra parentheses, checkpatch
and clang warnings.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

7a9ee41b

net: ethernet: bnx2: Replace NULL comparison · b8aac410

Varsha Rao authored Jun 03, 2018

This patch fixes the checkpatch issue of NULL comparison. Replace x == NULL
with !x, by using the following coccinelle script:

@disable is_null@
expression e;
@@
-e==NULL
+!e
Signed-off-by: Varsha Rao <rvarsha016@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b8aac410

net: ethernet: bnx2: Remove extra parentheses · 6dc5aa21

Varsha Rao authored Jun 03, 2018

The following coccinelle script removes extra parentheses to fix the
clang warning of extraneous parentheses.

@disable paren@
identifier i;
expression e;
statement s;
@@
if (
-(i == e)
+i == e
 )
s
Suggested-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Varsha Rao <rvarsha016@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6dc5aa21

net: gemini: fix spelling mistake: "it" -> "is" · 13ce3bc9

YueHaibing authored Jun 03, 2018

Trivial fix to spelling mistake in gemini dev_warn message
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

13ce3bc9

cls_flower: Fix comparing of old filter mask with new filter · f6521c58

Paul Blakey authored Jun 03, 2018

We incorrectly compare the mask and the result is that we can't modify
an already existing rule.

Fix that by comparing correctly.

Fixes: 05cd271f ("cls_flower: Support multiple masks per priority")
Reported-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f6521c58

cls_flower: Fix missing free of rhashtable · de9dc650

Paul Blakey authored Jun 03, 2018

When destroying the instance, destroy the head rhashtable.

Fixes: 05cd271f ("cls_flower: Support multiple masks per priority")
Reported-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

de9dc650

net: skbuff.h: drop unneeded <linux/slab.h> · 1f4c7413

Randy Dunlap authored Jun 02, 2018

<linux/skbuff.h> does not use nor need <linux/slab.h>, so drop this
header file from skbuff.h.

<linux/skbuff.h> is currently #included in around 1200 C source and
header files, making it the 31st most-used header file.

Build tested [allmodconfig] on 20 arch-es.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

1f4c7413

net: chelsio: Use zeroing memory allocator instead of allocator/memset · 40434a67

YueHaibing authored Jun 03, 2018

Use dma_zalloc_coherent for allocating zeroed
memory and remove unnecessary memset function.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

40434a67

rxrpc: Fix handling of call quietly cancelled out on server · 1a025028

David Howells authored Jun 03, 2018

Sometimes an in-progress call will stop responding on the fileserver when
the fileserver quietly cancels the call with an internally marked abort
(RX_CALL_DEAD), without sending an ABORT to the client.

This causes the client's call to eventually expire from lack of incoming
packets directed its way, which currently leads to it being cancelled
locally with ETIME.  Note that it's not currently clear as to why this
happens as it's really hard to reproduce.

The rotation policy implement by kAFS, however, doesn't differentiate
between ETIME meaning we didn't get any response from the server and ETIME
meaning the call got cancelled mid-flow.  The latter leads to an oops when
fetching data as the rotation partially resets the afs_read descriptor,
which can result in a cleared page pointer being dereferenced because that
page has already been filled.

Handle this by the following means:

 (1) Set a flag on a call when we receive a packet for it.

 (2) Store the highest packet serial number so far received for a call
     (bearing in mind this may wrap).

 (3) If, when the "not received anything recently" timeout expires on a
     call, we've received at least one packet for a call and the connection
     as a whole has received packets more recently than that call, then
     cancel the call locally with ECONNRESET rather than ETIME.

     This indicates that the call was definitely in progress on the server.

 (4) In kAFS, if the rotation algorithm sees ECONNRESET rather than ETIME,
     don't try the next server, but rather abort the call.

     This avoids the oops as we don't try to reuse the afs_read struct.
     Rather, as-yet ungotten pages will be reread at a later data.

Also:

 (5) Add an rxrpc tracepoint to log detection of the call being reset.

Without this, I occasionally see an oops like the following:

    general protection fault: 0000 [#1] SMP PTI
    ...
    RIP: 0010:_copy_to_iter+0x204/0x310
    RSP: 0018:ffff8800cae0f828 EFLAGS: 00010206
    RAX: 0000000000000560 RBX: 0000000000000560 RCX: 0000000000000560
    RDX: ffff8800cae0f968 RSI: ffff8800d58b3312 RDI: 0005080000000000
    RBP: ffff8800cae0f968 R08: 0000000000000560 R09: ffff8800ca00f400
    R10: ffff8800c36f28d4 R11: 00000000000008c4 R12: ffff8800cae0f958
    R13: 0000000000000560 R14: ffff8800d58b3312 R15: 0000000000000560
    FS:  00007fdaef108080(0000) GS:ffff8800ca680000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007fb28a8fa000 CR3: 00000000d2a76002 CR4: 00000000001606e0
    Call Trace:
     skb_copy_datagram_iter+0x14e/0x289
     rxrpc_recvmsg_data.isra.0+0x6f3/0xf68
     ? trace_buffer_unlock_commit_regs+0x4f/0x89
     rxrpc_kernel_recv_data+0x149/0x421
     afs_extract_data+0x1e0/0x798
     ? afs_wait_for_call_to_complete+0xc9/0x52e
     afs_deliver_fs_fetch_data+0x33a/0x5ab
     afs_deliver_to_call+0x1ee/0x5e0
     ? afs_wait_for_call_to_complete+0xc9/0x52e
     afs_wait_for_call_to_complete+0x12b/0x52e
     ? wake_up_q+0x54/0x54
     afs_make_call+0x287/0x462
     ? afs_fs_fetch_data+0x3e6/0x3ed
     ? rcu_read_lock_sched_held+0x5d/0x63
     afs_fs_fetch_data+0x3e6/0x3ed
     afs_fetch_data+0xbb/0x14a
     afs_readpages+0x317/0x40d
     __do_page_cache_readahead+0x203/0x2ba
     ? ondemand_readahead+0x3a7/0x3c1
     ondemand_readahead+0x3a7/0x3c1
     generic_file_buffered_read+0x18b/0x62f
     __vfs_read+0xdb/0xfe
     vfs_read+0xb2/0x137
     ksys_read+0x50/0x8c
     do_syscall_64+0x7d/0x1a0
     entry_SYSCALL_64_after_hwframe+0x49/0xbe

Note the weird value in RDI which is a result of trying to kmap() a NULL
page pointer.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1a025028

Allow ethtool to change tun link settings · 4e24f2dd

Chas Williams authored Jun 02, 2018

Let user space set whatever it would like to advertise for the
tun interface.  Preserve the existing defaults.
Signed-off-by: Chas Williams <3chas3@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4e24f2dd

Merge branch 'sh_eth-fix-and-clean-up-sh_eth_soft_swap' · 4cd328f8

David S. Miller authored Jun 04, 2018

Sergei Shtylyov says:

====================
sh_eth: fix & clean up sh_eth_soft_swap()

Here's a set of 3 patches against DaveM's 'net-next.git' repo. First one fixes an
old buffer endiannes issue (luckily, the ARM SoCs are smart enough to not actually
care) plus couple clean ups around sh_eth_soft_swap()...
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

4cd328f8

sh_eth: use DIV_ROUND_UP() in sh_eth_soft_swap() · 1100149a

Sergei Shtylyov authored Jun 02, 2018

When initializing 'maxp' in sh_eth_soft_swap(), the buffer length needs
to be rounded up -- that's just asking for DIV_ROUND_UP()!
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: David S. Miller <davem@davemloft.net>

1100149a

sh_eth: uninline sh_eth_soft_swap() · bb2fa4e8

Sergei Shtylyov authored Jun 02, 2018

sh_eth_tsu_soft_swap() is called twice by the driver, remove *inline* and
move  that function  from the header to the driver itself to let gcc decide
whether to expand it inline or not...
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: David S. Miller <davem@davemloft.net>

bb2fa4e8

sh_eth: make sh_eth_soft_swap() work on ARM · 232b6743

Sergei Shtylyov authored Jun 02, 2018

Browsing thru the driver disassembly, I noticed that ARM gcc generated
no code whatsoever for sh_eth_soft_swap() while building a little-endian
kernel -- apparently __LITTLE_ENDIAN__ was not being #define'd, however
it got implicitly #define'd when building with the SH gcc (I could only
find the explicit #define __LITTLE_ENDIAN that was #include'd when building
a little-endian kernel). Luckily, the Ether controller only doing big-
endian DMA is encountered on the early SH771x SoCs only and all ARM SoCs
implement EDMR.DE and thus set 'sh_eth_cpu_data::hw_swap'. But anyway, we
need to fix the #ifdef inside sh_eth_soft_swap() to something that would
work on all architectures...
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: David S. Miller <davem@davemloft.net>

232b6743

ixgbe: fix broken ipsec Rx with proper cast on spi · 9a75fa5c

Shannon Nelson authored May 31, 2018

Fix up a cast problem introduced by a sparse cleanup patch.  This fixes
a problem where the encrypted packets were not recognized on Rx and
subsequently dropped.

Fixes: 9cfbfa70 ("ixgbe: cleanup sparse warnings")
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

9a75fa5c

ixgbe: check ipsec ip addr against mgmt filters · 2a8a1552

Shannon Nelson authored May 30, 2018

Make sure we don't try to offload the decryption of an incoming
packet that should get delivered to the management engine.  This
is a corner case that will likely be very seldom seen, but could
really confuse someone if they were to hit it.
Suggested-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

2a8a1552

Merge branch 'mlxsw-Fixes-in-offloading-of-mirror-to-gretap' · 20677108

David S. Miller authored Jun 04, 2018

Ido Schimmel says:

====================
mlxsw: Fixes in offloading of mirror-to-gretap

Petr says:

These two patches fix issues in offloading of mirror-to-gretap when
bridge is present in the underlay.

In patch #1, reconsideration of SPAN configuration is not done right at
the point that SWITCHDEV_OBJ_ID_PORT_VLAN deletion notification is
distributed, but is postponed, because the notifications are actually
distributed before the relevant change is implemented in the bridge.

In patch #2, a problem in configuring VLAN tagging in situations when a
VLAN device is on top of an 802.1Q bridge whose egress port is marked as
"egress untagged". In that case, mlxsw would neglect to suppress the
tagging implicitly assumed after the VLAN device was seen.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

20677108

mlxsw: spectrum_span: Suppress VLAN on BRIDGE_VLAN_INFO_UNTAGGED · 1fc68bb7

Petr Machata authored Jun 02, 2018

When offloading mirroring to gretap or ip6gretap netdevices, an 802.1q
bridge is one of the soft devices permissible in the underlay when
resolving the packet path. After the packet path is resolved to a
particular bridge egress device, flags on packet VLAN determine whether
the egressed packet should be tagged.

The current logic however only ever sets the VLAN tag, never suppresses
it. Thus if there's a VLAN netdevice above the bridge that determines
the packet VLAN, that VLAN is never unset, and mirroring is configured
with VLAN tagging.

Fix by setting the packet VLAN on both branches: set to zero (for unset)
when BRIDGE_VLAN_INFO_UNTAGGED, copy the resolved VLAN (e.g. from bridge
PVID) otherwise.

Fixes: 946a11e7 ("mlxsw: spectrum_span: Allow bridge for gretap mirror")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1fc68bb7

mlxsw: spectrum_switchdev: Postpone respin on object deletion · f07ff014

Petr Machata authored Jun 02, 2018

VLAN deletion notifications are emitted before the relevant change is
projected to bridge configuration. Thus, like with VLAN addition,
schedule SPAN respin for later.

Fixes: c520bc69 ("mlxsw: Respin SPAN on switchdev events")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f07ff014

ixgbe: fix possible race in reset subtask · 88adce4e

Tony Nguyen authored May 30, 2018

Similar to ixgbevf, the same possibility for race exists. Extend the RTNL
lock in ixgbe_reset_subtask() to protect the state bits; this is to make
sure that we get the most up-to-date values for the bits and avoid a
possible race when going down.
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

88adce4e

bpf, i40e: add meta data support · cc5b114d

Daniel Borkmann authored May 28, 2018

Add support for XDP meta data when using build skb variant of
the i40e driver. Implementation is analogous to the existing
ixgbe and ixgbevf support for meta data from 366a88fe ("bpf,
ixgbe: add meta data support") and be833332 ("ixgbevf: Add
support for meta data"). With the build skb variant we get
192 bytes of extra headroom which can be used for encaps or
meta data.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Tested-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

cc5b114d

ipv6: omit traffic class when calculating flow hash · fa1be7e0

Michal Kubecek authored Jun 04, 2018

Some of the code paths calculating flow hash for IPv6 use flowlabel member
of struct flowi6 which, despite its name, encodes both flow label and
traffic class. If traffic class changes within a TCP connection (as e.g.
ssh does), ECMP route can switch between path. It's also inconsistent with
other code paths where ip6_flowlabel() (returning only flow label) is used
to feed the key.

Use only flow label everywhere, including one place where hash key is set
using ip6_flowinfo().

Fixes: 51ebd318 ("ipv6: add support of equal cost multipath (ECMP)")
Fixes: f70ea018 ("net: Add functions to get skb->hash based on flow structures")
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>

fa1be7e0

ixgbe: introduce a helper to simplify code · e9c72183

YueHaibing authored May 23, 2018

ixgbe_dbg_reg_ops_read and ixgbe_dbg_netdev_ops_read copy-pasting
the same code except for ixgbe_dbg_netdev_ops_buf/ixgbe_dbg_reg_ops_buf,
so introduce a helper ixgbe_dbg_common_ops_read to remove redundant code.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

e9c72183

Revert "ipv6: omit traffic class when calculating flow hash" · a925ab48

David S. Miller authored Jun 04, 2018

This reverts commit 87ae68c8.

Applied the wrong version of this fix, correct version
coming up.
Signed-off-by: David S. Miller <davem@davemloft.net>

a925ab48