Commits · 59fc137ebdd1a93bfec991a1c8dd96002433b2e9 · Kirill Smelkov / linux

19 Dec, 2018 9 commits

Merge branch 'vxlan-Various-fixes' · 59fc137e

David S. Miller authored Dec 18, 2018

Petr Machata says:

====================
vxlan: Various fixes

This patch set contains three fixes for the vxlan driver.

Patch #1 fixes handling of offload mark on replaced VXLAN FDB entries. A
way to trigger this is to replace the FDB entry with one that can not be
offloaded. A future patch set should make it possible to veto such FDB
changes. However the FDB might still fail to be offloaded due to another
issue, and the offload mark should reflect that.

Patch #2 fixes problems in __vxlan_dev_create() when a call to
rtnl_configure_link() fails. These failures would be tricky to hit on a
real system, the most likely vector is through an error in vxlan_open().
However, with the abovementioned vetoing patchset, vetoing the created
entry would trigger the same problems (and be easier to reproduce).

Patch #3 fixes a problem in vxlan_changelink(). In situations where the
default remote configured in the FDB table (if any) does not exactly
match the remote address configured at the VXLAN device, changing the
remote address breaks the default FDB entry. Patch #4 is then a self
test for this issue.

v3:
- Patch #2:
    - Reuse the same errout block for both cleanup paths. Use a bool to
      decide whether the unregister_netdevice() call should be made.

v2:
- Drop former patch #3
- Patch #2:
    - Delete the default entry before calling unregister_netdevice(). That
      takes care of former patch #3, hence tweak the commit message to
      mention that problem as well.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

59fc137e

selftests: net: Add test_vxlan_fdb_changelink.sh · 55cbe079

Petr Machata authored Dec 18, 2018

Add a test to exercise the fix from the previous patch.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

55cbe079

vxlan: changelink: Fix handling of default remotes · ce5e098f

Petr Machata authored Dec 18, 2018

Default remotes are stored as FDB entries with an Ethernet address of
00:00:00:00:00:00. When a request is made to change a remote address of
a VXLAN device, vxlan_changelink() first deletes the existing default
remote, and then creates a new FDB entry.

This works well as long as the list of default remotes matches exactly
the configuration of a VXLAN remote address. Thus when the VXLAN device
has a remote of X, there should be exactly one default remote FDB entry
X. If the VXLAN device has no remote address, there should be no such
entry.

Besides using "ip link set", it is possible to manipulate the list of
default remotes by using the "bridge fdb". It is therefore easy to break
the above condition. Under such circumstances, the __vxlan_fdb_delete()
call doesn't delete the FDB entry itself, but just one remote. The
following vxlan_fdb_create() then creates a new FDB entry, leading to a
situation where two entries exist for the address 00:00:00:00:00:00,
each with a different subset of default remotes.

An even more obvious breakage rooted in the same cause can be observed
when a remote address is configured for a VXLAN device that did not have
one before. In that case vxlan_changelink() doesn't remove any remote,
and just creates a new FDB entry for the new address:

$ ip link add name vx up type vxlan id 2000 dstport 4789
$ bridge fdb ap dev vx 00:00:00:00:00:00 dst 192.0.2.20 self permanent
$ bridge fdb ap dev vx 00:00:00:00:00:00 dst 192.0.2.30 self permanent
$ ip link set dev vx type vxlan remote 192.0.2.30
$ bridge fdb sh dev vx | grep 00:00:00:00:00:00
00:00:00:00:00:00 dst 192.0.2.30 self permanent <- new entry, 1 rdst
00:00:00:00:00:00 dst 192.0.2.20 self permanent <- orig. entry, 2 rdsts
00:00:00:00:00:00 dst 192.0.2.30 self permanent

To fix this, instead of calling vxlan_fdb_create() directly, defer to
vxlan_fdb_update(). That has logic to handle the duplicates properly.
Additionally, it also handles notifications, so drop that call from
changelink as well.

Fixes: 0241b836 ("vxlan: fix default fdb entry netlink notify ordering during netdev create")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ce5e098f

vxlan: Fix error path in __vxlan_dev_create() · 6db92468

Petr Machata authored Dec 18, 2018

When a failure occurs in rtnl_configure_link(), the current code
calls unregister_netdevice() to roll back the earlier call to
register_netdevice(), and jumps to errout, which calls
vxlan_fdb_destroy().

However unregister_netdevice() calls transitively ndo_uninit, which is
vxlan_uninit(), and that already takes care of deleting the default FDB
entry by calling vxlan_fdb_delete_default(). Since the entry added
earlier in __vxlan_dev_create() is exactly the default entry, the
cleanup code in the errout block always leads to double free and thus a
panic.

Besides, since vxlan_fdb_delete_default() always destroys the FDB entry
with notification enabled, the deletion of the default entry is notified
even before the addition was notified.

Instead, move the unregister_netdevice() call after the manual destroy,
which solves both problems.

Fixes: 0241b836 ("vxlan: fix default fdb entry netlink notify ordering during netdev create")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6db92468

vxlan: Unmark offloaded bit on replaced FDB entries · 6ad0b5a4

Petr Machata authored Dec 18, 2018

When rdst of an offloaded FDB entry is replaced, it certainly isn't
offloaded anymore. Drivers are notified about such replacements, and can
re-mark the entry as offloaded again if they so wish. However until a
driver does so explicitly, assume a replaced FDB entry is not offloaded.

Note that replaces coming via vxlan_fdb_external_learn_add() are always
immediately followed by an explicit offload marking.

Fixes: 0efe1173 ("vxlan: Support marking RDSTs as offloaded")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6ad0b5a4

Merge branch 'macb-DMA-race-fixes' · a9d6d897

David S. Miller authored Dec 18, 2018

Anssi Hannula says:

====================
net: macb: DMA race condition fixes

Here are a couple of race condition fixes for the macb driver. The first
two are for issues observed at runtime on real HW.

v2:
- added received Tested-bys and Acked-bys to the first two patches
- in patch 3/3, moved the timestamp protection barrier closer to the
  timestamp reads
- in patch 3/3, removed unnecessary move of the addr assignment in
  gem_rx() to keep the patch minimal for maximum clarity
- in patch 3/3, clarified commit message and comments

The 3/3 is the same one I improperly sent last week as a standalone
patch.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

a9d6d897

net: macb: add missing barriers when reading descriptors · 6e0af298

Anssi Hannula authored Dec 17, 2018

When reading buffer descriptors on RX or on TX completion, an
RX_USED/TX_USED bit is checked first to ensure that the descriptors have
been populated, i.e. the ownership has been transferred. However, there
are no memory barriers to ensure that the data protected by the
RX_USED/TX_USED bit is up-to-date with respect to that bit.

Specifically:

- TX timestamp descriptors may be loaded before ctrl is loaded for the
  TX_USED check, which is racy as the descriptors may be updated between
  the loads, causing old timestamp descriptor data to be used.

- RX ctrl may be loaded before addr is loaded for the RX_USED check,
  which is racy as a new frame may be written between the loads, causing
  old ctrl descriptor data to be used.
  This issue exists for both macb_rx() and gem_rx() variants.

Fix the races by adding DMA read memory barriers on those paths and
reordering the reads in macb_rx().

I have not observed any actual problems in practice caused by these
being missing, though.

Tested on a ZynqMP based system.

Fixes: 89e5785f ("[PATCH] Atmel MACB ethernet driver")
Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi>
Cc: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6e0af298

net: macb: fix dropped RX frames due to a race · 8159ecab

Anssi Hannula authored Dec 17, 2018

Bit RX_USED set to 0 in the address field allows the controller to write
data to the receive buffer descriptor.

The driver does not ensure the ctrl field is ready (cleared) when the
controller sees the RX_USED=0 written by the driver. The ctrl field might
only be cleared after the controller has already updated it according to
a newly received frame, causing the frame to be discarded in gem_rx() due
to unexpected ctrl field contents.

A message is logged when the above scenario occurs:

  macb ff0b0000.ethernet eth0: not whole frame pointed by descriptor

Fix the issue by ensuring that when the controller sees RX_USED=0 the
ctrl field is already cleared.

This issue was observed on a ZynqMP based system.

Fixes: 4df95131 ("net/macb: change RX path for GEM")
Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi>
Tested-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Cc: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8159ecab

net: macb: fix random memory corruption on RX with 64-bit DMA · e100a897

Anssi Hannula authored Dec 17, 2018

64-bit DMA addresses are split in upper and lower halves that are
written in separate fields on GEM. For RX, bit 0 of the address is used
as the ownership bit (RX_USED). When the RX_USED bit is unset the
controller is allowed to write data to the buffer.

The driver does not guarantee that the controller already sees the upper
half when the RX_USED bit is cleared, possibly resulting in the
controller writing an incoming frame to an address with an incorrect
upper half and therefore possibly corrupting unrelated system memory.

Fix that by adding the necessary DMA memory barrier between the writes.

This corruption was observed on a ZynqMP based system.

Fixes: fff8019a ("net: macb: Add 64 bit addressing support for GEM")
Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi>
Acked-by: Harini Katakam <harini.katakam@xilinx.com>
Tested-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Cc: Nicolas Ferre <nicolas.ferre@microchip.com>
Cc: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e100a897

18 Dec, 2018 16 commits

net: Use __kernel_clockid_t in uapi net_stamp.h · e2c4cf7f

Davide Caratti authored Dec 17, 2018

Herton reports the following error when building a userspace program that
includes net_stamp.h:

 In file included from foo.c:2:
 /usr/include/linux/net_tstamp.h:158:2: error: unknown type name
 ‘clockid_t’
   clockid_t clockid; /* reference clockid */
   ^~~~~~~~~

Fix it by using __kernel_clockid_t in place of clockid_t.

Fixes: 80b14dee ("net: Add a new socket option for a future transmit time.")
Cc: Timothy Redaelli <tredaelli@redhat.com>
Reported-by: Herton R. Krzesinski <herton@redhat.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Tested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e2c4cf7f

net: macb: restart tx after tx used bit read · 42983885

Claudiu Beznea authored Dec 17, 2018

On some platforms (currently detected only on SAMA5D4) TX might stuck
even the pachets are still present in DMA memories and TX start was
issued for them. This happens due to race condition between MACB driver
updating next TX buffer descriptor to be used and IP reading the same
descriptor. In such a case, the "TX USED BIT READ" interrupt is asserted.
GEM/MACB user guide specifies that if a "TX USED BIT READ" interrupt
is asserted TX must be restarted. Restart TX if used bit is read and
packets are present in software TX queue. Packets are removed from software
TX queue if TX was successful for them (see macb_tx_interrupt()).
Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

42983885

net: stmmac: Fix an error code in probe() · b26322d2

Dan Carpenter authored Dec 17, 2018

The function should return an error if create_singlethread_workqueue()
fails.

Fixes: 34877a15 ("net: stmmac: Rework and fix TX Timeout code")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b26322d2

tipc: check group dests after tipc_wait_for_cond() · 3c6306d4

Cong Wang authored Dec 16, 2018

Similar to commit 143ece65 ("tipc: check tsk->group in tipc_wait_for_cond()")
we have to reload grp->dests too after we re-take the sock lock.
This means we need to move the dsts check after tipc_wait_for_cond()
too.

Fixes: 75da2163 ("tipc: introduce communication groups")
Reported-and-tested-by: syzbot+99f20222fc5018d2b97a@syzkaller.appspotmail.com
Cc: Ying Xue <ying.xue@windriver.com>
Cc: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

3c6306d4

qed: Fix an error code qed_ll2_start_xmit() · f07d4276

Dan Carpenter authored Dec 17, 2018

We accidentally deleted the code to set "rc = -ENOMEM;" and this patch
adds it back.

Fixes: d2201a21 ("qed: No need for LL2 frags indication")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f07d4276

net: mvpp2: 10G modes aren't supported on all ports · 00679177

Antoine Tenart authored Dec 11, 2018

The mvpp2_phylink_validate() function sets all modes that are
supported by a given PPv2 port. A recent change made all ports to
advertise they support 10G modes in certain cases. This is not true,
as only the port #0 can do so. This patch fixes it.

Fixes: 01b3fd5a ("net: mvpp2: fix detection of 10G SFP modules")
Cc: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

00679177

VSOCK: Send reset control packet when socket is partially bound · a915b982

Jorgen Hansen authored Dec 18, 2018

If a server side socket is bound to an address, but not in the listening
state yet, incoming connection requests should receive a reset control
packet in response. However, the function used to send the reset
silently drops the reset packet if the sending socket isn't bound
to a remote address (as is the case for a bound socket not yet in
the listening state). This change fixes this by using the src
of the incoming packet as destination for the reset packet in
this case.

Fixes: d021c344 ("VSOCK: Introduce VM Sockets")
Reviewed-by: Adit Ranadive <aditr@vmware.com>
Reviewed-by: Vishnu Dasa <vdasa@vmware.com>
Signed-off-by: Jorgen Hansen <jhansen@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a915b982

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec · fde9cd69

David S. Miller authored Dec 18, 2018

Steffen Klassert says:

====================
pull request (net): ipsec 2018-12-18

1) Fix error return code in xfrm_output_one()
   when no dst_entry is attached to the skb.
   From Wei Yongjun.

2) The xfrm state hash bucket count reported to
   userspace is off by one. Fix from Benjamin Poirier.

3) Fix NULL pointer dereference in xfrm_input when
   skb_dst_force clears the dst_entry.

4) Fix freeing of xfrm states on acquire. We use a
   dedicated slab cache for the xfrm states now,
   so free it properly with kmem_cache_free.
   From Mathias Krause.

Please pull or let me know if there are problems.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

fde9cd69

Merge branch 'mlxsw-VXLAN-and-firmware-flashing-fixes' · 8d013b79

David S. Miller authored Dec 18, 2018

Ido Schimmel says:

====================
mlxsw: VXLAN and firmware flashing fixes

Patch #1 fixes firmware flashing failures by increasing the time period
after which the driver fails the transaction with the firmware. The
problem is explained in detail in the commit message.

Patch #2 adds a missing trap for decapsulated ARP packets. It is
necessary for VXLAN routing to work.

Patch #3 fixes a memory leak during driver reload caused by NULLing a
pointer before kfree().

Please consider patch #1 for 4.19.y
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

8d013b79

mlxsw: spectrum_nve: Fix memory leak upon driver reload · 5edb7e8b

Ido Schimmel authored Dec 18, 2018

The pointer was NULLed before freeing the memory, resulting in a memory
leak. Trace from kmemleak:

unreferenced object 0xffff88820ae36528 (size 512):
  comm "devlink", pid 5374, jiffies 4295354033 (age 10829.296s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<00000000a43f5195>] kmem_cache_alloc_trace+0x1be/0x330
    [<00000000312f8140>] mlxsw_sp_nve_init+0xcb/0x1ae0
    [<0000000009201d22>] mlxsw_sp_init+0x1382/0x2690
    [<000000007227d877>] mlxsw_sp1_init+0x1b5/0x260
    [<000000004a16feec>] __mlxsw_core_bus_device_register+0x776/0x1360
    [<0000000070ab954c>] mlxsw_devlink_core_bus_device_reload+0x129/0x220
    [<00000000432313d5>] devlink_nl_cmd_reload+0x119/0x1e0
    [<000000003821a06b>] genl_family_rcv_msg+0x813/0x1150
    [<00000000d54d04c0>] genl_rcv_msg+0xd1/0x180
    [<0000000040543d12>] netlink_rcv_skb+0x152/0x3c0
    [<00000000efc4eae8>] genl_rcv+0x2d/0x40
    [<00000000ea645603>] netlink_unicast+0x52f/0x740
    [<00000000641fca1a>] netlink_sendmsg+0x9c7/0xf50
    [<00000000fed4a4b8>] sock_sendmsg+0xbe/0x120
    [<00000000d85795a9>] __sys_sendto+0x397/0x620
    [<00000000c5f84622>] __x64_sys_sendto+0xe6/0x1a0

Fixes: 6e6030bd ("mlxsw: spectrum_nve: Implement common NVE core")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

5edb7e8b

mlxsw: spectrum: Add trap for decapsulated ARP packets · 5d504391

Ido Schimmel authored Dec 18, 2018

After a packet was decapsulated it is classified to the relevant FID
based on its VNI and undergoes L2 forwarding.

Unlike regular (non-encapsulated) ARP packets, Spectrum does not trap
decapsulated ARP packets during L2 forwarding and instead can only trap
such packets in the underlay router during decapsulation.

Add this missing packet trap, which is required for VXLAN routing when
the MAC of the target host is not known.

Fixes: b02597d5 ("mlxsw: spectrum: Add NVE packet traps")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

5d504391

mlxsw: core: Increase timeout during firmware flash process · cf0b70e7

Shalom Toledo authored Dec 18, 2018

During the firmware flash process, some of the EMADs get timed out, which
causes the driver to send them again with a limit of 5 retries. There are
some situations in which 5 retries is not enough and the EMAD access fails.
If the failed EMAD was related to the flashing process, the driver fails
the flashing.

The reason for these timeouts during firmware flashing is cache misses in
the CPU running the firmware. In case the CPU needs to fetch instructions
from the flash when a firmware is flashed, it needs to wait for the
flashing to complete. Since flashing takes time, it is possible for pending
EMADs to timeout.

Fix by increasing EMADs' timeout while flashing firmware.

Fixes: ce6ef68f ("mlxsw: spectrum: Implement the ethtool flash_device callback")
Signed-off-by: Shalom Toledo <shalomt@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cf0b70e7

net: dsa: mv88e6xxx: set ethtool regs version · a5f39326

Vivien Didelot authored Dec 17, 2018

Currently the ethtool_regs version is set to 0 for all DSA drivers.

Use this field to store the chip ID to simplify the pretty dump of
any interfaces registered by the "dsa" driver.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a5f39326

Merge branch 'net-SO_TIMESTAMPING-fixes' · b3329901

David S. Miller authored Dec 17, 2018

Willem de Bruijn says:

====================
net: SO_TIMESTAMPING fixes

Fix two omissions:

- tx timestamping is missing for AF_INET6/SOCK_RAW/IPPROTO_RAW
- SOF_TIMESTAMPING_OPT_ID is missing for IPPROTO_RAW, PF_PACKET, CAN

Discovered while expanding the selftest in

  tools/testing/selftests/networking/timestamping/txtimestamp.c

Will send the test patchset to net-next once the fixes make it to that
branch. For now, it is available at

  https://github.com/wdebruij/linux/commits/txtimestamp-test-1
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

b3329901

net: add missing SOF_TIMESTAMPING_OPT_ID support · 8f932f76

Willem de Bruijn authored Dec 17, 2018

SOF_TIMESTAMPING_OPT_ID is supported on TCP, UDP and RAW sockets.
But it was missing on RAW with IPPROTO_IP, PF_PACKET and CAN.

Add skb_setup_tx_timestamp that configures both tx_flags and tskey
for these paths that do not need corking or use bytestream keys.

Fixes: 09c2d251 ("net-timestamp: add key to disambiguate concurrent datagrams")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8f932f76

ipv6: add missing tx timestamping on IPPROTO_RAW · fbfb2321

Willem de Bruijn authored Dec 17, 2018

Raw sockets support tx timestamping, but one case is missing.

IPPROTO_RAW takes a separate packet construction path. raw_send_hdrinc
has an explicit call to sock_tx_timestamp, but rawv6_send_hdrinc does
not. Add it.

Fixes: 11878b40 ("net-timestamp: SOCK_RAW and PING timestamping")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

fbfb2321

17 Dec, 2018 1 commit

MAINTAINERS: change my email address · 255fe81a

Vivien Didelot authored Dec 17, 2018

Make my Gmail address the primary one from now on.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

255fe81a

16 Dec, 2018 12 commits

net: mvneta: fix operation for 64K PAGE_SIZE · e735fd55

Marcin Wojtas authored Dec 11, 2018

Recent changes in the mvneta driver reworked allocation
and handling of the ingress buffers to use entire pages.
Apart from that in SW BM scenario the HW must be informed
via PRXDQS about the biggest possible incoming buffer
that can be propagated by RX descriptors.

The BufferSize field was filled according to the MTU-dependent
pkt_size value. Later change to PAGE_SIZE broke RX operation
when usin 64K pages, as the field is simply too small.

This patch conditionally limits the value passed to the BufferSize
of the PRXDQS register, depending on the PAGE_SIZE used.
On the occasion remove now unused frag_size field of the mvneta_port
structure.

Fixes: 562e2f46 ("net: mvneta: Improve the buffer allocation method for SWBM")
Signed-off-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e735fd55

Merge branch 'hns-fixes' · 369a094d

David S. Miller authored Dec 16, 2018

Peng Li says:

====================
net: hns: Code improvements & fixes for HNS driver

This patchset introduces some code improvements and fixes
for the identified problems in the HNS driver.

Every patch is independent.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

369a094d

net: hns: Fix ping failed when use net bridge and send multicast · 6adafc35

Yonglong Liu authored Dec 15, 2018

Create a net bridge, add eth and vnet to the bridge. The vnet is used
by a virtual machine. When ping the virtual machine from the outside
host and the virtual machine send multicast at the same time, the ping
package will lost.

The multicast package send to the eth, eth will send it to the bridge too,
and the bridge learn the mac of eth. When outside host ping the virtual
mechine, it will match the promisc entry of the eth which is not expected,
and the bridge send it to eth not to vnet, cause ping lost.

So this patch change promisc tcam entry position to the END of 512 tcam
entries, which indicate lower priority. And separate one promisc entry to
two: mc & uc, to avoid package match the wrong tcam entry.
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6adafc35

net: hns: Add mac pcs config when enable|disable mac · 726ae5c9

Yonglong Liu authored Dec 15, 2018

In some case, when mac enable|disable and adjust link, may cause hard to
link(or abnormal) between mac and phy. This patch adds the code for rx PCS
to avoid this bug.

Disable the rx PCS when driver disable the gmac, and enable the rx PCS
when driver enable the mac.
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

726ae5c9

net: hns: Fix ntuple-filters status error. · 7e74a19c

Yonglong Liu authored Dec 15, 2018

The ntuple-filters features is forced on by chip.
But it shows "ntuple-filters: off [fixed]" when use ethtool.
This patch make it correct with "ntuple-filters: on [fixed]".
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7e74a19c

net: hns: Avoid net reset caused by pause frames storm · a57275d3

Yonglong Liu authored Dec 15, 2018

There will be a large number of MAC pause frames on the net,
which caused tx timeout of net device. And then the net device
was reset to try to recover it. So that is not useful, and will
cause some other problems.

So need doubled ndev->watchdog_timeo if device watchdog occurred
until watchdog_timeo up to 40s and then try resetting to recover
it.

When collecting dfx information such as hardware registers when tx timeout.
Some registers for count were cleared when read. So need move this task
before update net state which also read the count registers.
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a57275d3

net: hns: Free irq when exit from abnormal branch · c82bd077

Yonglong Liu authored Dec 15, 2018

1.In "hns_nic_init_irq", if request irq fail at index i,
  the function return directly without releasing irq resources
  that already requested.

2.In "hns_nic_net_up" after "hns_nic_init_irq",
  if exceptional branch occurs, irqs that already requested
  are not release.
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c82bd077

net: hns: Clean rx fbd when ae stopped. · 31f6b61d

Yonglong Liu authored Dec 15, 2018

If there are packets in hardware when changing the speed or duplex,
it may cause hardware hang up.

This patch adds the code to wait rx fbd clean up when ae stopped.
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

31f6b61d

net: hns: Fixed bug that netdev was opened twice · 5778b13b

Yonglong Liu authored Dec 15, 2018

After resetting dsaf to try to repair chip error such as ecc error,
the net device will be open if net interface is up. But at this time
if there is the users set the net device up with the command ifconfig,
the net device will be opened twice consecutively.

Function napi_enable was called when open device. And Kernel panic will
be occurred if it was called twice consecutively. Such as follow:
static inline void napi_enable(struct napi_struct *n)
{
         BUG_ON(!test_bit(NAPI_STATE_SCHED, &n->state));
         smp_mb__before_clear_bit();
         clear_bit(NAPI_STATE_SCHED, &n->state);
}

[37255.571996] Kernel panic - not syncing: BUG!
[37255.595234] Call trace:
[37255.597694] [<ffff80000008ab48>] dump_backtrace+0x0/0x1a0
[37255.603114] [<ffff80000008ad08>] show_stack+0x20/0x28
[37255.608187] [<ffff8000009c4944>] dump_stack+0x98/0xb8
[37255.613258] [<ffff8000009c149c>] panic+0x10c/0x26c
[37255.618070] [<ffff80000070f134>] hns_nic_net_up+0x30c/0x4e0
[37255.623664] [<ffff80000070f39c>] hns_nic_net_open+0x94/0x12c
[37255.629346] [<ffff80000084be78>] __dev_open+0xf4/0x168
[37255.634504] [<ffff80000084c1ac>] __dev_change_flags+0x98/0x15c
[37255.640359] [<ffff80000084c29c>] dev_change_flags+0x2c/0x68
[37255.769580] [<ffff8000008dc400>] devinet_ioctl+0x650/0x704
[37255.775086] [<ffff8000008ddc38>] inet_ioctl+0x98/0xb4
[37255.780159] [<ffff800000827b7c>] sock_do_ioctl+0x44/0x84
[37255.785490] [<ffff800000828e04>] sock_ioctl+0x248/0x30c
[37255.790737] [<ffff80000026dc6c>] do_vfs_ioctl+0x480/0x618
[37255.796156] [<ffff80000026de94>] SyS_ioctl+0x90/0xa4
[37255.801139] SMP: stopping secondary CPUs
[37255.805079] kbox: catch panic event.
[37255.809586] collected_len = 128928, LOG_BUF_LEN_LOCAL = 131072
[37255.816103] flush cache 0xffff80003f000000  size 0x800000
[37255.822192] flush cache 0xffff80003f000000  size 0x800000
[37255.828289] flush cache 0xffff80003f000000  size 0x800000
[37255.834378] kbox: no notify die func register. no need to notify
[37255.840413] ---[ end Kernel panic - not syncing: BUG!

This patchset fix this bug according to the flag NIC_STATE_DOWN.
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

5778b13b

net: hns: Some registers use wrong address according to the datasheet. · 4ad26f11

Yonglong Liu authored Dec 15, 2018

According to the hip06 datasheet:
1.Six registers use wrong address:
  RCB_COM_SF_CFG_INTMASK_RING
  RCB_COM_SF_CFG_RING_STS
  RCB_COM_SF_CFG_RING
  RCB_COM_SF_CFG_INTMASK_BD
  RCB_COM_SF_CFG_BD_RINT_STS
  DSAF_INODE_VC1_IN_PKT_NUM_0_REG
2.The offset of DSAF_INODE_VC1_IN_PKT_NUM_0_REG should be
  0x103C + 0x80 * all_chn_num
3.The offset to show the value of DSAF_INODE_IN_DATA_STP_DISC_0_REG
  is wrong, so the value of DSAF_INODE_SW_VLAN_TAG_DISC_0_REG will be
  overwrite

These registers are only used in "ethtool -d", so that did not cause ndev
to misfunction.
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4ad26f11

net: hns: All ports can not work when insmod hns ko after rmmod. · 308c6caf

Yonglong Liu authored Dec 15, 2018

There are two test cases:
1. Remove the 4 modules:hns_enet_drv/hns_dsaf/hnae/hns_mdio,
   and install them again, must use "ifconfig down/ifconfig up"
   command pair to bring port to work.

   This patch calls phy_stop function when init phy to fix this bug.

2. Remove the 2 modules:hns_enet_drv/hns_dsaf, and install them again,
   all ports can not use anymore, because of the phy devices register
   failed(phy devices already exists).

   Phy devices are registered when hns_dsaf installed, this patch
   removes them when hns_dsaf removed.

The two cases are sometimes related, fixing the second case also requires
fixing the first case, so fix them together.
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

308c6caf

net: hns: Incorrect offset address used for some registers. · 4e1d4be6

Yonglong Liu authored Dec 15, 2018

According to the hip06 Datasheet:
1. The offset of INGRESS_SW_VLAN_TAG_DISC should be 0x1A00+4*all_chn_num
2. The offset of INGRESS_IN_DATA_STP_DISC should be 0x1A50+4*all_chn_num
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4e1d4be6

15 Dec, 2018 2 commits

net: clear skb->tstamp in forwarding paths · 8203e2d8

Eric Dumazet authored Dec 14, 2018

Sergey reported that forwarding was no longer working
if fq packet scheduler was used.

This is caused by the recent switch to EDT model, since incoming
packets might have been timestamped by __net_timestamp()

__net_timestamp() uses ktime_get_real(), while fq expects packets
using CLOCK_MONOTONIC base.

The fix is to clear skb->tstamp in forwarding paths.

Fixes: 80b14dee ("net: Add a new socket option for a future transmit time.")
Fixes: fb420d5d ("tcp/fq: move back to CLOCK_MONOTONIC")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Sergey Matyukevich <geomatsi@gmail.com>
Tested-by: Sergey Matyukevich <geomatsi@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8203e2d8

mod_devicetable.h: correct kerneldoc typo, "PHYSID2" -> "MII_PHYSID2" · 15c6d8e5

Robert P. J. Day authored Dec 13, 2018

Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

15c6d8e5