Commits · 84e93d999a677ee3229e244e9eb29209c3bb6677 · Kirill Smelkov / linux

31 Oct, 2019 9 commits

wimax: use DEFINE_DEBUGFS_ATTRIBUTE to define debugfs fops · 84e93d99

zhong jiang authored Oct 30, 2019

It is more clear to use DEFINE_DEBUGFS_ATTRIBUTE to define debugfs file
operation rather than DEFINE_SIMPLE_ATTRIBUTE.

It is detected with the help of coccinelle.
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

84e93d99

net: dsa: add ethtool pause configuration support · a2a1a13b

Heiner Kallweit authored Oct 29, 2019

This patch adds glue logic to make pause settings per port
configurable vie ethtool.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a2a1a13b

vxlan: drop "vxlan" parameter in vxlan_fdb_alloc() · 1d7a5526

Guillaume Nault authored Oct 29, 2019

This parameter has never been used.
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1d7a5526

net: phy: marvell: add downshift support for 88E1145 · a319fb52

Heiner Kallweit authored Oct 29, 2019

Add downshift support for 88E1145, it uses the same downshift
configuration registers as 88E1111.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a319fb52

Merge branch 'ICMP-flow-improvements' · 29f52875

David S. Miller authored Oct 30, 2019

Matteo Croce says:

====================
ICMP flow improvements

This series improves the flow inspector handling of ICMP packets:
The first two patches just add some comments in the code which would have saved
me a few minutes of time, and refactor a piece of code.
The third one adds to the flow inspector the capability to extract the
Identifier field, if present, so echo requests and replies are classified
as part of the same flow.
The fourth patch uses the function introduced earlier to the bonding driver,
so echo replies can be balanced across bonding slaves.

v1 -> v2:
 - remove unused struct members
 - add an helper to check for the Id field
 - use a local flow_dissector_key in the bonding to avoid
   changing behaviour of the flow dissector
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

29f52875

bonding: balance ICMP echoes in layer3+4 mode · 58deb77c

Matteo Croce authored Oct 29, 2019

The bonding uses the L4 ports to balance flows between slaves. As the ICMP
protocol has no ports, those packets are sent all to the same device:

    # tcpdump -qltnni veth0 ip |sed 's/^/0: /' &
    # tcpdump -qltnni veth1 ip |sed 's/^/1: /' &
    # ping -qc1 192.168.0.2
    1: IP 192.168.0.1 > 192.168.0.2: ICMP echo request, id 315, seq 1, length 64
    1: IP 192.168.0.2 > 192.168.0.1: ICMP echo reply, id 315, seq 1, length 64
    # ping -qc1 192.168.0.2
    1: IP 192.168.0.1 > 192.168.0.2: ICMP echo request, id 316, seq 1, length 64
    1: IP 192.168.0.2 > 192.168.0.1: ICMP echo reply, id 316, seq 1, length 64
    # ping -qc1 192.168.0.2
    1: IP 192.168.0.1 > 192.168.0.2: ICMP echo request, id 317, seq 1, length 64
    1: IP 192.168.0.2 > 192.168.0.1: ICMP echo reply, id 317, seq 1, length 64

But some ICMP packets have an Identifier field which is
used to match packets within sessions, let's use this value in the hash
function to balance these packets between bond slaves:

    # ping -qc1 192.168.0.2
    0: IP 192.168.0.1 > 192.168.0.2: ICMP echo request, id 303, seq 1, length 64
    0: IP 192.168.0.2 > 192.168.0.1: ICMP echo reply, id 303, seq 1, length 64
    # ping -qc1 192.168.0.2
    1: IP 192.168.0.1 > 192.168.0.2: ICMP echo request, id 304, seq 1, length 64
    1: IP 192.168.0.2 > 192.168.0.1: ICMP echo reply, id 304, seq 1, length 64

Aso, let's use a flow_dissector_key which defines FLOW_DISSECTOR_KEY_ICMP,
so we can balance pings encapsulated in a tunnel when using mode encap3+4:

    # ping -q 192.168.1.2 -c1
    0: IP 192.168.0.1 > 192.168.0.2: GREv0, length 102: IP 192.168.1.1 > 192.168.1.2: ICMP echo request, id 585, seq 1, length 64
    0: IP 192.168.0.2 > 192.168.0.1: GREv0, length 102: IP 192.168.1.2 > 192.168.1.1: ICMP echo reply, id 585, seq 1, length 64
    # ping -q 192.168.1.2 -c1
    1: IP 192.168.0.1 > 192.168.0.2: GREv0, length 102: IP 192.168.1.1 > 192.168.1.2: ICMP echo request, id 586, seq 1, length 64
    1: IP 192.168.0.2 > 192.168.0.1: GREv0, length 102: IP 192.168.1.2 > 192.168.1.1: ICMP echo reply, id 586, seq 1, length 64
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

58deb77c

flow_dissector: extract more ICMP information · 5dec597e

Matteo Croce authored Oct 29, 2019

The ICMP flow dissector currently parses only the Type and Code fields.
Some ICMP packets (echo, timestamp) have a 16 bit Identifier field which
is used to correlate packets.
Add such field in flow_dissector_key_icmp and replace skb_flow_get_be16()
with a more complex function which populate this field.
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

5dec597e

flow_dissector: skip the ICMP dissector for non ICMP packets · 3b336d6f

Matteo Croce authored Oct 29, 2019

FLOW_DISSECTOR_KEY_ICMP is checked for every packet, not only ICMP ones.
Even if the test overhead is probably negligible, move the
ICMP dissector code under the big 'switch(ip_proto)' so it gets called
only for ICMP packets.
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

3b336d6f

flow_dissector: add meaningful comments · 98298e6c

Matteo Croce authored Oct 29, 2019

Documents two piece of code which can't be understood at a glance.
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

98298e6c

30 Oct, 2019 31 commits

tc-testing: fixed two failing pedit tests · c4917bfc

Roman Mashak authored Oct 30, 2019

Two pedit tests were failing due to incorrect operation
value in matchPattern, should be 'add' not 'val', so fix it.
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c4917bfc

tipc: add smart nagle feature · c0bceb97

Jon Maloy authored Oct 30, 2019

We introduce a feature that works like a combination of TCP_NAGLE and
TCP_CORK, but without some of the weaknesses of those. In particular,
we will not observe long delivery delays because of delayed acks, since
the algorithm itself decides if and when acks are to be sent from the
receiving peer.

- The nagle property as such is determined by manipulating a new
  'maxnagle' field in struct tipc_sock. If certain conditions are met,
  'maxnagle' will define max size of the messages which can be bundled.
  If it is set to zero no messages are ever bundled, implying that the
  nagle property is disabled.
- A socket with the nagle property enabled enters nagle mode when more
  than 4 messages have been sent out without receiving any data message
  from the peer.
- A socket leaves nagle mode whenever it receives a data message from
  the peer.

In nagle mode, messages smaller than 'maxnagle' are accumulated in the
socket write queue. The last buffer in the queue is marked with a new
'ack_required' bit, which forces the receiving peer to send a CONN_ACK
message back to the sender upon reception.

The accumulated contents of the write queue is transmitted when one of
the following events or conditions occur.

- A CONN_ACK message is received from the peer.
- A data message is received from the peer.
- A SOCK_WAKEUP pseudo message is received from the link level.
- The write queue contains more than 64 1k blocks of data.
- The connection is being shut down.
- There is no CONN_ACK message to expect. I.e., there is currently
  no outstanding message where the 'ack_required' bit was set. As a
  consequence, the first message added after we enter nagle mode
  is always sent directly with this bit set.

This new feature gives a 50-100% improvement of throughput for small
(i.e., less than MTU size) messages, while it might add up to one RTT
to latency time when the socket is in nagle mode.
Acked-by: Ying Xue <ying.xue@windreiver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c0bceb97

Merge branch 'mlxsw-Update-firmware-version' · 6c814e8c

David S. Miller authored Oct 30, 2019

Ido Schimmel says:

====================
mlxsw: Update firmware version

This patch set updates the firmware version for Spectrum-1 and enforces
a firmware version for Spectrum-2.

The version adds support for querying port module type. It will be used
by a followup patch set from Jiri to make port split code more generic.

Patch #1 increases the size of an existing register in order to be
compatible with the new firmware version. In the future the firmware
will assign default values to fields not specified by the driver.

Patch #2 temporarily increases the PCI reset timeout for SN3800 systems.
Note that in normal cases the driver will need to wait no longer than 5
seconds for the device to become ready following reset command.

Patch #3 bumps the firmware version for Spectrum-1.

Patch #4 enforces a minimum firmware version for Spectrum-2.

v2:
* Added patch #2
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

6c814e8c

mlxsw: Enforce firmware version for Spectrum-2 · a72afb68

Ido Schimmel authored Oct 30, 2019

In a similar fashion to Spectrum-1, enforce a specific firmware version
for Spectrum-2 so that the driver and firmware are always in sync with
regards to new features.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a72afb68

mlxsw: Bump firmware version to 13.2000.2308 · 5fd2ef46

Ido Schimmel authored Oct 30, 2019

The version adds support for querying port module type. It will be used
by a followup patch set from Jiri to make port split code more generic.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

5fd2ef46

mlxsw: pci: Increase PCI reset timeout for SN3800 systems · ff298839

Ido Schimmel authored Oct 30, 2019

SN3800 Spectrum-2 based systems have gearboxes that need to be
initialized by the firmware during its initialization flow. In certain
cases, the firmware might need to flash these gearboxes, which is
currently a time-consuming process.

In newer firmware versions, the firmware will not signal to the driver
that it is ready until the gearboxes are flashed. Increase the PCI reset
timeout for these situations. In normal cases, the driver will need to
wait no longer than 5 seconds.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ff298839

mlxsw: reg: Increase size of MPAR register · 5075066a

Ido Schimmel authored Oct 30, 2019

In new firmware versions this register is extended with a sampling rate
for Spectrum-2 and future ASICs.

Increase the size of the register to ensure the field is initialized to
0 which means every packet is mirrored.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

5075066a

Merge branch 'nfc-pn533-add-uart-phy-driver' · 74923441

David S. Miller authored Oct 29, 2019

Lars Poeschel says:

====================
nfc: pn533: add uart phy driver

The purpose of this patch series is to add a uart phy driver to the
pn533 nfc driver.
It first changes the dt strings and docs. The dt compatible strings
need to change, because I would add "pn532-uart" to the already
existing "pn533-i2c" one. These two are now unified into just
"pn532". Then the neccessary changes to the pn533 core driver are
made. Then the uart phy is added.
As the pn532 chip supports a autopoll, I wanted to use this instead
of the software poll loop in the pn533 core driver. It is added and
activated by the last to patches.
The way to add the autopoll later in seperate patches is chosen, to
show, that the uart phy driver can also work with the software poll
loop, if someone needs that for some reason.
In v11 of this patchseries I address a byte ordering issue reported
by kbuild test robot in patch 5/7.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

74923441

nfc: pn532_uart: Make use of pn532 autopoll · e4a5dc18

Lars Poeschel authored Oct 29, 2019

This switches the pn532 UART phy driver from manually polling to the new
autopoll mechanism.

Cc: Johan Hovold <johan@kernel.org>
Signed-off-by: Lars Poeschel <poeschel@lemonage.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

e4a5dc18

nfc: pn533: Add autopoll capability · c64b875f

Lars Poeschel authored Oct 29, 2019

pn532 devices support an autopoll command, that lets the chip
automatically poll for selected nfc technologies instead of manually
looping through every single nfc technology the user is interested in.
This is faster and less cpu and bus intensive than manually polling.
This adds this autopoll capability to the pn533 driver.

Cc: Johan Hovold <johan@kernel.org>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Lars Poeschel <poeschel@lemonage.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

c64b875f

nfc: pn533: add UART phy driver · c656aa4c

Lars Poeschel authored Oct 29, 2019

This adds the UART phy interface for the pn533 driver.
The pn533 driver can be used through UART interface this way.
It is implemented as a serdev device.

Cc: Johan Hovold <johan@kernel.org>
Cc: Claudiu Beznea <Claudiu.Beznea@microchip.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Lars Poeschel <poeschel@lemonage.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

c656aa4c

nfc: pn533: Split pn533 init & nfc_register · 843cc92e

Lars Poeschel authored Oct 29, 2019

There is a problem in the initialisation and setup of the pn533: It
registers with nfc too early. It could happen, that it finished
registering with nfc and someone starts using it. But setup of the pn533
is not yet finished. Bad or at least unintended things could happen.
So I split out nfc registering (and unregistering) to seperate functions
that have to be called late in probe then.
i2c requires a bit more love: i2c requests an irq in it's probe
function. 'Commit 32ecc75d ("NFC: pn533: change order operations in
dev registation")' shows, this can not happen too early. An irq can be
served before structs are fully initialized. The way chosen to prevent
this is to request the irq after nfc_alloc_device initialized the
structs, but before nfc_register_device. So there is now this
pn532_i2c_nfc_alloc function.

Cc: Johan Hovold <johan@kernel.org>
Cc: Claudiu Beznea <Claudiu.Beznea@microchip.com>
Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Lars Poeschel <poeschel@lemonage.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

843cc92e

nfc: pn533: Add dev_up/dev_down hooks to phy_ops · 0bf2840c

Lars Poeschel authored Oct 29, 2019

This adds hooks for dev_up and dev_down to the phy_ops. They are
optional.
The idea is to inform the phy driver when the nfc chip is really going
to be used. When it is not used, the phy driver can suspend it's
interface to the nfc chip to save some power. The nfc chip is considered
not in use before dev_up and after dev_down.

Cc: Johan Hovold <johan@kernel.org>
Signed-off-by: Lars Poeschel <poeschel@lemonage.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

0bf2840c

nfc: pn532: Add uart phy docs and rename it · 3c57b395

Lars Poeschel authored Oct 29, 2019

This adds documentation about the uart phy to the pn532 binding doc. As
the filename "pn533-i2c.txt" is not appropriate any more, rename it to
the more general "pn532.txt".
This also documents the deprecation of the compatible strings ending
with "...-i2c".

Cc: Johan Hovold <johan@kernel.org>
Cc: Simon Horman <horms@verge.net.au>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Lars Poeschel <poeschel@lemonage.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

3c57b395

nfc: pn533: i2c: "pn532" as dt compatible string · 3d5f3a67

Lars Poeschel authored Oct 29, 2019

It is favourable to have one unified compatible string for devices that
have multiple interfaces. So this adds simply "pn532" as the devicetree
binding compatible string and makes a note that the old ones are
deprecated.

Cc: Johan Hovold <johan@kernel.org>
Cc: Simon Horman <horms@verge.net.au>
Signed-off-by: Lars Poeschel <poeschel@lemonage.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

3d5f3a67

Merge branch 'bridge-fdbs-bitops' · 9014fc31

David S. Miller authored Oct 29, 2019

Nikolay Aleksandrov says:

====================
net: bridge: convert fdbs to use bitops

We'd like to have a well-defined behaviour when changing fdb flags. The
problem is that we've added new fields which are changed from all
contexts without any locking. We are aware of the bit test/change races
and these are fine (we can remove them later), but it is considered
undefined behaviour to change bitfields from multiple threads and also
on some architectures that can result in unexpected results,
specifically when all fields between the changed ones are also
bitfields. The conversion to bitops shows the intent clearly and
makes them use functions with well-defined behaviour in such cases.
There is no overhead for the fast-path, the bit changing functions are
used only in special cases when learning and in the slow path.
In addition this conversion allows us to simplify fdb flag handling and
avoid bugs for future bits (e.g. a forgetting to clear the new bit when
allocating a new fdb). All bridge selftests passed, also tried all of the
converted bits manually in a VM.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

9014fc31

net: bridge: fdb: set flags directly in fdb_create · 3fb01a31

Nikolay Aleksandrov authored Oct 29, 2019

No need to have separate arguments for each flag, just set the flags to
whatever was passed to fdb_create() before the fdb is published.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

3fb01a31

net: bridge: fdb: convert offloaded to use bitops · d38c6e3d

Nikolay Aleksandrov authored Oct 29, 2019

Convert the offloaded field to a flag and use bitops.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d38c6e3d

net: bridge: fdb: convert added_by_external_learn to use bitops · b5cd9f7c

Nikolay Aleksandrov authored Oct 29, 2019

Convert the added_by_external_learn field to a flag and use bitops.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b5cd9f7c

net: bridge: fdb: convert added_by_user to bitops · ac3ca6af

Nikolay Aleksandrov authored Oct 29, 2019

Straight-forward convert of the added_by_user field to bitops.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ac3ca6af

net: bridge: fdb: convert is_sticky to bitops · e0458d9a

Nikolay Aleksandrov authored Oct 29, 2019

Straight-forward convert of the is_sticky field to bitops.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e0458d9a

net: bridge: fdb: convert is_static to bitops · 29e63fff

Nikolay Aleksandrov authored Oct 29, 2019

Convert the is_static to bitops, make use of the combined
test_and_set/clear_bit to simplify expressions in fdb_add_entry.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

29e63fff

net: bridge: fdb: convert is_local to bitops · 6869c3b0

Nikolay Aleksandrov authored Oct 29, 2019

The patch adds a new fdb flags field in the hole between the two cache
lines and uses it to convert is_local to bitops.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6869c3b0

net/smc: remove unneeded include for smc.h · 8466a57d

Ursula Braun authored Oct 29, 2019

The only smc-related reference in net/sock.h is struct smc_hashinfo.
But just its address is refered to. Thus there is no need for the
include of net/smc.h. Remove it.
Suggested-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8466a57d

tipc: improve throughput between nodes in netns · f73b1281

Hoang Le authored Oct 29, 2019

Currently, TIPC transports intra-node user data messages directly
socket to socket, hence shortcutting all the lower layers of the
communication stack. This gives TIPC very good intra node performance,
both regarding throughput and latency.

We now introduce a similar mechanism for TIPC data traffic across
network namespaces located in the same kernel. On the send path, the
call chain is as always accompanied by the sending node's network name
space pointer. However, once we have reliably established that the
receiving node is represented by a namespace on the same host, we just
replace the namespace pointer with the receiving node/namespace's
ditto, and follow the regular socket receive patch though the receiving
node. This technique gives us a throughput similar to the node internal
throughput, several times larger than if we let the traffic go though
the full network stacks. As a comparison, max throughput for 64k
messages is four times larger than TCP throughput for the same type of
traffic.

To meet any security concerns, the following should be noted.

- All nodes joining a cluster are supposed to have been be certified
and authenticated by mechanisms outside TIPC. This is no different for
nodes/namespaces on the same host; they have to auto discover each
other using the attached interfaces, and establish links which are
supervised via the regular link monitoring mechanism. Hence, a kernel
local node has no other way to join a cluster than any other node, and
have to obey to policies set in the IP or device layers of the stack.

- Only when a sender has established with 100% certainty that the peer
node is located in a kernel local namespace does it choose to let user
data messages, and only those, take the crossover path to the receiving
node/namespace.

- If the receiving node/namespace is removed, its namespace pointer
is invalidated at all peer nodes, and their neighbor link monitoring
will eventually note that this node is gone.

- To ensure the "100% certainty" criteria, and prevent any possible
spoofing, received discovery messages must contain a proof that the
sender knows a common secret. We use the hash mix of the sending
node/namespace for this purpose, since it can be accessed directly by
all other namespaces in the kernel. Upon reception of a discovery
message, the receiver checks this proof against all the local
namespaces'hash_mix:es. If it finds a match, that, along with a
matching node id and cluster id, this is deemed sufficient proof that
the peer node in question is in a local namespace, and a wormhole can
be opened.

- We should also consider that TIPC is intended to be a cluster local
IPC mechanism (just like e.g. UNIX sockets) rather than a network
protocol, and hence we think it can justified to allow it to shortcut the
lower protocol layers.

Regarding traceability, we should notice that since commit 6c9081a3
("tipc: add loopback device tracking") it is possible to follow the node
internal packet flow by just activating tcpdump on the loopback
interface. This will be true even for this mechanism; by activating
tcpdump on the involved nodes' loopback interfaces their inter-name
space messaging can easily be tracked.

v2:
- update 'net' pointer when node left/rejoined
v3:
- grab read/write lock when using node ref obj
v4:
- clone traffics between netns to loopback
Suggested-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

f73b1281

inet: do not call sublist_rcv on empty list · 51210ad5

Florian Westphal authored Oct 29, 2019

syzbot triggered struct net NULL deref in NF_HOOK_LIST:
RIP: 0010:NF_HOOK_LIST include/linux/netfilter.h:331 [inline]
RIP: 0010:ip6_sublist_rcv+0x5c9/0x930 net/ipv6/ip6_input.c:292
 ipv6_list_rcv+0x373/0x4b0 net/ipv6/ip6_input.c:328
 __netif_receive_skb_list_ptype net/core/dev.c:5274 [inline]

Reason:
void ipv6_list_rcv(struct list_head *head, struct packet_type *pt,
                   struct net_device *orig_dev)
[..]
        list_for_each_entry_safe(skb, next, head, list) {
		/* iterates list */
                skb = ip6_rcv_core(skb, dev, net);
		/* ip6_rcv_core drops skb -> NULL is returned */
                if (skb == NULL)
                        continue;
	[..]
	}
	/* sublist is empty -> curr_net is NULL */
        ip6_sublist_rcv(&sublist, curr_dev, curr_net);

Before the recent change NF_HOOK_LIST did a list iteration before
struct net deref, i.e. it was a no-op in the empty list case.

List iteration now happens after *net deref, causing crash.

Follow the same pattern as the ip(v6)_list_rcv loop and add a list_empty
test for the final sublist dispatch too.

Cc: Edward Cree <ecree@solarflare.com>
Reported-by: syzbot+c54f457cad330e57e967@syzkaller.appspotmail.com
Fixes: ca58fbe0 ("netfilter: add and use nf_hook_slow_list()")
Signed-off-by: Florian Westphal <fw@strlen.de>
Tested-by: Leon Romanovsky <leonro@mellanox.com>
Tested-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

51210ad5

broadcom: bnxt: Fix use true/false for bool · acda6180

Saurav Girepunje authored Oct 29, 2019

Use true/false for bool type in bnxt_timer function.
Signed-off-by: Saurav Girepunje <saurav.girepunje@gmail.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

acda6180

cavium: thunder: Fix use true/false for bool type · cb5ff33f

Saurav Girepunje authored Oct 29, 2019

use true/false on bool type variables for assignment.
Signed-off-by: Saurav Girepunje <saurav.girepunje@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cb5ff33f

Merge branch 'net-phy-marvell-fix-and-extend-downshift-support' · 5b5168c7

David S. Miller authored Oct 29, 2019

Heiner Kallweit says:

====================
net: phy: marvell: fix and extend downshift support

This series includes two fixes and two extensions for downshift support.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

5b5168c7

net: phy: marvell: add PHY tunable support for more PHY versions · 262caf47

Heiner Kallweit authored Oct 28, 2019

More PHY versions are compatible with the existing downshift
implementation, so let's add downshift support for them.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

262caf47

net: phy: marvell: add downshift support for M88E1111 · 5c6bc519

Heiner Kallweit authored Oct 28, 2019

This patch adds downshift support for M88E1111. This PHY version uses
another register for downshift configuration, reading downshift status
is possible via the same register as for other PHY versions.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

5c6bc519