Commits · 80f5be5adee20d4c253f31082a40df39a6a4a9d0 · Kirill Smelkov / linux

06 Apr, 2017 11 commits

afs: afs_fsync() does two flushes, one of which is redundant · 80f5be5a

David Howells authored Apr 06, 2017

afs_fsync() calls filemap_write_and_wait_range() and then does a walk
through the writeback records and flushes those - which should achieve
exactly the same thing.

Get rid of the filemap_write_and_wait_range() since that's uninterruptible,
whereas the wait for the writeback records is interruptible.

Further, we can at least contract the inode-locked region to just the
afs_writeback_call().
Signed-off-by: David Howells <dhowells@redhat.com>

80f5be5a

afs: Don't pass vnode pointer around in afs_writeback struct · f67bf645

David Howells authored Apr 06, 2017

Don't pass the vnode pointer around in the afs_writeback struct as the
struct may get freed, yet we still need the pointer.  Pass it around
separately.
Signed-off-by: David Howells <dhowells@redhat.com>

f67bf645

rxrpc: Trace client call connection · 89ca6948

David Howells authored Apr 06, 2017

Add a tracepoint (rxrpc_connect_call) to log the combination of rxrpc_call
pointer, afs_call pointer/user data and wire call parameters to make it
easier to match the tracebuffer contents to captured network packets.
Signed-off-by: David Howells <dhowells@redhat.com>

89ca6948

rxrpc: Trace changes in a call's receive window size · 740586d2

David Howells authored Apr 06, 2017

Add a tracepoint (rxrpc_rx_rwind_change) to log changes in a call's receive
window size as imposed by the peer through an ACK packet.
Signed-off-by: David Howells <dhowells@redhat.com>

740586d2

rxrpc: Trace received aborts · 005ede28

David Howells authored Apr 06, 2017

Add a tracepoint (rxrpc_rx_abort) to record received aborts.
Signed-off-by: David Howells <dhowells@redhat.com>

005ede28

rxrpc: Trace protocol errors in received packets · fb46f6ee

David Howells authored Apr 06, 2017

Add a tracepoint (rxrpc_rx_proto) to record protocol errors in received
packets.  The following changes are made:

 (1) Add a function, __rxrpc_abort_eproto(), to note a protocol error on a
     call and mark the call aborted.  This is wrapped by
     rxrpc_abort_eproto() that makes the why string usable in trace.

 (2) Add trace_rxrpc_rx_proto() or rxrpc_abort_eproto() to protocol error
     generation points, replacing rxrpc_abort_call() with the latter.

 (3) Only send an abort packet in rxkad_verify_packet*() if we actually
     managed to abort the call.

Note that a trace event is also emitted if a kernel user (e.g. afs) tries
to send data through a call when it's not in the transmission phase, though
it's not technically a receive event.
Signed-off-by: David Howells <dhowells@redhat.com>

fb46f6ee

rxrpc: Handle temporary errors better in rxkad security · ef68622d

David Howells authored Apr 06, 2017

In the rxkad security module, when we encounter a temporary error (such as
ENOMEM) from which we could conceivably recover, don't abort the
connection, but rather permit retransmission of the relevant packets to
induce a retry.

Note that I'm leaving some places that could be merged together to insert
tracing in the next patch.

Signed-off-by; David Howells <dhowells@redhat.com>

ef68622d

rxrpc: Note a successfully aborted kernel operation · 84a4c09c

David Howells authored Apr 06, 2017

Make rxrpc_kernel_abort_call() return an indication as to whether it
actually aborted the operation or not so that kafs can trace the failure of
the operation.  Note that 'success' in this context means changing the
state of the call, not necessarily successfully transmitting an ABORT
packet.
Signed-off-by: David Howells <dhowells@redhat.com>

84a4c09c

rxrpc: Use negative error codes in rxrpc_call struct · 3a92789a

David Howells authored Apr 06, 2017

Use negative error codes in struct rxrpc_call::error because that's what
the kernel normally deals with and to make the code consistent. We only
turn them positive when transcribing into a cmsg for userspace recvmsg.
Signed-off-by: David Howells <dhowells@redhat.com>

3a92789a

bonding: attempt to better support longer hw addresses · faeeb317

Jarod Wilson authored Apr 04, 2017

People are using bonding over Infiniband IPoIB connections, and who knows
what else. Infiniband has a hardware address length of 20 octets
(INFINIBAND_ALEN), and the network core defines a MAX_ADDR_LEN of 32.
Various places in the bonding code are currently hard-wired to 6 octets
(ETH_ALEN), such as the 3ad code, which I've left untouched here. Besides,
only alb is currently possible on Infiniband links right now anyway, due
to commit 1533e773, so the alb code is where most of the changes are.

One major component of this change is the addition of a bond_hw_addr_copy
function that takes a length argument, instead of using ether_addr_copy
everywhere that hardware addresses need to be copied about. The other
major component of this change is converting the bonding code from using
struct sockaddr for address storage to struct sockaddr_storage, as the
former has an address storage space of only 14, while the latter is 128
minus a few, which is necessary to support bonding over device with up to
MAX_ADDR_LEN octet hardware addresses. Additionally, this probably fixes
up some memory corruption issues with the current code, where it's
possible to write an infiniband hardware address into a sockaddr declared
on the stack.

Lightly tested on a dual mlx4 IPoIB setup, which properly shows a 20-octet
hardware address now:

$ cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: fault-tolerance (active-backup) (fail_over_mac active)
Primary Slave: mlx4_ib0 (primary_reselect always)
Currently Active Slave: mlx4_ib0
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 100
Down Delay (ms): 100

Slave Interface: mlx4_ib0
MII Status: up
Speed: Unknown
Duplex: Unknown
Link Failure Count: 0
Permanent HW addr:
80:00:02:08:fe:80:00:00:00:00:00:00:e4:1d:2d:03:00:1d:67:01
Slave queue ID: 0

Slave Interface: mlx4_ib1
MII Status: up
Speed: Unknown
Duplex: Unknown
Link Failure Count: 0
Permanent HW addr:
80:00:02:09:fe:80:00:00:00:00:00:01:e4:1d:2d:03:00:1d:67:02
Slave queue ID: 0

Also tested with a standard 1Gbps NIC bonding setup (with a mix of
e1000 and e1000e cards), running LNST's bonding tests.

CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@gmail.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: netdev@vger.kernel.org
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

faeeb317

sfc: don't insert mc_list on low-latency firmware if it's too long · 148cbab6

Edward Cree authored Apr 04, 2017

If the mc_list is longer than 256 addresses, we enter mc_promisc mode.
If we're in mc_promisc mode and the firmware doesn't support cascaded
 multicast, normally we also insert our mc_list, to prevent stealing by
 another VI.  However, if the mc_list was too long, this isn't really
 helpful - the MC groups that didn't fit in the list can still get
 stolen, and having only some of them stealable will probably cause
 more confusing behaviour than having them all stealable.  Since
 inserting 256 multicast filters takes a long time and can lead to MCDI
 state machine timeouts, just skip the mc_list insert in this overflow
 condition.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

148cbab6

05 Apr, 2017 29 commits

Merge branch 'nfp-ksettings' · a4b7c07f

David S. Miller authored Apr 05, 2017

Jakub Kicinski says:

====================
nfp: ethtool link settings

This series adds support for getting and setting link settings
via the (moderately) new ethtool ksettings ops.

First patch introduces minimal speed and duplex reporting using
the information directly provided in PCI BAR0 memory.

Next few changes deal with the need to refresh port state read
from the service process and patch 6 finally uses that information
to provide link speed and duplex.  Patches 7 and 8 add auto
negotiation and port type reporting.

Remaining changes provide the set support for speed and auto
negotiation.  An upcoming series will also add port splitting
support via devlink.

Quite a bit of churn in this series is caused by the fact that
currently port speed and split changes will usually require a
reboot to take effect.  Current service process code is not capable
of performing MAC reinitialization after chip has been passing
traffic.  To make sure user is aware of this limitation we refuse
the configuration unless netdev is down, print warning to the logs
and if configuration was performed but did take effect we unregister
the netdev.  Service process has a "reboot needed" sticky bit, so
reloading the driver will not bring the netdev back.

Note that there is a helper in patch 13 which is marked as
__always_inline, because the FIELD_* macros require the parameters
to be known at compilation time.  I hope that is OK.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

a4b7c07f

nfp: add support for .set_link_ksettings() · 7c698737

Jakub Kicinski authored Apr 04, 2017

Support setting link speed and autonegotiation through
set_link_ksettings() ethtool op.  If the port is reconfigured
in incompatible way and reboot is required the netdev will get
unregistered and not come back until user reboots the system.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7c698737

nfp: NSP backend for link configuration operations · 5a560832

Jakub Kicinski authored Apr 04, 2017

Add NSP backend for upcoming link configuration operations.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

5a560832

nfp: add extended error messages · 85eb97dd

Jakub Kicinski authored Apr 04, 2017

Allow NSP to set option code even when error is reported.  This provides
a way for NSP to give user more precise information about why command
failed.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

85eb97dd

nfp: turn NSP port entry into a union · e890ae8e

Jakub Kicinski authored Apr 04, 2017

Make NSP port structure a union to simplify accessing the fields
from generic macros.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e890ae8e

nfp: allow multi-stage NSP configuration · 30a02921

Jakub Kicinski authored Apr 04, 2017

NSP commands may be slow to respond, we should try to avoid doing
a command-per-item when user requested to change multiple parameters
for instance with an ethtool .set_settings() command.

Introduce a way of internal NSP code to carry state in NSP structure
and add start/finish calls to perform the initialization and kick off
of the configuration request, with potentially many parameters being
modified in between.

nfp_eth_set_mod_enable() will make use of the new code internally,
other "set" functions to follow.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

30a02921

nfp: separate high level and low level NSP headers · ce22f5a2

Jakub Kicinski authored Apr 04, 2017

We will soon add more NSP commands and structure definitions.
Move all high-level NSP header contents to a common nfp_nsp.h file.
Right now it mostly boils down to renaming nfp_nsp_eth.h and
moving some functions from nfp.h there.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ce22f5a2

nfp: report port type in ethtool · 9f9e0da5

Jakub Kicinski authored Apr 04, 2017

Service process firmware provides us with information about media
and interface (SFP module) plugged in, translate that to Linux's
PORT_* defines and report via ethtool.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9f9e0da5

nfp: report auto-negotiation in ethtool · 42b1e6aa

Jakub Kicinski authored Apr 04, 2017

NSP ABI version 0.17 is exposing the autonegotiation settings.
Report whether autoneg is on via ethtool.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

42b1e6aa

nfp: report link speed from NSP · 21d529d5

Jakub Kicinski authored Apr 04, 2017

On the PF prefer the link speed value provided by the NSP.
Refresh port table if needed.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

21d529d5

nfp: add port state refresh · 172f638c

Jakub Kicinski authored Apr 04, 2017

We will need a way of refreshing port state for link settings
get/set.  For get we need to refresh port speed and type.

When settings are changed the reconfiguration may require
reboot before it's effective.  Unregister netdevs affected
by reconfiguration from a workqueue.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

172f638c

nfp: track link state changes · cee42951

Jakub Kicinski authored Apr 04, 2017

For caching link settings - remember if we have seen link events
since the last time the eth_port information was refreshed.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cee42951

nfp: add mutex protection for the port list · d12537df

Jakub Kicinski authored Apr 04, 2017

We will want to unregister netdevs after their port got reconfigured.
For that we need to make sure manipulations of port list from the
port reconfiguration flow will not race with driver's .remove()
callback.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d12537df

nfp: don't spawn netdevs for reconfigured ports · b9de0077

Jakub Kicinski authored Apr 04, 2017

After port reconfiguration (port split, media type change)
firmware will continue to report old configuration until
reboot.  NSP will inform us that reconfiguration is pending.
To avoid user confusion refuse to spawn netdevs until the
new configuration is applied (reboot).

We need to split the netdev to eth_table port matching from
MAC search and move it earlier in the probe() flow.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b9de0077

nfp: add support for .get_link_ksettings() · 265aeb51

Jakub Kicinski authored Apr 04, 2017

Read link speed from the BAR.  This provides very basic information
and works for both PFs and VFs.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

265aeb51

Merge tag 'linux-can-next-for-4.13-20170404' of... · 18148f09

David S. Miller authored Apr 05, 2017

Merge tag 'linux-can-next-for-4.13-20170404' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next

Marc Kleine-Budde says:

====================
pull-request: can-next 2017-03-03

this is a pull request of 5 patches for net-next/master.

There are two patches by Yegor Yefremov which convert the ti_hecc
driver into a DT only driver, as there is no in-tree user of the old
platform driver interface anymore. The next patch by Mario Kicherer
adds network namespace support to the can subsystem. The last two
patches by Akshay Bhat add support for the holt_hi311x SPI CAN driver.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

18148f09

selftests: add a generic testsuite for ethernet device · af0e5461

LABBE Corentin authored Apr 04, 2017

This patch add a generic testsuite for testing ethernet network device driver.
Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Tested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

af0e5461

Merge branch 'rtnetlink-event-type' · 2e507e24

David S. Miller authored Apr 05, 2017

Vladislav Yasevich says:

====================
rtnetlink: Updates to rtnetlink_event()

This series came out of the conversation that started as a result
my first attempt to add netdevice event info to netlink messages.

This series converts event processing to a 'white list', where
we explicitely permit events to generate netlink messages.  This
is meant to make people take a closer look and determine wheter
these events should really trigger netlink messages.

I am also adding a V2 of my patch to add event type to the netlink
message.  This version supports all events that we currently generate.

I will also update my patch to iproute that will show this data
through 'ip monitor'.

I actually need the ability to trap NETDEV_NOTIFY_PEERS event
(as well as possible NETDEV_RESEND_IGMP) to support hanlding of
macvtap on top of bonding.  I hope others will also find this info usefull.

V2: Added missed events (from David Ahern)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

2e507e24

rtnl: Add support for netdev event to link messages · def12888

Vlad Yasevich authored Apr 04, 2017

When netdev events happen, a rtnetlink_event() handler will send
messages for every event in it's white list.  These messages contain
current information about a particular device, but they do not include
the iformation about which event just happened.  The consumer of
the message has to try to infer this information.  In some cases
(ex: NETDEV_NOTIFY_PEERS), that is not possible.

This patch adds a new extension to RTM_NEWLINK message called IFLA_EVENT
that would have an encoding of the which event triggered this
message.  This would allow the the message consumer to easily determine
if it is interested in a particular event or not.
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

def12888

rtnetlink: Convert rtnetlink_event to white list · 5138e86f

Vlad Yasevich authored Apr 04, 2017

The rtnetlink_event currently functions as a blacklist where
we block cerntain netdev events from being sent to user space.
As a result, events have been added to the system that userspace
probably doesn't care about.

This patch converts the implementation to the white list so that
newly events would have to be specifically added to the list to
be sent to userspace.  This would force new event implementers to
consider whether a given event is usefull to user space or if it's
just a kernel event.
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

5138e86f

net: tcp: Define the TCP_MAX_WSCALE instead of literal number 14 · 589c49cb

Gao Feng authored Apr 04, 2017

Define one new macro TCP_MAX_WSCALE instead of literal number '14',
and use U16_MAX instead of 65535 as the max value of TCP window.
There is another minor change, use rounddown(space, mss) instead of
(space / mss) * mss;
Signed-off-by: Gao Feng <fgao@ikuai8.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

589c49cb

net: ibm: emac: remove unused sysrq handler for 'c' key · 5e351410

Eric Biggers authored Apr 03, 2017

Since commit d6580a9f ("kexec: sysrq: simplify sysrq-c handler"),
the sysrq handler for the 'c' key has been sysrq_crash_op. Debugging
code in the ibm_emac driver also tries to register a handler for the 'c'
key, but this has no effect because register_sysrq_key() doesn't replace
existing handlers. Since evidently no one has cared enough to fix this
in the last 8 years, and it's very rare for drivers to register sysrq
handlers (for good reason), just remove the dead code.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

5e351410

bonding: fix active-backup transition · 3f3c278c

Mahesh Bandewar authored Apr 03, 2017

Earlier patch c4adfc82 ("bonding: make speed, duplex setting
consistent with link state") made an attempt to keep slave state
consistent with speed and duplex settings. Unfortunately link-state
transition is used to change the active link especially when used
in conjunction with mii-mon. The above mentioned patch broke that
logic. Also when speed and duplex settings for a link are updated
during a link-event, the link-status should not be changed to
invoke correct transition logic.

This patch fixes this issue by moving the link-state update outside
of the bond_update_speed_duplex() fn and to the places where this fn
is called and update link-state selectively.

Fixes: c4adfc82 ("bonding: make speed, duplex setting consistent
with link state")
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Reviewed-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

3f3c278c

netlink/diag: report flags for netlink sockets · 457c79e5

Andrey Vagin authored Apr 03, 2017

cb_running is reported in /proc/self/net/netlink and it is reported by
the ss tool, when it gets information from the proc files.

sock_diag is a new interface which is used instead of proc files, so it
looks reasonable that this interface has to report no less information
about sockets than proc files.

We use these flags to dump and restore netlink sockets.
Signed-off-by: Andrei Vagin <avagin@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

457c79e5

qed: Add a missing error code · c8004600

Dan Carpenter authored Apr 03, 2017

We should be returning -ENOMEM if qed_mcp_cmd_add_elem() fails.  The
current code returns success.

Fixes: 4ed1eea8 ("qed: Revise MFW command locking")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c8004600

net: sched: choke: remove some dead code · 2c099ccb

Dan Carpenter authored Apr 03, 2017

We accidentally left this dead code behind after commit 5952fde1
("net: sched: choke: remove dead filter classify code").
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

2c099ccb

liquidio: clear the correct memory · 781159fb

Dan Carpenter authored Apr 03, 2017

There is a cut and paste bug here so we accidentally clear the first
few bytes of "resp" a second time instead clearing "ctx".

Fixes: 50c0add5 ("liquidio: refactor interrupt moderation code")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

781159fb

net: stmmac: rx queue to dma channel mapping fix · 03cf65a9

Joao Pinto authored Apr 03, 2017

In hardware configurations where multiple queues are active,
the rx queue needs to be mapped into a dma channel, even if
a single rx queue is used.
Signed-off-by: Joao Pinto <jpinto@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

03cf65a9

phy/ethtool: Add missing SPEED_<foo> strings · 1f37b177

Joe Perches authored Apr 02, 2017

Add all the currently available SPEED_<foo> strings.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1f37b177