Commits · fa642f08839bf2ff35b2f6c6a6c062aee8121ba8 · Kirill Smelkov / linux

07 Sep, 2018 1 commit

openvswitch: Derive IP protocol number for IPv6 later frags · fa642f08

Yi-Hung Wei authored Sep 04, 2018

Currently, OVS only parses the IP protocol number for the first
IPv6 fragment, but sets the IP protocol number for the later fragments
to be NEXTHDF_FRAGMENT.  This patch tries to derive the IP protocol
number for the IPV6 later frags so that we can match that.
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

fa642f08

06 Sep, 2018 39 commits

liquidio CN23XX: Remove set but not used variable 'ring_flag' · ddc4d236

YueHaibing authored Sep 06, 2018

Fixes gcc '-Wunused-but-set-variable' warning:

drivers/net/ethernet/cavium/liquidio/cn23xx_vf_device.c: In function 'cn23xx_setup_octeon_vf_device':
drivers/net/ethernet/cavium/liquidio/cn23xx_vf_device.c:619:20: warning:
variable 'ring_flag' set but not used [-Wunused-but-set-variable]
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ddc4d236

liquidio: Add spoof checking on a VF MAC address · 48875222

Weilin Chang authored Sep 05, 2018

1. Provide the API to set/unset the spoof checking feature.
2. Add a function to periodically provide the count of found
   packets with spoof VF MAC address.
3. Prevent VF MAC address changing while the spoofchk of the VF is
   on unless the changing MAC address is issued from PF.
Signed-off-by: Weilin Chang <weilin.chang@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

48875222

Merge tag 'mlx5e-updates-2018-09-05' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · ddc9cc01

David S. Miller authored Sep 06, 2018

Saeed Mahameed says:

====================
mlx5e-updates-2018-09-05

This series provides updates to mlx5 ethernet driver.

1) Starting with a four patches series to optimize flow counters updates,
From Vlad Buslov:
==============================================

By default mlx5 driver updates cached counters each second. Update function
consumes noticeable amount of CPU resources. The goal of this patch series
is to optimize update function.

Investigation revealed following bottlenecks in fs counters
implementation:
 1) Update code(scheduled each second) iterates over all counters twice.
 (first for finding and deleting counters that are marked for deletion,
 second iteration is for actually updating the counters)
 2) Counters are stored in rb tree. Linear iteration over all rb tree
 elements(rb_next in profiling data) consumed ~65% of time spent in
 update function.

Following optimizations were implemented:
 1) Instead of just marking counters for deletion, store them in
 standalone list. This removes first iteration over whole counters tree.
 2) Store counters in sorted list to optimize traversing them and remove
 calls to rb_next.

First implementation of these changes caused degradation of performance,
instead of improving it. Investigation revealed that there first cache
line of struct mlx5_fc is full and adding anything to it causes amount
of cache misses to double. To mitigate that, following refactorings were
implemented:
 - Change 'addlist' list type from double linked to single linked. This
 allowes to get free space for one additional pointer that is used to
 store deletion list(optimization 1)
 - Substitute rb tree with idr. Idr is non-intrusive data structure and
 doesn't require adding any new members to struct mlx5_fc. Use free
 space that became available for double linked sorted list that is used
 for traversing all counters. (optimization 2)

Described changes reduced CPU time spent in mlx5_fc_stats_work from 70%
to 44%. (global perf profile mode)
============================================

The rest of the series are misc updates:

2) From Kamal, Move mlx5e_priv_flags into en_ethtool.c, to avoid a
compilation warning.

3) From Roi Dayan, Move Q counters allocation and drop RQ to init_rx profile
function to avoid allocating Q counters when not required.

4) From Shay Agroskin, Replace PTP clock lock from RW lock to seq lock.
Almost double the packet rate when timestamping is active on multiple TX
queues.

5) From: Natali Shechtman, set ECN for received packets using CQE indication.

6) From: Alaa Hleihel, don't set CHECKSUM_COMPLETE on SCTP packets.
CHECKSUM_COMPLETE is not applicable to SCTP protocol.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

ddc9cc01

Merge branch 'dsa-b53-SerDes-support' · 2002bc32

David S. Miller authored Sep 06, 2018

Florian Fainelli says:

====================
net: dsa: b53: SerDes support

This patch series adds support for the SerDes found on NorthStar Plus
(NSP) which allows us to use the SFP port on the BCM958625HR board (and
other similar designs).

Changes in v3:

- properly hunk the request_threaded_irq() bits into patch #2

Changes in v2:

- migrate to threaded interrupt (Andrew)
- fixed a case where MLO_AN_FIXED's mac_config would still call into
  the serdes_config callback
- added an additional check on the phylink interface in mac_config
- default to ARCH_BCM_NSP instead of ARCH_BCM_IPROC which is really
  the NSP Kconfig bit we want
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

2002bc32

net: dsa: b53: Add SerDes support · 0e01491d

Florian Fainelli authored Sep 05, 2018

Add support for the Northstar Plus SerDes which is accessed through a
special page of the switch. Since this is something that most people
probably will not want to use, make it a configurable option with a
default on ARCH_BCM_NSP where it is the most useful currently.

The SerDes supports both SGMII and 1000baseX modes for both lanes, and
2500baseX for one of the lanes, and is internally looking like a
seemingly standard MII PHY, except for the few bits that got repurposed.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

0e01491d

net: dsa: b53: Add PHYLINK support · a8e8b985

Florian Fainelli authored Sep 05, 2018

Add support for PHYLINK, things are reasonably straight forward since we
do not yet support SerDes interfaces, that leaves us with just
MLO_AN_PHY and MLO_AN_FIXED to deal with.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

a8e8b985

net: dsa: b53: Add helper to set link parameters · 5e004460

Florian Fainelli authored Sep 05, 2018

Extract the logic from b53_adjust_link() responsible for overriding a
given port's link, speed, duplex and pause settings and make two helper
functions to set the port's configuration and the port's link settings.
We will make use of both, as separate functions while adding PHYLINK
support next.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

5e004460

net: dsa: b53: Make SRAB driver manage port interrupts · 16994374

Florian Fainelli authored Sep 05, 2018

Update the SRAB driver to manage per-port interrupts. Since we cannot
sleep during b53_io_ops, schedule a workqueue whenever we get a port
specific interrupt. We will later make use of this to call back into
PHYLINK when there is e.g: a link state change.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

16994374

net: dsa: b53: Add ability to enable/disable port interrupts · 8ca7c160

Florian Fainelli authored Sep 05, 2018

Some switches expose individual interrupt line(s) for port specific
event(s), allow configuring these interrupts at an appropriate time
during port_enable/disable callbacks where all port specific resources
are known to be set-up and ready for use.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

8ca7c160

qed*: Utilize FW 8.37.7.0 · a3f72307

Denis Bolotin authored Sep 05, 2018

This patch adds a new qed firmware with fixes and support for new features.

Fixes:
- Fix a rare case of device crash with iWARP, iSCSI or FCoE offload.
- Fix GRE tunneled traffic when iWARP offload is enabled.
- Fix RoCE failure in ib_send_bw when using inline data.
- Fix latency optimization flow for inline WQEs.
- BigBear 100G fix

RDMA:
- Reduce task context size.
- Application page sizes above 2GB support.
- Performance improvements.

ETH:
- Tenant DCB support.
- Replace RSS indirection table update interface.

Misc:
- Debug Tools changes.
Signed-off-by: Denis Bolotin <denis.bolotin@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a3f72307

Merge branch 'rtnetlink-add-IFA_TARGET_NETNSID-for-RTM_GETADDR' · 6ef848ef

David S. Miller authored Sep 05, 2018

Christian Brauner says:

====================
rtnetlink: add IFA_TARGET_NETNSID for RTM_GETADDR

This iteration should mainly addresses the suggestion to use
IFA_TARGET_NETNSID as the property name. Additionally, an an alias for
the already existing IFLA_IF_NETNSID property is added.

Note that two additional cleanup patches (8\9 and 9\9) were added to
address concerns raised that passing more than 6 arguments to a function
will cause additional variables to be pushed onto the stack instead of
being placed into registers. The way I addressed this is by introducing
two new struct inet{6}_fill_args that are used to pass common
information down to inet{6}_fill_if*() functions shortening all those
functions to three pointer arguments.
If this is something more people than Kirill find useful they can be
kept if not they can simply be dropped in later iterations of this
series or when merging.

Here is a short overview:
1. Rename from IFA_IF_NETNSID to IFA_TARGET_NETNSID.
2. Add IFLA_TARGET_NETNSID as an alias for IFA_IFLA_NETNSID and switch
   all occurrences over to the new alias.
3. Add inet4_fill_args struct to avoid passing more than 6 arguments in
   inet_fill_if*() functions.
4. Add inet6_fill_args struct to avoid passing more than 6 arguments in
   inet_fill_if*() functions.

The only functional change is the export of rtnl_get_net_ns_capable()
which is needed in case ipv6 is built as a module.

Note, I did not change the property name to IFA_TARGET_NSID as there was
no clear agreement what would be preferred. My personal preference is to
keep the IFA_IF_NETNSID name because it aligns naturally with the
IFLA_IF_NETNSID property for RTM_*LINK requests. Jiri seems to prefer
this name too.
However, if there is agreement that another property name makes more
sense I'm happy to send a v2 that changes this.

To test this patchset I performed 1 million getifaddrs() requests
against a network namespace containing 5 interfaces (lo, eth{0-4}). The
first test used a network namespace aware getifaddrs() implementation I
wrote and the second test used the traditional setns() + getifaddrs()
method. The results show that this patchsets allows userspace to cut
retrieval time in half:
1. netns_getifaddrs():      82 microseconds
2. setns() + getifaddrs(): 162 microseconds

A while back we introduced and enabled IFLA_IF_NETNSID in
RTM_{DEL,GET,NEW}LINK requests (cf. [1], [2], [3], [4], [5]). This has led
to signficant performance increases since it allows userspace to avoid
taking the hit of a setns(netns_fd, CLONE_NEWNET), then getting the
interfaces from the netns associated with the netns_fd. Especially when a
lot of network namespaces are in use, using setns() becomes increasingly
problematic when performance matters.
Usually, RTML_GETLINK requests are followed by RTM_GETADDR requests (cf.
getifaddrs() style functions and friends). But currently, RTM_GETADDR
requests do not support a similar property like IFLA_IF_NETNSID for
RTM_*LINK requests.
This is problematic since userspace can retrieve interfaces from another
network namespace by sending a IFLA_IF_NETNSID property along but
RTM_GETLINK request but is still forced to use the legacy setns() style of
retrieving interfaces in RTM_GETADDR requests.

The goal of this series is to make it possible to perform RTM_GETADDR
requests on different network namespaces. To this end a new IFA_IF_NETNSID
property for RTM_*ADDR requests is introduced. It can be used to send a
network namespace identifier along in RTM_*ADDR requests.  The network
namespace identifier will be used to retrieve the target network namespace
in which the request is supposed to be fulfilled.  This aligns the behavior
of RTM_*ADDR requests with the behavior of RTM_*LINK requests.

- The caller must have assigned a valid network namespace identifier for
  the target network namespace.
- The caller must have CAP_NET_ADMIN in the owning user namespace of the
  target network namespace.

[1]: commit 7973bfd8 ("rtnetlink: remove check for IFLA_IF_NETNSID")
[2]: commit 5bb8ed07 ("rtnetlink: enable IFLA_IF_NETNSID for RTM_NEWLINK")
[3]: commit b61ad68a ("rtnetlink: enable IFLA_IF_NETNSID for RTM_DELLINK")
[4]: commit c310bfcb ("rtnetlink: enable IFLA_IF_NETNSID for RTM_SETLINK")
[5]: commit 7c4f63ba ("rtnetlink: enable IFLA_IF_NETNSID in do_setlink()")
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

6ef848ef

ipv6: add inet6_fill_args · 203651b6

Christian Brauner authored Sep 04, 2018

inet6_fill_if{addr,mcaddr, acaddr}() already took 6 arguments which
meant the 7th argument would need to be pushed onto the stack on x86.
Add a new struct inet6_fill_args which holds common information passed
to inet6_fill_if{addr,mcaddr, acaddr}() and shortens the functions to
three pointer arguments.
Signed-off-by: Christian Brauner <christian@brauner.io>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

203651b6

ipv4: add inet_fill_args · 978a46fa

Christian Brauner authored Sep 04, 2018

inet_fill_ifaddr() already took 6 arguments which meant the 7th argument
would need to be pushed onto the stack on x86.
Add a new struct inet_fill_args which holds common information passed
to inet_fill_ifaddr() and shortens the function to three pointer arguments.
Signed-off-by: Christian Brauner <christian@brauner.io>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

978a46fa

rtnetlink: s/IFLA_IF_NETNSID/IFLA_TARGET_NETNSID/g · 7e4a8d5a

Christian Brauner authored Sep 04, 2018

IFLA_TARGET_NETNSID is the new alias for IFLA_IF_NETNSID. This commit
replaces all occurrences of IFLA_IF_NETNSID with the new alias to
indicate that this identifier is the preferred one.
Signed-off-by: Christian Brauner <christian@brauner.io>
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Cc: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7e4a8d5a

if_link: add IFLA_TARGET_NETNSID alias · 19d8f1ad

Christian Brauner authored Sep 04, 2018

This adds IFLA_TARGET_NETNSID as an alias for IFLA_IF_NETNSID for
RTM_*LINK requests.
The new name is clearer and also aligns with the newly introduced
IFA_TARGET_NETNSID propert for RTM_*ADDR requests.
Signed-off-by: Christian Brauner <christian@brauner.io>
Suggested-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Cc: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

19d8f1ad

rtnetlink: move type calculation out of loop · 87ccbb1f

Christian Brauner authored Sep 04, 2018

I don't see how the type - which is one of
RTM_{GETADDR,GETROUTE,GETNETCONF} - can change. So do the message type
calculation once before entering the for loop.
Signed-off-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>

87ccbb1f

ipv6: enable IFA_TARGET_NETNSID for RTM_GETADDR · 6ecf4c37

Christian Brauner authored Sep 04, 2018

- Backwards Compatibility:
  If userspace wants to determine whether ipv6 RTM_GETADDR requests
  support the new IFA_TARGET_NETNSID property it should verify that the
  reply includes the IFA_TARGET_NETNSID property. If it does not
  userspace should assume that IFA_TARGET_NETNSID is not supported for
  ipv6 RTM_GETADDR requests on this kernel.
- From what I gather from current userspace tools that make use of
  RTM_GETADDR requests some of them pass down struct ifinfomsg when they
  should actually pass down struct ifaddrmsg. To not break existing
  tools that pass down the wrong struct we will do the same as for
  RTM_GETLINK | NLM_F_DUMP requests and not error out when the
  nlmsg_parse() fails.

- Security:
  Callers must have CAP_NET_ADMIN in the owning user namespace of the
  target network namespace.
Signed-off-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>

6ecf4c37

ipv4: enable IFA_TARGET_NETNSID for RTM_GETADDR · d3807145

Christian Brauner authored Sep 04, 2018

- Backwards Compatibility:
  If userspace wants to determine whether ipv4 RTM_GETADDR requests
  support the new IFA_TARGET_NETNSID property it should verify that the
  reply includes the IFA_TARGET_NETNSID property. If it does not
  userspace should assume that IFA_TARGET_NETNSID is not supported for
  ipv4 RTM_GETADDR requests on this kernel.
- From what I gather from current userspace tools that make use of
  RTM_GETADDR requests some of them pass down struct ifinfomsg when they
  should actually pass down struct ifaddrmsg. To not break existing
  tools that pass down the wrong struct we will do the same as for
  RTM_GETLINK | NLM_F_DUMP requests and not error out when the
  nlmsg_parse() fails.

- Security:
  Callers must have CAP_NET_ADMIN in the owning user namespace of the
  target network namespace.
Signed-off-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>

d3807145

if_addr: add IFA_TARGET_NETNSID · 9f3c057c

Christian Brauner authored Sep 04, 2018

This adds a new IFA_TARGET_NETNSID property to be used by address
families such as PF_INET and PF_INET6.
The IFA_TARGET_NETNSID property can be used to send a network namespace
identifier as part of a request. If a IFA_TARGET_NETNSID property is
identified it will be used to retrieve the target network namespace in
which the request is to be made.
Signed-off-by: Christian Brauner <christian@brauner.io>
Cc: Jiri Benc <jbenc@redhat.com>
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9f3c057c

rtnetlink: add rtnl_get_net_ns_capable() · c383edc4

Christian Brauner authored Sep 04, 2018

get_target_net() will be used in follow-up patches in ipv{4,6} codepaths to
retrieve network namespaces based on network namespace identifiers. So
remove the static declaration and export in the rtnetlink header. Also,
rename it to rtnl_get_net_ns_capable() to make it obvious what this
function is doing.
Export rtnl_get_net_ns_capable() so it can be used when ipv6 is built as
a module.
Signed-off-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>

c383edc4

Merge branch 'net-lan78xx-Minor-improvements' · d4cc5976

David S. Miller authored Sep 05, 2018

Stefan Wahren says:

====================
net: lan78xx: Minor improvements

This patch series contains some minor improvements for the lan78xx
driver.

Changes in V2:
- Keep Copyright comment as multi-line
- Add Raghuram's Reviewed-by
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

d4cc5976

net: lan78xx: Make declaration style consistent · 51ceac9f

Stefan Wahren authored Sep 04, 2018

This patch makes some declaration more consistent.
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Reviewed-by: Raghuram Chary Jallipalli <raghuramchary.jallipalli@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

51ceac9f

net: lan78xx: Switch to SPDX identifier · 6be665a5

Stefan Wahren authored Sep 04, 2018

Adopt the SPDX license identifier headers to ease license compliance
management.
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6be665a5

net: lan78xx: Drop unnecessary strcpy in lan78xx_probe · 7a6b022d

Stefan Wahren authored Sep 04, 2018

There is no need for this strcpy because alloc_etherdev() already
does this job.
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Reviewed-by: Raghuram Chary Jallipalli <raghuramchary.jallipalli@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7a6b022d

net: lan78xx: Bail out if lan78xx_get_endpoints fails · fa8cd98c

Stefan Wahren authored Sep 04, 2018

We need to bail out if lan78xx_get_endpoints() fails, otherwise the
result is overwritten.

Fixes: 55d7de9d ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet")
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Reviewed-by: Raghuram Chary Jallipalli <raghuramchary.jallipalli@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

fa8cd98c

nfp: separate VXLAN and GRE feature handling · 7848418e

Jakub Kicinski authored Sep 04, 2018

VXLAN and GRE FW features have to currently be both advertised
for the driver to enable them.  Separate the handling.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7848418e

Merge branch 'nfp-improve-the-new-rtsym-helpers' · eebd3faa

David S. Miller authored Sep 05, 2018

Jakub Kicinski says:

====================
nfp: improve the new rtsym helpers

This set fixes a bug in ABS rtsym handling I added in net-next,
it expands the error checking and reporting on the rtsym accesses.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

eebd3faa

nfp: validate rtsym accesses fall within the symbol · e84b2f2d

Jakub Kicinski authored Sep 04, 2018

With the accesses to rtsyms now all going via special helpers
we can easily make sure the driver is not reading past the
end of the symbol.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Francois H. Theron <francois.theron@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e84b2f2d

nfp: prefix rtsym error messages with symbol name · 31e380f3

Jakub Kicinski authored Sep 04, 2018

For ease of debug preface all error messages with the name
of the symbol which caused them.  Use the same message format
for existing messages while at it.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Francois H. Theron <francois.theron@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

31e380f3

nfp: fix readq on absolute RTsyms · 3c576de3

Jakub Kicinski authored Sep 04, 2018

Return the error and report value through the output param.

Fixes: 640917dd ("nfp: support access to absolute RTsyms")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Francois H. Theron <francois.theron@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

3c576de3

failover: Add missing check to validate 'slave_dev' in net_failover_slave_unregister · 9e7e6cab

YueHaibing authored Sep 04, 2018

Fixes gcc '-Wunused-but-set-variable' warning:

drivers/net/net_failover.c: In function 'net_failover_slave_unregister':
drivers/net/net_failover.c:598:35: warning:
 variable 'primary_dev' set but not used [-Wunused-but-set-variable]

There should check the validity of 'slave_dev'.

Fixes: cfc80d9a ("net: Introduce net_failover driver")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9e7e6cab

netlink: Make groups check less stupid in netlink_bind() · 428f944b

Dmitry Safonov authored Sep 03, 2018

As Linus noted, the test for 0 is needless, groups type can follow the
usual kernel style and 8*sizeof(unsigned long) is BITS_PER_LONG:

> The code [..] isn't technically incorrect...
> But it is stupid.
> Why stupid? Because the test for 0 is pointless.
>
> Just doing
>        if (nlk->ngroups < 8*sizeof(groups))
>                groups &= (1UL << nlk->ngroups) - 1;
>
> would have been fine and more understandable, since the "mask by shift
> count" already does the right thing for a ngroups value of 0. Now that
> test for zero makes me go "what's special about zero?". It turns out
> that the answer to that is "nothing".
[..]
> The type of "groups" is kind of silly too.
>
> Yeah, "long unsigned int" isn't _technically_ wrong. But we normally
> call that type "unsigned long".

Cleanup my piece of pointlessness.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: netdev@vger.kernel.org
Fairly-blamed-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

428f944b

packet: add sockopt to ignore outgoing packets · fa788d98

Vincent Whitchurch authored Sep 03, 2018

Currently, the only way to ignore outgoing packets on a packet socket is
via the BPF filter. With MSG_ZEROCOPY, packets that are looped into
AF_PACKET are copied in dev_queue_xmit_nit(), and this copy happens even
if the filter run from packet_rcv() would reject them. So the presence
of a packet socket on the interface takes away the benefits of
MSG_ZEROCOPY, even if the packet socket is not interested in outgoing
packets. (Even when MSG_ZEROCOPY is not used, the skb is unnecessarily
cloned, but the cost for that is much lower.)

Add a socket option to allow AF_PACKET sockets to ignore outgoing
packets to solve this. Note that the *BSDs already have something
similar: BIOCSSEESENT/BIOCSDIRECTION and BIOCSDIRFILT.

The first intended user is lldpd.
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

fa788d98

net/mlx5e: don't set CHECKSUM_COMPLETE on SCTP packets · fe1dc069

Alaa Hleihel authored Jul 10, 2018

CHECKSUM_COMPLETE is not applicable to SCTP protocol.
Setting it for SCTP packets leads to CRC32c validation failure.

Fixes: bbceefce ("net/mlx5e: Support RX CHECKSUM_COMPLETE")
Signed-off-by: Alaa Hleihel <alaa@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

fe1dc069

net/mlx5e: Set ECN for received packets using CQE indication · f007c13d

Natali Shechtman authored Mar 20, 2018

In multi-host (MH) NIC scheme, a single HW port serves multiple hosts
or sockets on the same host.
The HW uses a mechanism in the PCIe buffer which monitors
the amount of consumed PCIe buffers per host.
On a certain configuration, under congestion,
the HW emulates a switch doing ECN marking on packets using ECN
indication on the completion descriptor (CQE).

The driver needs to set the ECN bits on the packet SKB,
such that the network stack can react on that, this commit does that.
Signed-off-by: Natali Shechtman <natali@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

f007c13d

net/mlx5e: Replace PTP clock lock from RW lock to seq lock · 64109f1d

Shay Agroskin authored Jun 05, 2018

Changed "priv.clock.lock" lock from 'rw_lock' to 'seq_lock'
in order to improve packet rate performance.

Tested on Intel(R) Xeon(R) CPU E5-2660 v2 @ 2.20GHz.
Sent 64b packets between two peers connected by ConnectX-5,
and measured packet rate for the receiver in three modes:
	no time-stamping (base rate)
	time-stamping using rw_lock (old lock) for critical region
	time-stamping using seq_lock (new lock) for critical region
Only the receiver time stamped its packets.

The measured packet rate improvements are:

	Single flow (multiple TX rings to single RX ring):
		without timestamping:	  4.26 (M packets)/sec
		with rw-lock (old lock):  4.1  (M packets)/sec
		with seq-lock (new lock): 4.16 (M packets)/sec
		1.46% improvement

	Multiple flows (multiple TX rings to six RX rings):
		without timestamping: 	  22   (M packets)/sec
		with rw-lock (old lock):  11.7 (M packets)/sec
		with seq-lock (new lock): 21.3 (M packets)/sec
		82.05% improvement

The packet rate improvement is due to the lack of atomic operations
for the 'readers' by the seq-lock.
Since there are much more 'readers' than 'writers' contention
on this lock, almost all atomic operations are saved.
this results in a dramatic decrease in overall
cache misses.
Signed-off-by: Shay Agroskin <shayag@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

64109f1d

net/mlx5e: Move Q counters allocation and drop RQ to init_rx · 1462e48d

Roi Dayan authored Aug 05, 2018

Not all profiles query the HW Q counters in update_stats() callback.
HW Q couners are limited per device and in case of representors all
their Q counters are allocated on the parent PF device.
Avoid reundant allocation of HW Q counters by moving the allocation
to init_rx profile callback.
Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

1462e48d

net/mlx5e: Move mlx5e_priv_flags into en_ethtool.c · d2408205

Kamal Heib authored Jul 15, 2018

Move the definition of mlx5e_priv_flags into en_ethtool.c because it's
only used there.

Fixes: 4e59e288 ("net/mlx5e: Introduce net device priv flags infrastructure")
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

d2408205

net/mlx5: Add flow counters idr · 12d6066c

Vlad Buslov authored Jul 24, 2018

Previous patch in series changed flow counter storage structure from
rb_tree to linked list in order to improve flow counter traversal
performance. The drawback of such solution is that flow counter lookup by
id becomes linear in complexity.

Store pointers to flow counters in idr in order to improve lookup
performance to logarithmic again. Idr is non-intrusive data structure and
doesn't require extending flow counter struct with new elements. This means
that idr can be used for lookup, while linked list from previous patch is
used for traversal, and struct mlx5_fc size is <= 2 cache lines.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Amir Vadai <amir@vadai.me>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

12d6066c