Commits · 3dd69e3dd21969a34406862ae299538e2e6387e7 · Kirill Smelkov / linux

19 Jan, 2017 15 commits

net/mlx5e: Reorder update stats · 3dd69e3d

Saeed Mahameed authored Dec 28, 2016

Reorder update stats flow to update most important counters last,
to get more accurate results.

New update order:
	- PCIe counters
	- Port counters
	- Vport counters
	- Queue counters
	- Software counters
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Gal Pressman <galp@mellanox.com>

3dd69e3d

net/mlx5: Move cached hca caps to designated caps struct · 701052c5

Gal Pressman authored Dec 14, 2016

The caps structure consists of hca caps and port/management caps,
all under one roof.
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

701052c5

net/mlx5e: Expose PCIe statistics to ethtool · 0f7f3481

Gal Pressman authored Nov 17, 2016

This patch exposes PCIe performance counters, queried with
ethtool -S <devname>.
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

0f7f3481

net/mlx5: Add MPCNT register infrastructure · 8ed1a630

Gal Pressman authored Nov 17, 2016

Add the needed infrastructure for future use of MPCNT register.
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

8ed1a630

net/mlx5e: Expose physical layer statistical counters to ethtool · 5db0a4f6

Gal Pressman authored Aug 23, 2016

Use ethtool -S to query physical layer statistical counters including:
- rx_symbol_errors_phy: Number of symbol errors that were not corrected
  by FEC correction algorithm or that FEC was not active on this interface.

- rx_corrected_bits_phy: Number of corrected bits according to active
  FEC (RS/FC).
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

5db0a4f6

net/mlx5: Add PPCNT physical layer statistical group infrastructure · d8dc0508

Gal Pressman authored Sep 27, 2016

Add the needed infrastructure for future use of PPCNT physical layer
statistical group.
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>

d8dc0508

net/mlx5: Query and cache PCAM, MCAM registers on initialization · 71862561

Gal Pressman authored Dec 08, 2016

On load_one, we now cache our capabilities registers internally, similar
to QUERY_HCA_CAP. Capabilities can later be queried using macros
introduced in this patch.
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

71862561

net/mlx5: Implement PCAM, MCAM access register commands · c835ad64

Gal Pressman authored Dec 08, 2016

Introduced registers will expose capabilities of new registers and
features related to port/management.
Driver will query MCAM and PCAM in order to avoid failing on old
firmwares with lack of support.
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

c835ad64

net/mlx5: Expose PCAM, MCAM registers infrastructure · cfdcbcea

Gal Pressman authored Dec 08, 2016

PCAM: Ports capabilities mask register.
MCAM: Management capabilities mask register.

PCAM and MCAM registers will provide information regarding firmware
support for different features, in order to avoid cases where new driver
combined with old firmware results in syndromes (for ex. PCIe counters
before this patchset).
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

cfdcbcea

net/mlx5e: Receive s-tagged packets in promiscuous mode · 8a271746

Mohamad Haj Yahia authored Oct 09, 2016

Today when the driver enter to promiscuous mode or vlan
filter is disabled, we add flow rule to receive any c-taggd
packets, therefore s-tagged packets are dropped.
In order to receive s-tagged packets as well we need to add
flow rule to receive any s-tagged packet.

Fixes: 7cb21b79 ('net/mlx5e: Rename en_flow_table.c to en_fs.c')
Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

8a271746

net/mlx5: Add support to s-tag in mlx5 firmware interface · 10543365

Mohamad Haj Yahia authored Oct 09, 2016

Add svlan_tag and rename vlan_tag to cvlan_tag in flow table entry
match param.
Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>

10543365

net/mlx5e: Implement 1PPS support · ee7f1220

Eugenia Emantayev authored Aug 22, 2016

This patch enables the 1PPS IN and 1PPS OUT support according
to the advertised HCA capability. Single pin may be configured
to one of the above mutual exclusive functions via standard
Linux tools and APIs. For example, testptp open source application.
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

ee7f1220

net/mlx5: Add MTPPS and MTPPSE registers infrastructure · f9a1ef72

Eugenia Emantayev authored Oct 10, 2016

Implement query and set functionality for MTPPS and MTPPSE registers.
MTPPS (Management Pulse Per Second) provides the device PPS capabilities,
configures the PPS in and out modules and holds the PPS in time stamp.
Query MTPPS is supported only when HCA_CAP.pps is set and modify is supported
when HCA_CAP.pps_modify is set.

MTPPSE (Management Pulse Per Second Event) configures the different event
generation modes for PPS. Supported when HCA_CAP.pps is set.
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

f9a1ef72

net/mlx5: Fix version printout in case of health issue · 712bfef6

Eli Cohen authored Dec 16, 2016

Firmware representation of the firmware version on the health buffer has
changed for newer device. The representation in the initialization
segment does not and will not change. In addition, we print the health
buffer firmware version as a raw hex number.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

712bfef6

net/mlx5: Remove information print after attempt to load mlx5_ib module · f82eed45

Leon Romanovsky authored Dec 01, 2016

Infiniband part of mlx5 driver can be compiled as a module
or as a part of bzImage (compiled in). In the second case,
the call to request_module will return an error -ENOENT.
It will cause to a misleading print "failed request module
on mlx5_ib".

This patch removes this print, In order to comply with mlx4.

Fixes: f66f049f ("net/mlx5_core: Request the mlx5 IB module on driver load")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

f82eed45

18 Jan, 2017 25 commits

net: Remove usage of net_device last_rx member · 4a7c9726

Tobias Klauser authored Jan 18, 2017

The network stack no longer uses the last_rx member of struct net_device
since the bonding driver switched to use its own private last_rx in
commit 9f242738 ("bonding: use last_arp_rx in slave_last_rx()").

However, some drivers still (ab)use the field for their own purposes and
some driver just update it without actually using it.

Previously, there was an accompanying comment for the last_rx member
added in commit 4dc89133 ("net: add a comment on netdev->last_rx")
which asked drivers not to update is, unless really needed. However,
this commend was removed in commit f8ff080d ("bonding: remove
useless updating of slave->dev->last_rx"), so some drivers added later
on still did update last_rx.

Remove all usage of last_rx and switch three drivers (sky2, atp and
smc91c92_cs) which actually read and write it to use their own private
copy in netdev_priv.

Compile-tested with allyesconfig and allmodconfig on x86 and arm.

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Jay Vosburgh <j.vosburgh@gmail.com>
Cc: Veaceslav Falico <vfalico@gmail.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Cc: Mirko Lindner <mlindner@marvell.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Acked-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4a7c9726

net: dsa: use cpu_switch instead of ds[0] · 9520ed8f

Vivien Didelot authored Jan 17, 2017

Now that the DSA Ethernet switches are true Linux devices, the CPU
switch is not necessarily the first one. If its address is higher than
the second switch on the same MDIO bus, its index will be 1, not 0.

Avoid any confusion by using dst->cpu_switch instead of dst->ds[0].
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9520ed8f

net: dsa: store CPU switch structure in the tree · b22de490

Vivien Didelot authored Jan 17, 2017

Store a dsa_switch pointer to the CPU switch in the tree instead of only
its index. This avoids the need to initialize it to -1.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b22de490

net: ethernet: ti: davinci_cpdma: correct check on NULL in set rate · e33c2ef1

Ivan Khoronzhuk authored Jan 18, 2017

Check "ch" on NULL first, then get ctlr.
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

e33c2ef1

Merge branch 'vhost_net-batching' · e3e37e70

David S. Miller authored Jan 18, 2017

Jason Wang says:

====================
vhost_net tx batching

This series tries to implement tx batching support for vhost. This was
done by using MSG_MORE as a hint for under layer socket. The backend
(e.g tap) can then batch the packets temporarily in a list and
submit it all once the number of bacthed exceeds a limitation.

Tests shows obvious improvement on guest pktgen over over
mlx4(noqueue) on host:

                                     Mpps  -+%
        rx-frames = 0                0.91  +0%
        rx-frames = 4                1.00  +9.8%
        rx-frames = 8                1.00  +9.8%
        rx-frames = 16               1.01  +10.9%
        rx-frames = 32               1.07  +17.5%
        rx-frames = 48               1.07  +17.5%
        rx-frames = 64               1.08  +18.6%
        rx-frames = 64 (no MSG_MORE) 0.91  +0%

Changes from V4:
- stick to NAPI_POLL_WEIGHT for rx-frames is user specify a value
  greater than it.
Changes from V3:
- use ethtool instead of module parameter to control the maximum
  number of batched packets
- avoid overhead when MSG_MORE were not set and no packet queued
Changes from V2:
- remove uselss queue limitation check (and we don't drop any packet now)
Changes from V1:
- drop NAPI handler since we don't use NAPI now
- fix the issues that may exceeds max pending of zerocopy
- more improvement on available buffer detection
- move the limitation of batched pacekts from vhost to tuntap
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

e3e37e70

tun: rx batching · 5503fcec

Jason Wang authored Jan 18, 2017

We can only process 1 packet at one time during sendmsg(). This often
lead bad cache utilization under heavy load. So this patch tries to do
some batching during rx before submitting them to host network
stack. This is done through accepting MSG_MORE as a hint from
sendmsg() caller, if it was set, batch the packet temporarily in a
linked list and submit them all once MSG_MORE were cleared.

Tests were done by pktgen (burst=128) in guest over mlx4(noqueue) on host:

                                 Mpps  -+%
    rx-frames = 0                0.91  +0%
    rx-frames = 4                1.00  +9.8%
    rx-frames = 8                1.00  +9.8%
    rx-frames = 16               1.01  +10.9%
    rx-frames = 32               1.07  +17.5%
    rx-frames = 48               1.07  +17.5%
    rx-frames = 64               1.08  +18.6%
    rx-frames = 64 (no MSG_MORE) 0.91  +0%

User were allowed to change per device batched packets through
ethtool -C rx-frames. NAPI_POLL_WEIGHT were used as upper limitation
to prevent bh from being disabled too long.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

5503fcec

vhost_net: tx batching · 0ed005ce

Jason Wang authored Jan 18, 2017

This patch tries to utilize tuntap rx batching by peeking the tx
virtqueue during transmission, if there's more available buffers in
the virtqueue, set MSG_MORE flag for a hint for backend (e.g tuntap)
to batch the packets.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0ed005ce

vhost: better detection of available buffers · 275bf960

Jason Wang authored Jan 18, 2017

This patch tries to do several tweaks on vhost_vq_avail_empty() for a
better performance:

- check cached avail index first which could avoid userspace memory access.
- using unlikely() for the failure of userspace access
- check vq->last_avail_idx instead of cached avail index as the last
  step.

This patch is need for batching supports which needs to peek whether
or not there's still available buffers in the ring.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

275bf960

net:add one common config ARCH_WANT_RELAX_ORDER to support relax ordering · 1a8b6d76

Mao Wenan authored Jan 18, 2017

Relax ordering(RO) is one feature of 82599 NIC, to enable this feature can
enhance the performance for some cpu architecure, such as SPARC and so on.
Currently it only supports one special cpu architecture(SPARC) in 82599
driver to enable RO feature, this is not very common for other cpu architecture
which really needs RO feature.
This patch add one common config CONFIG_ARCH_WANT_RELAX_ORDER to set RO feature,
and should define CONFIG_ARCH_WANT_RELAX_ORDER in sparc Kconfig firstly.
Signed-off-by: Mao Wenan <maowenan@huawei.com>
Reviewed-by: Alexander Duyck <alexander.duyck@gmail.com>
Reviewed-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1a8b6d76

Merge branch 'ipv6-simplify-rt6_fill_node' · 1e48aac1

David S. Miller authored Jan 18, 2017

David Ahern says:

====================
net: ipv6: simplify rt6_fill_node

Remove a couple of unnecessary input arguments to rt6_fill_node.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

1e48aac1

net: ipv6: remove prefix arg to rt6_fill_node · f8cfe2ce

David Ahern authored Jan 17, 2017

The prefix arg to rt6_fill_node is non-0 in only 1 path - rt6_dump_route
where a user is requesting a prefix only dump. Simplify rt6_fill_node
by removing the prefix arg and moving the prefix check to rt6_dump_route.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f8cfe2ce

net: ipv6: remove nowait arg to rt6_fill_node · fd61c6ba

David Ahern authored Jan 17, 2017

All callers of rt6_fill_node pass 0 for nowait arg. Remove the arg and
simplify rt6_fill_node accordingly.

rt6_fill_node passes the nowait of 0 to ip6mr_get_route. Remove the
nowait arg from it as well.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

fd61c6ba

Merge branch 'sctp-sender-side-stream-reconf-ssn-reset-request-chunk' · 1ce463dd

David S. Miller authored Jan 18, 2017

Xin Long says:

====================
sctp: add sender-side procedures for stream reconf ssn reset request chunk

Patch 6/6 is to implement sender-side procedures for the Outgoing
and Incoming SSN Reset Request Parameter described in rfc6525
section 5.1.2 and 5.1.3

Patches 1-5/6 are ahead of it to define some apis and asoc members
for it.

Note that with this patchset, asoc->reconf_enable has no chance yet to
be set, until the patch "sctp: add get and set sockopt for reconf_enable"
is applied in the future. As we can not just enable it when sctp is not
capable of processing reconf chunk yet.

v1->v2:
  - put these into a smaller group.
  - rename some temporary variables in the codes.
  - rename the titles of the commits and improve some changelogs.
v2->v3:
  - re-split the patchset and make sure it has no dead codes for review.
v3->v4:
  - move sctp_make_reconf() into patch 1/6 to avoid kbuild warning.
  - drop unused struct sctp_strreset_req.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

1ce463dd