Commits · 39b7b6a6247568f99df059e8a4c9bd674f6b99ac · nexedi / linux

19 Jan, 2017 1 commit

net/sched: cls_flower: reduce fl_change stack size · 39b7b6a6

Arnd Bergmann authored Jan 19, 2017

The new ARP support has pushed the stack size over the edge on ARM,
as there are two large objects on the stack in this function (mask
and tb) and both have now grown a bit more:

net/sched/cls_flower.c: In function 'fl_change':
net/sched/cls_flower.c:928:1: error: the frame size of 1072 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]

We can solve this by dynamically allocating one or both of them.
I first tried to do it just for the mask, but that only saved
152 bytes on ARM, while this version just does it for the 'tb'
array, bringing the stack size back down to 664 bytes.

Fixes: 99d31326 ("net/sched: cls_flower: Support matching on ARP")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

39b7b6a6

18 Jan, 2017 28 commits

net: Remove usage of net_device last_rx member · 4a7c9726

Tobias Klauser authored Jan 18, 2017

The network stack no longer uses the last_rx member of struct net_device
since the bonding driver switched to use its own private last_rx in
commit 9f242738 ("bonding: use last_arp_rx in slave_last_rx()").

However, some drivers still (ab)use the field for their own purposes and
some driver just update it without actually using it.

Previously, there was an accompanying comment for the last_rx member
added in commit 4dc89133 ("net: add a comment on netdev->last_rx")
which asked drivers not to update is, unless really needed. However,
this commend was removed in commit f8ff080d ("bonding: remove
useless updating of slave->dev->last_rx"), so some drivers added later
on still did update last_rx.

Remove all usage of last_rx and switch three drivers (sky2, atp and
smc91c92_cs) which actually read and write it to use their own private
copy in netdev_priv.

Compile-tested with allyesconfig and allmodconfig on x86 and arm.

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Jay Vosburgh <j.vosburgh@gmail.com>
Cc: Veaceslav Falico <vfalico@gmail.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Cc: Mirko Lindner <mlindner@marvell.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Acked-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4a7c9726

net: dsa: use cpu_switch instead of ds[0] · 9520ed8f

Vivien Didelot authored Jan 17, 2017

Now that the DSA Ethernet switches are true Linux devices, the CPU
switch is not necessarily the first one. If its address is higher than
the second switch on the same MDIO bus, its index will be 1, not 0.

Avoid any confusion by using dst->cpu_switch instead of dst->ds[0].
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9520ed8f

net: dsa: store CPU switch structure in the tree · b22de490

Vivien Didelot authored Jan 17, 2017

Store a dsa_switch pointer to the CPU switch in the tree instead of only
its index. This avoids the need to initialize it to -1.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b22de490

net: ethernet: ti: davinci_cpdma: correct check on NULL in set rate · e33c2ef1

Ivan Khoronzhuk authored Jan 18, 2017

Check "ch" on NULL first, then get ctlr.
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

e33c2ef1

Merge branch 'vhost_net-batching' · e3e37e70

David S. Miller authored Jan 18, 2017

Jason Wang says:

====================
vhost_net tx batching

This series tries to implement tx batching support for vhost. This was
done by using MSG_MORE as a hint for under layer socket. The backend
(e.g tap) can then batch the packets temporarily in a list and
submit it all once the number of bacthed exceeds a limitation.

Tests shows obvious improvement on guest pktgen over over
mlx4(noqueue) on host:

                                     Mpps  -+%
        rx-frames = 0                0.91  +0%
        rx-frames = 4                1.00  +9.8%
        rx-frames = 8                1.00  +9.8%
        rx-frames = 16               1.01  +10.9%
        rx-frames = 32               1.07  +17.5%
        rx-frames = 48               1.07  +17.5%
        rx-frames = 64               1.08  +18.6%
        rx-frames = 64 (no MSG_MORE) 0.91  +0%

Changes from V4:
- stick to NAPI_POLL_WEIGHT for rx-frames is user specify a value
  greater than it.
Changes from V3:
- use ethtool instead of module parameter to control the maximum
  number of batched packets
- avoid overhead when MSG_MORE were not set and no packet queued
Changes from V2:
- remove uselss queue limitation check (and we don't drop any packet now)
Changes from V1:
- drop NAPI handler since we don't use NAPI now
- fix the issues that may exceeds max pending of zerocopy
- more improvement on available buffer detection
- move the limitation of batched pacekts from vhost to tuntap
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

e3e37e70

tun: rx batching · 5503fcec

Jason Wang authored Jan 18, 2017

We can only process 1 packet at one time during sendmsg(). This often
lead bad cache utilization under heavy load. So this patch tries to do
some batching during rx before submitting them to host network
stack. This is done through accepting MSG_MORE as a hint from
sendmsg() caller, if it was set, batch the packet temporarily in a
linked list and submit them all once MSG_MORE were cleared.

Tests were done by pktgen (burst=128) in guest over mlx4(noqueue) on host:

                                 Mpps  -+%
    rx-frames = 0                0.91  +0%
    rx-frames = 4                1.00  +9.8%
    rx-frames = 8                1.00  +9.8%
    rx-frames = 16               1.01  +10.9%
    rx-frames = 32               1.07  +17.5%
    rx-frames = 48               1.07  +17.5%
    rx-frames = 64               1.08  +18.6%
    rx-frames = 64 (no MSG_MORE) 0.91  +0%

User were allowed to change per device batched packets through
ethtool -C rx-frames. NAPI_POLL_WEIGHT were used as upper limitation
to prevent bh from being disabled too long.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

5503fcec

vhost_net: tx batching · 0ed005ce

Jason Wang authored Jan 18, 2017

This patch tries to utilize tuntap rx batching by peeking the tx
virtqueue during transmission, if there's more available buffers in
the virtqueue, set MSG_MORE flag for a hint for backend (e.g tuntap)
to batch the packets.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0ed005ce

vhost: better detection of available buffers · 275bf960

Jason Wang authored Jan 18, 2017

This patch tries to do several tweaks on vhost_vq_avail_empty() for a
better performance:

- check cached avail index first which could avoid userspace memory access.
- using unlikely() for the failure of userspace access
- check vq->last_avail_idx instead of cached avail index as the last
  step.

This patch is need for batching supports which needs to peek whether
or not there's still available buffers in the ring.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

275bf960

net:add one common config ARCH_WANT_RELAX_ORDER to support relax ordering · 1a8b6d76

Mao Wenan authored Jan 18, 2017

Relax ordering(RO) is one feature of 82599 NIC, to enable this feature can
enhance the performance for some cpu architecure, such as SPARC and so on.
Currently it only supports one special cpu architecture(SPARC) in 82599
driver to enable RO feature, this is not very common for other cpu architecture
which really needs RO feature.
This patch add one common config CONFIG_ARCH_WANT_RELAX_ORDER to set RO feature,
and should define CONFIG_ARCH_WANT_RELAX_ORDER in sparc Kconfig firstly.
Signed-off-by: Mao Wenan <maowenan@huawei.com>
Reviewed-by: Alexander Duyck <alexander.duyck@gmail.com>
Reviewed-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1a8b6d76

Merge branch 'ipv6-simplify-rt6_fill_node' · 1e48aac1

David S. Miller authored Jan 18, 2017

David Ahern says:

====================
net: ipv6: simplify rt6_fill_node

Remove a couple of unnecessary input arguments to rt6_fill_node.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

1e48aac1

net: ipv6: remove prefix arg to rt6_fill_node · f8cfe2ce

David Ahern authored Jan 17, 2017

The prefix arg to rt6_fill_node is non-0 in only 1 path - rt6_dump_route
where a user is requesting a prefix only dump. Simplify rt6_fill_node
by removing the prefix arg and moving the prefix check to rt6_dump_route.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f8cfe2ce

net: ipv6: remove nowait arg to rt6_fill_node · fd61c6ba

David Ahern authored Jan 17, 2017

All callers of rt6_fill_node pass 0 for nowait arg. Remove the arg and
simplify rt6_fill_node accordingly.

rt6_fill_node passes the nowait of 0 to ip6mr_get_route. Remove the
nowait arg from it as well.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

fd61c6ba

Merge branch 'sctp-sender-side-stream-reconf-ssn-reset-request-chunk' · 1ce463dd

David S. Miller authored Jan 18, 2017

Xin Long says:

====================
sctp: add sender-side procedures for stream reconf ssn reset request chunk

Patch 6/6 is to implement sender-side procedures for the Outgoing
and Incoming SSN Reset Request Parameter described in rfc6525
section 5.1.2 and 5.1.3

Patches 1-5/6 are ahead of it to define some apis and asoc members
for it.

Note that with this patchset, asoc->reconf_enable has no chance yet to
be set, until the patch "sctp: add get and set sockopt for reconf_enable"
is applied in the future. As we can not just enable it when sctp is not
capable of processing reconf chunk yet.

v1->v2:
  - put these into a smaller group.
  - rename some temporary variables in the codes.
  - rename the titles of the commits and improve some changelogs.
v2->v3:
  - re-split the patchset and make sure it has no dead codes for review.
v3->v4:
  - move sctp_make_reconf() into patch 1/6 to avoid kbuild warning.
  - drop unused struct sctp_strreset_req.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

1ce463dd