Commits · 54e9e0874938ba5b112b9352e280b4962590a57d · Kirill Smelkov / linux

18 May, 2017 17 commits

net/wan/fsl_ucc_hdlc: call qe_setbrg only for loopback mode · 54e9e087

Holger Brunck authored May 17, 2017

We can't assume that we are always in loopback mode if rx and tx clock
have the same clock source. If we want to use HDLC busmode we also have
the same clock source but we are not in loopback mode. So move the
setting of the baudrate generator after the check for property for the
loopback mode.
Signed-off-by: Holger Brunck <holger.brunck@keymile.com>
Cc: Zhao Qiang <qiang.zhao@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

54e9e087

net/wan/fsl_ucc_hdlc: fix incorrect memory allocation · 5b8aad93

Holger Brunck authored May 17, 2017

We need space for the struct qe_bd and not for a pointer to this struct.
Signed-off-by: Holger Brunck <holger.brunck@keymile.com>
Cc: Zhao Qiang <qiang.zhao@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

5b8aad93

net/wan/fsl_ucc_hdlc: fix wrong indentation · 10515db5

Holger Brunck authored May 17, 2017

Signed-off-by: Holger Brunck <holger.brunck@keymile.com>
Cc: Zhao Qiang <qiang.zhao@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

10515db5

net/wan/fsl_ucc_hdlc: fix unitialized variable warnings · 66bb144b

Holger Brunck authored May 17, 2017

This fixes the following compiler warnings:
drivers/net/wan/fsl_ucc_hdlc.c: In function 'ucc_hdlc_poll':
warning: 'skb' may be used uninitialized in this function
[-Wmaybe-uninitialized]
  skb->mac_header = skb->data - skb->head;

and

drivers/net/wan/fsl_ucc_hdlc.c: In function 'ucc_hdlc_probe':
drivers/net/wan/fsl_ucc_hdlc.c:1127:3: warning: 'utdm' may be used
uninitialized in this function [-Wmaybe-uninitialized]
   kfree(utdm);
Signed-off-by: Holger Brunck <holger.brunck@keymile.com>
Cc: Zhao Qiang <qiang.zhao@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

66bb144b

net/wan/fsl_ucc_hdlc: cleanup debug traces · f0897854

Holger Brunck authored May 17, 2017

Some of the tracing seems to be remaining traces for basic driver
development. They can be removed now, as they cause noisy printouts.
Signed-off-by: Holger Brunck <holger.brunck@keymile.com>
Cc: Zhao Qiang <qiang.zhao@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f0897854

net: make struct net_device::tx_queue_len unsigned int · 0cd29503

Alexey Dobriyan authored May 17, 2017

4 billion packet queue is something unthinkable so use 32-bit value
for now.

Space savings on x86_64:

	add/remove: 0/0 grow/shrink: 3/70 up/down: 16/-131 (-115)
	function                                     old     new   delta
	change_tx_queue_len                           94     108     +14
	qdisc_create                                1176    1177      +1
	alloc_netdev_mqs                            1124    1125      +1
	xenvif_alloc                                 533     532      -1
	x25_asy_setup                                167     166      -1
			...
	tun_queue_resize                             945     940      -5
	pfifo_fast_enqueue                           167     162      -5
	qfq_init_qdisc                               168     158     -10
	tap_queue_resize                             810     799     -11
	transmit                                     719     698     -21
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0cd29503

udp: make function udp_skb_dtor_locked static · 64f5102d

Colin Ian King authored May 17, 2017

Function udp_skb_dtor_locked does not need to be in global scope
so make it static to fix sparse warning:

net/ipv4/udp.c: warning: symbol 'udp_skb_dtor_locked' was not
declared. Should it be static?

Fixes: 6dfb4367 ("udp: keep the sk_receive_queue held when splicing")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

64f5102d

Merge branch 'vhost_net-rx-batch-dequeuing' · f646c75b

David S. Miller authored May 18, 2017

Jason Wang says:

====================
vhost_net rx batch dequeuing

This series tries to implement rx batching for vhost-net. This is done
by batching the dequeuing from skb_array which was exported by
underlayer socket and pass the sbk back through msg_control to finish
userspace copying. This is also the requirement for more batching
implemention on rx path.

Tests shows at most 7.56% improvment bon rx pps on top of batch
zeroing and no obvious changes for TCP_STREAM/TCP_RR result.

Please review.

Thanks

Changes from V4:
- drop batch zeroing patch
- renew the performance numbers
- move skb pointer array out of vhost_net structure

Changes from V3:
- add batch zeroing patch to fix the build warnings

Changes from V2:
- rebase to net-next HEAD
- use unconsume helpers to put skb back on releasing
- introduce and use vhost_net internal buffer helpers
- renew performance numbers on top of batch zeroing

Changes from V1:
- switch to use for() in __ptr_ring_consume_batched()
- rename peek_head_len_batched() to fetch_skbs()
- use skb_array_consume_batched() instead of
  skb_array_consume_batched_bh() since no consumer run in bh
- drop the lockless peeking patch since skb_array could be resized, so
  it's not safe to call lockless one
====================
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f646c75b

vhost_net: try batch dequing from skb array · c67df11f

Jason Wang authored May 17, 2017

We used to dequeue one skb during recvmsg() from skb_array, this could
be inefficient because of the bad cache utilization and spinlock
touching for each packet. This patch tries to batch them by calling
batch dequeuing helpers explicitly on the exported skb array and pass
the skb back through msg_control for underlayer socket to finish the
userspace copying. Batch dequeuing is also the requirement for more
batching improvement on receive path.

Tests were done by pktgen on tap with XDP1 in guest. Host is Intel(R)
Xeon(R) CPU E5-2650 0 @ 2.00GHz.

rx batch | pps

0   2.25Mpps
1   2.33Mpps (+3.56%)
4   2.33Mpps (+3.56%)
16  2.35Mpps (+4.44%)
64  2.42Mpps (+7.56%) <- Default rx batching
128 2.40Mpps (+6.67%)
256 2.38Mpps (+5.78%)
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c67df11f

tap: support receiving skb from msg_control · 3b4ba04a

Jason Wang authored May 17, 2017

This patch makes tap_recvmsg() can receive from skb from its caller
through msg_control. Vhost_net will be the first user.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

3b4ba04a

tun: support receiving skb through msg_control · ac77cfd4

Jason Wang authored May 17, 2017

This patch makes tun_recvmsg() can receive from skb from its caller
through msg_control. Vhost_net will be the first user.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ac77cfd4

tap: export skb_array · 49f96fd0

Jason Wang authored May 17, 2017

This patch exports skb_array through tap_get_skb_array(). Caller can
then manipulate skb array directly.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

49f96fd0

tun: export skb_array · 83339c6b

Jason Wang authored May 17, 2017

This patch exports skb_array through tun_get_skb_array(). Caller can
then manipulate skb array directly.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

83339c6b

skb_array: introduce batch dequeuing · 3528c1a5

Jason Wang authored May 17, 2017

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

3528c1a5

ptr_ring: introduce batch dequeuing · 728fc8d5

Jason Wang authored May 17, 2017

This patch introduce a batched version of consuming, consumer can
dequeue more than one pointers from the ring at a time. We don't care
about the reorder of reading here so no need for compiler barrier.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

728fc8d5

skb_array: introduce skb_array_unconsume · 3acb6960

Jason Wang authored May 17, 2017

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

3acb6960

ptr_ring: add ptr_ring_unconsume · 197a5212

Michael S. Tsirkin authored May 17, 2017

Applications that consume a batch of entries in one go
can benefit from ability to return some of them back
into the ring.

Add an API for that - assuming there's space. If there's no space
naturally can't do this and have to drop entries, but this implies ring
is full so we'd likely drop some anyway.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

197a5212

17 May, 2017 23 commits

Merge branch 'phy-marvell-cleanups' · 1fc4d180

David S. Miller authored May 17, 2017

Andrew Lunn says:

====================
net: phy: marvell: Checkpatch cleanup

I will be contributing a few new features to the Marvell PHY driver
soon. Start by making the code mostly checkpatch clean. There should
not be any functional changes. Just comments set into the correct
format, missing blank lines, turn some comparisons around, and
refactoring to reduce indentation depth.

There is still one camel in the code, but it actually makes sense, so
leave it in piece.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

1fc4d180

net: phy: marvell: checkpatch - Fix remaining long lines · 23beb38f

Andrew Lunn authored May 17, 2017

Fold lines longer than 80 characters
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

23beb38f

net: phy: marvell: Add helpers to get/set page · 6427bb2d

Andrew Lunn authored May 17, 2017

Makes the code a bit more readable, and solves quite a few checkpatch
warnings of lines longer than 80 characters.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

6427bb2d

net: phy: marvell: Refactor some bigger functions · e1dde8dc

Andrew Lunn authored May 17, 2017

Break big functions up by using a number of smaller helper
function. Solves some of the over 80 lines warnings, by reducing the
indentation level.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

e1dde8dc

net: phy: marvell: Checkpatch - assignments and comparisons · 4f48ed32

Andrew Lunn authored May 17, 2017

Avoid multiple assignments
Comparisons should place the constant on the right side of the test
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

4f48ed32

net: phy: marvell: Checkpatch - Missing or extra blank lines · e69d9ed4

Andrew Lunn authored May 17, 2017

Remove the extra blank lines, add one in where recommended.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

e69d9ed4

net: phy: Marvell: checkpatch - Comments · 0c3439bc

Andrew Lunn authored May 17, 2017

Use net style comment blocks, and wrap one block with long lines.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

0c3439bc

Merge branch 'tcp-TCP-TS-option-use-1-ms-clock' · e26925ec

David S. Miller authored May 17, 2017

Eric Dumazet says:

====================
tcp: TCP TS option use 1 ms clock

TCP Timestamps option is defined in RFC 7323

Traditionally on linux, it has been tied to the internal
'jiffy' variable, because it had been a cheap and good enough
generator.

Unfortunately some distros use HZ=250 or even HZ=100 leading
to not very useful TCP timestamps.

For TCP flows in the DC, Google has used usec resolution for more
than two years with great success [1].
RCVBUF autotuning is more precise.

This series converts tp->tcp_mstamp to a plain u64 value storing
a 1 usec TCP clock.

This choice will allow us to upstream the 1 usec TS option as
discussed in IETF 97.

Kathleen Nichols [2] and others advocate for 1ms TS clocks for
network analysis. (1ms being the lowest value supported by RFC 7323.)

[1] https://www.ietf.org/proceedings/97/slides/slides-97-tcpm-tcp-options-for-low-latency-00.pdf
[2] http://netseminar.stanford.edu/seminars/02_02_17.pdf
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

e26925ec

tcp: switch TCP TS option (RFC 7323) to 1ms clock · 9a568de4

Eric Dumazet authored May 16, 2017

TCP Timestamps option is defined in RFC 7323

Traditionally on linux, it has been tied to the internal
'jiffies' variable, because it had been a cheap and good enough
generator.

For TCP flows on the Internet, 1 ms resolution would be much better
than 4ms or 10ms (HZ=250 or HZ=100 respectively)

For TCP flows in the DC, Google has used usec resolution for more
than two years with great success [1]

Receive size autotuning (DRS) is indeed more precise and converges
faster to optimal window size.

This patch converts tp->tcp_mstamp to a plain u64 value storing
a 1 usec TCP clock.

This choice will allow us to upstream the 1 usec TS option as
discussed in IETF 97.

[1] https://www.ietf.org/proceedings/97/slides/slides-97-tcpm-tcp-options-for-low-latency-00.pdfSigned-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9a568de4

tcp: replace misc tcp_time_stamp to tcp_jiffies32 · ac9517fc

Eric Dumazet authored May 16, 2017

After this patch, all uses of tcp_time_stamp will require
a change when we introduce 1 ms and/or 1 us TCP TS option.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ac9517fc

tcp_lp: cache tcp_time_stamp · 46bf466f

Eric Dumazet authored May 16, 2017

tcp_time_stamp will become slightly more expensive soon,
cache its value.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

46bf466f

tcp_westwood: use tcp_jiffies32 instead of tcp_time_stamp · ad5ad69e

Eric Dumazet authored May 16, 2017

This CC does not need 1 ms tcp_time_stamp and can use
the jiffy based 'timestamp'.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ad5ad69e

tcp: use tcp_jiffies32 in __tcp_oow_rate_limited() · 594208af

Eric Dumazet authored May 16, 2017

This place wants to use tcp_jiffies32, this is good enough.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

594208af

tcp: uses jiffies_32 to feed tp->chrono_start · 628174cc

Eric Dumazet authored May 16, 2017

tcp_time_stamp will no longer be tied to jiffies.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

628174cc

tcp: use tcp_jiffies32 to feed probe_timestamp · c74df29a

Eric Dumazet authored May 16, 2017

Use tcp_jiffies32 instead of tcp_time_stamp, since
tcp_time_stamp will soon be only used for TCP TS option.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c74df29a

tcp: use tcp_jiffies32 for rcv_tstamp and lrcvtime · 70eabf0e

Eric Dumazet authored May 16, 2017

Use tcp_jiffies32 instead of tcp_time_stamp, since
tcp_time_stamp will soon be only used for TCP TS option.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

70eabf0e

tcp: bic, cubic: use tcp_jiffies32 instead of tcp_time_stamp · ac35f562

Eric Dumazet authored May 16, 2017

Use tcp_jiffies32 instead of tcp_time_stamp, since
tcp_time_stamp will soon be only used for TCP TS option.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ac35f562

tcp_bbr: use tcp_jiffies32 instead of tcp_time_stamp · 2660bfa8

Eric Dumazet authored May 16, 2017

Use tcp_jiffies32 instead of tcp_time_stamp, since
tcp_time_stamp will soon be only used for TCP TS option.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

2660bfa8

tcp: use tcp_jiffies32 to feed tp->snd_cwnd_stamp · c2203cf7

Eric Dumazet authored May 16, 2017

Use tcp_jiffies32 instead of tcp_time_stamp to feed
tp->snd_cwnd_stamp.

tcp_time_stamp will soon be a litle bit more expensive
than simply reading 'jiffies'.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c2203cf7

tcp: use tcp_jiffies32 to feed tp->lsndtime · d635fbe2

Eric Dumazet authored May 16, 2017

Use tcp_jiffies32 instead of tcp_time_stamp to feed
tp->lsndtime.

tcp_time_stamp will soon be a litle bit more expensive
than simply reading 'jiffies'.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d635fbe2

dccp: do not use tcp_time_stamp · d011b9a4

Eric Dumazet authored May 16, 2017

Use our own macro instead of abusing tcp_time_stamp
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d011b9a4

tcp: introduce tcp_jiffies32 · ec66eda8

Eric Dumazet authored May 16, 2017

We abuse tcp_time_stamp for two different cases :

1) base to generate TCP Timestamp options (RFC 7323)

2) A 32bit version of jiffies since some TCP fields
   are 32bit wide to save memory.

Since we want in the future to have 1ms TCP TS clock,
regardless of HZ value, we want to cleanup things.

tcp_jiffies32 is the truncated jiffies value,
which will be used only in places where we want a 'host'
timestamp.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ec66eda8

tcp: use tp->tcp_mstamp in output path · 385e2070

Eric Dumazet authored May 16, 2017

Idea is to later convert tp->tcp_mstamp to a full u64 counter
using usec resolution, so that we can later have fine
grained TCP TS clock (RFC 7323), regardless of HZ value.

We try to refresh tp->tcp_mstamp only when necessary.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

385e2070