Commit 02288248 authored by Jon Maloy's avatar Jon Maloy Committed by David S. Miller

tipc: eliminate gap indicator from ACK messages

When we increase the link send window we sometimes observe the
following scenario:

1) A packet #N arrives out of order far ahead of a sequence of older
   packets which are still under way. The packet is added to the
   deferred queue.
2) The missing packets arrive in sequence, and for each 16th of them
   an ACK is sent back to the receiver, as it should be.
3) When building those ACK messages, it is checked if there is a gap
   between the link's 'rcv_nxt' and the first packet in the deferred
   queue. This is always the case until packet number #N-1 arrives, and
   a 'gap' indicator is added, effectively turning them into NACK
   messages.
4) When those NACKs arrive at the sender, all the requested
   retransmissions are done, since it is a first-time request.

This sometimes leads to a huge amount of redundant retransmissions,
causing a drop in max throughput. This problem gets worse when we
in a later commit introduce variable window congestion control,
since it drops the link back to 'fast recovery' much more often
than necessary.

We now fix this by not sending any 'gap' indicator in regular ACK
messages. We already have a mechanism for sending explicit NACKs
in place, and this is sufficient to keep up the packet flow.
Acked-by: default avatarYing Xue <ying.xue@windriver.com>
Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
parent 08cbc75f
...@@ -1521,7 +1521,8 @@ static int tipc_link_build_nack_msg(struct tipc_link *l, ...@@ -1521,7 +1521,8 @@ static int tipc_link_build_nack_msg(struct tipc_link *l,
struct sk_buff_head *xmitq) struct sk_buff_head *xmitq)
{ {
u32 def_cnt = ++l->stats.deferred_recv; u32 def_cnt = ++l->stats.deferred_recv;
u32 defq_len = skb_queue_len(&l->deferdq); struct sk_buff_head *dfq = &l->deferdq;
u32 defq_len = skb_queue_len(dfq);
int match1, match2; int match1, match2;
if (link_is_bc_rcvlink(l)) { if (link_is_bc_rcvlink(l)) {
...@@ -1532,8 +1533,12 @@ static int tipc_link_build_nack_msg(struct tipc_link *l, ...@@ -1532,8 +1533,12 @@ static int tipc_link_build_nack_msg(struct tipc_link *l,
return 0; return 0;
} }
if (defq_len >= 3 && !((defq_len - 3) % 16)) if (defq_len >= 3 && !((defq_len - 3) % 16)) {
tipc_link_build_proto_msg(l, STATE_MSG, 0, 0, 0, 0, 0, xmitq); u16 rcvgap = buf_seqno(skb_peek(dfq)) - l->rcv_nxt;
tipc_link_build_proto_msg(l, STATE_MSG, 0, 0,
rcvgap, 0, 0, xmitq);
}
return 0; return 0;
} }
...@@ -1631,7 +1636,7 @@ static void tipc_link_build_proto_msg(struct tipc_link *l, int mtyp, bool probe, ...@@ -1631,7 +1636,7 @@ static void tipc_link_build_proto_msg(struct tipc_link *l, int mtyp, bool probe,
if (!tipc_link_is_up(l) && (mtyp == STATE_MSG)) if (!tipc_link_is_up(l) && (mtyp == STATE_MSG))
return; return;
if (!skb_queue_empty(dfq)) if ((probe || probe_reply) && !skb_queue_empty(dfq))
rcvgap = buf_seqno(skb_peek(dfq)) - l->rcv_nxt; rcvgap = buf_seqno(skb_peek(dfq)) - l->rcv_nxt;
skb = tipc_msg_create(LINK_PROTOCOL, mtyp, INT_H_SIZE, skb = tipc_msg_create(LINK_PROTOCOL, mtyp, INT_H_SIZE,
...@@ -2079,7 +2084,6 @@ static int tipc_link_proto_rcv(struct tipc_link *l, struct sk_buff *skb, ...@@ -2079,7 +2084,6 @@ static int tipc_link_proto_rcv(struct tipc_link *l, struct sk_buff *skb,
if (rcvgap || reply) if (rcvgap || reply)
tipc_link_build_proto_msg(l, STATE_MSG, 0, reply, tipc_link_build_proto_msg(l, STATE_MSG, 0, reply,
rcvgap, 0, 0, xmitq); rcvgap, 0, 0, xmitq);
rc |= tipc_link_advance_transmq(l, ack, gap, ga, xmitq); rc |= tipc_link_advance_transmq(l, ack, gap, ga, xmitq);
/* If NACK, retransmit will now start at right position */ /* If NACK, retransmit will now start at right position */
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment