Commit 901546fd authored by John Fastabend's avatar John Fastabend Committed by Daniel Borkmann

bpf, sockmap: Handle fin correctly

The sockmap code is returning EAGAIN after a FIN packet is received and no
more data is on the receive queue. Correct behavior is to return 0 to the
user and the user can then close the socket. The EAGAIN causes many apps
to retry which masks the problem. Eventually the socket is evicted from
the sockmap because its released from sockmap sock free handling. The
issue creates a delay and can cause some errors on application side.

To fix this check on sk_msg_recvmsg side if length is zero and FIN flag
is set then set return to zero. A selftest will be added to check this
condition.

Fixes: 04919bed ("tcp: Introduce tcp_read_skb()")
Signed-off-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
Tested-by: default avatarWilliam Findlay <will@isovalent.com>
Reviewed-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20230523025618.113937-6-john.fastabend@gmail.com
parent 405df89d
...@@ -174,6 +174,24 @@ static int tcp_msg_wait_data(struct sock *sk, struct sk_psock *psock, ...@@ -174,6 +174,24 @@ static int tcp_msg_wait_data(struct sock *sk, struct sk_psock *psock,
return ret; return ret;
} }
static bool is_next_msg_fin(struct sk_psock *psock)
{
struct scatterlist *sge;
struct sk_msg *msg_rx;
int i;
msg_rx = sk_psock_peek_msg(psock);
i = msg_rx->sg.start;
sge = sk_msg_elem(msg_rx, i);
if (!sge->length) {
struct sk_buff *skb = msg_rx->skb;
if (skb && TCP_SKB_CB(skb)->tcp_flags & TCPHDR_FIN)
return true;
}
return false;
}
static int tcp_bpf_recvmsg_parser(struct sock *sk, static int tcp_bpf_recvmsg_parser(struct sock *sk,
struct msghdr *msg, struct msghdr *msg,
size_t len, size_t len,
...@@ -196,6 +214,19 @@ static int tcp_bpf_recvmsg_parser(struct sock *sk, ...@@ -196,6 +214,19 @@ static int tcp_bpf_recvmsg_parser(struct sock *sk,
lock_sock(sk); lock_sock(sk);
msg_bytes_ready: msg_bytes_ready:
copied = sk_msg_recvmsg(sk, psock, msg, len, flags); copied = sk_msg_recvmsg(sk, psock, msg, len, flags);
/* The typical case for EFAULT is the socket was gracefully
* shutdown with a FIN pkt. So check here the other case is
* some error on copy_page_to_iter which would be unexpected.
* On fin return correct return code to zero.
*/
if (copied == -EFAULT) {
bool is_fin = is_next_msg_fin(psock);
if (is_fin) {
copied = 0;
goto out;
}
}
if (!copied) { if (!copied) {
long timeo; long timeo;
int data; int data;
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment