• Peilin Ye's avatar
    ip_gre, ip6_gre: Fix race condition on o_seqno in collect_md mode · 31c417c9
    Peilin Ye authored
    As pointed out by Jakub Kicinski, currently using TUNNEL_SEQ in
    collect_md mode is racy for [IP6]GRE[TAP] devices.  Consider the
    following sequence of events:
    
    1. An [IP6]GRE[TAP] device is created in collect_md mode using "ip link
       add ... external".  "ip" ignores "[o]seq" if "external" is specified,
       so TUNNEL_SEQ is off, and the device is marked as NETIF_F_LLTX (i.e.
       it uses lockless TX);
    2. Someone sets TUNNEL_SEQ on outgoing skb's, using e.g.
       bpf_skb_set_tunnel_key() in an eBPF program attached to this device;
    3. gre_fb_xmit() or __gre6_xmit() processes these skb's:
    
    	gre_build_header(skb, tun_hlen,
    			 flags, protocol,
    			 tunnel_id_to_key32(tun_info->key.tun_id),
    			 (flags & TUNNEL_SEQ) ? htonl(tunnel->o_seqno++)
    					      : 0);   ^^^^^^^^^^^^^^^^^
    
    Since we are not using the TX lock (&txq->_xmit_lock), multiple CPUs may
    try to do this tunnel->o_seqno++ in parallel, which is racy.  Fix it by
    making o_seqno atomic_t.
    
    As mentioned by Eric Dumazet in commit b790e01a ("ip_gre: lockless
    xmit"), making o_seqno atomic_t increases "chance for packets being out
    of order at receiver" when NETIF_F_LLTX is on.
    
    Maybe a better fix would be:
    
    1. Do not ignore "oseq" in external mode.  Users MUST specify "oseq" if
       they want the kernel to allow sequencing of outgoing packets;
    2. Reject all outgoing TUNNEL_SEQ packets if the device was not created
       with "oseq".
    
    Unfortunately, that would break userspace.
    
    We could now make [IP6]GRE[TAP] devices always NETIF_F_LLTX, but let us
    do it in separate patches to keep this fix minimal.
    Suggested-by: default avatarJakub Kicinski <kuba@kernel.org>
    Fixes: 77a5196a ("gre: add sequence number for collect md mode.")
    Signed-off-by: default avatarPeilin Ye <peilin.ye@bytedance.com>
    Acked-by: default avatarWilliam Tu <u9012063@gmail.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    31c417c9
ip_tunnels.h 13.7 KB