• Pablo Neira Ayuso's avatar
    netfilter: flowtable: fix TCP flow teardown · e5eaac2b
    Pablo Neira Ayuso authored
    This patch addresses three possible problems:
    
    1. ct gc may race to undo the timeout adjustment of the packet path, leaving
       the conntrack entry in place with the internal offload timeout (one day).
    
    2. ct gc removes the ct because the IPS_OFFLOAD_BIT is not set and the CLOSE
       timeout is reached before the flow offload del.
    
    3. tcp ct is always set to ESTABLISHED with a very long timeout
       in flow offload teardown/delete even though the state might be already
       CLOSED. Also as a remark we cannot assume that the FIN or RST packet
       is hitting flow table teardown as the packet might get bumped to the
       slow path in nftables.
    
    This patch resets IPS_OFFLOAD_BIT from flow_offload_teardown(), so
    conntrack handles the tcp rst/fin packet which triggers the CLOSE/FIN
    state transition.
    
    Moreover, teturn the connection's ownership to conntrack upon teardown
    by clearing the offload flag and fixing the established timeout value.
    The flow table GC thread will asynchonrnously free the flow table and
    hardware offload entries.
    
    Before this patch, the IPS_OFFLOAD_BIT remained set for expired flows on
    which is also misleading since the flow is back to classic conntrack
    path.
    
    If nf_ct_delete() removes the entry from the conntrack table, then it
    calls nf_ct_put() which decrements the refcnt. This is not a problem
    because the flowtable holds a reference to the conntrack object from
    flow_offload_alloc() path which is released via flow_offload_free().
    
    This patch also updates nft_flow_offload to skip packets in SYN_RECV
    state. Since we might miss or bump packets to slow path, we do not know
    what will happen there while we are still in SYN_RECV, this patch
    postpones offload up to the next packet which also aligns to the
    existing behaviour in tc-ct.
    
    flow_offload_teardown() does not reset the existing tcp state from
    flow_offload_fixup_tcp() to ESTABLISHED anymore, packets bump to slow
    path might have already update the state to CLOSE/FIN.
    
    Joint work with Oz and Sven.
    
    Fixes: 1e5b2471 ("netfilter: nf_flow_table: teardown flow timeout race")
    Signed-off-by: default avatarOz Shlomo <ozsh@nvidia.com>
    Signed-off-by: default avatarSven Auhagen <sven.auhagen@voleatech.de>
    Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
    e5eaac2b
nft_flow_offload.c 12.8 KB