1. 06 Jul, 2021 3 commits
    • Manfred Spraul's avatar
      netfilter: conntrack: Mark access for KCSAN · cf4466ea
      Manfred Spraul authored
      KCSAN detected an data race with ipc/sem.c that is intentional.
      
      As nf_conntrack_lock() uses the same algorithm: Update
      nf_conntrack_core as well:
      
      nf_conntrack_lock() contains
        a1) spin_lock()
        a2) smp_load_acquire(nf_conntrack_locks_all).
      
      a1) actually accesses one lock from an array of locks.
      
      nf_conntrack_locks_all() contains
        b1) nf_conntrack_locks_all=true (normal write)
        b2) spin_lock()
        b3) spin_unlock()
      
      b2 and b3 are done for every lock.
      
      This guarantees that nf_conntrack_locks_all() prevents any
      concurrent nf_conntrack_lock() owners:
      If a thread past a1), then b2) will block until that thread releases
      the lock.
      If the threat is before a1, then b3)+a1) ensure the write b1) is
      visible, thus a2) is guaranteed to see the updated value.
      
      But: This is only the latest time when b1) becomes visible.
      It may also happen that b1) is visible an undefined amount of time
      before the b3). And thus KCSAN will notice a data race.
      
      In addition, the compiler might be too clever.
      
      Solution: Use WRITE_ONCE().
      Signed-off-by: default avatarManfred Spraul <manfred@colorfullife.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      cf4466ea
    • Ali Abdallah's avatar
      netfilter: conntrack: add new sysctl to disable RST check · 1da4cd82
      Ali Abdallah authored
      This patch adds a new sysctl tcp_ignore_invalid_rst to disable marking
      out of segments RSTs as INVALID.
      Signed-off-by: default avatarAli Abdallah <aabdallah@suse.de>
      Acked-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      1da4cd82
    • Ali Abdallah's avatar
      netfilter: conntrack: improve RST handling when tuple is re-used · c4edc3cc
      Ali Abdallah authored
      If we receive a SYN packet in original direction on an existing
      connection tracking entry, we let this SYN through because conntrack
      might be out-of-sync.
      
      Conntrack gets back in sync when server responds with SYN/ACK and state
      gets updated accordingly.
      
      However, if server replies with RST, this packet might be marked as
      INVALID because td_maxack value reflects the *old* conntrack state
      and not the state of the originator of the RST.
      
      Avoid td_maxack-based checks if previous packet was a SYN.
      
      Unfortunately that is not be enough: an out of order ACK in original
      direction updates last_index, so we still end up marking valid RST.
      
      Thus disable the sequence check when we are not in established state and
      the received RST has a sequence of 0.
      
      Because marking RSTs as invalid usually leads to unwanted timeouts,
      also skip RST sequence checks if a conntrack entry is already closing.
      
      Such entries can already be evicted via GC in case the table is full.
      Co-developed-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarAli Abdallah <aabdallah@suse.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      c4edc3cc
  2. 02 Jul, 2021 5 commits
    • Vasily Averin's avatar
      netfilter: ctnetlink: suspicious RCU usage in ctnetlink_dump_helpinfo · c23a9fd2
      Vasily Averin authored
      Two patches listed below removed ctnetlink_dump_helpinfo call from under
      rcu_read_lock. Now its rcu_dereference generates following warning:
      =============================
      WARNING: suspicious RCU usage
      5.13.0+ #5 Not tainted
      -----------------------------
      net/netfilter/nf_conntrack_netlink.c:221 suspicious rcu_dereference_check() usage!
      
      other info that might help us debug this:
      rcu_scheduler_active = 2, debug_locks = 1
      stack backtrace:
      CPU: 1 PID: 2251 Comm: conntrack Not tainted 5.13.0+ #5
      Call Trace:
       dump_stack+0x7f/0xa1
       ctnetlink_dump_helpinfo+0x134/0x150 [nf_conntrack_netlink]
       ctnetlink_fill_info+0x2c2/0x390 [nf_conntrack_netlink]
       ctnetlink_dump_table+0x13f/0x370 [nf_conntrack_netlink]
       netlink_dump+0x10c/0x370
       __netlink_dump_start+0x1a7/0x260
       ctnetlink_get_conntrack+0x1e5/0x250 [nf_conntrack_netlink]
       nfnetlink_rcv_msg+0x613/0x993 [nfnetlink]
       netlink_rcv_skb+0x50/0x100
       nfnetlink_rcv+0x55/0x120 [nfnetlink]
       netlink_unicast+0x181/0x260
       netlink_sendmsg+0x23f/0x460
       sock_sendmsg+0x5b/0x60
       __sys_sendto+0xf1/0x160
       __x64_sys_sendto+0x24/0x30
       do_syscall_64+0x36/0x70
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      Fixes: 49ca022b ("netfilter: ctnetlink: don't dump ct extensions of unconfirmed conntracks")
      Fixes: 0b35f603 ("netfilter: Remove duplicated rcu_read_lock.")
      Signed-off-by: default avatarVasily Averin <vvs@virtuozzo.com>
      Reviewed-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      c23a9fd2
    • Vasily Averin's avatar
      netfilter: conntrack: nf_ct_gre_keymap_flush() removal · a23f89a9
      Vasily Averin authored
      nf_ct_gre_keymap_flush() is useless.
      It is called from nf_conntrack_cleanup_net_list() only and tries to remove
      nf_ct_gre_keymap entries from pernet gre keymap list. Though:
      a) at this point the list should already be empty, all its entries were
      deleted during the conntracks cleanup, because
      nf_conntrack_cleanup_net_list() executes nf_ct_iterate_cleanup(kill_all)
      before nf_conntrack_proto_pernet_fini():
       nf_conntrack_cleanup_net_list
        +- nf_ct_iterate_cleanup
        |   nf_ct_put
        |    nf_conntrack_put
        |     nf_conntrack_destroy
        |      destroy_conntrack
        |       destroy_gre_conntrack
        |        nf_ct_gre_keymap_destroy
        `- nf_conntrack_proto_pernet_fini
            nf_ct_gre_keymap_flush
      
      b) Let's say we find that the keymap list is not empty. This means netns
      still has a conntrack associated with gre, in which case we should not free
      its memory, because this will lead to a double free and related crashes.
      However I doubt it could have gone unnoticed for years, obviously
      this does not happen in real life. So I think we can remove
      both nf_ct_gre_keymap_flush() and nf_conntrack_proto_pernet_fini().
      Signed-off-by: default avatarVasily Averin <vvs@virtuozzo.com>
      Acked-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      a23f89a9
    • Colin Ian King's avatar
      netfilter: nf_tables: Fix dereference of null pointer flow · 4ca041f9
      Colin Ian King authored
      In the case where chain->flags & NFT_CHAIN_HW_OFFLOAD is false then
      nft_flow_rule_create is not called and flow is NULL. The subsequent
      error handling execution via label err_destroy_flow_rule will lead
      to a null pointer dereference on flow when calling nft_flow_rule_destroy.
      Since the error path to err_destroy_flow_rule has to cater for null
      and non-null flows, only call nft_flow_rule_destroy if flow is non-null
      to fix this issue.
      
      Addresses-Coverity: ("Explicity null dereference")
      Fixes: 3c5e4462 ("netfilter: nf_tables: memleak in hw offload abort path")
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      4ca041f9
    • Florian Westphal's avatar
      netfilter: conntrack: do not renew entry stuck in tcp SYN_SENT state · e15d4cdf
      Florian Westphal authored
      Consider:
        client -----> conntrack ---> Host
      
      client sends a SYN, but $Host is unreachable/silent.
      Client eventually gives up and the conntrack entry will time out.
      
      However, if the client is restarted with same addr/port pair, it
      may prevent the conntrack entry from timing out.
      
      This is noticeable when the existing conntrack entry has no NAT
      transformation or an outdated one and port reuse happens either
      on client or due to a NAT middlebox.
      
      This change prevents refresh of the timeout for SYN retransmits,
      so entry is going away after nf_conntrack_tcp_timeout_syn_sent
      seconds (default: 60).
      
      Entry will be re-created on next connection attempt, but then
      nat rules will be evaluated again.
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      e15d4cdf
    • Florian Westphal's avatar
      selftest: netfilter: add test case for unreplied tcp connections · 37d220b5
      Florian Westphal authored
      TCP connections in UNREPLIED state (only SYN seen) can be kept alive
      indefinitely, as each SYN re-sets the timeout.
      
      This means that even if a peer has closed its socket the entry
      never times out.
      
      This also prevents re-evaluation of configured NAT rules.
      Add a test case that sets SYN timeout to 10 seconds, then check
      that the nat redirection added later eventually takes effect.
      
      This is based off a repro script from Antonio Ojea.
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      37d220b5
  3. 01 Jul, 2021 32 commits