• Florian Westphal's avatar
    netfilter: conntrack: resched in nf_ct_iterate_cleanup · d93c6258
    Florian Westphal authored
    Ulrich reports soft lockup with following (shortened) callchain:
    
    NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s!
    __netif_receive_skb_core+0x6e4/0x774
    process_backlog+0x94/0x160
    net_rx_action+0x88/0x178
    call_do_softirq+0x24/0x3c
    do_softirq+0x54/0x6c
    __local_bh_enable_ip+0x7c/0xbc
    nf_ct_iterate_cleanup+0x11c/0x22c [nf_conntrack]
    masq_inet_event+0x20/0x30 [nf_nat_masquerade_ipv6]
    atomic_notifier_call_chain+0x1c/0x2c
    ipv6_del_addr+0x1bc/0x220 [ipv6]
    
    Problem is that nf_ct_iterate_cleanup can run for a very long time
    since it can be interrupted by softirq processing.
    Moreover, atomic_notifier_call_chain runs with rcu readlock held.
    
    So lets call cond_resched() in nf_ct_iterate_cleanup and defer
    the call to a work queue for the atomic_notifier_call_chain case.
    
    We also need another cond_resched in get_next_corpse, since we
    have to deal with iter() always returning false, in that case
    get_next_corpse will walk entire conntrack table.
    Reported-by: default avatarUlrich Weber <uw@ocedo.com>
    Tested-by: default avatarUlrich Weber <uw@ocedo.com>
    Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
    Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
    d93c6258
nf_conntrack_core.c 49.5 KB