• Jon Paul Maloy's avatar
    tipc: fix crash during node removal · d25a0125
    Jon Paul Maloy authored
    When the TIPC module is unloaded, we have identified a race condition
    that allows a node reference counter to go to zero and the node instance
    being freed before the node timer is finished with accessing it. This
    leads to occasional crashes, especially in multi-namespace environments.
    
    The scenario goes as follows:
    
    CPU0:(node_stop)                       CPU1:(node_timeout)  // ref == 2
    
    1:                                          if(!mod_timer())
    2: if (del_timer())
    3:   tipc_node_put()                                        // ref -> 1
    4: tipc_node_put()                                          // ref -> 0
    5:   kfree_rcu(node);
    6:                                               tipc_node_get(node)
    7:                                               // BOOM!
    
    We now clean up this functionality as follows:
    
    1) We remove the node pointer from the node lookup table before we
       attempt deactivating the timer. This way, we reduce the risk that
       tipc_node_find() may obtain a valid pointer to an instance marked
       for deletion; a harmless but undesirable situation.
    
    2) We use del_timer_sync() instead of del_timer() to safely deactivate
       the node timer without any risk that it might be reactivated by the
       timeout handler. There is no risk of deadlock here, since the two
       functions never touch the same spinlocks.
    
    3: We remove a pointless tipc_node_get() + tipc_node_put() from the
       timeout handler.
    Reported-by: default avatarZhijiang Hu <huzhijiang@gmail.com>
    Acked-by: default avatarYing Xue <ying.xue@windriver.com>
    Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    d25a0125
node.c 48.4 KB