• Jann Horn's avatar
    io-wq: fix handling of NUMA node IDs · 3fc50ab5
    Jann Horn authored
    There are several things that can go wrong in the current code on NUMA
    systems, especially if not all nodes are online all the time:
    
     - If the identifiers of the online nodes do not form a single contiguous
       block starting at zero, wq->wqes will be too small, and OOB memory
       accesses will occur e.g. in the loop in io_wq_create().
     - If a node comes online between the call to num_online_nodes() and the
       for_each_node() loop in io_wq_create(), an OOB write will occur.
     - If a node comes online between io_wq_create() and io_wq_enqueue(), a
       lookup is performed for an element that doesn't exist, and an OOB read
       will probably occur.
    
    Fix it by:
    
     - using nr_node_ids instead of num_online_nodes() for the allocation size;
       nr_node_ids is calculated by setup_nr_node_ids() to be bigger than the
       highest node ID that could possibly come online at some point, even if
       those nodes' identifiers are not a contiguous block
     - creating workers for all possible CPUs, not just all online ones
    
    This is basically what the normal workqueue code also does, as far as I can
    tell.
    Signed-off-by: default avatarJann Horn <jannh@google.com>
    Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
    3fc50ab5
io-wq.c 25.7 KB