• Coly Li's avatar
    bcache: make bch_sectors_dirty_init() to be multithreaded · b144e45f
    Coly Li authored
    When attaching a cached device (a.k.a backing device) to a cache
    device, bch_sectors_dirty_init() is called to count dirty sectors
    and stripes (see what bcache_dev_sectors_dirty_add() does) on the
    cache device.
    
    The counting is done by a single thread recursive function
    bch_btree_map_keys() to iterate all the bcache btree nodes.
    If the btree has huge number of nodes, bch_sectors_dirty_init() will
    take quite long time. In my testing, if the registering cache set has
    a existed UUID which matches a already registered cached device, the
    automatical attachment during the registration may take more than
    55 minutes. This is too long for waiting the bcache to work in real
    deployment.
    
    Fortunately when bch_sectors_dirty_init() is called, no other thread
    will access the btree yet, it is safe to do a read-only parallelized
    dirty sectors counting by multiple threads.
    
    This patch tries to create multiple threads, and each thread tries to
    one-by-one count dirty sectors from the sub-tree indexed by a root
    node key which the thread fetched. After the sub-tree is counted, the
    counting thread will continue to fetch another root node key, until
    the fetched key is NULL. How many threads in parallel depends on
    the number of keys from the btree root node, and the number of online
    CPU core. The thread number will be the less number but no more than
    BCH_DIRTY_INIT_THRD_MAX. If there are only 2 keys in root node, it
    can only be 2x times faster by this patch. But if there are 10 keys
    in the root node, with this patch it can be 10x times faster.
    Signed-off-by: default avatarColy Li <colyli@suse.de>
    Cc: Christoph Hellwig <hch@infradead.org>
    Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
    b144e45f
writeback.c 25.7 KB