• Coly Li's avatar
    bcache: stop dc->writeback_rate_update properly · 3fd47bfe
    Coly Li authored
    struct delayed_work writeback_rate_update in struct cache_dev is a delayed
    worker to call function update_writeback_rate() in period (the interval is
    defined by dc->writeback_rate_update_seconds).
    
    When a metadate I/O error happens on cache device, bcache error handling
    routine bch_cache_set_error() will call bch_cache_set_unregister() to
    retire whole cache set. On the unregister code path, this delayed work is
    stopped by calling cancel_delayed_work_sync(&dc->writeback_rate_update).
    
    dc->writeback_rate_update is a special delayed work from others in bcache.
    In its routine update_writeback_rate(), this delayed work is re-armed
    itself. That means when cancel_delayed_work_sync() returns, this delayed
    work can still be executed after several seconds defined by
    dc->writeback_rate_update_seconds.
    
    The problem is, after cancel_delayed_work_sync() returns, the cache set
    unregister code path will continue and release memory of struct cache set.
    Then the delayed work is scheduled to run, __update_writeback_rate()
    will reference the already released cache_set memory, and trigger a NULL
    pointer deference fault.
    
    This patch introduces two more bcache device flags,
    - BCACHE_DEV_WB_RUNNING
      bit set:  bcache device is in writeback mode and running, it is OK for
                dc->writeback_rate_update to re-arm itself.
      bit clear:bcache device is trying to stop dc->writeback_rate_update,
                this delayed work should not re-arm itself and quit.
    - BCACHE_DEV_RATE_DW_RUNNING
      bit set:  routine update_writeback_rate() is executing.
      bit clear: routine update_writeback_rate() quits.
    
    This patch also adds a function cancel_writeback_rate_update_dwork() to
    wait for dc->writeback_rate_update quits before cancel it by calling
    cancel_delayed_work_sync(). In order to avoid a deadlock by unexpected
    quit dc->writeback_rate_update, after time_out seconds this function will
    give up and continue to call cancel_delayed_work_sync().
    
    And here I explain how this patch stops self re-armed delayed work properly
    with the above stuffs.
    
    update_writeback_rate() sets BCACHE_DEV_RATE_DW_RUNNING at its beginning
    and clears BCACHE_DEV_RATE_DW_RUNNING at its end. Before calling
    cancel_writeback_rate_update_dwork() clear flag BCACHE_DEV_WB_RUNNING.
    
    Before calling cancel_delayed_work_sync() wait utill flag
    BCACHE_DEV_RATE_DW_RUNNING is clear. So when calling
    cancel_delayed_work_sync(), dc->writeback_rate_update must be already re-
    armed, or quite by seeing BCACHE_DEV_WB_RUNNING cleared. In both cases
    delayed work routine update_writeback_rate() won't be executed after
    cancel_delayed_work_sync() returns.
    
    Inside update_writeback_rate() before calling schedule_delayed_work(), flag
    BCACHE_DEV_WB_RUNNING is checked before. If this flag is cleared, it means
    someone is about to stop the delayed work. Because flag
    BCACHE_DEV_RATE_DW_RUNNING is set already and cancel_delayed_work_sync()
    has to wait for this flag to be cleared, we don't need to worry about race
    condition here.
    
    If update_writeback_rate() is scheduled to run after checking
    BCACHE_DEV_RATE_DW_RUNNING and before calling cancel_delayed_work_sync()
    in cancel_writeback_rate_update_dwork(), it is also safe. Because at this
    moment BCACHE_DEV_WB_RUNNING is cleared with memory barrier. As I mentioned
    previously, update_writeback_rate() will see BCACHE_DEV_WB_RUNNING is clear
    and quit immediately.
    
    Because there are more dependences inside update_writeback_rate() to struct
    cache_set memory, dc->writeback_rate_update is not a simple self re-arm
    delayed work. After trying many different methods (e.g. hold dc->count, or
    use locks), this is the only way I can find which works to properly stop
    dc->writeback_rate_update delayed work.
    
    Changelog:
    v3: change values of BCACHE_DEV_WB_RUNNING and BCACHE_DEV_RATE_DW_RUNNING
        to bit index, for test_bit().
    v2: Try to fix the race issue which is pointed out by Junhui.
    v1: The initial version for review
    Signed-off-by: default avatarColy Li <colyli@suse.de>
    Reviewed-by: default avatarJunhui Tang <tang.junhui@zte.com.cn>
    Reviewed-by: default avatarMichael Lyle <mlyle@lyle.org>
    Cc: Michael Lyle <mlyle@lyle.org>
    Cc: Hannes Reinecke <hare@suse.com>
    Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
    3fd47bfe
super.c 52.5 KB