• Yu Kuai's avatar
    block: fix that blk_time_get_ns() doesn't update time after schedule · 3ec48489
    Yu Kuai authored
    While monitoring the throttle time of IO from iocost, it's found that
    such time is always zero after the io_schedule() from ioc_rqos_throttle,
    for example, with the following debug patch:
    
    +       printk("%s-%d: %s enter %llu\n", current->comm, current->pid, __func__, blk_time_get_ns());
            while (true) {
                    set_current_state(TASK_UNINTERRUPTIBLE);
                    if (wait.committed)
                            break;
                    io_schedule();
            }
    +       printk("%s-%d: %s exit  %llu\n", current->comm, current->pid, __func__, blk_time_get_ns());
    
    It can be observerd that blk_time_get_ns() always return the same time:
    
    [ 1068.096579] fio-1268: ioc_rqos_throttle enter 1067901962288
    [ 1068.272587] fio-1268: ioc_rqos_throttle exit  1067901962288
    [ 1068.274389] fio-1268: ioc_rqos_throttle enter 1067901962288
    [ 1068.472690] fio-1268: ioc_rqos_throttle exit  1067901962288
    [ 1068.474485] fio-1268: ioc_rqos_throttle enter 1067901962288
    [ 1068.672656] fio-1268: ioc_rqos_throttle exit  1067901962288
    [ 1068.674451] fio-1268: ioc_rqos_throttle enter 1067901962288
    [ 1068.872655] fio-1268: ioc_rqos_throttle exit  1067901962288
    
    And I think the root cause is that 'PF_BLOCK_TS' is always cleared
    by blk_flush_plug() before scheduel(), hence blk_plug_invalidate_ts()
    will never be called:
    
    blk_time_get_ns
     plug->cur_ktime = ktime_get_ns();
     current->flags |= PF_BLOCK_TS;
    
    io_schedule:
     io_schedule_prepare
      blk_flush_plug
       __blk_flush_plug
        /* the flag is cleared, while time is not */
        current->flags &= ~PF_BLOCK_TS;
     schedule
     sched_update_worker
      /* the flag is not set, hence plug->cur_ktime is not cleared */
      if (tsk->flags & PF_BLOCK_TS)
       blk_plug_invalidate_ts()
    
    blk_time_get_ns
     /* got the time stashed before schedule */
     return plug->cur_ktime;
    
    Fix the problem by clearing cached time in __blk_flush_plug().
    
    Fixes: 06b23f92 ("block: update cached timestamp post schedule/preemption")
    Signed-off-by: default avatarYu Kuai <yukuai3@huawei.com>
    Link: https://lore.kernel.org/r/20240411032349.3051233-2-yukuai1@huaweicloud.comSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
    3ec48489
blk-core.c 34.1 KB