• Gabriel Krisman Bertazi's avatar
    dm mpath: provide high-resolution timer to HST for bio-based · c06dfd12
    Gabriel Krisman Bertazi authored
    The precision loss of reading IO start_time with jiffies_to_nsecs
    instead of using a high resolution timer degrades HST path prediction
    for BIO-based mpath on high load workloads.
    
    Below, I show the utilization percentage of a 10 disk multipath with
    asymmetrical disk access cost, while being exercised by a randwrite FIO
    benchmark with high submission queue depth (depth=64).  It is possible
    to see that the HST path selection degrades heavily for high-iops in
    BIO-mpath, underutilizing the slower paths way beyond expected.  This
    seems to be caused by the start_time truncation, which makes some IO to
    seem much slower than it actually is.  In this scenario ST outperforms
    HST for bio-mpath, but not for mq-mpath, which already uses ktime_get_ns().
    
    The third column shows utilization with this patch applied.  It is easy
    to see that now HST prediction is much closer to the ideal distribution
    (calculated considering the real cost of each path).
    
    |     |   ST | HST (orig) | HST(ktime) | Best |
    | sdd | 0.17 |       0.20 |       0.17 | 0.18 |
    | sde | 0.17 |       0.20 |       0.17 | 0.18 |
    | sdf | 0.17 |       0.20 |       0.17 | 0.18 |
    | sdg | 0.06 |       0.00 |       0.06 | 0.04 |
    | sdh | 0.03 |       0.00 |       0.03 | 0.02 |
    | sdi | 0.03 |       0.00 |       0.03 | 0.02 |
    | sdj | 0.02 |       0.00 |       0.01 | 0.01 |
    | sdk | 0.02 |       0.00 |       0.01 | 0.01 |
    | sdl | 0.17 |       0.20 |       0.17 | 0.18 |
    | sdm | 0.17 |       0.20 |       0.17 | 0.18 |
    
    This issue was originally discussed [1] when we first merged HST, and
    this patch was left as a low hanging fruit to be solved later.
    
    Regarding the implementation, as suggested by Mike in that mail thread,
    in order to avoid the overhead of ktime_get_ns for other selectors, this
    patch adds a flag for the selector code to request the high-resolution
    timer.
    
    I tested this using the same benchmark used in the original HST submission.
    
    Full test and benchmark scripts are available here:
    
      https://people.collabora.com/~krisman/HST-BIO-MPATH/
    
    [1] https://lore.kernel.org/lkml/85tv0am9de.fsf@collabora.com/T/Signed-off-by: default avatarGabriel Krisman Bertazi <krisman@collabora.com>
    [snitzer: cleaned up various implementation details]
    Signed-off-by: default avatarMike Snitzer <snitzer@kernel.org>
    c06dfd12
dm-ps-historical-service-time.c 13.7 KB