Commit a34b0244 authored by Paolo Valente's avatar Paolo Valente Committed by Jens Axboe

block, bfq: consider also past I/O in soft real-time detection

BFQ privileges the I/O of soft real-time applications, such as video
players, to guarantee to these application a high bandwidth and a low
latency. In this respect, it is not easy to correctly detect when an
application is soft real-time. A particularly nasty false positive is
that of an I/O-bound application that occasionally happens to meet all
requirements to be deemed as soft real-time. After being detected as
soft real-time, such an application monopolizes the device. Fortunately,
BFQ will realize soon that the application is actually not soft
real-time and suspend every privilege. Yet, the application may happen
again to be wrongly detected as soft real-time, and so on.

As highlighted by our tests, this problem causes BFQ to occasionally
fail to guarantee a high responsiveness, in the presence of heavy
background I/O workloads. The reason is that the background workload
happens to be detected as soft real-time, more or less frequently,
during the execution of the interactive task under test. To give an
idea, because of this problem, Libreoffice Writer occasionally takes 8
seconds, instead of 3, to start up, if there are sequential reads and
writes in the background, on a Kingston SSDNow V300.

This commit addresses this issue by leveraging the following facts.

The reason why some applications are detected as soft real-time despite
all BFQ checks to avoid false positives, is simply that, during high
CPU or storage-device load, I/O-bound applications may happen to do
I/O slowly enough to meet all soft real-time requirements, and pass
all BFQ extra checks. Yet, this happens only for limited time periods:
slow-speed time intervals are usually interspersed between other time
intervals during which these applications do I/O at a very high speed.
To exploit these facts, this commit introduces a little change, in the
detection of soft real-time behavior, to systematically consider also
the recent past: the higher the speed was in the recent past, the
later next I/O should arrive for the application to be considered as
soft real-time. At the beginning of a slow-speed interval, the minimum
arrival time allowed for the next I/O usually happens to still be so
high, to fall *after* the end of the slow-speed period itself. As a
consequence, the application does not risk to be deemed as soft
real-time during the slow-speed interval. Then, during the next
high-speed interval, the application cannot, evidently, be deemed as
soft real-time (exactly because of its speed), and so on.

This extra filtering proved to be rather effective: in the above test,
the frequency of false positives became so low that the start-up time
was 3 seconds in all iterations (apart from occasional outliers,
caused by page-cache-management issues, which are out of the scope of
this commit, and cannot be solved by an I/O scheduler).
Tested-by: default avatarLee Tibbert <lee.tibbert@gmail.com>
Signed-off-by: default avatarPaolo Valente <paolo.valente@linaro.org>
Signed-off-by: default avatarAngelo Ruocco <angeloruocco90@gmail.com>
Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
parent 4403e4e4
...@@ -2940,42 +2940,84 @@ static bool bfq_bfqq_is_slow(struct bfq_data *bfqd, struct bfq_queue *bfqq, ...@@ -2940,42 +2940,84 @@ static bool bfq_bfqq_is_slow(struct bfq_data *bfqd, struct bfq_queue *bfqq,
* whereas soft_rt_next_start is set to infinity for applications that do * whereas soft_rt_next_start is set to infinity for applications that do
* not. * not.
* *
* Unfortunately, even a greedy application may happen to behave in an * Unfortunately, even a greedy (i.e., I/O-bound) application may
* isochronous way if the CPU load is high. In fact, the application may * happen to meet, occasionally or systematically, both the above
* stop issuing requests while the CPUs are busy serving other processes, * bandwidth and isochrony requirements. This may happen at least in
* then restart, then stop again for a while, and so on. In addition, if * the following circumstances. First, if the CPU load is high. The
* the disk achieves a low enough throughput with the request pattern * application may stop issuing requests while the CPUs are busy
* issued by the application (e.g., because the request pattern is random * serving other processes, then restart, then stop again for a while,
* and/or the device is slow), then the application may meet the above * and so on. The other circumstances are related to the storage
* bandwidth requirement too. To prevent such a greedy application to be * device: the storage device is highly loaded or reaches a low-enough
* deemed as soft real-time, a further rule is used in the computation of * throughput with the I/O of the application (e.g., because the I/O
* soft_rt_next_start: soft_rt_next_start must be higher than the current * is random and/or the device is slow). In all these cases, the
* time plus the maximum time for which the arrival of a request is waited * I/O of the application may be simply slowed down enough to meet
* for when a sync queue becomes idle, namely bfqd->bfq_slice_idle. * the bandwidth and isochrony requirements. To reduce the probability
* This filters out greedy applications, as the latter issue instead their * that greedy applications are deemed as soft real-time in these
* next request as soon as possible after the last one has been completed * corner cases, a further rule is used in the computation of
* (in contrast, when a batch of requests is completed, a soft real-time * soft_rt_next_start: the return value of this function is forced to
* application spends some time processing data). * be higher than the maximum between the following two quantities.
* *
* Unfortunately, the last filter may easily generate false positives if * (a) Current time plus: (1) the maximum time for which the arrival
* only bfqd->bfq_slice_idle is used as a reference time interval and one * of a request is waited for when a sync queue becomes idle,
* or both the following cases occur: * namely bfqd->bfq_slice_idle, and (2) a few extra jiffies. We
* 1) HZ is so low that the duration of a jiffy is comparable to or higher * postpone for a moment the reason for adding a few extra
* than bfqd->bfq_slice_idle. This happens, e.g., on slow devices with * jiffies; we get back to it after next item (b). Lower-bounding
* HZ=100. * the return value of this function with the current time plus
* bfqd->bfq_slice_idle tends to filter out greedy applications,
* because the latter issue their next request as soon as possible
* after the last one has been completed. In contrast, a soft
* real-time application spends some time processing data, after a
* batch of its requests has been completed.
*
* (b) Current value of bfqq->soft_rt_next_start. As pointed out
* above, greedy applications may happen to meet both the
* bandwidth and isochrony requirements under heavy CPU or
* storage-device load. In more detail, in these scenarios, these
* applications happen, only for limited time periods, to do I/O
* slowly enough to meet all the requirements described so far,
* including the filtering in above item (a). These slow-speed
* time intervals are usually interspersed between other time
* intervals during which these applications do I/O at a very high
* speed. Fortunately, exactly because of the high speed of the
* I/O in the high-speed intervals, the values returned by this
* function happen to be so high, near the end of any such
* high-speed interval, to be likely to fall *after* the end of
* the low-speed time interval that follows. These high values are
* stored in bfqq->soft_rt_next_start after each invocation of
* this function. As a consequence, if the last value of
* bfqq->soft_rt_next_start is constantly used to lower-bound the
* next value that this function may return, then, from the very
* beginning of a low-speed interval, bfqq->soft_rt_next_start is
* likely to be constantly kept so high that any I/O request
* issued during the low-speed interval is considered as arriving
* to soon for the application to be deemed as soft
* real-time. Then, in the high-speed interval that follows, the
* application will not be deemed as soft real-time, just because
* it will do I/O at a high speed. And so on.
*
* Getting back to the filtering in item (a), in the following two
* cases this filtering might be easily passed by a greedy
* application, if the reference quantity was just
* bfqd->bfq_slice_idle:
* 1) HZ is so low that the duration of a jiffy is comparable to or
* higher than bfqd->bfq_slice_idle. This happens, e.g., on slow
* devices with HZ=100. The time granularity may be so coarse
* that the approximation, in jiffies, of bfqd->bfq_slice_idle
* is rather lower than the exact value.
* 2) jiffies, instead of increasing at a constant rate, may stop increasing * 2) jiffies, instead of increasing at a constant rate, may stop increasing
* for a while, then suddenly 'jump' by several units to recover the lost * for a while, then suddenly 'jump' by several units to recover the lost
* increments. This seems to happen, e.g., inside virtual machines. * increments. This seems to happen, e.g., inside virtual machines.
* To address this issue, we do not use as a reference time interval just * To address this issue, in the filtering in (a) we do not use as a
* bfqd->bfq_slice_idle, but bfqd->bfq_slice_idle plus a few jiffies. In * reference time interval just bfqd->bfq_slice_idle, but
* particular we add the minimum number of jiffies for which the filter * bfqd->bfq_slice_idle plus a few jiffies. In particular, we add the
* seems to be quite precise also in embedded systems and KVM/QEMU virtual * minimum number of jiffies for which the filter seems to be quite
* machines. * precise also in embedded systems and KVM/QEMU virtual machines.
*/ */
static unsigned long bfq_bfqq_softrt_next_start(struct bfq_data *bfqd, static unsigned long bfq_bfqq_softrt_next_start(struct bfq_data *bfqd,
struct bfq_queue *bfqq) struct bfq_queue *bfqq)
{ {
return max(bfqq->last_idle_bklogged + return max3(bfqq->soft_rt_next_start,
bfqq->last_idle_bklogged +
HZ * bfqq->service_from_backlogged / HZ * bfqq->service_from_backlogged /
bfqd->bfq_wr_max_softrt_rate, bfqd->bfq_wr_max_softrt_rate,
jiffies + nsecs_to_jiffies(bfqq->bfqd->bfq_slice_idle) + 4); jiffies + nsecs_to_jiffies(bfqq->bfqd->bfq_slice_idle) + 4);
...@@ -4014,10 +4056,15 @@ static void bfq_init_bfqq(struct bfq_data *bfqd, struct bfq_queue *bfqq, ...@@ -4014,10 +4056,15 @@ static void bfq_init_bfqq(struct bfq_data *bfqd, struct bfq_queue *bfqq,
bfqq->split_time = bfq_smallest_from_now(); bfqq->split_time = bfq_smallest_from_now();
/* /*
* Set to the value for which bfqq will not be deemed as * To not forget the possibly high bandwidth consumed by a
* soft rt when it becomes backlogged. * process/queue in the recent past,
* bfq_bfqq_softrt_next_start() returns a value at least equal
* to the current value of bfqq->soft_rt_next_start (see
* comments on bfq_bfqq_softrt_next_start). Set
* soft_rt_next_start to now, to mean that bfqq has consumed
* no bandwidth so far.
*/ */
bfqq->soft_rt_next_start = bfq_greatest_from_now(); bfqq->soft_rt_next_start = jiffies;
/* first request is almost certainly seeky */ /* first request is almost certainly seeky */
bfqq->seek_history = 1; bfqq->seek_history = 1;
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment