Commit 84c124da authored by Divyesh Shah's avatar Divyesh Shah Committed by Jens Axboe

blkio: Changes to IO controller additional stats patches

that include some minor fixes and addresses all comments.

Changelog: (most based on Vivek Goyal's comments)
o renamed blkiocg_reset_write to blkiocg_reset_stats
o more clarification in the documentation on io_service_time and io_wait_time
o Initialize blkg->stats_lock
o rename io_add_stat to blkio_add_stat and declare it static
o use bool for direction and sync
o derive direction and sync info from existing rq methods
o use 12 for major:minor string length
o define io_service_time better to cover the NCQ case
o add a separate reset_stats interface
o make the indexed stats a 2d array to simplify macro and function pointer code
o blkio.time now exports in jiffies as before
o Added stats description in patch description and
  Documentation/cgroup/blkio-controller.txt
o Prefix all stats functions with blkio and make them static as applicable
o replace IO_TYPE_MAX with IO_TYPE_TOTAL
o Moved #define constant to top of blk-cgroup.c
o Pass dev_t around instead of char *
o Add note to documentation file about resetting stats
o use BLK_CGROUP_MODULE in addition to BLK_CGROUP config option in #ifdef
  statements
o Avoid struct request specific knowledge in blk-cgroup. blk-cgroup.h now has
  rq_direction() and rq_sync() functions which are used by CFQ and when using
  io-controller at a higher level, bio_* functions can be added.

Signed-off-by: Divyesh Shah<dpshah@google.com>
Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
parent 31373d09
...@@ -77,7 +77,6 @@ Details of cgroup files ...@@ -77,7 +77,6 @@ Details of cgroup files
======================= =======================
- blkio.weight - blkio.weight
- Specifies per cgroup weight. - Specifies per cgroup weight.
Currently allowed range of weights is from 100 to 1000. Currently allowed range of weights is from 100 to 1000.
- blkio.time - blkio.time
...@@ -92,6 +91,49 @@ Details of cgroup files ...@@ -92,6 +91,49 @@ Details of cgroup files
third field specifies the number of sectors transferred by the third field specifies the number of sectors transferred by the
group to/from the device. group to/from the device.
- blkio.io_service_bytes
- Number of bytes transferred to/from the disk by the group. These
are further divided by the type of operation - read or write, sync
or async. First two fields specify the major and minor number of the
device, third field specifies the operation type and the fourth field
specifies the number of bytes.
- blkio.io_serviced
- Number of IOs completed to/from the disk by the group. These
are further divided by the type of operation - read or write, sync
or async. First two fields specify the major and minor number of the
device, third field specifies the operation type and the fourth field
specifies the number of IOs.
- blkio.io_service_time
- Total amount of time between request dispatch and request completion
for the IOs done by this cgroup. This is in nanoseconds to make it
meaningful for flash devices too. For devices with queue depth of 1,
this time represents the actual service time. When queue_depth > 1,
that is no longer true as requests may be served out of order. This
may cause the service time for a given IO to include the service time
of multiple IOs when served out of order which may result in total
io_service_time > actual time elapsed. This time is further divided by
the type of operation - read or write, sync or async. First two fields
specify the major and minor number of the device, third field
specifies the operation type and the fourth field specifies the
io_service_time in ns.
- blkio.io_wait_time
- Total amount of time the IOs for this cgroup spent waiting in the
scheduler queues for service. This can be greater than the total time
elapsed since it is cumulative io_wait_time for all IOs. It is not a
measure of total time the cgroup spent waiting but rather a measure of
the wait_time for its individual IOs. For devices with queue_depth > 1
this metric does not include the time spent waiting for service once
the IO is dispatched to the device but till it actually gets serviced
(there might be a time lag here due to re-ordering of requests by the
device). This is in nanoseconds to make it meaningful for flash
devices too. This time is further divided by the type of operation -
read or write, sync or async. First two fields specify the major and
minor number of the device, third field specifies the operation type
and the fourth field specifies the io_wait_time in ns.
- blkio.dequeue - blkio.dequeue
- Debugging aid only enabled if CONFIG_DEBUG_CFQ_IOSCHED=y. This - Debugging aid only enabled if CONFIG_DEBUG_CFQ_IOSCHED=y. This
gives the statistics about how many a times a group was dequeued gives the statistics about how many a times a group was dequeued
...@@ -99,6 +141,10 @@ Details of cgroup files ...@@ -99,6 +141,10 @@ Details of cgroup files
and minor number of the device and third field specifies the number and minor number of the device and third field specifies the number
of times a group was dequeued from a particular device. of times a group was dequeued from a particular device.
- blkio.reset_stats
- Writing an int to this file will result in resetting all the stats
for that cgroup.
CFQ sysfs tunable CFQ sysfs tunable
================= =================
/sys/block/<disk>/queue/iosched/group_isolation /sys/block/<disk>/queue/iosched/group_isolation
......
This diff is collapsed.
...@@ -23,12 +23,31 @@ extern struct cgroup_subsys blkio_subsys; ...@@ -23,12 +23,31 @@ extern struct cgroup_subsys blkio_subsys;
#define blkio_subsys_id blkio_subsys.subsys_id #define blkio_subsys_id blkio_subsys.subsys_id
#endif #endif
enum io_type { enum stat_type {
IO_READ = 0, /* Total time spent (in ns) between request dispatch to the driver and
IO_WRITE, * request completion for IOs doen by this cgroup. This may not be
IO_SYNC, * accurate when NCQ is turned on. */
IO_ASYNC, BLKIO_STAT_SERVICE_TIME = 0,
IO_TYPE_MAX /* Total bytes transferred */
BLKIO_STAT_SERVICE_BYTES,
/* Total IOs serviced, post merge */
BLKIO_STAT_SERVICED,
/* Total time spent waiting in scheduler queue in ns */
BLKIO_STAT_WAIT_TIME,
/* All the single valued stats go below this */
BLKIO_STAT_TIME,
BLKIO_STAT_SECTORS,
#ifdef CONFIG_DEBUG_BLK_CGROUP
BLKIO_STAT_DEQUEUE
#endif
};
enum stat_sub_type {
BLKIO_STAT_READ = 0,
BLKIO_STAT_WRITE,
BLKIO_STAT_SYNC,
BLKIO_STAT_ASYNC,
BLKIO_STAT_TOTAL
}; };
struct blkio_cgroup { struct blkio_cgroup {
...@@ -42,13 +61,7 @@ struct blkio_group_stats { ...@@ -42,13 +61,7 @@ struct blkio_group_stats {
/* total disk time and nr sectors dispatched by this group */ /* total disk time and nr sectors dispatched by this group */
uint64_t time; uint64_t time;
uint64_t sectors; uint64_t sectors;
/* Total disk time used by IOs in ns */ uint64_t stat_arr[BLKIO_STAT_WAIT_TIME + 1][BLKIO_STAT_TOTAL];
uint64_t io_service_time[IO_TYPE_MAX];
uint64_t io_service_bytes[IO_TYPE_MAX]; /* Total bytes transferred */
/* Total IOs serviced, post merge */
uint64_t io_serviced[IO_TYPE_MAX];
/* Total time spent waiting in scheduler queue in ns */
uint64_t io_wait_time[IO_TYPE_MAX];
#ifdef CONFIG_DEBUG_BLK_CGROUP #ifdef CONFIG_DEBUG_BLK_CGROUP
/* How many times this group has been removed from service tree */ /* How many times this group has been removed from service tree */
unsigned long dequeue; unsigned long dequeue;
...@@ -65,7 +78,7 @@ struct blkio_group { ...@@ -65,7 +78,7 @@ struct blkio_group {
char path[128]; char path[128];
#endif #endif
/* The device MKDEV(major, minor), this group has been created for */ /* The device MKDEV(major, minor), this group has been created for */
dev_t dev; dev_t dev;
/* Need to serialize the stats in the case of reset/update */ /* Need to serialize the stats in the case of reset/update */
spinlock_t stats_lock; spinlock_t stats_lock;
...@@ -128,21 +141,21 @@ extern void blkiocg_add_blkio_group(struct blkio_cgroup *blkcg, ...@@ -128,21 +141,21 @@ extern void blkiocg_add_blkio_group(struct blkio_cgroup *blkcg,
extern int blkiocg_del_blkio_group(struct blkio_group *blkg); extern int blkiocg_del_blkio_group(struct blkio_group *blkg);
extern struct blkio_group *blkiocg_lookup_group(struct blkio_cgroup *blkcg, extern struct blkio_group *blkiocg_lookup_group(struct blkio_cgroup *blkcg,
void *key); void *key);
void blkio_group_init(struct blkio_group *blkg);
void blkiocg_update_timeslice_used(struct blkio_group *blkg, void blkiocg_update_timeslice_used(struct blkio_group *blkg,
unsigned long time); unsigned long time);
void blkiocg_update_request_dispatch_stats(struct blkio_group *blkg, void blkiocg_update_dispatch_stats(struct blkio_group *blkg, uint64_t bytes,
struct request *rq); bool direction, bool sync);
void blkiocg_update_request_completion_stats(struct blkio_group *blkg, void blkiocg_update_completion_stats(struct blkio_group *blkg,
struct request *rq); uint64_t start_time, uint64_t io_start_time, bool direction, bool sync);
#else #else
struct cgroup; struct cgroup;
static inline struct blkio_cgroup * static inline struct blkio_cgroup *
cgroup_to_blkio_cgroup(struct cgroup *cgroup) { return NULL; } cgroup_to_blkio_cgroup(struct cgroup *cgroup) { return NULL; }
static inline void blkio_group_init(struct blkio_group *blkg) {}
static inline void blkiocg_add_blkio_group(struct blkio_cgroup *blkcg, static inline void blkiocg_add_blkio_group(struct blkio_cgroup *blkcg,
struct blkio_group *blkg, void *key, dev_t dev) struct blkio_group *blkg, void *key, dev_t dev) {}
{
}
static inline int static inline int
blkiocg_del_blkio_group(struct blkio_group *blkg) { return 0; } blkiocg_del_blkio_group(struct blkio_group *blkg) { return 0; }
...@@ -151,9 +164,10 @@ static inline struct blkio_group * ...@@ -151,9 +164,10 @@ static inline struct blkio_group *
blkiocg_lookup_group(struct blkio_cgroup *blkcg, void *key) { return NULL; } blkiocg_lookup_group(struct blkio_cgroup *blkcg, void *key) { return NULL; }
static inline void blkiocg_update_timeslice_used(struct blkio_group *blkg, static inline void blkiocg_update_timeslice_used(struct blkio_group *blkg,
unsigned long time) {} unsigned long time) {}
static inline void blkiocg_update_request_dispatch_stats( static inline void blkiocg_update_dispatch_stats(struct blkio_group *blkg,
struct blkio_group *blkg, struct request *rq) {} uint64_t bytes, bool direction, bool sync) {}
static inline void blkiocg_update_request_completion_stats( static inline void blkiocg_update_completion_stats(struct blkio_group *blkg,
struct blkio_group *blkg, struct request *rq) {} uint64_t start_time, uint64_t io_start_time, bool direction,
bool sync) {}
#endif #endif
#endif /* _BLK_CGROUP_H */ #endif /* _BLK_CGROUP_H */
...@@ -955,6 +955,7 @@ cfq_find_alloc_cfqg(struct cfq_data *cfqd, struct cgroup *cgroup, int create) ...@@ -955,6 +955,7 @@ cfq_find_alloc_cfqg(struct cfq_data *cfqd, struct cgroup *cgroup, int create)
for_each_cfqg_st(cfqg, i, j, st) for_each_cfqg_st(cfqg, i, j, st)
*st = CFQ_RB_ROOT; *st = CFQ_RB_ROOT;
RB_CLEAR_NODE(&cfqg->rb_node); RB_CLEAR_NODE(&cfqg->rb_node);
blkio_group_init(&cfqg->blkg);
/* /*
* Take the initial reference that will be released on destroy * Take the initial reference that will be released on destroy
...@@ -1865,7 +1866,8 @@ static void cfq_dispatch_insert(struct request_queue *q, struct request *rq) ...@@ -1865,7 +1866,8 @@ static void cfq_dispatch_insert(struct request_queue *q, struct request *rq)
elv_dispatch_sort(q, rq); elv_dispatch_sort(q, rq);
cfqd->rq_in_flight[cfq_cfqq_sync(cfqq)]++; cfqd->rq_in_flight[cfq_cfqq_sync(cfqq)]++;
blkiocg_update_request_dispatch_stats(&cfqq->cfqg->blkg, rq); blkiocg_update_dispatch_stats(&cfqq->cfqg->blkg, blk_rq_bytes(rq),
rq_data_dir(rq), rq_is_sync(rq));
} }
/* /*
...@@ -3286,7 +3288,9 @@ static void cfq_completed_request(struct request_queue *q, struct request *rq) ...@@ -3286,7 +3288,9 @@ static void cfq_completed_request(struct request_queue *q, struct request *rq)
WARN_ON(!cfqq->dispatched); WARN_ON(!cfqq->dispatched);
cfqd->rq_in_driver--; cfqd->rq_in_driver--;
cfqq->dispatched--; cfqq->dispatched--;
blkiocg_update_request_completion_stats(&cfqq->cfqg->blkg, rq); blkiocg_update_completion_stats(&cfqq->cfqg->blkg, rq_start_time_ns(rq),
rq_io_start_time_ns(rq), rq_data_dir(rq),
rq_is_sync(rq));
cfqd->rq_in_flight[cfq_cfqq_sync(cfqq)]--; cfqd->rq_in_flight[cfq_cfqq_sync(cfqq)]--;
......
...@@ -1209,9 +1209,27 @@ static inline void set_io_start_time_ns(struct request *req) ...@@ -1209,9 +1209,27 @@ static inline void set_io_start_time_ns(struct request *req)
{ {
req->io_start_time_ns = sched_clock(); req->io_start_time_ns = sched_clock();
} }
static inline uint64_t rq_start_time_ns(struct request *req)
{
return req->start_time_ns;
}
static inline uint64_t rq_io_start_time_ns(struct request *req)
{
return req->io_start_time_ns;
}
#else #else
static inline void set_start_time_ns(struct request *req) {} static inline void set_start_time_ns(struct request *req) {}
static inline void set_io_start_time_ns(struct request *req) {} static inline void set_io_start_time_ns(struct request *req) {}
static inline uint64_t rq_start_time_ns(struct request *req)
{
return 0;
}
static inline uint64_t rq_io_start_time_ns(struct request *req)
{
return 0;
}
#endif #endif
#define MODULE_ALIAS_BLOCKDEV(major,minor) \ #define MODULE_ALIAS_BLOCKDEV(major,minor) \
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment