• Tejun Heo's avatar
    blk-cgroup: reimplement basic IO stats using cgroup rstat · f7331648
    Tejun Heo authored
    blk-cgroup has been using blkg_rwstat to track basic IO stats.
    Unfortunately, reading recursive stats scales badly as itinvolves
    walking all descendants.  On systems with a huge number of cgroups
    (dead or alive), this can lead to substantial CPU cost when reading IO
    stats.
    
    This patch reimplements basic IO stats using cgroup rstat which uses
    more memory but makes recursive stat reading O(# descendants which
    have been active since last reading) instead of O(# descendants).
    
    * blk-cgroup core no longer uses sync/async stats.  Introduce new stat
      enums - BLKG_IOSTAT_{READ|WRITE|DISCARD}.
    
    * Add blkg_iostat[_set] which encapsulates byte and io stats, last
      values for propagation delta calculation and u64_stats_sync for
      correctness on 32bit archs.
    
    * Update the new percpu stat counters directly and implement
      blkcg_rstat_flush() to implement propagation.
    
    * blkg_print_stat() can now bring the stats up to date by calling
      cgroup_rstat_flush() and print them instead of directly summing up
      all descendants.
    
    * It now allocates 96 bytes per cpu.  It used to be 40 bytes.
    Signed-off-by: default avatarTejun Heo <tj@kernel.org>
    Cc: Dan Schatzberg <dschatzberg@fb.com>
    Cc: Daniel Xu <dlxu@fb.com>
    Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
    f7331648
blk-cgroup.c 47.4 KB