• Tejun Heo's avatar
    blkcg: fix blkcg_policy_data allocation bug · 06b285bd
    Tejun Heo authored
    e48453c3 ("block, cgroup: implement policy-specific per-blkcg
    data") updated per-blkcg policy data to be dynamically allocated.
    When a policy is registered, its policy data aren't created.  Instead,
    when the policy is activated on a queue, the policy data are allocated
    if there are blkg's (blkcg_gq's) which are attached to a given blkcg.
    This is buggy.  Consider the following scenario.
    
    1. A blkcg is created.  No blkg's attached yet.
    
    2. The policy is registered.  No policy data is allocated.
    
    3. The policy is activated on a queue.  As the above blkcg doesn't
       have any blkg's, it won't allocate the matching blkcg_policy_data.
    
    4. An IO is issued from the blkcg and blkg is created and the blkcg
       still doesn't have the matching policy data allocated.
    
    With cfq-iosched, this leads to an oops.
    
    It also doesn't free policy data on policy unregistration assuming
    that freeing of all policy data on blkcg destruction should take care
    of it; however, this also is incorrect.
    
    1. A blkcg has policy data.
    
    2. The policy gets unregistered but the policy data remains.
    
    3. Another policy gets registered on the same slot.
    
    4. Later, the new policy tries to allocate policy data on the previous
       blkcg but the slot is already occupied and gets skipped.  The
       policy ends up operating on the policy data of the previous policy.
    
    There's no reason to manage blkcg_policy_data lazily.  The reason we
    do lazy allocation of blkg's is that the number of all possible blkg's
    is the product of cgroups and block devices which can reach a
    surprising level.  blkcg_policy_data is contrained by the number of
    cgroups and shouldn't be a problem.
    
    This patch makes blkcg_policy_data to be allocated for all existing
    blkcg's on policy registration and freed on unregistration and removes
    blkcg_policy_data handling from policy [de]activation paths.  This
    makes that blkcg_policy_data are created and removed with the policy
    they belong to and fixes the above described problems.
    Signed-off-by: default avatarTejun Heo <tj@kernel.org>
    Fixes: e48453c3 ("block, cgroup: implement policy-specific per-blkcg data")
    Cc: Vivek Goyal <vgoyal@redhat.com>
    Cc: Arianna Avanzini <avanzini.arianna@gmail.com>
    Signed-off-by: default avatarJens Axboe <axboe@fb.com>
    06b285bd
blk-cgroup.c 31.8 KB