• Paul Jackson's avatar
    [PATCH] cpuset exit NULL dereference fix · 2efe86b8
    Paul Jackson authored
    There is a race in the kernel cpuset code, between the code
    to handle notify_on_release, and the code to remove a cpuset.
    The notify_on_release code can end up trying to access a
    cpuset that has been removed.  In the most common case, this
    causes a NULL pointer dereference from the routine cpuset_path.
    However all manner of bad things are possible, in theory at least.
    
    The existing code decrements the cpuset use count, and if the
    count goes to zero, processes the notify_on_release request,
    if appropriate.  However, once the count goes to zero, unless we
    are holding the global cpuset_sem semaphore, there is nothing to
    stop another task from immediately removing the cpuset entirely,
    and recycling its memory.
    
    The obvious fix would be to always hold the cpuset_sem
    semaphore while decrementing the use count and dealing with
    notify_on_release.  However we don't want to force a global
    semaphore into the mainline task exit path, as that might create
    a scaling problem.
    
    The actual fix is almost as easy - since this is only an issue
    for cpusets using notify_on_release, which the top level big
    cpusets don't normally need to use, only take the cpuset_sem
    for cpusets using notify_on_release.
    
    This code has been run for hours without a hiccup, while running
    a cpuset create/destroy stress test that could crash the existing
    kernel in seconds.  This patch applies to the current -linus
    git kernel.
    Signed-off-by: default avatarPaul Jackson <pj@sgi.com>
    Acked-by: default avatarSimon Derr <simon.derr@bull.net>
    Acked-by: default avatarDinakar Guniguntala <dino@in.ibm.com>
    Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
    2efe86b8
cpuset.c 40.1 KB