• KOSAKI Motohiro's avatar
    mempolicy: remove mempolicy sharing · 869833f2
    KOSAKI Motohiro authored
    Dave Jones' system call fuzz testing tool "trinity" triggered the
    following bug error with slab debugging enabled
    
        =============================================================================
        BUG numa_policy (Not tainted): Poison overwritten
        -----------------------------------------------------------------------------
    
        INFO: 0xffff880146498250-0xffff880146498250. First byte 0x6a instead of 0x6b
        INFO: Allocated in mpol_new+0xa3/0x140 age=46310 cpu=6 pid=32154
         __slab_alloc+0x3d3/0x445
         kmem_cache_alloc+0x29d/0x2b0
         mpol_new+0xa3/0x140
         sys_mbind+0x142/0x620
         system_call_fastpath+0x16/0x1b
    
        INFO: Freed in __mpol_put+0x27/0x30 age=46268 cpu=6 pid=32154
         __slab_free+0x2e/0x1de
         kmem_cache_free+0x25a/0x260
         __mpol_put+0x27/0x30
         remove_vma+0x68/0x90
         exit_mmap+0x118/0x140
         mmput+0x73/0x110
         exit_mm+0x108/0x130
         do_exit+0x162/0xb90
         do_group_exit+0x4f/0xc0
         sys_exit_group+0x17/0x20
         system_call_fastpath+0x16/0x1b
    
        INFO: Slab 0xffffea0005192600 objects=27 used=27 fp=0x          (null) flags=0x20000000004080
        INFO: Object 0xffff880146498250 @offset=592 fp=0xffff88014649b9d0
    
    The problem is that the structure is being prematurely freed due to a
    reference count imbalance. In the following case mbind(addr, len) should
    replace the memory policies of both vma1 and vma2 and thus they will
    become to share the same mempolicy and the new mempolicy will have the
    MPOL_F_SHARED flag.
    
      +-------------------+-------------------+
      |     vma1          |     vma2(shmem)   |
      +-------------------+-------------------+
      |                                       |
     addr                                 addr+len
    
    alloc_pages_vma() uses get_vma_policy() and mpol_cond_put() pair for
    maintaining the mempolicy reference count.  The current rule is that
    get_vma_policy() only increments refcount for shmem VMA and
    mpol_conf_put() only decrements refcount if the policy has
    MPOL_F_SHARED.
    
    In above case, vma1 is not shmem vma and vma->policy has MPOL_F_SHARED!
    The reference count will be decreased even though was not increased
    whenever alloc_page_vma() is called.  This has been broken since commit
    [52cd3b07: mempolicy: rework mempolicy Reference Counting] in 2008.
    
    There is another serious bug with the sharing of memory policies.
    Currently, mempolicy rebind logic (it is called from cpuset rebinding)
    ignores a refcount of mempolicy and override it forcibly.  Thus, any
    mempolicy sharing may cause mempolicy corruption.  The bug was
    introduced by commit [68860ec1: cpusets: automatic numa mempolicy
    rebinding].
    
    Ideally, the shared policy handling would be rewritten to either
    properly handle COW of the policy structures or at least reference count
    MPOL_F_SHARED based exclusively on information within the policy.
    However, this patch takes the easier approach of disabling any policy
    sharing between VMAs.  Each new range allocated with sp_alloc will
    allocate a new policy, set the reference count to 1 and drop the
    reference count of the old policy.  This increases the memory footprint
    but is not expected to be a major problem as mbind() is unlikely to be
    used for fine-grained ranges.  It is also inefficient because it means
    we allocate a new policy even in cases where mbind_range() could use the
    new_policy passed to it.  However, it is more straight-forward and the
    change should be invisible to the user.
    
    [mgorman@suse.de: Edited changelog]
    Reported-by: Dave Jones <davej@redhat.com>,
    Cc: Christoph Lameter <cl@linux.com>,
    Reviewed-by: default avatarChristoph Lameter <cl@linux.com>
    Signed-off-by: default avatarKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
    Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
    Cc: Josh Boyer <jwboyer@gmail.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    869833f2
mempolicy.c 66.1 KB