• Aristeu Rozanski's avatar
    hugetlb: force allocating surplus hugepages on mempolicy allowed nodes · 003af997
    Aristeu Rozanski authored
    When trying to allocate a hugepage with no reserved ones free, it may be
    allowed in case a number of overcommit hugepages was configured (using
    /proc/sys/vm/nr_overcommit_hugepages) and that number wasn't reached. 
    This allows for a behavior of having extra hugepages allocated
    dynamically, if there're resources for it.  Some sysadmins even prefer not
    reserving any hugepages and setting a big number of overcommit hugepages.
    
    But while attempting to allocate overcommit hugepages in a multi node
    system (either NUMA or mempolicy/cpuset) said allocations might randomly
    fail even when there're resources available for the allocation.
    
    This happens due to allowed_mems_nr() only accounting for the number of
    free hugepages in the nodes the current process belongs to and the surplus
    hugepage allocation is done so it can be allocated in any node.  In case
    one or more of the requested surplus hugepages are allocated in a
    different node, the whole allocation will fail due allowed_mems_nr()
    returning a lower value.
    
    So allocate surplus hugepages in one of the nodes the current process
    belongs to.
    
    Easy way to reproduce this issue is to use a 2+ NUMA nodes system:
    
    	# echo 0 >/proc/sys/vm/nr_hugepages
    	# echo 1 >/proc/sys/vm/nr_overcommit_hugepages
    	# numactl -m0 ./tools/testing/selftests/mm/map_hugetlb 2
    
    Repeating the execution of map_hugetlb test application will eventually
    fail when the hugepage ends up allocated in a different node.
    
    [aris@ruivo.org: v2]
      Link: https://lkml.kernel.org/r/20240701212343.GG844599@cathedrallabs.org
    Link: https://lkml.kernel.org/r/20240621190050.mhxwb65zn37doegp@redhat.comSigned-off-by: default avatarAristeu Rozanski <aris@redhat.com>
    Cc: Muchun Song <muchun.song@linux.dev>
    Cc: Aristeu Rozanski <aris@ruivo.org>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Vishal Moola <vishal.moola@gmail.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    003af997
hugetlb.c 218 KB