• Feng Tang's avatar
    mm/mempolicy: don't handle MPOL_LOCAL like a fake MPOL_PREFERRED policy · 7858d7bc
    Feng Tang authored
    MPOL_LOCAL policy has been setup as a real policy, but it is still handled
    like a faked POL_PREFERRED policy with one internal MPOL_F_LOCAL flag bit
    set, and there are many places having to judge the real 'prefer' or the
    'local' policy, which are quite confusing.
    
    In current code, there are 4 cases that MPOL_LOCAL are used:
    
    1. user specifies 'local' policy
    
    2. user specifies 'prefer' policy, but with empty nodemask
    
    3. system 'default' policy is used
    
    4. 'prefer' policy + valid 'preferred' node with MPOL_F_STATIC_NODES
       flag set, and when it is 'rebind' to a nodemask which doesn't contains
       the 'preferred' node, it will perform as 'local' policy
    
    So make 'local' a real policy instead of a fake 'prefer' one, and kill
    MPOL_F_LOCAL bit, which can greatly reduce the confusion for code reading.
    
    For case 4, the logic of mpol_rebind_preferred() is confusing, as Michal
    Hocko pointed out:
    
    : I do believe that rebinding preferred policy is just bogus and it should
    : be dropped altogether on the ground that a preference is a mere hint from
    : userspace where to start the allocation.  Unless I am missing something
    : cpusets will be always authoritative for the final placement.  The
    : preferred node just acts as a starting point and it should be really
    : preserved when cpusets changes.  Otherwise we have a very subtle behavior
    : corner cases.
    
    So dump all the tricky transformation between 'prefer' and 'local', and
    just record the new nodemask of rebinding.
    
    [feng.tang@intel.com: fix a problem in mpol_set_nodemask(), per Michal Hocko]
      Link: https://lkml.kernel.org/r/1622560492-1294-3-git-send-email-feng.tang@intel.com
    [feng.tang@intel.com: refine code and comments of mpol_set_nodemask(), per Michal]
      Link: https://lkml.kernel.org/r/20210603081807.GE56979@shbuild999.sh.intel.com
    
    Link: https://lkml.kernel.org/r/1622469956-82897-3-git-send-email-feng.tang@intel.comSigned-off-by: default avatarFeng Tang <feng.tang@intel.com>
    Suggested-by: default avatarMichal Hocko <mhocko@suse.com>
    Acked-by: default avatarMichal Hocko <mhocko@suse.com>
    Cc: Andi Kleen <ak@linux.intel.com>
    Cc: Andrea Arcangeli <aarcange@redhat.com>
    Cc: Ben Widawsky <ben.widawsky@intel.com>
    Cc: Dan Williams <dan.j.williams@intel.com>
    Cc: Dave Hansen <dave.hansen@intel.com>
    Cc: David Rientjes <rientjes@google.com>
    Cc: Huang Ying <ying.huang@intel.com>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Cc: Michal Hocko <mhocko@kernel.org>
    Cc: Mike Kravetz <mike.kravetz@oracle.com>
    Cc: Randy Dunlap <rdunlap@infradead.org>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    7858d7bc
mempolicy.c 74.7 KB