• Sergey Senozhatsky's avatar
    zsmalloc: fine-grained inuse ratio based fullness grouping · 4c7ac972
    Sergey Senozhatsky authored
    Each zspage maintains ->inuse counter which keeps track of the number of
    objects stored in the zspage.  The ->inuse counter also determines the
    zspage's "fullness group" which is calculated as the ratio of the "inuse"
    objects to the total number of objects the zspage can hold
    (objs_per_zspage).  The closer the ->inuse counter is to objs_per_zspage,
    the better.
    
    Each size class maintains several fullness lists, that keep track of
    zspages of particular "fullness".  Pages within each fullness list are
    stored in random order with regard to the ->inuse counter.  This is
    because sorting the zspages by ->inuse counter each time obj_malloc() or
    obj_free() is called would be too expensive.  However, the ->inuse counter
    is still a crucial factor in many situations.
    
    For the two major zsmalloc operations, zs_malloc() and zs_compact(), we
    typically select the head zspage from the corresponding fullness list as
    the best candidate zspage.  However, this assumption is not always
    accurate.
    
    For the zs_malloc() operation, the optimal candidate zspage should have
    the highest ->inuse counter.  This is because the goal is to maximize the
    number of ZS_FULL zspages and make full use of all allocated memory.
    
    For the zs_compact() operation, the optimal source zspage should have the
    lowest ->inuse counter.  This is because compaction needs to move objects
    in use to another page before it can release the zspage and return its
    physical pages to the buddy allocator.  The fewer objects in use, the
    quicker compaction can release the zspage.  Additionally, compaction is
    measured by the number of pages it releases.
    
    This patch reworks the fullness grouping mechanism.  Instead of having two
    groups - ZS_ALMOST_EMPTY (usage ratio below 3/4) and ZS_ALMOST_FULL (usage
    ration above 3/4) - that result in too many zspages being included in the
    ALMOST_EMPTY group for specific classes, size classes maintain a larger
    number of fullness lists that give strict guarantees on the minimum and
    maximum ->inuse values within each group.  Each group represents a 10%
    change in the ->inuse ratio compared to neighboring groups.  In essence,
    there are groups for zspages with 0%, 10%, 20% usage ratios, and so on, up
    to 100%.
    
    This enhances the selection of candidate zspages for both zs_malloc() and
    zs_compact().  A printout of the ->inuse counters of the first 7 zspages
    per (random) class fullness group:
    
     class-768 objs_per_zspage 16:
       fullness 100%:  empty
       fullness  99%:  empty
       fullness  90%:  empty
       fullness  80%:  empty
       fullness  70%:  empty
       fullness  60%:  8  8  9  9  8  8  8
       fullness  50%:  empty
       fullness  40%:  5  5  6  5  5  5  5
       fullness  30%:  4  4  4  4  4  4  4
       fullness  20%:  2  3  2  3  3  2  2
       fullness  10%:  1  1  1  1  1  1  1
       fullness   0%:  empty
    
    The zs_malloc() function searches through the groups of pages starting
    with the one having the highest usage ratio.  This means that it always
    selects a zspage from the group with the least internal fragmentation
    (highest usage ratio) and makes it even less fragmented by increasing its
    usage ratio.
    
    The zs_compact() function, on the other hand, begins by scanning the group
    with the highest fragmentation (lowest usage ratio) to locate the source
    page.  The first available zspage is selected, and then the function moves
    downward to find a destination zspage in the group with the lowest
    internal fragmentation (highest usage ratio).
    
    Link: https://lkml.kernel.org/r/20230304034835.2082479-3-senozhatsky@chromium.orgSigned-off-by: default avatarSergey Senozhatsky <senozhatsky@chromium.org>
    Acked-by: default avatarMinchan Kim <minchan@kernel.org>
    Cc: Yosry Ahmed <yosryahmed@google.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    4c7ac972
zsmalloc.c 66.1 KB