• Nick Terrell's avatar
    lib: zstd: Don't inline functions in zstd_opt.c · 1974990c
    Nick Terrell authored
    `zstd_opt.c` contains the match finder for the highest compression
    levels. These levels are already very slow, and are unlikely to be used
    in the kernel. If they are used, they shouldn't be used in latency
    sensitive workloads, so slowing them down shouldn't be a big deal.
    
    This saves 188 KB of the 288 KB regression reported by Geert Uytterhoeven [0].
    I've also opened an issue upstream [1] so that we can properly tackle
    the code size issue in `zstd_opt.c` for all users, and can hopefully
    remove this hack in the next zstd version we import.
    
    Bloat-o-meter output on x86-64:
    
    ```
    > ../scripts/bloat-o-meter vmlinux.old vmlinux
    add/remove: 6/5 grow/shrink: 1/9 up/down: 16673/-209939 (-193266)
    Function                                     old     new   delta
    ZSTD_compressBlock_opt_generic.constprop       -    7559   +7559
    ZSTD_insertBtAndGetAllMatches                  -    6304   +6304
    ZSTD_insertBt1                                 -    1731   +1731
    ZSTD_storeSeq                                  -     693    +693
    ZSTD_BtGetAllMatches                           -     255    +255
    ZSTD_updateRep                                 -     128    +128
    ZSTD_updateTree                               96      99      +3
    ZSTD_insertAndFindFirstIndexHash3             81       -     -81
    ZSTD_setBasePrices.constprop                  98       -     -98
    ZSTD_litLengthPrice.constprop                138       -    -138
    ZSTD_count                                   362     181    -181
    ZSTD_count_2segments                        1407     938    -469
    ZSTD_insertBt1.constprop                    2689       -   -2689
    ZSTD_compressBlock_btultra2                19990     423  -19567
    ZSTD_compressBlock_btultra                 19633      15  -19618
    ZSTD_initStats_ultra                       19825       -  -19825
    ZSTD_compressBlock_btopt                   20374      12  -20362
    ZSTD_compressBlock_btopt_extDict           29984      12  -29972
    ZSTD_compressBlock_btultra_extDict         30718      15  -30703
    ZSTD_compressBlock_btopt_dictMatchState    32689      12  -32677
    ZSTD_compressBlock_btultra_dictMatchState   33574      15  -33559
    Total: Before=6611828, After=6418562, chg -2.92%
    ```
    
    [0] https://lkml.org/lkml/2021/11/14/189
    [1] https://github.com/facebook/zstd/issues/2862
    
    Link: https://lore.kernel.org/r/20211117014949.1169186-3-nickrterrell@gmail.com/
    Link: https://lore.kernel.org/r/20211117201459.1194876-3-nickrterrell@gmail.com/Reported-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
    Tested-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
    Reviewed-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
    Signed-off-by: default avatarNick Terrell <terrelln@fb.com>
    1974990c
compiler.h 5.75 KB