• Timofey Titovets's avatar
    Btrfs: heuristic: add Shannon entropy calculation · 19562430
    Timofey Titovets authored
    Byte distribution check in heuristic will filter edge data cases and
    some time fail to classify input data.
    
    Let's fix that by adding Shannon entropy calculation, that will cover
    classification of most other data types.
    
    As Shannon entropy needs log2 with some precision to work, let's use
    ilog2(N) and for increased precision, by do ilog2(pow(N, 4)).
    
    Shannon entropy has been slightly changed to avoid signed numbers and
    division.
    
    The calculation is direct by the formula, successor of precalculated
    table or chains of if-else.
    
    The accuracy errors of ilog2 are compensated by
    
    @ENTROPY_LVL_ACEPTABLE 70 -> 65
    @ENTROPY_LVL_HIGH      85 -> 80
    Signed-off-by: default avatarTimofey Titovets <nefelim4ag@gmail.com>
    Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
    [ update comments ]
    Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
    19562430
compression.c 39.2 KB