• George Spelvin's avatar
    bcache: Clean up bch_get_congested() · 3a394727
    George Spelvin authored
    There are a few nits in this function.  They could in theory all
    be separate patches, but that's probably taking small commits
    too far.
    
    1) I added a brief comment saying what it does.
    
    2) I like to declare pointer parameters "const" where possible
       for documentation reasons.
    
    3) It uses bitmap_weight(&rand, BITS_PER_LONG) to compute the Hamming
    weight of a 32-bit random number (giving a random integer with
    mean 16 and variance 8).  Passing by reference in a 64-bit variable
    is silly; just use hweight32().
    
    4) Its helper function fract_exp_two is unnecessarily tangled.
    Gcc can optimize the multiply by (1 << x) to a shift, but it can
    be written in a much more straightforward way at the cost of one
    more bit of internal precision.  Some analysis reveals that this
    bit is always available.
    
    This shrinks the object code for fract_exp_two(x, 6) from 23 bytes:
    
    0000000000000000 <foo1>:
       0:   89 f9                   mov    %edi,%ecx
       2:   c1 e9 06                shr    $0x6,%ecx
       5:   b8 01 00 00 00          mov    $0x1,%eax
       a:   d3 e0                   shl    %cl,%eax
       c:   83 e7 3f                and    $0x3f,%edi
       f:   d3 e7                   shl    %cl,%edi
      11:   c1 ef 06                shr    $0x6,%edi
      14:   01 f8                   add    %edi,%eax
      16:   c3                      retq
    
    To 19:
    
    0000000000000017 <foo2>:
      17:   89 f8                   mov    %edi,%eax
      19:   83 e0 3f                and    $0x3f,%eax
      1c:   83 c0 40                add    $0x40,%eax
      1f:   89 f9                   mov    %edi,%ecx
      21:   c1 e9 06                shr    $0x6,%ecx
      24:   d3 e0                   shl    %cl,%eax
      26:   c1 e8 06                shr    $0x6,%eax
      29:   c3                      retq
    
    (Verified with 0 <= frac_bits <= 8, 0 <= x < 16<<frac_bits;
    both versions produce the same output.)
    
    5) And finally, the call to bch_get_congested() in check_should_bypass()
    is separated from the use of the value by multiple tests which
    could moot the need to compute it.  Move the computation down to
    where it's needed.  This also saves a local register to hold the
    computed value.
    Signed-off-by: default avatarGeorge Spelvin <lkml@sdf.org>
    Signed-off-by: default avatarColy Li <colyli@suse.de>
    Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
    3a394727
util.h 15.5 KB