• Steven Whitehouse's avatar
    GFS2: glock statistics gathering · a245769f
    Steven Whitehouse authored
    The stats are divided into two sets: those relating to the
    super block and those relating to an individual glock. The
    super block stats are done on a per cpu basis in order to
    try and reduce the overhead of gathering them. They are also
    further divided by glock type.
    
    In the case of both the super block and glock statistics,
    the same information is gathered in each case. The super
    block statistics are used to provide default values for
    most of the glock statistics, so that newly created glocks
    should have, as far as possible, a sensible starting point.
    
    The statistics are divided into three pairs of mean and
    variance, plus two counters. The mean/variance pairs are
    smoothed exponential estimates and the algorithm used is
    one which will be very familiar to those used to calculation
    of round trip times in network code.
    
    The three pairs of mean/variance measure the following
    things:
    
     1. DLM lock time (non-blocking requests)
     2. DLM lock time (blocking requests)
     3. Inter-request time (again to the DLM)
    
    A non-blocking request is one which will complete right
    away, whatever the state of the DLM lock in question. That
    currently means any requests when (a) the current state of
    the lock is exclusive (b) the requested state is either null
    or unlocked or (c) the "try lock" flag is set. A blocking
    request covers all the other lock requests.
    
    There are two counters. The first is there primarily to show
    how many lock requests have been made, and thus how much data
    has gone into the mean/variance calculations. The other counter
    is counting queueing of holders at the top layer of the glock
    code. Hopefully that number will be a lot larger than the number
    of dlm lock requests issued.
    
    So why gather these statistics? There are several reasons
    we'd like to get a better idea of these timings:
    
    1. To be able to better set the glock "min hold time"
    2. To spot performance issues more easily
    3. To improve the algorithm for selecting resource groups for
    allocation (to base it on lock wait time, rather than blindly
    using a "try lock")
    Due to the smoothing action of the updates, a step change in
    some input quantity being sampled will only fully be taken
    into account after 8 samples (or 4 for the variance) and this
    needs to be carefully considered when interpreting the
    results.
    
    Knowing both the time it takes a lock request to complete and
    the average time between lock requests for a glock means we
    can compute the total percentage of the time for which the
    node is able to use a glock vs. time that the rest of the
    cluster has its share. That will be very useful when setting
    the lock min hold time.
    
    The other point to remember is that all times are in
    nanoseconds. Great care has been taken to ensure that we
    measure exactly the quantities that we want, as accurately
    as possible. There are always inaccuracies in any
    measuring system, but I hope this is as accurate as we
    can reasonably make it.
    Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
    a245769f
trace_gfs2.h 13.4 KB