• Bob Van Landuyt's avatar
    Don't record duration for errors · 789dc771
    Bob Van Landuyt authored
    This makes the request duration histograms consistent across both
    metrics and only records it when the request was successful.
    
    It also makes sure that the way the error was received in the
    middleware does not matter anymore. Before this change, an error that
    bubbled up towards the middleware would count towards a groups error
    budget, but not towards the service availability. With this, both metrics
    will not include durations of requests that resulted in responses with
    a 5xx status code.
    
    If a request failed, it's not very important for users how fast it
    failed.
    
    For our metrics it could also skew results: A very fast 500 would have
    less impact on a service's availability or a groups budget spend than
    a slow one. While they should weigh the same. Similarly, a slow
    failure should not count double towards availability.
    
    This results in the following scoring for requests:
    
    |         | Fast | Slow |
    |---------|------|------|
    | Success | 2/2  | 1/2  |
    | Error   | 0/1  | 0/1  |
    789dc771
gitlab_metrics.md 50.4 KB