@@ -132,8 +132,8 @@ general guidelines around how to collect those, due to the individual nature of
There are several types of counters which are all found in `usage_data.rb`:
-**Ordinary Batch Counters:** Simple count of a given ActiveRecord_Relation
-**Distinct Batch Counters:** Distinct count of a given ActiveRecord_Relation on given column
-**Sum Batch Counters:** Sum the values of a given ActiveRecord_Relation on given column
-**Distinct Batch Counters:** Distinct count of a given ActiveRecord_Relation in a given column
-**Sum Batch Counters:** Sum the values of a given ActiveRecord_Relation in a given column
-**Alternative Counters:** Used for settings and configurations
-**Redis Counters:** Used for in-memory counts.
...
...
@@ -153,7 +153,15 @@ For GitLab.com, there are extremely large tables with 15 second query timeouts,
| `merge_request_diff_files` | 1082 |
| `events` | 514 |
There are two batch counting methods provided, `Ordinary Batch Counters` and `Distinct Batch Counters`. Batch counting requires indexes on columns to calculate max, min, and range queries. In some cases, a specialized index may need to be added on the columns involved in a counter.
We have several batch counting methods available:
-`Ordinary Batch Counters`
-`Distinct Batch Counters`
-`Sum Batch Counters`
-`Estimated Batch Counters`
Batch counting requires indexes on columns to calculate max, min, and range queries. In some cases,
you may need to add a specialized index on the columns involved in a counter.
1. Simple execution of estimated batch counter, with only relation provided, returned value will represent estimated
number of unique values in `id` column (which is the primary key) of `Project` relation:
```ruby
estimate_batch_distinct_count(::Project)
```
1. Execution of estimated batch counter, where provided relation has applied additional filter (`.where(time_period)`), number of unique values is going to be estimated in custom column (`:author_id`), and parameters: `start` and `finish` together apply boundaries that defines range of provided relation that is going to be analyzed
Handles `::Redis::CommandError` and `Gitlab::UsageDataCounters::BaseCounter::UnknownEvent`
...
...
@@ -309,6 +387,10 @@ Examples of implementation:
#### Redis HLL Counters
WARNING:
HyperLogLog (HLL) is a probabilistic algorithm and its **results always includes some small error**. According to [Redis documentation](https://redis.io/commands/pfcount), data from
used HLL implementation is "approximated with a standard error of 0.81%".
With `Gitlab::UsageDataCounters::HLLRedisCounter` we have available data structures used to count unique values.
Implemented using Redis methods [PFADD](https://redis.io/commands/pfadd) and [PFCOUNT](https://redis.io/commands/pfcount).
...
...
@@ -783,8 +865,6 @@ appear to be associated to any of the services running, since they all appear to
## Aggregated metrics
> - [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/45979) in GitLab 13.6.
> - It's [deployed behind a feature flag](../../user/feature_flags.md), disabled by default.
> - It's enabled on GitLab.com.
WARNING:
This feature is intended solely for internal GitLab use.