@@ -537,13 +537,13 @@ Data that was created on the primary while the secondary was paused will be lost
...
@@ -537,13 +537,13 @@ Data that was created on the primary while the secondary was paused will be lost
1. Update the existing cluster configuration.
1. Update the existing cluster configuration.
You can retrieve the existing config with Helm:
You can retrieve the existing configuration with Helm:
```shell
```shell
helm --namespace gitlab get values gitlab-geo > gitlab.yaml
helm --namespace gitlab get values gitlab-geo > gitlab.yaml
```
```
The existing config will contain a section for Geo that should resemble:
The existing configuration will contain a section for Geo that should resemble:
```yaml
```yaml
geo:
geo:
...
@@ -562,7 +562,7 @@ Data that was created on the primary while the secondary was paused will be lost
...
@@ -562,7 +562,7 @@ Data that was created on the primary while the secondary was paused will be lost
You can remove the entire `psql` section if the cluster will remain as a primary site, this refers to the tracking database and will be ignored whilst the cluster is acting as a primary site.
You can remove the entire `psql` section if the cluster will remain as a primary site, this refers to the tracking database and will be ignored whilst the cluster is acting as a primary site.
@@ -196,7 +196,7 @@ This list of limitations only reflects the latest version of GitLab. If you are
...
@@ -196,7 +196,7 @@ This list of limitations only reflects the latest version of GitLab. If you are
- Object pools for forked project deduplication work only on the **primary** site, and are duplicated on the **secondary** site.
- Object pools for forked project deduplication work only on the **primary** site, and are duplicated on the **secondary** site.
- GitLab Runners cannot register with a **secondary** site. Support for this is [planned for the future](https://gitlab.com/gitlab-org/gitlab/-/issues/3294).
- GitLab Runners cannot register with a **secondary** site. Support for this is [planned for the future](https://gitlab.com/gitlab-org/gitlab/-/issues/3294).
- Configuring Geo **secondary** sites to [use high-availability configurations of PostgreSQL](https://gitlab.com/groups/gitlab-org/-/epics/2536) is currently in **alpha** support.
- Configuring Geo **secondary** sites to [use high-availability configurations of PostgreSQL](https://gitlab.com/groups/gitlab-org/-/epics/2536) is currently in **alpha** support.
-[Selective synchronization](replication/configuration.md#selective-synchronization) only limits what repositories are replicated. The entire PostgreSQL data is still replicated. Selective synchronization is not built to accomodate compliance / export control use cases.
-[Selective synchronization](replication/configuration.md#selective-synchronization) only limits what repositories are replicated. The entire PostgreSQL data is still replicated. Selective synchronization is not built to accommodate compliance / export control use cases.
Components marked with * can be optionally run on reputable
Components marked with * can be optionally run on reputable
third party external PaaS PostgreSQL solutions. Google Cloud SQL and AWS RDS are known to work.
third party external PaaS PostgreSQL solutions. Google Cloud SQL and AWS RDS are known to work.
Components marked with ** can be optionally run on reputable
Components marked with ** can be optionally run on reputable
third party external PaaS Redis solutions. Google Memorystore and AWS Elasticache are known to work.
third party external PaaS Redis solutions. Google Memorystore and AWS ElastiCache are known to work.
```plantuml
```plantuml
@startuml 3k
@startuml 3k
...
@@ -1213,7 +1213,7 @@ Praefect requires several secret tokens to secure communications across the Clus
...
@@ -1213,7 +1213,7 @@ Praefect requires several secret tokens to secure communications across the Clus
Gitaly Cluster nodes are configured in Praefect via a `virtual storage`. Each storage contains
Gitaly Cluster nodes are configured in Praefect via a `virtual storage`. Each storage contains
the details of each Gitaly node that makes up the cluster. Each storage is also given a name
the details of each Gitaly node that makes up the cluster. Each storage is also given a name
and this name is used in several areas of the config. In this guide, the name of the storage will be
and this name is used in several areas of the configuration. In this guide, the name of the storage will be
`default`. Also, this guide is geared towards new installs, if upgrading an existing environment
`default`. Also, this guide is geared towards new installs, if upgrading an existing environment
to use Gitaly Cluster, you may need to use a different name.
to use Gitaly Cluster, you may need to use a different name.
Refer to the [Praefect documentation](../gitaly/praefect.md#praefect) for more info.
Refer to the [Praefect documentation](../gitaly/praefect.md#praefect) for more info.
...
@@ -2074,7 +2074,7 @@ but with smaller performance requirements, several modifications can be consider
...
@@ -2074,7 +2074,7 @@ but with smaller performance requirements, several modifications can be consider
- PostgreSQL: Can be run on reputable Cloud PaaS solutions such as Google Cloud SQL or AWS RDS. In this setup, the PgBouncer and Consul nodes are no longer required:
- PostgreSQL: Can be run on reputable Cloud PaaS solutions such as Google Cloud SQL or AWS RDS. In this setup, the PgBouncer and Consul nodes are no longer required:
- Consul may still be desired if [Prometheus](../monitoring/prometheus/index.md) auto discovery is a requirement, otherwise you would need to [manually add scrape configurations](../monitoring/prometheus/index.md#adding-custom-scrape-configurations) for all nodes.
- Consul may still be desired if [Prometheus](../monitoring/prometheus/index.md) auto discovery is a requirement, otherwise you would need to [manually add scrape configurations](../monitoring/prometheus/index.md#adding-custom-scrape-configurations) for all nodes.
- As Redis Sentinel runs on the same box as Consul in this architecture, it may need to be run on a separate box if Redis is still being run via Omnibus.
- As Redis Sentinel runs on the same box as Consul in this architecture, it may need to be run on a separate box if Redis is still being run via Omnibus.
- Redis: Can be run on reputable Cloud PaaS solutions such as Google Memorystore and AWS Elasticache. In this setup, the Redis Sentinel is no longer required.
- Redis: Can be run on reputable Cloud PaaS solutions such as Google Memorystore and AWS ElastiCache. In this setup, the Redis Sentinel is no longer required.
We expect to see 20M builds created daily on GitLab.com in the first half of
We expect to see 20M builds created daily on GitLab.com in the first half of
2024.
2024.
![ci_builds cumulative with forecast](ci_builds_cumulative_forecast.png)
![CI builds cumulative with forecast](ci_builds_cumulative_forecast.png)
## Goals
## Goals
...
@@ -46,9 +46,9 @@ Historically, Rails used to use [integer](https://www.postgresql.org/docs/9.1/da
...
@@ -46,9 +46,9 @@ Historically, Rails used to use [integer](https://www.postgresql.org/docs/9.1/da
type when creating primary keys for a table. We did use the default when we
type when creating primary keys for a table. We did use the default when we
[created the `ci_builds` table in 2012](https://gitlab.com/gitlab-org/gitlab/-/blob/046b28312704f3131e72dcd2dbdacc5264d4aa62/db/ci/migrate/20121004165038_create_builds.rb).
[created the `ci_builds` table in 2012](https://gitlab.com/gitlab-org/gitlab/-/blob/046b28312704f3131e72dcd2dbdacc5264d4aa62/db/ci/migrate/20121004165038_create_builds.rb).
[The behavior of Rails has changed](https://github.com/rails/rails/pull/26266)
[The behavior of Rails has changed](https://github.com/rails/rails/pull/26266)
since the release of Rails 5. The framework is now using bigint type that is 8
since the release of Rails 5. The framework is now using `bigint` type that is 8
bytes long, however we have not migrated primary keys for `ci_builds` table to
bytes long, however we have not migrated primary keys for `ci_builds` table to
bigint yet.
`bigint` yet.
We will run out of the capacity of the integer type to store primary keys in
We will run out of the capacity of the integer type to store primary keys in
`ci_builds` table before December 2021. When it happens without a viable
`ci_builds` table before December 2021. When it happens without a viable
...
@@ -89,7 +89,7 @@ Prophet](https://facebook.github.io/prophet/) shows that in the first half of
...
@@ -89,7 +89,7 @@ Prophet](https://facebook.github.io/prophet/) shows that in the first half of
to around 2M we see created today, this is 10x growth our product might need to
to around 2M we see created today, this is 10x growth our product might need to
@@ -41,7 +41,7 @@ With pagination, the data is split into equal pieces (pages). On the first visit
...
@@ -41,7 +41,7 @@ With pagination, the data is split into equal pieces (pages). On the first visit
### Pick the right approach
### Pick the right approach
Let the database handle the pagination, filtering, and data retrieval. Implementing in-memory pagination on the backend (`paginate_array` from kaminari) or on the frontend (JavaScript) might work for a few hundreds of records. If application limits are not defined, things can get out of control quickly.
Let the database handle the pagination, filtering, and data retrieval. Implementing in-memory pagination on the backend (`paginate_array` from Kaminari) or on the frontend (JavaScript) might work for a few hundreds of records. If application limits are not defined, things can get out of control quickly.
### Reduce complexity
### Reduce complexity
...
@@ -78,7 +78,7 @@ Infinite scroll can use keyset pagination without affecting the user experience
...
@@ -78,7 +78,7 @@ Infinite scroll can use keyset pagination without affecting the user experience
### Offset pagination
### Offset pagination
The most common way to paginate lists is using offset-based pagination (UI and REST API). It's backed by the popular [kaminari](https://github.com/kaminari/kaminari) Ruby gem, which provides convenient helper methods to implement pagination on ActiveRecord queries.
The most common way to paginate lists is using offset-based pagination (UI and REST API). It's backed by the popular [Kaminari](https://github.com/kaminari/kaminari) Ruby gem, which provides convenient helper methods to implement pagination on ActiveRecord queries.
Offset-based pagination is leveraging the `LIMIT` and `OFFSET` SQL clauses to take out a specific slice from the table.
Offset-based pagination is leveraging the `LIMIT` and `OFFSET` SQL clauses to take out a specific slice from the table.
...
@@ -97,9 +97,9 @@ Notice that the query also orders the rows by the primary key (`id`). When pagin
...
@@ -97,9 +97,9 @@ Notice that the query also orders the rows by the primary key (`id`). When pagin
Example pagination bar:
Example pagination bar:
![Page selector rendered by kaminari](../img/offset_pagination_ui_v13_11.jpg)
![Page selector rendered by Kaminari](../img/offset_pagination_ui_v13_11.jpg)
The kaminari gem renders a nice pagination bar on the UI with page numbers and optionally quick shortcuts the next, previous, first, and last page buttons. To render these buttons, kaminari needs to know the number of rows, and for that, a count query is executed.
The Kaminari gem renders a nice pagination bar on the UI with page numbers and optionally quick shortcuts the next, previous, first, and last page buttons. To render these buttons, Kaminari needs to know the number of rows, and for that, a count query is executed.
```sql
```sql
SELECTCOUNT(*)FROMissuesWHEREproject_id=1
SELECTCOUNT(*)FROMissuesWHEREproject_id=1
...
@@ -158,7 +158,7 @@ Here we're leveraging the ordered property of the b-tree database index. Values
...
@@ -158,7 +158,7 @@ Here we're leveraging the ordered property of the b-tree database index. Values
Kaminari by default executes a count query to determine the number of pages for rendering the page links. Count queries can be quite expensive for a large table, in an unfortunate scenario the queries will simply time out.
Kaminari by default executes a count query to determine the number of pages for rendering the page links. Count queries can be quite expensive for a large table, in an unfortunate scenario the queries will simply time out.
To work around this, we can run kaminari without invoking the count SQL query.
To work around this, we can run Kaminari without invoking the count SQL query.
@@ -311,5 +311,5 @@ Using keyset pagination outside of GraphQL is not straightforward. We have the l
...
@@ -311,5 +311,5 @@ Using keyset pagination outside of GraphQL is not straightforward. We have the l
Keyset pagination provides stable performance regardless of the number of pages we moved forward. To achieve this performance, the paginated query needs an index that covers all the columns in the `ORDER BY` clause, similarly to the offset pagination.
Keyset pagination provides stable performance regardless of the number of pages we moved forward. To achieve this performance, the paginated query needs an index that covers all the columns in the `ORDER BY` clause, similarly to the offset pagination.
### General performance guidelines
### General performance guidelines
See the [pagination general performance guidelines page](pagination_performance_guidelines.md).
See the [pagination general performance guidelines page](pagination_performance_guidelines.md).