1.[PostgreSQL](database.md#postgresql-in-a-scaled-environment) with [PGBouncer](https://docs.gitlab.com/ee/administration/high_availability/pgbouncer.html)
1.[Redis](redis.md#redis-in-a-scaled-environment)
1.[Redis](redis.md#redis-in-a-scaled-environment)
1.[Gitaly](gitaly.md)(recommended) and / or [NFS](nfs.md)[^4]
1.[Gitaly](gitaly.md)(recommended) and / or [NFS](nfs.md)[^4]
1.[GitLab application nodes](gitlab.md)
1.[GitLab application nodes](gitlab.md)
- With [Object Storage service enabled](../gitaly/index.md#eliminating-nfs-altogether)[^3]
- With [Object Storage service enabled](../gitaly/index.md#eliminating-nfs-altogether)[^3]
1.[Load Balancer](load_balancer.md)[^2]
1.[Load Balancer(s)](load_balancer.md)[^2]
1.[Monitoring node (Prometheus and Grafana)](monitoring_node.md)
1.[Monitoring node (Prometheus and Grafana)](monitoring_node.md)
### Full Scaling
### Full Scaling
...
@@ -98,8 +98,8 @@ is split into separate Sidekiq and Unicorn/Workhorse nodes. One indication that
...
@@ -98,8 +98,8 @@ is split into separate Sidekiq and Unicorn/Workhorse nodes. One indication that
this architecture is required is if Sidekiq queues begin to periodically increase
this architecture is required is if Sidekiq queues begin to periodically increase
in size, indicating that there is contention or there are not enough resources.
in size, indicating that there is contention or there are not enough resources.
- 1 or more PostgreSQL node
- 1 or more PostgreSQL nodes
- 1 or more Redis node
- 1 or more Redis nodes
- 1 or more Gitaly storage servers
- 1 or more Gitaly storage servers
- 1 or more Object Storage services[^3] and / or NFS storage server[^4]
- 1 or more Object Storage services[^3] and / or NFS storage server[^4]
- 2 or more Sidekiq nodes
- 2 or more Sidekiq nodes
...
@@ -182,6 +182,7 @@ the basis of the GitLab.com architecture. While this scales well it also comes
...
@@ -182,6 +182,7 @@ the basis of the GitLab.com architecture. While this scales well it also comes
with the added complexity of many more nodes to configure, manage, and monitor.
with the added complexity of many more nodes to configure, manage, and monitor.
- 3 PostgreSQL nodes
- 3 PostgreSQL nodes
- 1 or more PgBouncer nodes (with associated internal load balancers)
- 4 or more Redis nodes (2 separate clusters for persistent and cache data)
- 4 or more Redis nodes (2 separate clusters for persistent and cache data)
- 3 Consul nodes
- 3 Consul nodes
- 3 Sentinel nodes
- 3 Sentinel nodes
...
@@ -228,16 +229,17 @@ users are, how much automation you use, mirroring, and repo/change size.
...
@@ -228,16 +229,17 @@ users are, how much automation you use, mirroring, and repo/change size.
@@ -135,7 +135,8 @@ The recommended configuration for a PostgreSQL HA requires:
...
@@ -135,7 +135,8 @@ The recommended configuration for a PostgreSQL HA requires:
-`repmgrd` - A service to monitor, and handle failover in case of a failure
-`repmgrd` - A service to monitor, and handle failover in case of a failure
-`Consul` agent - Used for service discovery, to alert other nodes when failover occurs
-`Consul` agent - Used for service discovery, to alert other nodes when failover occurs
- A minimum of three `Consul` server nodes
- A minimum of three `Consul` server nodes
- A minimum of one `pgbouncer` service node
- A minimum of one `pgbouncer` service node, but it's recommended to have one per database node
- An internal load balancer (TCP) is required when there is more than one `pgbouncer` service node
You also need to take into consideration the underlying network topology,
You also need to take into consideration the underlying network topology,
making sure you have redundant connectivity between all Database and GitLab instances,
making sure you have redundant connectivity between all Database and GitLab instances,
...
@@ -155,13 +156,13 @@ Database nodes run two services with PostgreSQL:
...
@@ -155,13 +156,13 @@ Database nodes run two services with PostgreSQL:
On failure, the old master node is automatically evicted from the cluster, and should be rejoined manually once recovered.
On failure, the old master node is automatically evicted from the cluster, and should be rejoined manually once recovered.
- Consul. Monitors the status of each node in the database cluster and tracks its health in a service definition on the Consul cluster.
- Consul. Monitors the status of each node in the database cluster and tracks its health in a service definition on the Consul cluster.
Alongside PgBouncer, there is a Consul agent that watches the status of the PostgreSQL service. If that status changes, Consul runs a script which updates the configuration and reloads PgBouncer
Alongside each PgBouncer, there is a Consul agent that watches the status of the PostgreSQL service. If that status changes, Consul runs a script which updates the configuration and reloads PgBouncer
##### Connection flow
##### Connection flow
Each service in the package comes with a set of [default ports](https://docs.gitlab.com/omnibus/package-information/defaults.html#ports). You may need to make specific firewall rules for the connections listed below:
Each service in the package comes with a set of [default ports](https://docs.gitlab.com/omnibus/package-information/defaults.html#ports). You may need to make specific firewall rules for the connections listed below:
- Application servers connect to [PgBouncer default port](https://docs.gitlab.com/omnibus/package-information/defaults.html#pgbouncer)
- Application servers connect to either PgBouncer directly via its [default port](https://docs.gitlab.com/omnibus/package-information/defaults.html#pgbouncer) or via a configured Internal Load Balancer (TCP) that serves multiple PgBouncers.
- PgBouncer connects to the primary database servers [PostgreSQL default port](https://docs.gitlab.com/omnibus/package-information/defaults.html#postgresql)
- PgBouncer connects to the primary database servers [PostgreSQL default port](https://docs.gitlab.com/omnibus/package-information/defaults.html#postgresql)
- Repmgr connects to the database servers [PostgreSQL default port](https://docs.gitlab.com/omnibus/package-information/defaults.html#postgresql)
- Repmgr connects to the database servers [PostgreSQL default port](https://docs.gitlab.com/omnibus/package-information/defaults.html#postgresql)
- Postgres secondaries connect to the primary database servers [PostgreSQL default port](https://docs.gitlab.com/omnibus/package-information/defaults.html#postgresql)
- Postgres secondaries connect to the primary database servers [PostgreSQL default port](https://docs.gitlab.com/omnibus/package-information/defaults.html#postgresql)
...
@@ -499,7 +500,7 @@ attributes set, but the following need to be set.
...
@@ -499,7 +500,7 @@ attributes set, but the following need to be set.
@@ -533,7 +534,8 @@ Here we'll show you some fully expanded example configurations.
...
@@ -533,7 +534,8 @@ Here we'll show you some fully expanded example configurations.
##### Example recommended setup
##### Example recommended setup
This example uses 3 Consul servers, 3 PostgreSQL servers, and 1 application node.
This example uses 3 Consul servers, 3 PgBouncer servers (with associated internal load balancer),
3 PostgreSQL servers, and 1 application node.
We start with all servers on the same 10.6.0.0/16 private network range, they
We start with all servers on the same 10.6.0.0/16 private network range, they
can connect to each freely other on those addresses.
can connect to each freely other on those addresses.
...
@@ -543,14 +545,16 @@ Here is a list and description of each machine and the assigned IP:
...
@@ -543,14 +545,16 @@ Here is a list and description of each machine and the assigned IP:
-`10.6.0.11`: Consul 1
-`10.6.0.11`: Consul 1
-`10.6.0.12`: Consul 2
-`10.6.0.12`: Consul 2
-`10.6.0.13`: Consul 3
-`10.6.0.13`: Consul 3
-`10.6.0.21`: PostgreSQL master
-`10.6.0.20`: Internal Load Balancer
-`10.6.0.22`: PostgreSQL secondary
-`10.6.0.21`: PgBouncer 1
-`10.6.0.23`: PostgreSQL secondary
-`10.6.0.22`: PgBouncer 2
-`10.6.0.31`: GitLab application
-`10.6.0.23`: PgBouncer 3
-`10.6.0.31`: PostgreSQL master
All passwords are set to `toomanysecrets`, please do not use this password or derived hashes.
-`10.6.0.32`: PostgreSQL secondary
-`10.6.0.33`: PostgreSQL secondary
-`10.6.0.41`: GitLab application
The external_url for GitLab is `http://gitlab.example.com`
All passwords are set to `toomanysecrets`, please do not use this password or derived hashes and the external_url for GitLab is `http://gitlab.example.com`.
Please note that after the initial configuration, if a failover occurs, the PostgresSQL master will change to one of the available secondaries until it is failed back.
Please note that after the initial configuration, if a failover occurs, the PostgresSQL master will change to one of the available secondaries until it is failed back.
...
@@ -566,10 +570,45 @@ consul['configuration'] = {
...
@@ -566,10 +570,45 @@ consul['configuration'] = {
server: true,
server: true,
retry_join: %w(10.6.0.11 10.6.0.12 10.6.0.13)
retry_join: %w(10.6.0.11 10.6.0.12 10.6.0.13)
}
}
consul['monitoring_service_discovery']=true
```
[Reconfigure Omnibus GitLab][reconfigure GitLab] for the changes to take effect.
##### Example recommended setup for PgBouncer servers
On each server edit `/etc/gitlab/gitlab.rb`:
```ruby
# Disable all components except Pgbouncer and Consul agent
[Reconfigure Omnibus GitLab][reconfigure GitLab] for the changes to take effect.
[Reconfigure Omnibus GitLab][reconfigure GitLab] for the changes to take effect.
##### Internal load balancer setup
An internal load balancer (TCP) is then required to be setup to serve each PgBouncer node (in this example on the IP of `10.6.0.20`). An example of how to do this can be found in the [PgBouncer Configure Internal Load Balancer](pgbouncer.md#configure-the-internal-load-balancer) section.
##### Example recommended setup for PostgreSQL servers
##### Example recommended setup for PostgreSQL servers
After deploying the configuration follow these steps:
After deploying the configuration follow these steps:
1. On `10.6.0.21`, our primary database
1. On `10.6.0.31`, our primary database
Enable the `pg_trgm` extension
Enable the `pg_trgm` extension
...
@@ -673,7 +710,7 @@ After deploying the configuration follow these steps:
...
@@ -673,7 +710,7 @@ After deploying the configuration follow these steps:
CREATE EXTENSION pg_trgm;
CREATE EXTENSION pg_trgm;
```
```
1. On `10.6.0.22`, our first standby database
1. On `10.6.0.32`, our first standby database
Make this node a standby of the primary
Make this node a standby of the primary
...
@@ -681,7 +718,7 @@ After deploying the configuration follow these steps:
...
@@ -681,7 +718,7 @@ After deploying the configuration follow these steps:
gitlab-ctl repmgr standby setup 10.6.0.21
gitlab-ctl repmgr standby setup 10.6.0.21
```
```
1. On `10.6.0.23`, our second standby database
1. On `10.6.0.33`, our second standby database
Make this node a standby of the primary
Make this node a standby of the primary
...
@@ -689,7 +726,7 @@ After deploying the configuration follow these steps:
...
@@ -689,7 +726,7 @@ After deploying the configuration follow these steps:
gitlab-ctl repmgr standby setup 10.6.0.21
gitlab-ctl repmgr standby setup 10.6.0.21
```
```
1. On `10.6.0.31`, our application server
1. On `10.6.0.41`, our application server
Set `gitlab-consul` user's PgBouncer password to `toomanysecrets`
Set `gitlab-consul` user's PgBouncer password to `toomanysecrets`
...
@@ -705,7 +742,7 @@ After deploying the configuration follow these steps:
...
@@ -705,7 +742,7 @@ After deploying the configuration follow these steps:
#### Example minimal setup
#### Example minimal setup
This example uses 3 PostgreSQL servers, and 1 application node.
This example uses 3 PostgreSQL servers, and 1 application node (with PgBouncer setup alongside).
It differs from the [recommended setup](#example-recommended-setup) by moving the Consul servers into the same servers we use for PostgreSQL.
It differs from the [recommended setup](#example-recommended-setup) by moving the Consul servers into the same servers we use for PostgreSQL.
The trade-off is between reducing server counts, against the increased operational complexity of needing to deal with PostgreSQL [failover](#failover-procedure) and [restore](#restore-procedure) procedures in addition to [Consul outage recovery](consul.md#outage-recovery) on the same set of machines.
The trade-off is between reducing server counts, against the increased operational complexity of needing to deal with PostgreSQL [failover](#failover-procedure) and [restore](#restore-procedure) procedures in addition to [Consul outage recovery](consul.md#outage-recovery) on the same set of machines.
As part of its High Availability stack, GitLab Premium includes a bundled version of [PgBouncer](https://www.pgbouncer.org/) that can be managed through `/etc/gitlab/gitlab.rb`.
As part of its High Availability stack, GitLab Premium includes a bundled version of [PgBouncer](https://pgbouncer.github.io/) that can be managed through `/etc/gitlab/gitlab.rb`. PgBouncer is used to seamlessly migrate database connections between servers in a failover scenario. Additionally, it can be used in a non-HA setup to pool connections, speeding up response time while reducing resource usage.
In a High Availability setup, PgBouncer is used to seamlessly migrate database connections between servers in a failover scenario.
In a HA setup, it's recommended to run a PgBouncer node separately for each database node with an internal load balancer (TCP) serving each accordingly.
Additionally, it can be used in a non-HA setup to pool connections, speeding up response time while reducing resource usage.
It is recommended to run PgBouncer alongside the `gitlab-rails` service, or on its own dedicated node in a cluster.
## Operations
## Operations
...
@@ -18,7 +14,7 @@ It is recommended to run PgBouncer alongside the `gitlab-rails` service, or on i
...
@@ -18,7 +14,7 @@ It is recommended to run PgBouncer alongside the `gitlab-rails` service, or on i
1. Make sure you collect [`CONSUL_SERVER_NODES`](database.md#consul-information), [`CONSUL_PASSWORD_HASH`](database.md#consul-information), and [`PGBOUNCER_PASSWORD_HASH`](database.md#pgbouncer-information) before executing the next step.
1. Make sure you collect [`CONSUL_SERVER_NODES`](database.md#consul-information), [`CONSUL_PASSWORD_HASH`](database.md#consul-information), and [`PGBOUNCER_PASSWORD_HASH`](database.md#pgbouncer-information) before executing the next step.
1.Edit `/etc/gitlab/gitlab.rb` replacing values noted in the `# START user configuration` section:
1.One each node, edit the `/etc/gitlab/gitlab.rb` config file and replace values noted in the `# START user configuration` section as below:
```ruby
```ruby
# Disable all components except PgBouncer and Consul agent
# Disable all components except PgBouncer and Consul agent
...
@@ -67,7 +63,7 @@ It is recommended to run PgBouncer alongside the `gitlab-rails` service, or on i
...
@@ -67,7 +63,7 @@ It is recommended to run PgBouncer alongside the `gitlab-rails` service, or on i
#### PgBouncer Checkpoint
#### PgBouncer Checkpoint
1. Ensure the node is talking to the current master:
1. Ensure each node is talking to the current master:
```sh
```sh
gitlab-ctl pgb-console # You will be prompted for PGBOUNCER_PASSWORD
gitlab-ctl pgb-console # You will be prompted for PGBOUNCER_PASSWORD
...
@@ -100,6 +96,41 @@ It is recommended to run PgBouncer alongside the `gitlab-rails` service, or on i
...
@@ -100,6 +96,41 @@ It is recommended to run PgBouncer alongside the `gitlab-rails` service, or on i
(2 rows)
(2 rows)
```
```
#### Configure the internal load balancer
If you're running more than one PgBouncer node as recommended, then at this time you'll need to set up a TCP internal load balancer to serve each correctly. This can be done with any reputable TCP load balancer.
As an example here's how you could do it with [HAProxy](https://www.haproxy.org/):
```
global
log /dev/log local0
log localhost local1 notice
log stdout format raw local0
defaults
log global
default-server inter 10s fall 3 rise 2
balance leastconn
frontend internal-pgbouncer-tcp-in
bind *:6432
mode tcp
option tcplog
default_backend pgbouncer
backend pgbouncer
mode tcp
option tcp-check
server pgbouncer1 <ip>:6432 check
server pgbouncer2 <ip>:6432 check
server pgbouncer3 <ip>:6432 check
```
Refer to your preferred Load Balancer's documentation for further guidance.
### Running PgBouncer as part of a non-HA GitLab installation
### Running PgBouncer as part of a non-HA GitLab installation
1. Generate PGBOUNCER_USER_PASSWORD_HASH with the command `gitlab-ctl pg-password-md5 pgbouncer`
1. Generate PGBOUNCER_USER_PASSWORD_HASH with the command `gitlab-ctl pg-password-md5 pgbouncer`