Commit 3948f4d2 authored by Achilleas Pipinellis's avatar Achilleas Pipinellis

Merge branch 'docs-reference-architecture-patroni' into 'master'

Update reference architecture docs to use Patroni

See merge request gitlab-org/gitlab!50573
parents 131d7f0b e5abd2fc
......@@ -422,9 +422,9 @@ install the necessary dependencies from step 1, and add the
GitLab package repository from step 2. When installing GitLab
in the second step, do not supply the `EXTERNAL_URL` value.
#### PostgreSQL primary node
#### PostgreSQL nodes
1. SSH in to the PostgreSQL primary node.
1. SSH in to one of the PostgreSQL nodes.
1. Generate a password hash for the PostgreSQL username/password pair. This assumes you will use the default
username of `gitlab` (recommended). The command will request a password
and confirmation. Use the value that is output by this command in the next
......@@ -452,23 +452,33 @@ in the second step, do not supply the `EXTERNAL_URL` value.
sudo gitlab-ctl pg-password-md5 gitlab-consul
```
1. On the primary database node, edit `/etc/gitlab/gitlab.rb` replacing values noted in the `# START user configuration` section:
1. On every database node, edit `/etc/gitlab/gitlab.rb` replacing values noted in the `# START user configuration` section:
```ruby
# Disable all components except PostgreSQL and Repmgr and Consul
# Disable all components except PostgreSQL, Patroni, and Consul
roles ['postgres_role']
# PostgreSQL configuration
postgresql['listen_address'] = '0.0.0.0'
postgresql['hot_standby'] = 'on'
postgresql['wal_level'] = 'replica'
postgresql['shared_preload_libraries'] = 'repmgr_funcs'
# Enable Patroni
patroni['enable'] = true
# Set `max_wal_senders` to one more than the number of database nodes in the cluster.
# This is used to prevent replication from using up all of the
# available database connections.
patroni['postgresql']['max_wal_senders'] = 4
patroni['postgresql']['max_replication_slots'] = 4
# Incoming recommended value for max connections is 500. See https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5691.
patroni['postgresql']['max_connections'] = 500
# Disable automatic database migrations
gitlab_rails['auto_migrate'] = false
# Configure the Consul agent
consul['enable'] = true
consul['services'] = %w(postgresql)
## Enable service discovery for Prometheus
consul['monitoring_service_discovery'] = true
# START user configuration
# Please set the real values as explained in Required Information section
......@@ -477,18 +487,9 @@ in the second step, do not supply the `EXTERNAL_URL` value.
postgresql['pgbouncer_user_password'] = '<pgbouncer_password_hash>'
# Replace POSTGRESQL_PASSWORD_HASH with a generated md5 value
postgresql['sql_user_password'] = '<postgresql_password_hash>'
# Set `max_wal_senders` to one more than the number of database nodes in the cluster.
# This is used to prevent replication from using up all of the
# available database connections.
postgresql['max_wal_senders'] = 4
postgresql['max_replication_slots'] = 4
# Replace XXX.XXX.XXX.XXX/YY with Network Address
postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/24)
repmgr['trust_auth_cidr_addresses'] = %w(127.0.0.1/32 10.6.0.0/24)
## Enable service discovery for Prometheus
consul['monitoring_service_discovery'] = true
# Set the network addresses that the exporters will listen on for monitoring
node_exporter['listen_address'] = '0.0.0.0:9100'
......@@ -503,70 +504,9 @@ in the second step, do not supply the `EXTERNAL_URL` value.
# END user configuration
```
1. Copy the `/etc/gitlab/gitlab-secrets.json` file from your Consul server, and replace
the file of the same name on this server. If that file is not on this server,
add the file from your Consul server to this server.
1. [Reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect.
<div align="right">
<a type="button" class="btn btn-default" href="#setup-components">
Back to setup components <i class="fa fa-angle-double-up" aria-hidden="true"></i>
</a>
</div>
#### PostgreSQL secondary nodes
1. On both the secondary nodes, add the same configuration specified above for the primary node
with an additional setting (`repmgr['master_on_initialization'] = false`) that will inform `gitlab-ctl` that they are standby nodes initially
and there's no need to attempt to register them as a primary node:
```ruby
# Disable all components except PostgreSQL and Repmgr and Consul
roles ['postgres_role']
# PostgreSQL configuration
postgresql['listen_address'] = '0.0.0.0'
postgresql['hot_standby'] = 'on'
postgresql['wal_level'] = 'replica'
postgresql['shared_preload_libraries'] = 'repmgr_funcs'
# Disable automatic database migrations
gitlab_rails['auto_migrate'] = false
# Configure the Consul agent
consul['services'] = %w(postgresql)
# Specify if a node should attempt to be primary on initialization.
repmgr['master_on_initialization'] = false
# Replace PGBOUNCER_PASSWORD_HASH with a generated md5 value
postgresql['pgbouncer_user_password'] = '<pgbouncer_password_hash>'
# Replace POSTGRESQL_PASSWORD_HASH with a generated md5 value
postgresql['sql_user_password'] = '<postgresql_password_hash>'
# Set `max_wal_senders` to one more than the number of database nodes in the cluster.
# This is used to prevent replication from using up all of the
# available database connections.
postgresql['max_wal_senders'] = 4
postgresql['max_replication_slots'] = 4
# Replace with your network addresses
postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/24)
repmgr['trust_auth_cidr_addresses'] = %w(127.0.0.1/32 10.6.0.0/24)
## Enable service discovery for Prometheus
consul['monitoring_service_discovery'] = true
# Set the network addresses that the exporters will listen on for monitoring
node_exporter['listen_address'] = '0.0.0.0:9100'
postgres_exporter['listen_address'] = '0.0.0.0:9187'
## The IPs of the Consul server nodes
## You can also use FQDNs and intermix them with IPs
consul['configuration'] = {
retry_join: %w(10.6.0.11 10.6.0.12 10.6.0.13),
}
```
PostgreSQL, with Patroni managing its failover, will default to use `pg_rewind` by default to handle conflicts.
Like most failover handling methods, this has a small chance of leading to data loss.
Learn more about the various [Patroni replication methods](../postgresql/replication_and_failover.md#selecting-the-appropriate-patroni-replication-method).
1. Copy the `/etc/gitlab/gitlab-secrets.json` file from your Consul server, and replace
the file of the same name on this server. If that file is not on this server,
......@@ -601,84 +541,25 @@ SSH in to the **primary node**:
1. Exit the database prompt by typing `\q` and Enter.
1. Verify the cluster is initialized with one node:
1. Check the status of the leader and cluster:
```shell
gitlab-ctl repmgr cluster show
gitlab-ctl patroni members
```
The output should be similar to the following:
```plaintext
Role | Name | Upstream | Connection String
----------+----------|----------|----------------------------------------
* master | HOSTNAME | | host=HOSTNAME user=gitlab_repmgr dbname=gitlab_repmgr
| Cluster | Member | Host | Role | State | TL | Lag in MB | Pending restart |
|---------------|-----------------------------------|-----------|--------|---------|-----|-----------|-----------------|
| postgresql-ha | <PostgreSQL primary hostname> | 10.6.0.21 | Leader | running | 175 | | * |
| postgresql-ha | <PostgreSQL secondary 1 hostname> | 10.6.0.22 | | running | 175 | 0 | * |
| postgresql-ha | <PostgreSQL secondary 2 hostname> | 10.6.0.23 | | running | 175 | 0 | * |
```
1. Note down the hostname or IP address in the connection string: `host=HOSTNAME`. We will
refer to the hostname in the next section as `<primary_node_name>`. If the value
is not an IP address, it will need to be a resolvable name (via DNS or
`/etc/hosts`)
SSH in to the **secondary node**:
1. Set up the repmgr standby:
```shell
gitlab-ctl repmgr standby setup <primary_node_name>
```
Do note that this will remove the existing data on the node. The command
has a wait time.
The output should be similar to the following:
```console
Doing this will delete the entire contents of /var/opt/gitlab/postgresql/data
If this is not what you want, hit Ctrl-C now to exit
To skip waiting, rerun with the -w option
Sleeping for 30 seconds
Stopping the database
Removing the data
Cloning the data
Starting the database
Registering the node with the cluster
ok: run: repmgrd: (pid 19068) 0s
```
Before moving on, make sure the databases are configured correctly. Run the
following command on the **primary** node to verify that replication is working
properly and the secondary nodes appear in the cluster:
```shell
gitlab-ctl repmgr cluster show
```
The output should be similar to the following:
```plaintext
Role | Name | Upstream | Connection String
----------+---------|-----------|------------------------------------------------
* master | MASTER | | host=<primary_node_name> user=gitlab_repmgr dbname=gitlab_repmgr
standby | STANDBY | MASTER | host=<secondary_node_name> user=gitlab_repmgr dbname=gitlab_repmgr
standby | STANDBY | MASTER | host=<secondary_node_name> user=gitlab_repmgr dbname=gitlab_repmgr
```
If the 'Role' column for any node says "FAILED", check the
If the 'State' column for any node doesn't say "running", check the
[Troubleshooting section](troubleshooting.md) before proceeding.
Also, check that the `repmgr-check-master` command works successfully on each node:
```shell
su - gitlab-consul
gitlab-ctl repmgr-check-master || echo 'This node is a standby repmgr node'
```
This command relies on exit codes to tell Consul whether a particular node is a master
or secondary. The most important thing here is that this command does not produce errors.
If there are errors it's most likely due to incorrect `gitlab-consul` database user permissions.
Check the [Troubleshooting section](troubleshooting.md) before proceeding.
<div align="right">
<a type="button" class="btn btn-default" href="#setup-components">
Back to setup components <i class="fa fa-angle-double-up" aria-hidden="true"></i>
......@@ -696,7 +577,7 @@ The following IPs will be used as an example:
1. On each PgBouncer node, edit `/etc/gitlab/gitlab.rb`, and replace
`<consul_password_hash>` and `<pgbouncer_password_hash>` with the
password hashes you [set up previously](#postgresql-primary-node):
password hashes you [set up previously](#postgresql-nodes):
```ruby
# Disable all components except Pgbouncer and Consul agent
......@@ -704,15 +585,16 @@ The following IPs will be used as an example:
# Configure PgBouncer
pgbouncer['admin_users'] = %w(pgbouncer gitlab-consul)
pgbouncer['users'] = {
'gitlab-consul': {
password: '<consul_password_hash>'
},
'pgbouncer': {
password: '<pgbouncer_password_hash>'
}
'gitlab-consul': {
password: '<consul_password_hash>'
},
'pgbouncer': {
password: '<pgbouncer_password_hash>'
}
}
# Incoming recommended value for max db connections is 150. See https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5691.
pgbouncer['max_db_connections'] = 150
# Configure Consul agent
consul['watchers'] = %w(postgresql)
......
......@@ -422,9 +422,9 @@ install the necessary dependencies from step 1, and add the
GitLab package repository from step 2. When installing GitLab
in the second step, do not supply the `EXTERNAL_URL` value.
#### PostgreSQL primary node
#### PostgreSQL nodes
1. SSH in to the PostgreSQL primary node.
1. SSH in to one of the PostgreSQL nodes.
1. Generate a password hash for the PostgreSQL username/password pair. This assumes you will use the default
username of `gitlab` (recommended). The command will request a password
and confirmation. Use the value that is output by this command in the next
......@@ -452,23 +452,33 @@ in the second step, do not supply the `EXTERNAL_URL` value.
sudo gitlab-ctl pg-password-md5 gitlab-consul
```
1. On the primary database node, edit `/etc/gitlab/gitlab.rb` replacing values noted in the `# START user configuration` section:
1. On every database node, edit `/etc/gitlab/gitlab.rb` replacing values noted in the `# START user configuration` section:
```ruby
# Disable all components except PostgreSQL and Repmgr and Consul
# Disable all components except PostgreSQL, Patroni, and Consul
roles ['postgres_role']
# PostgreSQL configuration
postgresql['listen_address'] = '0.0.0.0'
postgresql['hot_standby'] = 'on'
postgresql['wal_level'] = 'replica'
postgresql['shared_preload_libraries'] = 'repmgr_funcs'
# Enable Patroni
patroni['enable'] = true
# Set `max_wal_senders` to one more than the number of database nodes in the cluster.
# This is used to prevent replication from using up all of the
# available database connections.
patroni['postgresql']['max_wal_senders'] = 4
patroni['postgresql']['max_replication_slots'] = 4
# Incoming recommended value for max connections is 500. See https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5691.
patroni['postgresql']['max_connections'] = 500
# Disable automatic database migrations
gitlab_rails['auto_migrate'] = false
# Configure the Consul agent
consul['enable'] = true
consul['services'] = %w(postgresql)
## Enable service discovery for Prometheus
consul['monitoring_service_discovery'] = true
# START user configuration
# Please set the real values as explained in Required Information section
......@@ -477,18 +487,9 @@ in the second step, do not supply the `EXTERNAL_URL` value.
postgresql['pgbouncer_user_password'] = '<pgbouncer_password_hash>'
# Replace POSTGRESQL_PASSWORD_HASH with a generated md5 value
postgresql['sql_user_password'] = '<postgresql_password_hash>'
# Set `max_wal_senders` to one more than the number of database nodes in the cluster.
# This is used to prevent replication from using up all of the
# available database connections.
postgresql['max_wal_senders'] = 4
postgresql['max_replication_slots'] = 4
# Replace XXX.XXX.XXX.XXX/YY with Network Address
postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/24)
repmgr['trust_auth_cidr_addresses'] = %w(127.0.0.1/32 10.6.0.0/24)
## Enable service discovery for Prometheus
consul['monitoring_service_discovery'] = true
# Set the network addresses that the exporters will listen on for monitoring
node_exporter['listen_address'] = '0.0.0.0:9100'
......@@ -503,70 +504,9 @@ in the second step, do not supply the `EXTERNAL_URL` value.
# END user configuration
```
1. Copy the `/etc/gitlab/gitlab-secrets.json` file from your Consul server, and replace
the file of the same name on this server. If that file is not on this server,
add the file from your Consul server to this server.
1. [Reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect.
<div align="right">
<a type="button" class="btn btn-default" href="#setup-components">
Back to setup components <i class="fa fa-angle-double-up" aria-hidden="true"></i>
</a>
</div>
#### PostgreSQL secondary nodes
1. On both the secondary nodes, add the same configuration specified above for the primary node
with an additional setting (`repmgr['master_on_initialization'] = false`) that will inform `gitlab-ctl` that they are standby nodes initially
and there's no need to attempt to register them as a primary node:
```ruby
# Disable all components except PostgreSQL and Repmgr and Consul
roles ['postgres_role']
# PostgreSQL configuration
postgresql['listen_address'] = '0.0.0.0'
postgresql['hot_standby'] = 'on'
postgresql['wal_level'] = 'replica'
postgresql['shared_preload_libraries'] = 'repmgr_funcs'
# Disable automatic database migrations
gitlab_rails['auto_migrate'] = false
# Configure the Consul agent
consul['services'] = %w(postgresql)
# Specify if a node should attempt to be primary on initialization.
repmgr['master_on_initialization'] = false
# Replace PGBOUNCER_PASSWORD_HASH with a generated md5 value
postgresql['pgbouncer_user_password'] = '<pgbouncer_password_hash>'
# Replace POSTGRESQL_PASSWORD_HASH with a generated md5 value
postgresql['sql_user_password'] = '<postgresql_password_hash>'
# Set `max_wal_senders` to one more than the number of database nodes in the cluster.
# This is used to prevent replication from using up all of the
# available database connections.
postgresql['max_wal_senders'] = 4
postgresql['max_replication_slots'] = 4
# Replace with your network addresses
postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/24)
repmgr['trust_auth_cidr_addresses'] = %w(127.0.0.1/32 10.6.0.0/24)
## Enable service discovery for Prometheus
consul['monitoring_service_discovery'] = true
# Set the network addresses that the exporters will listen on for monitoring
node_exporter['listen_address'] = '0.0.0.0:9100'
postgres_exporter['listen_address'] = '0.0.0.0:9187'
## The IPs of the Consul server nodes
## You can also use FQDNs and intermix them with IPs
consul['configuration'] = {
retry_join: %w(10.6.0.11 10.6.0.12 10.6.0.13),
}
```
PostgreSQL, with Patroni managing its failover, will default to use `pg_rewind` by default to handle conflicts.
Like most failover handling methods, this has a small chance of leading to data loss.
Learn more about the various [Patroni replication methods](../postgresql/replication_and_failover.md#selecting-the-appropriate-patroni-replication-method).
1. Copy the `/etc/gitlab/gitlab-secrets.json` file from your Consul server, and replace
the file of the same name on this server. If that file is not on this server,
......@@ -601,84 +541,25 @@ SSH in to the **primary node**:
1. Exit the database prompt by typing `\q` and Enter.
1. Verify the cluster is initialized with one node:
1. Check the status of the leader and cluster:
```shell
gitlab-ctl repmgr cluster show
gitlab-ctl patroni members
```
The output should be similar to the following:
```plaintext
Role | Name | Upstream | Connection String
----------+----------|----------|----------------------------------------
* master | HOSTNAME | | host=HOSTNAME user=gitlab_repmgr dbname=gitlab_repmgr
| Cluster | Member | Host | Role | State | TL | Lag in MB | Pending restart |
|---------------|-----------------------------------|-----------|--------|---------|-----|-----------|-----------------|
| postgresql-ha | <PostgreSQL primary hostname> | 10.6.0.21 | Leader | running | 175 | | * |
| postgresql-ha | <PostgreSQL secondary 1 hostname> | 10.6.0.22 | | running | 175 | 0 | * |
| postgresql-ha | <PostgreSQL secondary 2 hostname> | 10.6.0.23 | | running | 175 | 0 | * |
```
1. Note down the hostname or IP address in the connection string: `host=HOSTNAME`. We will
refer to the hostname in the next section as `<primary_node_name>`. If the value
is not an IP address, it will need to be a resolvable name (via DNS or
`/etc/hosts`)
SSH in to the **secondary node**:
1. Set up the repmgr standby:
```shell
gitlab-ctl repmgr standby setup <primary_node_name>
```
Do note that this will remove the existing data on the node. The command
has a wait time.
The output should be similar to the following:
```console
Doing this will delete the entire contents of /var/opt/gitlab/postgresql/data
If this is not what you want, hit Ctrl-C now to exit
To skip waiting, rerun with the -w option
Sleeping for 30 seconds
Stopping the database
Removing the data
Cloning the data
Starting the database
Registering the node with the cluster
ok: run: repmgrd: (pid 19068) 0s
```
Before moving on, make sure the databases are configured correctly. Run the
following command on the **primary** node to verify that replication is working
properly and the secondary nodes appear in the cluster:
```shell
gitlab-ctl repmgr cluster show
```
The output should be similar to the following:
```plaintext
Role | Name | Upstream | Connection String
----------+---------|-----------|------------------------------------------------
* master | MASTER | | host=<primary_node_name> user=gitlab_repmgr dbname=gitlab_repmgr
standby | STANDBY | MASTER | host=<secondary_node_name> user=gitlab_repmgr dbname=gitlab_repmgr
standby | STANDBY | MASTER | host=<secondary_node_name> user=gitlab_repmgr dbname=gitlab_repmgr
```
If the 'Role' column for any node says "FAILED", check the
If the 'State' column for any node doesn't say "running", check the
[Troubleshooting section](troubleshooting.md) before proceeding.
Also, check that the `repmgr-check-master` command works successfully on each node:
```shell
su - gitlab-consul
gitlab-ctl repmgr-check-master || echo 'This node is a standby repmgr node'
```
This command relies on exit codes to tell Consul whether a particular node is a master
or secondary. The most important thing here is that this command does not produce errors.
If there are errors it's most likely due to incorrect `gitlab-consul` database user permissions.
Check the [Troubleshooting section](troubleshooting.md) before proceeding.
<div align="right">
<a type="button" class="btn btn-default" href="#setup-components">
Back to setup components <i class="fa fa-angle-double-up" aria-hidden="true"></i>
......@@ -696,7 +577,7 @@ The following IPs will be used as an example:
1. On each PgBouncer node, edit `/etc/gitlab/gitlab.rb`, and replace
`<consul_password_hash>` and `<pgbouncer_password_hash>` with the
password hashes you [set up previously](#postgresql-primary-node):
password hashes you [set up previously](#postgresql-nodes):
```ruby
# Disable all components except Pgbouncer and Consul agent
......@@ -704,15 +585,16 @@ The following IPs will be used as an example:
# Configure PgBouncer
pgbouncer['admin_users'] = %w(pgbouncer gitlab-consul)
pgbouncer['users'] = {
'gitlab-consul': {
password: '<consul_password_hash>'
},
'pgbouncer': {
password: '<pgbouncer_password_hash>'
}
'gitlab-consul': {
password: '<consul_password_hash>'
},
'pgbouncer': {
password: '<pgbouncer_password_hash>'
}
}
# Incoming recommended value for max db connections is 150. See https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5691.
pgbouncer['max_db_connections'] = 150
# Configure Consul agent
consul['watchers'] = %w(postgresql)
......
......@@ -271,7 +271,7 @@ further configuration steps.
```ruby
# Disable all components except PostgreSQL
roles ['postgres_role']
repmgr['enable'] = false
patroni['enable'] = false
consul['enable'] = false
prometheus['enable'] = false
alertmanager['enable'] = false
......
......@@ -672,9 +672,9 @@ install the necessary dependencies from step 1, and add the
GitLab package repository from step 2. When installing GitLab
in the second step, do not supply the `EXTERNAL_URL` value.
#### PostgreSQL primary node
#### PostgreSQL nodes
1. SSH in to the PostgreSQL primary node.
1. SSH in to one of the PostgreSQL nodes.
1. Generate a password hash for the PostgreSQL username/password pair. This assumes you will use the default
username of `gitlab` (recommended). The command will request a password
and confirmation. Use the value that is output by this command in the next
......@@ -702,114 +702,33 @@ in the second step, do not supply the `EXTERNAL_URL` value.
sudo gitlab-ctl pg-password-md5 gitlab-consul
```
1. On the primary database node, edit `/etc/gitlab/gitlab.rb` replacing values noted in the `# START user configuration` section:
1. On every database node, edit `/etc/gitlab/gitlab.rb` replacing values noted in the `# START user configuration` section:
```ruby
# Disable all components except PostgreSQL and Repmgr and Consul
# Disable all components except PostgreSQL, Patroni, and Consul
roles ['postgres_role']
# PostgreSQL configuration
postgresql['listen_address'] = '0.0.0.0'
postgresql['hot_standby'] = 'on'
postgresql['wal_level'] = 'replica'
postgresql['shared_preload_libraries'] = 'repmgr_funcs'
# Disable automatic database migrations
gitlab_rails['auto_migrate'] = false
# Configure the Consul agent
consul['services'] = %w(postgresql)
# START user configuration
# Please set the real values as explained in Required Information section
#
# Replace PGBOUNCER_PASSWORD_HASH with a generated md5 value
postgresql['pgbouncer_user_password'] = '<pgbouncer_password_hash>'
# Replace POSTGRESQL_PASSWORD_HASH with a generated md5 value
postgresql['sql_user_password'] = '<postgresql_password_hash>'
# Enable Patroni
patroni['enable'] = true
# Set `max_wal_senders` to one more than the number of database nodes in the cluster.
# This is used to prevent replication from using up all of the
# available database connections.
postgresql['max_wal_senders'] = 4
postgresql['max_replication_slots'] = 4
# Replace XXX.XXX.XXX.XXX/YY with Network Address
postgresql['trust_auth_cidr_addresses'] = %w(127.0.0.1/32 10.6.0.0/24)
repmgr['trust_auth_cidr_addresses'] = %w(127.0.0.1/32 10.6.0.0/24)
## Enable service discovery for Prometheus
consul['enable'] = true
consul['monitoring_service_discovery'] = true
# Set the network addresses that the exporters will listen on for monitoring
node_exporter['listen_address'] = '0.0.0.0:9100'
postgres_exporter['listen_address'] = '0.0.0.0:9187'
postgres_exporter['dbname'] = 'gitlabhq_production'
postgres_exporter['password'] = '<postgresql_password_hash>'
## The IPs of the Consul server nodes
## You can also use FQDNs and intermix them with IPs
consul['configuration'] = {
retry_join: %w(10.6.0.11 10.6.0.12 10.6.0.13),
}
#
# END user configuration
```
1. [Reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect.
1. You can list the current PostgreSQL primary, secondary nodes status via:
```shell
sudo /opt/gitlab/bin/gitlab-ctl repmgr cluster show
```
1. Verify the GitLab services are running:
```shell
sudo gitlab-ctl status
```
The output should be similar to the following:
```plaintext
run: consul: (pid 30593) 77133s; run: log: (pid 29912) 77156s
run: logrotate: (pid 23449) 3341s; run: log: (pid 29794) 77175s
run: node-exporter: (pid 30613) 77133s; run: log: (pid 29824) 77170s
run: postgres-exporter: (pid 30620) 77132s; run: log: (pid 29894) 77163s
run: postgresql: (pid 30630) 77132s; run: log: (pid 29618) 77181s
run: repmgrd: (pid 30639) 77132s; run: log: (pid 29985) 77150s
```
<div align="right">
<a type="button" class="btn btn-default" href="#setup-components">
Back to setup components <i class="fa fa-angle-double-up" aria-hidden="true"></i>
</a>
</div>
#### PostgreSQL secondary nodes
1. On both the secondary nodes, add the same configuration specified above for the primary node
with an additional setting that will inform `gitlab-ctl` that they are standby nodes initially
and there's no need to attempt to register them as a primary node:
```ruby
# Disable all components except PostgreSQL and Repmgr and Consul
roles ['postgres_role']
# PostgreSQL configuration
postgresql['listen_address'] = '0.0.0.0'
postgresql['hot_standby'] = 'on'
postgresql['wal_level'] = 'replica'
postgresql['shared_preload_libraries'] = 'repmgr_funcs'
patroni['postgresql']['max_wal_senders'] = 4
patroni['postgresql']['max_replication_slots'] = 4
# Incoming recommended value for max connections is 500. See https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5691.
patroni['postgresql']['max_connections'] = 500
# Disable automatic database migrations
gitlab_rails['auto_migrate'] = false
# Configure the Consul agent
consul['enable'] = true
consul['services'] = %w(postgresql)
# Specify if a node should attempt to be primary on initialization.
repmgr['master_on_initialization'] = false
## Enable service discovery for Prometheus
consul['monitoring_service_discovery'] = true
# START user configuration
# Please set the real values as explained in Required Information section
......@@ -818,34 +737,31 @@ in the second step, do not supply the `EXTERNAL_URL` value.
postgresql['pgbouncer_user_password'] = '<pgbouncer_password_hash>'
# Replace POSTGRESQL_PASSWORD_HASH with a generated md5 value
postgresql['sql_user_password'] = '<postgresql_password_hash>'
# Set `max_wal_senders` to one more than the number of database nodes in the cluster.
# This is used to prevent replication from using up all of the
# available database connections.
postgresql['max_wal_senders'] = 4
postgresql['max_replication_slots'] = 4
# Replace XXX.XXX.XXX.XXX/YY with Network Address
postgresql['trust_auth_cidr_addresses'] = %w(127.0.0.1/32 10.6.0.0/24)
repmgr['trust_auth_cidr_addresses'] = %w(127.0.0.1/32 10.6.0.0/24)
## Enable service discovery for Prometheus
consul['enable'] = true
consul['monitoring_service_discovery'] = true
postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/24)
# Set the network addresses that the exporters will listen on for monitoring
node_exporter['listen_address'] = '0.0.0.0:9100'
postgres_exporter['listen_address'] = '0.0.0.0:9187'
postgres_exporter['dbname'] = 'gitlabhq_production'
postgres_exporter['password'] = '<postgresql_password_hash>'
## The IPs of the Consul server nodes
## You can also use FQDNs and intermix them with IPs
consul['configuration'] = {
retry_join: %w(10.6.0.11 10.6.0.12 10.6.0.13),
}
#
# END user configuration
```
PostgreSQL, with Patroni managing its failover, will default to use `pg_rewind` by default to handle conflicts.
Like most failover handling methods, this has a small chance of leading to data loss.
Learn more about the various [Patroni replication methods](../postgresql/replication_and_failover.md#selecting-the-appropriate-patroni-replication-method).
1. Copy the `/etc/gitlab/gitlab-secrets.json` file from your Consul server, and replace
the file of the same name on this server. If that file is not on this server,
add the file from your Consul server to this server.
1. [Reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect.
Advanced [configuration options](https://docs.gitlab.com/omnibus/settings/database.html)
......@@ -876,84 +792,25 @@ SSH in to the **primary node**:
1. Exit the database prompt by typing `\q` and Enter.
1. Verify the cluster is initialized with one node:
1. Check the status of the leader and cluster:
```shell
gitlab-ctl repmgr cluster show
gitlab-ctl patroni members
```
The output should be similar to the following:
```plaintext
Role | Name | Upstream | Connection String
----------+----------|----------|----------------------------------------
* master | HOSTNAME | | host=HOSTNAME user=gitlab_repmgr dbname=gitlab_repmgr
| Cluster | Member | Host | Role | State | TL | Lag in MB | Pending restart |
|---------------|-----------------------------------|-----------|--------|---------|-----|-----------|-----------------|
| postgresql-ha | <PostgreSQL primary hostname> | 10.6.0.31 | Leader | running | 175 | | * |
| postgresql-ha | <PostgreSQL secondary 1 hostname> | 10.6.0.32 | | running | 175 | 0 | * |
| postgresql-ha | <PostgreSQL secondary 2 hostname> | 10.6.0.33 | | running | 175 | 0 | * |
```
1. Note down the hostname or IP address in the connection string: `host=HOSTNAME`. We will
refer to the hostname in the next section as `<primary_node_name>`. If the value
is not an IP address, it will need to be a resolvable name (via DNS or
`/etc/hosts`)
SSH in to the **secondary node**:
1. Set up the repmgr standby:
```shell
gitlab-ctl repmgr standby setup <primary_node_name>
```
Do note that this will remove the existing data on the node. The command
has a wait time.
The output should be similar to the following:
```console
Doing this will delete the entire contents of /var/opt/gitlab/postgresql/data
If this is not what you want, hit Ctrl-C now to exit
To skip waiting, rerun with the -w option
Sleeping for 30 seconds
Stopping the database
Removing the data
Cloning the data
Starting the database
Registering the node with the cluster
ok: run: repmgrd: (pid 19068) 0s
```
Before moving on, make sure the databases are configured correctly. Run the
following command on the **primary** node to verify that replication is working
properly and the secondary nodes appear in the cluster:
```shell
gitlab-ctl repmgr cluster show
```
The output should be similar to the following:
```plaintext
Role | Name | Upstream | Connection String
----------+---------|-----------|------------------------------------------------
* master | MASTER | | host=<primary_node_name> user=gitlab_repmgr dbname=gitlab_repmgr
standby | STANDBY | MASTER | host=<secondary_node_name> user=gitlab_repmgr dbname=gitlab_repmgr
standby | STANDBY | MASTER | host=<secondary_node_name> user=gitlab_repmgr dbname=gitlab_repmgr
```
If the 'Role' column for any node says "FAILED", check the
If the 'State' column for any node doesn't say "running", check the
[Troubleshooting section](troubleshooting.md) before proceeding.
Also, check that the `repmgr-check-master` command works successfully on each node:
```shell
su - gitlab-consul
gitlab-ctl repmgr-check-master || echo 'This node is a standby repmgr node'
```
This command relies on exit codes to tell Consul whether a particular node is a master
or secondary. The most important thing here is that this command does not produce errors.
If there are errors it's most likely due to incorrect `gitlab-consul` database user permissions.
Check the [Troubleshooting section](troubleshooting.md) before proceeding.
<div align="right">
<a type="button" class="btn btn-default" href="#setup-components">
Back to setup components <i class="fa fa-angle-double-up" aria-hidden="true"></i>
......@@ -971,7 +828,7 @@ The following IPs will be used as an example:
1. On each PgBouncer node, edit `/etc/gitlab/gitlab.rb`, and replace
`<consul_password_hash>` and `<pgbouncer_password_hash>` with the
password hashes you [set up previously](#postgresql-primary-node):
password hashes you [set up previously](#postgresql-nodes):
```ruby
# Disable all components except Pgbouncer and Consul agent
......@@ -979,15 +836,16 @@ The following IPs will be used as an example:
# Configure PgBouncer
pgbouncer['admin_users'] = %w(pgbouncer gitlab-consul)
pgbouncer['users'] = {
'gitlab-consul': {
password: '<consul_password_hash>'
},
'pgbouncer': {
password: '<pgbouncer_password_hash>'
}
'gitlab-consul': {
password: '<consul_password_hash>'
},
'pgbouncer': {
password: '<pgbouncer_password_hash>'
}
}
# Incoming recommended value for max db connections is 150. See https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5691.
pgbouncer['max_db_connections'] = 150
# Configure Consul agent
consul['watchers'] = %w(postgresql)
......
......@@ -422,9 +422,9 @@ install the necessary dependencies from step 1, and add the
GitLab package repository from step 2. When installing GitLab
in the second step, do not supply the `EXTERNAL_URL` value.
#### PostgreSQL primary node
#### PostgreSQL nodes
1. SSH in to the PostgreSQL primary node.
1. SSH in to one of the PostgreSQL nodes.
1. Generate a password hash for the PostgreSQL username/password pair. This assumes you will use the default
username of `gitlab` (recommended). The command will request a password
and confirmation. Use the value that is output by this command in the next
......@@ -452,23 +452,33 @@ in the second step, do not supply the `EXTERNAL_URL` value.
sudo gitlab-ctl pg-password-md5 gitlab-consul
```
1. On the primary database node, edit `/etc/gitlab/gitlab.rb` replacing values noted in the `# START user configuration` section:
1. On every database node, edit `/etc/gitlab/gitlab.rb` replacing values noted in the `# START user configuration` section:
```ruby
# Disable all components except PostgreSQL and Repmgr and Consul
# Disable all components except PostgreSQL, Patroni, and Consul
roles ['postgres_role']
# PostgreSQL configuration
postgresql['listen_address'] = '0.0.0.0'
postgresql['hot_standby'] = 'on'
postgresql['wal_level'] = 'replica'
postgresql['shared_preload_libraries'] = 'repmgr_funcs'
# Enable Patroni
patroni['enable'] = true
# Set `max_wal_senders` to one more than the number of database nodes in the cluster.
# This is used to prevent replication from using up all of the
# available database connections.
patroni['postgresql']['max_wal_senders'] = 4
patroni['postgresql']['max_replication_slots'] = 4
# Incoming recommended value for max connections is 500. See https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5691.
patroni['postgresql']['max_connections'] = 500
# Disable automatic database migrations
gitlab_rails['auto_migrate'] = false
# Configure the Consul agent
consul['enable'] = true
consul['services'] = %w(postgresql)
## Enable service discovery for Prometheus
consul['monitoring_service_discovery'] = true
# START user configuration
# Please set the real values as explained in Required Information section
......@@ -477,18 +487,9 @@ in the second step, do not supply the `EXTERNAL_URL` value.
postgresql['pgbouncer_user_password'] = '<pgbouncer_password_hash>'
# Replace POSTGRESQL_PASSWORD_HASH with a generated md5 value
postgresql['sql_user_password'] = '<postgresql_password_hash>'
# Set `max_wal_senders` to one more than the number of database nodes in the cluster.
# This is used to prevent replication from using up all of the
# available database connections.
postgresql['max_wal_senders'] = 4
postgresql['max_replication_slots'] = 4
# Replace XXX.XXX.XXX.XXX/YY with Network Address
postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/24)
repmgr['trust_auth_cidr_addresses'] = %w(127.0.0.1/32 10.6.0.0/24)
## Enable service discovery for Prometheus
consul['monitoring_service_discovery'] = true
# Set the network addresses that the exporters will listen on for monitoring
node_exporter['listen_address'] = '0.0.0.0:9100'
......@@ -503,70 +504,9 @@ in the second step, do not supply the `EXTERNAL_URL` value.
# END user configuration
```
1. Copy the `/etc/gitlab/gitlab-secrets.json` file from your Consul server, and replace
the file of the same name on this server. If that file is not on this server,
add the file from your Consul server to this server.
1. [Reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect.
<div align="right">
<a type="button" class="btn btn-default" href="#setup-components">
Back to setup components <i class="fa fa-angle-double-up" aria-hidden="true"></i>
</a>
</div>
#### PostgreSQL secondary nodes
1. On both the secondary nodes, add the same configuration specified above for the primary node
with an additional setting (`repmgr['master_on_initialization'] = false`) that will inform `gitlab-ctl` that they are standby nodes initially
and there's no need to attempt to register them as a primary node:
```ruby
# Disable all components except PostgreSQL and Repmgr and Consul
roles ['postgres_role']
# PostgreSQL configuration
postgresql['listen_address'] = '0.0.0.0'
postgresql['hot_standby'] = 'on'
postgresql['wal_level'] = 'replica'
postgresql['shared_preload_libraries'] = 'repmgr_funcs'
# Disable automatic database migrations
gitlab_rails['auto_migrate'] = false
# Configure the Consul agent
consul['services'] = %w(postgresql)
# Specify if a node should attempt to be primary on initialization.
repmgr['master_on_initialization'] = false
# Replace PGBOUNCER_PASSWORD_HASH with a generated md5 value
postgresql['pgbouncer_user_password'] = '<pgbouncer_password_hash>'
# Replace POSTGRESQL_PASSWORD_HASH with a generated md5 value
postgresql['sql_user_password'] = '<postgresql_password_hash>'
# Set `max_wal_senders` to one more than the number of database nodes in the cluster.
# This is used to prevent replication from using up all of the
# available database connections.
postgresql['max_wal_senders'] = 4
postgresql['max_replication_slots'] = 4
# Replace with your network addresses
postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/24)
repmgr['trust_auth_cidr_addresses'] = %w(127.0.0.1/32 10.6.0.0/24)
## Enable service discovery for Prometheus
consul['monitoring_service_discovery'] = true
# Set the network addresses that the exporters will listen on for monitoring
node_exporter['listen_address'] = '0.0.0.0:9100'
postgres_exporter['listen_address'] = '0.0.0.0:9187'
## The IPs of the Consul server nodes
## You can also use FQDNs and intermix them with IPs
consul['configuration'] = {
retry_join: %w(10.6.0.11 10.6.0.12 10.6.0.13),
}
```
PostgreSQL, with Patroni managing its failover, will default to use `pg_rewind` by default to handle conflicts.
Like most failover handling methods, this has a small chance of leading to data loss.
Learn more about the various [Patroni replication methods](../postgresql/replication_and_failover.md#selecting-the-appropriate-patroni-replication-method).
1. Copy the `/etc/gitlab/gitlab-secrets.json` file from your Consul server, and replace
the file of the same name on this server. If that file is not on this server,
......@@ -601,84 +541,25 @@ SSH in to the **primary node**:
1. Exit the database prompt by typing `\q` and Enter.
1. Verify the cluster is initialized with one node:
1. Check the status of the leader and cluster:
```shell
gitlab-ctl repmgr cluster show
gitlab-ctl patroni members
```
The output should be similar to the following:
```plaintext
Role | Name | Upstream | Connection String
----------+----------|----------|----------------------------------------
* master | HOSTNAME | | host=HOSTNAME user=gitlab_repmgr dbname=gitlab_repmgr
| Cluster | Member | Host | Role | State | TL | Lag in MB | Pending restart |
|---------------|-----------------------------------|-----------|--------|---------|-----|-----------|-----------------|
| postgresql-ha | <PostgreSQL primary hostname> | 10.6.0.21 | Leader | running | 175 | | * |
| postgresql-ha | <PostgreSQL secondary 1 hostname> | 10.6.0.22 | | running | 175 | 0 | * |
| postgresql-ha | <PostgreSQL secondary 2 hostname> | 10.6.0.23 | | running | 175 | 0 | * |
```
1. Note down the hostname or IP address in the connection string: `host=HOSTNAME`. We will
refer to the hostname in the next section as `<primary_node_name>`. If the value
is not an IP address, it will need to be a resolvable name (via DNS or
`/etc/hosts`)
SSH in to the **secondary node**:
1. Set up the repmgr standby:
```shell
gitlab-ctl repmgr standby setup <primary_node_name>
```
Do note that this will remove the existing data on the node. The command
has a wait time.
The output should be similar to the following:
```console
Doing this will delete the entire contents of /var/opt/gitlab/postgresql/data
If this is not what you want, hit Ctrl-C now to exit
To skip waiting, rerun with the -w option
Sleeping for 30 seconds
Stopping the database
Removing the data
Cloning the data
Starting the database
Registering the node with the cluster
ok: run: repmgrd: (pid 19068) 0s
```
Before moving on, make sure the databases are configured correctly. Run the
following command on the **primary** node to verify that replication is working
properly and the secondary nodes appear in the cluster:
```shell
gitlab-ctl repmgr cluster show
```
The output should be similar to the following:
```plaintext
Role | Name | Upstream | Connection String
----------+---------|-----------|------------------------------------------------
* master | MASTER | | host=<primary_node_name> user=gitlab_repmgr dbname=gitlab_repmgr
standby | STANDBY | MASTER | host=<secondary_node_name> user=gitlab_repmgr dbname=gitlab_repmgr
standby | STANDBY | MASTER | host=<secondary_node_name> user=gitlab_repmgr dbname=gitlab_repmgr
```
If the 'Role' column for any node says "FAILED", check the
If the 'State' column for any node doesn't say "running", check the
[Troubleshooting section](troubleshooting.md) before proceeding.
Also, check that the `repmgr-check-master` command works successfully on each node:
```shell
su - gitlab-consul
gitlab-ctl repmgr-check-master || echo 'This node is a standby repmgr node'
```
This command relies on exit codes to tell Consul whether a particular node is a master
or secondary. The most important thing here is that this command does not produce errors.
If there are errors it's most likely due to incorrect `gitlab-consul` database user permissions.
Check the [Troubleshooting section](troubleshooting.md) before proceeding.
<div align="right">
<a type="button" class="btn btn-default" href="#setup-components">
Back to setup components <i class="fa fa-angle-double-up" aria-hidden="true"></i>
......@@ -696,7 +577,7 @@ The following IPs will be used as an example:
1. On each PgBouncer node, edit `/etc/gitlab/gitlab.rb`, and replace
`<consul_password_hash>` and `<pgbouncer_password_hash>` with the
password hashes you [set up previously](#postgresql-primary-node):
password hashes you [set up previously](#postgresql-nodes):
```ruby
# Disable all components except Pgbouncer and Consul agent
......@@ -704,15 +585,16 @@ The following IPs will be used as an example:
# Configure PgBouncer
pgbouncer['admin_users'] = %w(pgbouncer gitlab-consul)
pgbouncer['users'] = {
'gitlab-consul': {
password: '<consul_password_hash>'
},
'pgbouncer': {
password: '<pgbouncer_password_hash>'
}
'gitlab-consul': {
password: '<consul_password_hash>'
},
'pgbouncer': {
password: '<pgbouncer_password_hash>'
}
}
# Incoming recommended value for max db connections is 150. See https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5691.
pgbouncer['max_db_connections'] = 150
# Configure Consul agent
consul['watchers'] = %w(postgresql)
......
......@@ -671,9 +671,9 @@ install the necessary dependencies from step 1, and add the
GitLab package repository from step 2. When installing GitLab
in the second step, do not supply the `EXTERNAL_URL` value.
#### PostgreSQL primary node
#### PostgreSQL nodes
1. SSH in to the PostgreSQL primary node.
1. SSH in to one of the PostgreSQL nodes.
1. Generate a password hash for the PostgreSQL username/password pair. This assumes you will use the default
username of `gitlab` (recommended). The command will request a password
and confirmation. Use the value that is output by this command in the next
......@@ -701,114 +701,33 @@ in the second step, do not supply the `EXTERNAL_URL` value.
sudo gitlab-ctl pg-password-md5 gitlab-consul
```
1. On the primary database node, edit `/etc/gitlab/gitlab.rb` replacing values noted in the `# START user configuration` section:
1. On every database node, edit `/etc/gitlab/gitlab.rb` replacing values noted in the `# START user configuration` section:
```ruby
# Disable all components except PostgreSQL and Repmgr and Consul
# Disable all components except PostgreSQL, Patroni, and Consul
roles ['postgres_role']
# PostgreSQL configuration
postgresql['listen_address'] = '0.0.0.0'
postgresql['hot_standby'] = 'on'
postgresql['wal_level'] = 'replica'
postgresql['shared_preload_libraries'] = 'repmgr_funcs'
# Disable automatic database migrations
gitlab_rails['auto_migrate'] = false
# Configure the Consul agent
consul['services'] = %w(postgresql)
# START user configuration
# Please set the real values as explained in Required Information section
#
# Replace PGBOUNCER_PASSWORD_HASH with a generated md5 value
postgresql['pgbouncer_user_password'] = '<pgbouncer_password_hash>'
# Replace POSTGRESQL_PASSWORD_HASH with a generated md5 value
postgresql['sql_user_password'] = '<postgresql_password_hash>'
# Enable Patroni
patroni['enable'] = true
# Set `max_wal_senders` to one more than the number of database nodes in the cluster.
# This is used to prevent replication from using up all of the
# available database connections.
postgresql['max_wal_senders'] = 4
postgresql['max_replication_slots'] = 4
# Replace XXX.XXX.XXX.XXX/YY with Network Address
postgresql['trust_auth_cidr_addresses'] = %w(127.0.0.1/32 10.6.0.0/24)
repmgr['trust_auth_cidr_addresses'] = %w(127.0.0.1/32 10.6.0.0/24)
## Enable service discovery for Prometheus
consul['enable'] = true
consul['monitoring_service_discovery'] = true
# Set the network addresses that the exporters will listen on for monitoring
node_exporter['listen_address'] = '0.0.0.0:9100'
postgres_exporter['listen_address'] = '0.0.0.0:9187'
postgres_exporter['dbname'] = 'gitlabhq_production'
postgres_exporter['password'] = '<postgresql_password_hash>'
## The IPs of the Consul server nodes
## You can also use FQDNs and intermix them with IPs
consul['configuration'] = {
retry_join: %w(10.6.0.11 10.6.0.12 10.6.0.13),
}
#
# END user configuration
```
1. [Reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect.
1. You can list the current PostgreSQL primary, secondary nodes status via:
```shell
sudo /opt/gitlab/bin/gitlab-ctl repmgr cluster show
```
1. Verify the GitLab services are running:
```shell
sudo gitlab-ctl status
```
The output should be similar to the following:
```plaintext
run: consul: (pid 30593) 77133s; run: log: (pid 29912) 77156s
run: logrotate: (pid 23449) 3341s; run: log: (pid 29794) 77175s
run: node-exporter: (pid 30613) 77133s; run: log: (pid 29824) 77170s
run: postgres-exporter: (pid 30620) 77132s; run: log: (pid 29894) 77163s
run: postgresql: (pid 30630) 77132s; run: log: (pid 29618) 77181s
run: repmgrd: (pid 30639) 77132s; run: log: (pid 29985) 77150s
```
<div align="right">
<a type="button" class="btn btn-default" href="#setup-components">
Back to setup components <i class="fa fa-angle-double-up" aria-hidden="true"></i>
</a>
</div>
#### PostgreSQL secondary nodes
1. On both the secondary nodes, add the same configuration specified above for the primary node
with an additional setting that will inform `gitlab-ctl` that they are standby nodes initially
and there's no need to attempt to register them as a primary node:
```ruby
# Disable all components except PostgreSQL and Repmgr and Consul
roles ['postgres_role']
# PostgreSQL configuration
postgresql['listen_address'] = '0.0.0.0'
postgresql['hot_standby'] = 'on'
postgresql['wal_level'] = 'replica'
postgresql['shared_preload_libraries'] = 'repmgr_funcs'
patroni['postgresql']['max_wal_senders'] = 4
patroni['postgresql']['max_replication_slots'] = 4
# Incoming recommended value for max connections is 500. See https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5691.
patroni['postgresql']['max_connections'] = 500
# Disable automatic database migrations
gitlab_rails['auto_migrate'] = false
# Configure the Consul agent
consul['enable'] = true
consul['services'] = %w(postgresql)
# Specify if a node should attempt to be primary on initialization.
repmgr['master_on_initialization'] = false
## Enable service discovery for Prometheus
consul['monitoring_service_discovery'] = true
# START user configuration
# Please set the real values as explained in Required Information section
......@@ -817,34 +736,31 @@ in the second step, do not supply the `EXTERNAL_URL` value.
postgresql['pgbouncer_user_password'] = '<pgbouncer_password_hash>'
# Replace POSTGRESQL_PASSWORD_HASH with a generated md5 value
postgresql['sql_user_password'] = '<postgresql_password_hash>'
# Set `max_wal_senders` to one more than the number of database nodes in the cluster.
# This is used to prevent replication from using up all of the
# available database connections.
postgresql['max_wal_senders'] = 4
postgresql['max_replication_slots'] = 4
# Replace XXX.XXX.XXX.XXX/YY with Network Address
postgresql['trust_auth_cidr_addresses'] = %w(127.0.0.1/32 10.6.0.0/24)
repmgr['trust_auth_cidr_addresses'] = %w(127.0.0.1/32 10.6.0.0/24)
## Enable service discovery for Prometheus
consul['enable'] = true
consul['monitoring_service_discovery'] = true
postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/24)
# Set the network addresses that the exporters will listen on for monitoring
node_exporter['listen_address'] = '0.0.0.0:9100'
postgres_exporter['listen_address'] = '0.0.0.0:9187'
postgres_exporter['dbname'] = 'gitlabhq_production'
postgres_exporter['password'] = '<postgresql_password_hash>'
## The IPs of the Consul server nodes
## You can also use FQDNs and intermix them with IPs
consul['configuration'] = {
retry_join: %w(10.6.0.11 10.6.0.12 10.6.0.13),
}
#
# END user configuration
```
PostgreSQL, with Patroni managing its failover, will default to use `pg_rewind` by default to handle conflicts.
Like most failover handling methods, this has a small chance of leading to data loss.
Learn more about the various [Patroni replication methods](../postgresql/replication_and_failover.md#selecting-the-appropriate-patroni-replication-method).
1. Copy the `/etc/gitlab/gitlab-secrets.json` file from your Consul server, and replace
the file of the same name on this server. If that file is not on this server,
add the file from your Consul server to this server.
1. [Reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect.
Advanced [configuration options](https://docs.gitlab.com/omnibus/settings/database.html)
......@@ -874,84 +790,25 @@ SSH in to the **primary node**:
1. Exit the database prompt by typing `\q` and Enter.
1. Verify the cluster is initialized with one node:
1. Check the status of the leader and cluster:
```shell
gitlab-ctl repmgr cluster show
gitlab-ctl patroni members
```
The output should be similar to the following:
```plaintext
Role | Name | Upstream | Connection String
----------+----------|----------|----------------------------------------
* master | HOSTNAME | | host=HOSTNAME user=gitlab_repmgr dbname=gitlab_repmgr
| Cluster | Member | Host | Role | State | TL | Lag in MB | Pending restart |
|---------------|-----------------------------------|-----------|--------|---------|-----|-----------|-----------------|
| postgresql-ha | <PostgreSQL primary hostname> | 10.6.0.31 | Leader | running | 175 | | * |
| postgresql-ha | <PostgreSQL secondary 1 hostname> | 10.6.0.32 | | running | 175 | 0 | * |
| postgresql-ha | <PostgreSQL secondary 2 hostname> | 10.6.0.33 | | running | 175 | 0 | * |
```
1. Note down the hostname or IP address in the connection string: `host=HOSTNAME`. We will
refer to the hostname in the next section as `<primary_node_name>`. If the value
is not an IP address, it will need to be a resolvable name (via DNS or
`/etc/hosts`)
SSH in to the **secondary node**:
1. Set up the repmgr standby:
```shell
gitlab-ctl repmgr standby setup <primary_node_name>
```
Do note that this will remove the existing data on the node. The command
has a wait time.
The output should be similar to the following:
```console
Doing this will delete the entire contents of /var/opt/gitlab/postgresql/data
If this is not what you want, hit Ctrl-C now to exit
To skip waiting, rerun with the -w option
Sleeping for 30 seconds
Stopping the database
Removing the data
Cloning the data
Starting the database
Registering the node with the cluster
ok: run: repmgrd: (pid 19068) 0s
```
Before moving on, make sure the databases are configured correctly. Run the
following command on the **primary** node to verify that replication is working
properly and the secondary nodes appear in the cluster:
```shell
gitlab-ctl repmgr cluster show
```
The output should be similar to the following:
```plaintext
Role | Name | Upstream | Connection String
----------+---------|-----------|------------------------------------------------
* master | MASTER | | host=<primary_node_name> user=gitlab_repmgr dbname=gitlab_repmgr
standby | STANDBY | MASTER | host=<secondary_node_name> user=gitlab_repmgr dbname=gitlab_repmgr
standby | STANDBY | MASTER | host=<secondary_node_name> user=gitlab_repmgr dbname=gitlab_repmgr
```
If the 'Role' column for any node says "FAILED", check the
If the 'State' column for any node doesn't say "running", check the
[Troubleshooting section](troubleshooting.md) before proceeding.
Also, check that the `repmgr-check-master` command works successfully on each node:
```shell
su - gitlab-consul
gitlab-ctl repmgr-check-master || echo 'This node is a standby repmgr node'
```
This command relies on exit codes to tell Consul whether a particular node is a master
or secondary. The most important thing here is that this command does not produce errors.
If there are errors it's most likely due to incorrect `gitlab-consul` database user permissions.
Check the [Troubleshooting section](troubleshooting.md) before proceeding.
<div align="right">
<a type="button" class="btn btn-default" href="#setup-components">
Back to setup components <i class="fa fa-angle-double-up" aria-hidden="true"></i>
......@@ -969,7 +826,7 @@ The following IPs will be used as an example:
1. On each PgBouncer node, edit `/etc/gitlab/gitlab.rb`, and replace
`<consul_password_hash>` and `<pgbouncer_password_hash>` with the
password hashes you [set up previously](#postgresql-primary-node):
password hashes you [set up previously](#postgresql-nodes):
```ruby
# Disable all components except Pgbouncer and Consul agent
......@@ -977,15 +834,16 @@ The following IPs will be used as an example:
# Configure PgBouncer
pgbouncer['admin_users'] = %w(pgbouncer gitlab-consul)
pgbouncer['users'] = {
'gitlab-consul': {
password: '<consul_password_hash>'
},
'pgbouncer': {
password: '<pgbouncer_password_hash>'
}
'gitlab-consul': {
password: '<consul_password_hash>'
},
'pgbouncer': {
password: '<pgbouncer_password_hash>'
}
}
# Incoming recommended value for max db connections is 150. See https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5691.
pgbouncer['max_db_connections'] = 150
# Configure Consul agent
consul['watchers'] = %w(postgresql)
......
......@@ -514,39 +514,24 @@ See the suggested fix [in Geo documentation](../geo/replication/troubleshooting.
See the suggested fix [in Geo documentation](../geo/replication/troubleshooting.md#message-log--invalid-ip-mask-md5-name-or-service-not-known).
## Troubleshooting PostgreSQL
## Troubleshooting PostgreSQL with Patroni
In case you are experiencing any issues connecting through PgBouncer, the first place to check is always the logs:
In case you are experiencing any issues connecting through PgBouncer, the first place to check is always the logs for PostgreSQL (which is run through Patroni):
```shell
sudo gitlab-ctl tail postgresql
sudo gitlab-ctl tail patroni
```
### Consul and PostgreSQL changes not taking effect
### Consul and PostgreSQL with Patroni changes not taking effect
Due to the potential impacts, `gitlab-ctl reconfigure` only reloads Consul and PostgreSQL, it will not restart the services. However, not all changes can be activated by reloading.
To restart either service, run `gitlab-ctl restart SERVICE`
To restart either service, run `gitlab-ctl restart consul` or `gitlab-ctl restart patroni` respectively.
For PostgreSQL, it is usually safe to restart the master node by default. Automatic failover defaults to a 1 minute timeout. Provided the database returns before then, nothing else needs to be done. To be safe, you can stop `repmgrd` on the standby nodes first with `gitlab-ctl stop repmgrd`, then start afterwards with `gitlab-ctl start repmgrd`.
For PostgreSQL with Patroni, to prevent the primary node from being failed over automatically, it's safest to stop all secondaries first, then restart the primary and finally restart the secondaries again.
On the Consul server nodes, it is important to restart the Consul service in a controlled fashion. Read our [Consul documentation](../consul.md#restart-consul) for instructions on how to restart the service.
### `gitlab-ctl repmgr-check-master` command produces errors
If this command displays errors about database permissions it is likely that something failed during
install, resulting in the `gitlab-consul` database user getting incorrect permissions. Follow these
steps to fix the problem:
1. On the master database node, connect to the database prompt - `gitlab-psql -d template1`
1. Delete the `gitlab-consul` user - `DROP USER "gitlab-consul";`
1. Exit the database prompt - `\q`
1. [Reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) and the user will be re-added with the proper permissions.
1. Change to the `gitlab-consul` user - `su - gitlab-consul`
1. Try the check command again - `gitlab-ctl repmgr-check-master`.
Now there should not be errors. If errors still occur then there is another problem.
### PgBouncer error `ERROR: pgbouncer cannot connect to server`
You may get this error when running `gitlab-rake gitlab:db:configure` or you
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment