Commit 93e04700 authored by Stan Hu's avatar Stan Hu

Add environment variables to override backup/restore DB settings

In the latest versions of PostgreSQL, using `pg_dump` on a PgBouncer
connection can cause a full GitLab outage. This happens because
`pg_dump` clears the search path and explicitly sets the schema for
every SQL query it runs. When PgBouncer is used in transaction pooling
mode, these connection settings persist and cause queries made by Rails
to fail since the `public` schema is not searched.

Currently there is no way to tell whether a connection is using
PgBouncer, so there is no way to prevent `pg_dump` from running.

To avoid causing an outage, we provide admins with a way to override the
database settings for the backup and restore task via environment
variables:

* `GITLAB_BACKUP_PGHOST`
* `GITLAB_BACKUP_PGUSER`
* `GITLAB_BACKUP_PGPORT`
* `GITLAB_BACKUP_PGPASSWORD`
* `GITLAB_BACKUP_PGSSLMODE`
* `GITLAB_BACKUP_PGSSLKEY`
* `GITLAB_BACKUP_PGSSLCERT`
* `GITLAB_BACKUP_PGSSLROOTCERT`
* `GITLAB_BACKUP_PGSSLCRL`
* `GITLAB_BACKUP_PGSSLCOMPRESSION`

Relates to:

* https://gitlab.com/gitlab-org/gitlab/-/issues/23211
* https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/3470
parent dbcf6f37
---
title: Add environment variables to override backup/restore DB settings
merge_request: 45855
author:
type: added
...@@ -940,9 +940,7 @@ message. Install the [correct GitLab version](https://packages.gitlab.com/gitlab ...@@ -940,9 +940,7 @@ message. Install the [correct GitLab version](https://packages.gitlab.com/gitlab
and then try again. and then try again.
NOTE: **Note:** NOTE: **Note:**
There is a known issue with restore not working with `pgbouncer`. The [workaround is to bypass There is a known issue with restore not working with `pgbouncer`. [Read more about backup and restore with `pgbouncer`](#backup-and-restore-for-installations-using-pgbouncer).
`pgbouncer` and connect directly to the primary database node](../administration/postgresql/pgbouncer.md#procedure-for-bypassing-pgbouncer).
[Read more about backup and restore with `pgbouncer`](#backup-and-restore-for-installations-using-pgbouncer).
### Restore for Docker image and GitLab Helm chart installations ### Restore for Docker image and GitLab Helm chart installations
...@@ -1039,26 +1037,60 @@ practical use. ...@@ -1039,26 +1037,60 @@ practical use.
## Backup and restore for installations using PgBouncer ## Backup and restore for installations using PgBouncer
PgBouncer can cause the following errors when performing backups and restores: Do NOT backup or restore GitLab through a PgBouncer connection. These
tasks must [bypass PgBouncer and connect directly to the PostgreSQL primary database node](#bypassing-pgbouncer),
or they will cause a GitLab outage.
When the GitLab backup or restore task is used with PgBouncer, the
following error message is shown:
```ruby ```ruby
ActiveRecord::StatementInvalid: PG::UndefinedTable ActiveRecord::StatementInvalid: PG::UndefinedTable
``` ```
There is a [known issue](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/3470) for restore not working This happens because the task uses `pg_dump`, which [sets a null search
with `pgbouncer`. path and explicitly includes the schema in every SQL query](https://gitlab.com/gitlab-org/gitlab/-/issues/23211)
to address [CVE-2018-1058](https://www.postgresql.org/about/news/postgresql-103-968-9512-9417-and-9322-released-1834/).
Since connections are reused with PgBouncer in transaction pooling mode,
PostgreSQL fails to search the default `public` schema. As a result,
this clearing of the search path causes tables and columns to appear
missing.
### Bypassing PgBouncer
There are two ways to fix this:
1. [Use environment variables to override the database settings](#environment-variable-overrides) for the backup task.
1. Reconfigure a node to [connect directly to the PostgreSQL primary database node](../administration/postgresql/pgbouncer.md#procedure-for-bypassing-pgbouncer).
#### Environment variable overrides
To workaround this issue, the GitLab server will need to bypass `pgbouncer` and By default, GitLab uses the database configuration stored in a
[connect directly to the primary database node](../administration/postgresql/pgbouncer.md#procedure-for-bypassing-pgbouncer) configuration file (`database.yml`). However, you can override the database settings
to perform the database restore. for the backup and restore task by setting environment
variables that are prefixed with `GITLAB_BACKUP_`:
- `GITLAB_BACKUP_PGHOST`
- `GITLAB_BACKUP_PGUSER`
- `GITLAB_BACKUP_PGPORT`
- `GITLAB_BACKUP_PGPASSWORD`
- `GITLAB_BACKUP_PGSSLMODE`
- `GITLAB_BACKUP_PGSSLKEY`
- `GITLAB_BACKUP_PGSSLCERT`
- `GITLAB_BACKUP_PGSSLROOTCERT`
- `GITLAB_BACKUP_PGSSLCRL`
- `GITLAB_BACKUP_PGSSLCOMPRESSION`
For example, to override the database host and port to use 192.168.1.10
and port 5432 with the Omnibus package:
```shell
sudo GITLAB_BACKUP_PGHOST=192.168.1.10 GITLAB_BACKUP_PGPORT=5432 /opt/gitlab/bin/gitlab-backup create
```
There is also a [known issue](https://gitlab.com/gitlab-org/gitlab/-/issues/23211) See the [PostgreSQL documentation](https://www.postgresql.org/docs/12/libpq-envars.html)
with PostgreSQL 9 and running a database backup through PgBouncer that can cause for more details on what these parameters do.
an outage to GitLab. If you're still on PostgreSQL 9 and upgrading PostgreSQL isn't
an option, workarounds include having a dedicated application node just for backups,
configured to connect directly the primary database node as noted above. You're
advised to upgrade your PostgreSQL version though, GitLab 11.11 shipped with PostgreSQL
10.7, and that is the recommended version for GitLab 12+.
## Additional notes ## Additional notes
......
...@@ -140,7 +140,14 @@ module Backup ...@@ -140,7 +140,14 @@ module Backup
'sslcrl' => 'PGSSLCRL', 'sslcrl' => 'PGSSLCRL',
'sslcompression' => 'PGSSLCOMPRESSION' 'sslcompression' => 'PGSSLCOMPRESSION'
} }
args.each { |opt, arg| ENV[arg] = config[opt].to_s if config[opt] } args.each do |opt, arg|
# This enables the use of different PostgreSQL settings in
# case PgBouncer is used. PgBouncer clears the search path,
# which wreaks havoc on Rails if connections are reused.
override = "GITLAB_BACKUP_#{arg}"
val = ENV[override].presence || config[opt].to_s.presence
ENV[arg] = val if val
end
end end
def report_success(success) def report_success(success)
......
...@@ -48,5 +48,26 @@ RSpec.describe Backup::Database do ...@@ -48,5 +48,26 @@ RSpec.describe Backup::Database do
expect(output).to include(visible_error) expect(output).to include(visible_error)
end end
end end
context 'with PostgreSQL settings defined in the environment' do
let(:cmd) { %W[#{Gem.ruby} -e] + ["$stderr.puts ENV.to_h.select { |k, _| k.start_with?('PG') }"] }
let(:config) { YAML.load_file(File.join(Rails.root, 'config', 'database.yml'))['test'] }
before do
stub_const 'ENV', ENV.to_h.merge({
'GITLAB_BACKUP_PGHOST' => 'test.example.com',
'PGPASSWORD' => 'donotchange'
})
end
it 'overrides default config values' do
subject.restore
expect(output).to include(%("PGHOST"=>"test.example.com"))
expect(output).to include(%("PGPASSWORD"=>"donotchange"))
expect(output).to include(%("PGPORT"=>"#{config['port']}")) if config['port']
expect(output).to include(%("PGUSER"=>"#{config['username']}")) if config['username']
end
end
end end
end end
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment