praefect.md 25.4 KB
Newer Older
1
# Praefect: High Availability
2

3
NOTE: **Note:** Praefect is an experimental service, and data loss is likely.
4

5
Praefect is an optional reverse-proxy for [Gitaly](../index.md) to manage a
6 7 8
cluster of Gitaly nodes for high availability. Initially, high availability
be implemented through asynchronous replication. If a Gitaly node becomes
unavailable, it will be possible to fail over to a warm Gitaly replica.
9 10 11 12

The first minimal version will support:

- Eventual consistency of the secondary replicas.
13 14
- Automatic fail over from the primary to the secondary.
- Reporting of possible data loss if replication queue is non empty.
15 16 17 18

Follow the [HA Gitaly epic](https://gitlab.com/groups/gitlab-org/-/epics/1489)
for updates and roadmap.

19
## Requirements for configuring Gitaly for High Availability
20

21 22
NOTE: **Note:** this reference architecture is not highly available because
Praefect is a single point of failure.
23

24 25
The minimal [alpha](https://about.gitlab.com/handbook/product/#alpha-beta-ga)
reference architecture additionally requires:
26

27 28 29 30
- 1 Praefect node
- 1 PostgreSQL server (PostgreSQL 9.6 or newer)
- 3 Gitaly nodes (1 primary, 2 secondary)

31
![Alpha architecture diagram](img/praefect_architecture_v12_10.png)
32 33 34 35 36 37

See the [design
document](https://gitlab.com/gitlab-org/gitaly/-/blob/master/doc/design_ha.md)
for implementation details.

## Setup Instructions
38

39 40
If you [installed](https://about.gitlab.com/install/) GitLab using the Omnibus
package (highly recommended), follow the steps below:
41

42 43 44 45 46
1. [Preparation](#preparation)
1. [Configuring the Praefect database](#postgresql)
1. [Configuring the Praefect proxy/router](#praefect)
1. [Configuring each Gitaly node](#gitaly) (once for each Gitaly node)
1. [Updating the GitLab server configuration](#gitlab)
47
1. [Configure Grafana](#grafana)
48

49
### Preparation
50

51 52
Before beginning, you should already have a working GitLab instance. [Learn how
to install GitLab](https://about.gitlab.com/install/).
53

54 55 56
Provision a PostgreSQL server (PostgreSQL 9.6 or newer). Configuration through
the GitLab Omnibus distribution is not yet supported. Follow this
[issue](https://gitlab.com/gitlab-org/gitaly/issues/2476) for updates.
57

58 59 60 61 62
Prepare all your new nodes by [installing
GitLab](https://about.gitlab.com/install/).

- 1 Praefect node (minimal storage required)
- 3 Gitaly nodes (high CPU, high memory, fast storage)
63
- 1 GitLab server
64 65 66 67

You will need the IP/host address for each node.

1. `POSTGRESQL_SERVER_ADDRESS`: the IP/host address of the PostgreSQL server
68 69 70 71 72 73 74
1. `PRAEFECT_HOST`: the IP/host address of the Praefect server
1. `GITALY_HOST`: the IP/host address of each Gitaly server
1. `GITLAB_HOST`: the IP/host address of the GitLab server

If you are using a cloud provider, you can look up the addresses for each server through your cloud provider's management console.

If you are using Google Cloud Platform, SoftLayer, or any other vendor that provides a virtual private cloud (VPC) you can use the private addresses for each cloud instance (corresponds to “internal address” for Google Cloud Platform) for `PRAEFECT_HOST`, `GITALY_HOST`, and `GITLAB_HOST`.
75

76 77
#### Secrets

78 79 80 81 82 83 84 85 86 87 88 89 90 91
The communication between components is secured with different secrets, which
are described below. Before you begin, generate a unique secret for each, and
make note of it. This will make it easy to replace these placeholder tokens
with secure tokens as you complete the setup process.

1. `GITLAB_SHELL_SECRET_TOKEN`: this is used by Git hooks to make callback HTTP
   API requests to GitLab when accepting a Git push. This secret is shared with
   GitLab Shell for legacy reasons.
1. `PRAEFECT_EXTERNAL_TOKEN`: repositories hosted on your Praefect cluster can
   only be accessed by Gitaly clients that carry this token.
1. `PRAEFECT_INTERNAL_TOKEN`: this token is used for replication traffic inside
   your Praefect cluster. This is distinct from `PRAEFECT_EXTERNAL_TOKEN`
   because Gitaly clients must not be able to access internal nodes of the
   Praefect cluster directly; that could lead to data loss.
92
1. `PRAEFECT_SQL_PASSWORD`: this password is used by Praefect to connect to
93
   PostgreSQL.
94 95
1. `GRAFANA_PASSWORD`: this password is used to access the `admin`
   account in the Grafana dashboards.
96

97 98
We will note in the instructions below where these secrets are required.

99
### PostgreSQL
100

101 102
NOTE: **Note:** don't reuse the GitLab application database for the Praefect
database.
103

104
To complete this section you will need:
105

106 107 108
- 1 Praefect node
- 1 PostgreSQL server (PostgreSQL 9.6 or newer)
  - An SQL user with permissions to create databases
109

110 111
During this section, we will configure the PostgreSQL server, from the Praefect
node, using `psql` which is installed by GitLab Omnibus.
112

113
1. SSH into the **Praefect** node and login as root:
114

115 116 117
   ```shell
   sudo -i
   ```
118

119 120 121
1. Connect to the PostgreSQL server with administrative access. This is likely
   the `postgres` user. The database `template1` is used because it is created
   by default on all PostgreSQL servers.
122

123 124 125
   ```shell
   /opt/gitlab/embedded/bin/psql -U postgres -d template1 -h POSTGRESQL_SERVER_ADDRESS
   ```
126

127 128 129
   Create a new user `praefect` which will be used by Praefect. Replace
   `PRAEFECT_SQL_PASSWORD` with the strong password you generated in the
   preparation step.
130

131 132
   ```sql
   CREATE ROLE praefect WITH LOGIN CREATEDB PASSWORD 'PRAEFECT_SQL_PASSWORD';
133
   ```
134

135
1. Reconnect to the PostgreSQL server, this time as the `praefect` user:
136

137 138 139
   ```shell
   /opt/gitlab/embedded/bin/psql -U praefect -d template1 -h POSTGRESQL_SERVER_ADDRESS
   ```
140

141 142
   Create a new database `praefect_production`. By creating the database while
   connected as the `praefect` user, we are confident they have access.
143

144 145 146
   ```sql
   CREATE DATABASE praefect_production WITH ENCODING=UTF8;
   ```
147

148
The database used by Praefect is now configured.
149

150
### Praefect
151

152
To complete this section you will need:
153

154 155 156
- [Configured PostgreSQL server](#postgresql), including:
  - IP/host address (`POSTGRESQL_SERVER_ADDRESS`)
  - password (`PRAEFECT_SQL_PASSWORD`)
157

158 159
Praefect should be run on a dedicated node. Do not run Praefect on the
application server, or a Gitaly node.
160

161
1. SSH into the **Praefect** node and login as root:
162

163 164 165
   ```shell
   sudo -i
   ```
166

167
1. Disable all other services by editing `/etc/gitlab/gitlab.rb`:
168

169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191
   ```ruby
   # Disable all other services on the Praefect node
   postgresql['enable'] = false
   redis['enable'] = false
   nginx['enable'] = false
   prometheus['enable'] = false
   grafana['enable'] = false
   unicorn['enable'] = false
   sidekiq['enable'] = false
   gitlab_workhorse['enable'] = false
   gitaly['enable'] = false

   # Enable only the Praefect service
   praefect['enable'] = true

   # Prevent database connections during 'gitlab-ctl reconfigure'
   gitlab_rails['rake_cache_clear'] = false
   gitlab_rails['auto_migrate'] = false
   ```

1. Configure **Praefect** to listen on network interfaces by editing
   `/etc/gitlab/gitlab.rb`:

192 193 194 195
   You will need to replace:

   - `PRAEFECT_HOST` with the IP address or hostname of the Praefect node

196
   ```ruby
197
   praefect['listen_addr'] = 'PRAEFECT_HOST:2305'
198 199 200

   # Enable Prometheus metrics access to Praefect. You must use firewalls
   # to restrict access to this address/port.
201
   praefect['prometheus_listen_addr'] = 'PRAEFECT_HOST:9652'
202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232
   ```

1. Configure a strong `auth_token` for **Praefect** by editing
   `/etc/gitlab/gitlab.rb`. This will be needed by clients outside the cluster
   (like GitLab Shell) to communicate with the Praefect cluster :

   ```ruby
   praefect['auth_token'] = 'PRAEFECT_EXTERNAL_TOKEN'
   ```

1. Configure **Praefect** to connect to the PostgreSQL database by editing
   `/etc/gitlab/gitlab.rb`.

   You will need to replace `POSTGRESQL_SERVER_ADDRESS` with the IP/host address
   of the database, and `PRAEFECT_SQL_PASSWORD` with the strong password set
   above.

   ```ruby
   praefect['database_host'] = 'POSTGRESQL_SERVER_ADDRESS'
   praefect['database_port'] = 5432
   praefect['database_user'] = 'praefect'
   praefect['database_password'] = 'PRAEFECT_SQL_PASSWORD'
   praefect['database_dbname'] = 'praefect_production'
   ```

   If you want to use a TLS client certificate, the options below can be used:

   ```ruby
   # Connect to PostreSQL using a TLS client certificate
   # praefect['database_sslcert'] = '/path/to/client-cert'
   # praefect['database_sslkey'] = '/path/to/client-key'
233

234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266
   # Trust a custom certificate authority
   # praefect['database_sslrootcert'] = '/path/to/rootcert'
   ```

   By default Praefect will refuse to make an unencrypted connection to
   PostgreSQL. You can override this by uncommenting the following line:

   ```ruby
   # praefect['database_sslmode'] = 'disable'
   ```

1. Configure the **Praefect** cluster to connect to each Gitaly node in the
   cluster by editing `/etc/gitlab/gitlab.rb`.

   In the example below we have configured one cluster named `praefect`. This
   cluster has three Gitaly nodes `gitaly-1`, `gitaly-2`, and `gitaly-3`, which
   will be replicas of each other.

   Replace `PRAEFECT_INTERNAL_TOKEN` with a strong secret, which will be used by
   Praefect when communicating with Gitaly nodes in the cluster. This token is
   distinct from the `PRAEFECT_EXTERNAL_TOKEN`.

   Replace `GITALY_HOST` with the IP/host address of the each Gitaly node.

   More Gitaly nodes can be added to the cluster to increase the number of
   replicas. More clusters can also be added for very large GitLab instances.

   NOTE: **Note:** The `gitaly-1` node is currently denoted the primary. This
   can be used to manually fail from one node to another. This will be removed
   in the future to allow for automatic failover.

   ```ruby
   # Name of storage hash must match storage name in git_data_dirs on GitLab
267
   # server ('praefect') and in git_data_dirs on Gitaly nodes ('gitaly-1')
268
   praefect['virtual_storages'] = {
269
     'praefect' => {
270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289
       'gitaly-1' => {
         'address' => 'tcp://GITALY_HOST:8075',
         'token'   => 'PRAEFECT_INTERNAL_TOKEN',
         'primary' => true
       },
       'gitaly-2' => {
         'address' => 'tcp://GITALY_HOST:8075',
         'token'   => 'PRAEFECT_INTERNAL_TOKEN'
       },
       'gitaly-3' => {
         'address' => 'tcp://GITALY_HOST:8075',
         'token'   => 'PRAEFECT_INTERNAL_TOKEN'
       }
     }
   }
   ```

1. Save the changes to `/etc/gitlab/gitlab.rb` and [reconfigure Praefect](../restart_gitlab.md#omnibus-gitlab-reconfigure):

   ```shell
290
   gitlab-ctl reconfigure
291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367
   ```

1. Verify that Praefect can reach PostgreSQL:

   ```shell
   sudo -u git /opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.toml sql-ping
   ```

   If the check fails, make sure you have followed the steps correctly. If you
   edit `/etc/gitlab/gitlab.rb`, remember to run `sudo gitlab-ctl reconfigure`
   again before trying the `sql-ping` command.

### Gitaly

NOTE: **Note:** Complete these steps for **each** Gitaly node.

To complete this section you will need:

- [Configured Praefect node](#praefect)
- 3 (or more) servers, with GitLab installed, to be configured as Gitaly nodes.
  These should be dedicated nodes, do not run other services on these nodes.

Every Gitaly server assigned to the Praefect cluster needs to be configured. The
configuration is the same as a normal [standalone Gitaly server](../index.md),
except:

- the storage names are exposed to Praefect, not GitLab
- the secret token is shared with Praefect, not GitLab

The configuration of all Gitaly nodes in the Praefect cluster can be identical,
because we rely on Praefect to route operations correctly.

Particular attention should be shown to:

- the `gitaly['auth_token']` configured in this section must match the `token`
  value under `praefect['virtual_storages']` on the Praefect node. This was set
  in the [previous section](#praefect). This document uses the placeholder
  `PRAEFECT_INTERNAL_TOKEN` throughout.
- the storage names in `git_data_dirs` configured in this section must match the
  storage names under `praefect['virtual_storages']` on the Praefect node. This
  was set in the [previous section](#praefect). This document uses `gitaly-1`,
  `gitaly-2`, and `gitaly-3` as Gitaly storage names.

For more information on Gitaly server configuration, see our [Gitaly
documentation](index.md#3-gitaly-server-configuration).

1. SSH into the **Gitaly** node and login as root:

   ```shell
   sudo -i
   ```

1. Disable all other services by editing `/etc/gitlab/gitlab.rb`:

   ```ruby
   # Disable all other services on the Praefect node
   postgresql['enable'] = false
   redis['enable'] = false
   nginx['enable'] = false
   prometheus['enable'] = false
   grafana['enable'] = false
   unicorn['enable'] = false
   sidekiq['enable'] = false
   gitlab_workhorse['enable'] = false
   prometheus_monitoring['enable'] = false

   # Enable only the Praefect service
   gitaly['enable'] = true

   # Prevent database connections during 'gitlab-ctl reconfigure'
   gitlab_rails['rake_cache_clear'] = false
   gitlab_rails['auto_migrate'] = false
   ```

1. Configure **Gitaly** to listen on network interfaces by editing
   `/etc/gitlab/gitlab.rb`:

368 369 370 371
   You will need to replace:

   - `GITALY_HOST` with the IP address or hostname of the Gitaly node

372 373 374
   ```ruby
   # Make Gitaly accept connections on all network interfaces.
   # Use firewalls to restrict access to this address/port.
375
   gitaly['listen_addr'] = 'GITALY_HOST:8075'
376 377 378

   # Enable Prometheus metrics access to Gitaly. You must use firewalls
   # to restrict access to this address/port.
379
   gitaly['prometheus_listen_addr'] = 'GITALY_HOST:9236'
380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401
   ```

1. Configure a strong `auth_token` for **Gitaly** by editing
   `/etc/gitlab/gitlab.rb`. This will be needed by clients to communicate with
   this Gitaly nodes. Typically, this token will be the same for all Gitaly
   nodes.

   ```ruby
   gitaly['auth_token'] = 'PRAEFECT_INTERNAL_TOKEN'
   ```

1. Configure the GitLab Shell `secret_token`, and `internal_api_url` which are
   needed for `git push` operations.

   If you have already configured [Gitaly on its own server](../index.md)

   ```ruby
   gitlab_shell['secret_token'] = 'GITLAB_SHELL_SECRET_TOKEN'

   # Configure the gitlab-shell API callback URL. Without this, `git push` will
   # fail. This can be your front door GitLab URL or an internal load balancer.
   # Examples: 'https://example.gitlab.com', 'http://1.2.3.4'
402
   gitlab_rails['internal_api_url'] = 'http://GITLAB_HOST'
403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435
   ```

1. Configure the storage location for Git data by setting `git_data_dirs` in
   `/etc/gitlab/gitlab.rb`. Each Gitaly node should have a unique storage name
   (eg `gitaly-1`).

   Instead of configuring `git_data_dirs` uniquely for each Gitaly node, it is
   often easier to have include the configuration for all Gitaly nodes on every
   Gitaly node. This is supported because the Praefect `virtual_storages`
   configuration maps each storage name (eg `gitaly-1`) to a specific node, and
   requests are routed accordingly. This means every Gitaly node in your fleet
   can share the same configuration.

   ```ruby
   # You can include the data dirs for all nodes in the same config, because
   # Praefect will only route requests according to the addresses provided in the
   # prior step.
   git_data_dirs({
     "gitaly-1" => {
       "path" => "/var/opt/gitlab/git-data"
     },
     "gitaly-2" => {
       "path" => "/var/opt/gitlab/git-data"
     },
     "gitaly-3" => {
       "path" => "/var/opt/gitlab/git-data"
     }
   })
   ```

1. Save the changes to `/etc/gitlab/gitlab.rb` and [reconfigure Gitaly](../restart_gitlab.md#omnibus-gitlab-reconfigure):

   ```shell
436
   gitlab-ctl reconfigure
437 438
   ```

439 440 441 442 443 444
1. To ensure that Gitaly [has updated its Prometheus listen address](https://gitlab.com/gitlab-org/gitaly/-/issues/2521), [restart Gitaly](../restart_gitlab.md#omnibus-gitlab-restart):

   ```shell
   gitlab-ctl restart gitaly
   ```

445
**The steps above must be completed for each Gitaly node!**
446 447

After all Gitaly nodes are configured, you can run the Praefect connection
Paul Okstad's avatar
Paul Okstad committed
448
checker to verify Praefect can connect to all Gitaly servers in the Praefect
449
config.
Paul Okstad's avatar
Paul Okstad committed
450

451
1. SSH into the **Praefect** node and run the Praefect connection checker:
Paul Okstad's avatar
Paul Okstad committed
452

453 454 455
   ```shell
   sudo /opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.toml dial-nodes
   ```
456

457 458 459 460 461 462
1. Enable automatic failover by editing `/etc/gitlab/gitlab.rb`:

   ```ruby
   praefect['failover_enabled'] = true
   ```

463
   When automatic failover is enabled, Praefect checks the health of internal
464 465 466 467 468 469 470 471 472 473
   Gitaly nodes. If the primary has a certain amount of health checks fail, it
   will promote one of the secondaries to be primary, and demote the primary to
   be a secondary.

   Manual failover is possible by updating `praefect['virtual_storages']` and
   nominating a new primary node.

   NOTE: **Note:**: Automatic failover is not yet supported for setups with
   multiple Praefect nodes. There is currently no coordination between Praefect
   nodes, which could result in two Praefect instances thinking two different
474 475
   Gitaly nodes are the primary. Follow issue
   [#2547](https://gitlab.com/gitlab-org/gitaly/-/issues/2547) for
476 477 478 479 480 481 482 483 484
   updates.

1. Save the changes to `/etc/gitlab/gitlab.rb` and [reconfigure
   Praefect](../restart_gitlab.md#omnibus-gitlab-reconfigure):

   ```shell
   gitlab-ctl reconfigure
   ```

485
### GitLab
486

487
To complete this section you will need:
488

489 490
- [Configured Praefect node](#praefect)
- [Configured Gitaly nodes](#gitaly)
491

492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512
The Praefect cluster needs to be exposed as a storage location to the GitLab
application. This is done by updating the `git_data_dirs`.

Particular attention should be shown to:

- the storage name added to `git_data_dirs` in this section must match the
  storage name under `praefect['virtual_storages']` on the Praefect node. This
  was set in the [Praefect](#praefect) section of this guide. This document uses
  `praefect` as the Praefect storage name.

1. SSH into the **GitLab** node and login as root:

   ```shell
   sudo -i
   ```

1. Add the Praefect cluster as a storage location by editing
   `/etc/gitlab/gitlab.rb`.

   You will need to replace:

513
   - `PRAEFECT_HOST` with the IP address or hostname of the Praefect node
514
   - `GITLAB_HOST` with the IP address or hostname of the GitLab server
515
   - `PRAEFECT_EXTERNAL_TOKEN` with the real secret
516

517 518 519
   ```ruby
   git_data_dirs({
     "default" => {
520
       "gitaly_address" => "tcp://GITLAB_HOST:8075"
521 522
     },
     "praefect" => {
523
       "gitaly_address" => "tcp://PRAEFECT_HOST:2305",
524 525 526 527
       "gitaly_token" => 'PRAEFECT_EXTERNAL_TOKEN'
     }
   })
   ```
528

529 530 531 532
1. Allow Gitaly to listen on a tcp port by editing
   `/etc/gitlab/gitlab.rb`

   ```ruby
533
   gitaly['listen_addr'] = 'GITLAB_HOST:8075'
534 535
   ```

536 537 538 539 540 541 542 543 544 545
1. Configure the `gitlab_shell['secret_token']` so that callbacks from Gitaly
   nodes during a `git push` are properly authenticated by editing
   `/etc/gitlab/gitlab.rb`:

   You will need to replace `GITLAB_SHELL_SECRET_TOKEN` with the real secret.

   ```ruby
   gitlab_shell['secret_token'] = 'GITLAB_SHELL_SECRET_TOKEN'
   ```

546 547
1. Configure the `external_url` so that files could be served by GitLab
   by proper endpoint access by editing `/etc/gitlab/gitlab.rb`:
548

549
   You will need to replace `GITLAB_SERVER_URL` with the real external facing URL on which
550
   current GitLab instance is serving:
551

552 553 554 555
   ```ruby
   external_url 'GITLAB_SERVER_URL'
   ```

556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585
1. Add Prometheus monitoring settings by editing `/etc/gitlab/gitlab.rb`.

   You will need to replace:

   - `PRAEFECT_HOST` with the IP address or hostname of the Praefect node
   - `GITALY_HOST` with the IP address or hostname of each Gitaly node

   ```ruby
   prometheus['scrape_configs'] = [
     {
       'job_name' => 'praefect',
       'static_configs' => [
         'targets' => [
           'PRAEFECT_HOST:9652' # praefect
         ]
       ]
     },
     {
       'job_name' => 'praefect-gitaly',
       'static_configs' => [
         'targets' => [
           'GITALY_HOST:9236', # gitaly-1
           'GITALY_HOST:9236', # gitaly-2
           'GITALY_HOST:9236', # gitaly-3
         ]
       ]
     }
   ]
   ```

586 587 588
1. Save the changes to `/etc/gitlab/gitlab.rb` and [reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure):

   ```shell
589
   gitlab-ctl reconfigure
590 591
   ```

592 593 594 595 596 597
1. Verify each `gitlab-shell` on each Gitaly instance can reach GitLab. On each Gitaly instance run:

   ```shell
   /opt/gitlab/embedded/service/gitlab-shell/bin/check -config /opt/gitlab/embedded/service/gitlab-shell/config.yml
   ```

598 599 600
1. Verify that GitLab can reach Praefect:

   ```shell
601
   gitlab-rake gitlab:gitaly:check
602 603 604 605 606 607 608 609 610
   ```

1. Update the **Repository storage** settings from **Admin Area > Settings >
   Repository > Repository storage** to make the newly configured Praefect
   cluster the storage location for new Git repositories.

   - Deselect the **default** storage location
   - Select the **praefect** storage location

611 612
   ![Update repository storage](img/praefect_storage_v12_10.png)

613 614 615 616 617
1. Verify everything is still working by creating a new project. Check the
   "Initialize repository with a README" box so that there is content in the
   repository that viewed. If the project is created, and you can see the
   README file, it works!

618
### Grafana
619

620 621 622 623
Grafana is included with GitLab, and can be used to monitor your Praefect
cluster. See [Grafana Dashboard
Service](https://docs.gitlab.com/omnibus/settings/grafana.html)
for detailed documentation.
624

625
To get started quickly:
626

627
1. SSH into the **GitLab** node and login as root:
628

629 630 631
   ```shell
   sudo -i
   ```
632

633
1. Enable the Grafana login form by editing `/etc/gitlab/gitlab.rb`.
634 635

   ```ruby
636
   grafana['disable_login_form'] = false
637 638
   ```

639 640
1. Save the changes to `/etc/gitlab/gitlab.rb` and [reconfigure
   GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure):
641

642 643 644
   ```shell
   gitlab-ctl reconfigure
   ```
645

646 647
1. Set the Grafana admin password. This command will prompt you to enter a new
   password:
648

649 650 651
   ```shell
   gitlab-ctl set-grafana-password
   ```
652

653 654
1. In your web browser, open `/-/grafana` (e.g.
   `https://gitlab.example.com/-/grafana`) on your GitLab server.
655

656
   Login using the password you set, and the username `admin`.
657

658 659
1. Go to **Explore** and query `gitlab_build_info` to verify that you are
   getting metrics from all your machines.
660

661 662
Congratulations! You've configured an observable highly available Praefect
cluster.
663

664
## Automatic failover and leader election
665

666 667 668
Praefect regularly checks the health of each backend Gitaly node. This
information can be used to automatically failover to a new primary node if the
current primary node is found to be unhealthy.
669

670 671 672 673 674 675 676 677 678 679 680 681
- **Manual:** Automatic failover is disabled. The primary node can be
  reconfigured in `/etc/gitlab/gitlab.rb` on the Praefect node. Modify the
  `praefect['virtual_storages']` field by moving the `primary = true` to promote
  a different Gitaly node to primary. In the steps above, `gitaly-1` was set to
  the primary.
- **Memory:** Enabled by setting `praefect['failover_enabled'] = true` in
  `/etc/gitlab/gitlab.rb` on the Praefect node. If a sufficient number of health
  checks fail for the current primary backend Gitaly node, and new primary will
  be elected. **Do not use with multiple Praefect nodes!** Using with multiple
  Praefect nodes is likely to result in a split brain.
- **PostgreSQL:** Coming soon. See isse
  [#2547](https://gitlab.com/gitlab-org/gitaly/-/issues/2547) for updates.
682

683 684
It is likely that we will implement support for Consul, and a cloud native
strategy in the future.
685

Paul Okstad's avatar
Paul Okstad committed
686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708
## Backend Node Recovery

When a Praefect backend node fails and is no longer able to
replicate changes, the backend node will start to drift from the primary. If
that node eventually recovers, it will need to be reconciled with the current
primary. The primary node is considered the single source of truth for the
state of a shard. The Praefect `reconcile` subcommand allows for the manual
reconciliation between a backend node and the current primary.

Run the following command on the Praefect server after all placeholders
(`<virtual-storage>` and `<target-storage>`) have been replaced:

```shell
sudo /opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.toml reconcile -virtual <virtual-storage> -target <target-storage>
```

- Replace the placeholder `<virtual-storage>` with the virtual storage containing the backend node storage to be checked.
- Replace the placeholder `<target-storage>` with the backend storage name.

The command will return a list of repositories that were found to be
inconsistent against the current primary. Each of these inconsistencies will
also be logged with an accompanying replication job ID.

709 710 711 712 713 714 715 716 717
## Migrating existing repositories to Praefect

If your GitLab instance already has repositories, these won't be migrated
automatically.

Repositories may be moved from one storage location using the [Repository
API](../../api/projects.html#edit-project):

```shell
718
curl --request PUT --header "PRIVATE-TOKEN: <your_access_token>" --data "repository_storage=praefect" https://example.gitlab.com/api/v4/projects/123
719
```
720

721
## Debugging Praefect
722

723 724 725 726 727 728 729 730 731 732 733 734 735 736
If you receive an error, check `/var/log/gitlab/gitlab-rails/production.log`.

Here are common errors and potential causes:

- 500 response code
  - **ActionView::Template::Error (7:permission denied)**
    - `praefect['auth_token']` and `gitlab_rails['gitaly_token']` do not match on the GitLab server.
  - **Unable to save project. Error: 7:permission denied**
    - Secret token in `praefect['storage_nodes']` on GitLab server does not match the
      value in `gitaly['auth_token']` on one or more Gitaly servers.
- 503 response code
  - **GRPC::Unavailable (14:failed to connect to all addresses)**
    - GitLab was unable to reach Praefect.
  - **GRPC::Unavailable (14:all SubCons are in TransientFailure...)**
Paul Okstad's avatar
Paul Okstad committed
737 738
    - Praefect cannot reach one or more of its child Gitaly nodes. Try running
      the Praefect connection checker to diagnose.