# Working with the bundled Consul service **[PREMIUM ONLY]**
## Overview
As part of its High Availability stack, GitLab Premium includes a bundled version of [Consul](http://consul.io) that can be managed through `/etc/gitlab/gitlab.rb`.
A Consul cluster consists of multiple server agents, as well as client agents that run on other nodes which need to talk to the consul cluster.
## Operations
### Checking cluster membership
To see which nodes are part of the cluster, run the following on any member in the cluster
```
# /opt/gitlab/embedded/bin/consul members
Node Address Status Type Build Protocol DC
consul-b XX.XX.X.Y:8301 alive server 0.9.0 2 gitlab_consul
consul-c XX.XX.X.Y:8301 alive server 0.9.0 2 gitlab_consul
consul-c XX.XX.X.Y:8301 alive server 0.9.0 2 gitlab_consul
Ideally all nodes will have a `Status` of `alive`.
### Restarting the server cluster
**Note**: This section only applies to server agents. It is safe to restart client agents whenever needed.
If it is necessary to restart the server cluster, it is important to do this in a controlled fashion in order to maintain quorum. If quorum is lost, you will need to follow the consul [outage recovery](#outage-recovery) process to recover the cluster.
To be safe, we recommend you only restart one server agent at a time to ensure the cluster remains intact.
For larger clusters, it is possible to restart multiple agents at a time. See the [Consul consensus document](https://www.consul.io/docs/internals/consensus.html#deployment-table) for how many failures it can tolerate. This will be the number of simulateneous restarts it can sustain.
## Troubleshooting
### Consul server agents unable to communicate
By default, the server agents will attempt to [bind](https://www.consul.io/docs/agent/options.html#_bind) to '0.0.0.0', but they will advertise the first private IP address on the node for other agents to communicate with them. If the other nodes cannot communicate with a node on this address, then the cluster will have a failed status.
You will see messages like the following in `gitlab-ctl tail consul` output if you are running into this issue:
```
2017-09-25_19:53:39.90821 2017/09/25 19:53:39 [WARN] raft: no known peers, aborting election
2017-09-25_19:53:41.74356 2017/09/25 19:53:41 [ERR] agent: failed to sync remote state: No cluster leader
```
To fix this:
1. Pick an address on each node that all of the other nodes can reach this node through.
1. Update your `/etc/gitlab/gitlab.rb`
```ruby
consul['configuration'] = {
...
bind_addr: 'IP ADDRESS'
}
```
1. Run `gitlab-ctl reconfigure`
If you still see the errors, you may have to [erase the consul database and reinitialize](#recreate-from-scratch) on the affected node.
### Consul agents do not start - Multiple private IPs
In the case that a node has multiple private IPs the agent be confused as to which of the private addresses to advertise, and then immediately exit on start.
You will see messages like the following in `gitlab-ctl tail consul` output if you are running into this issue:
```
2017-11-09_17:41:45.52876 ==> Starting Consul agent...
2017-11-09_17:41:45.53057 ==> Error creating agent: Failed to get advertise address: Multiple private IPs found. Please configure one.
```
To fix this:
1. Pick an address on the node that all of the other nodes can reach this node through.
1. Update your `/etc/gitlab/gitlab.rb`
```ruby
consul['configuration'] = {
...
bind_addr: 'IP ADDRESS'
}
```
1. Run `gitlab-ctl reconfigure`
### Outage recovery
If you lost enough server agents in the cluster to break quorum, then the cluster is considered failed, and it will not function without manual intervenetion.
#### Recreate from scratch
By default, GitLab does not store anything in the consul cluster that cannot be recreated. To erase the consul database and reinitialize
```
# gitlab-ctl stop consul
# rm -rf /var/opt/gitlab/consul/data
# gitlab-ctl start consul
```
After this, the cluster should start back up, and the server agents rejoin. Shortly after that, the client agents should rejoin as well.
#### Recover a failed cluster
If you have taken advantage of consul to store other data, and want to restore the failed cluster, please follow the [Consul guide](https://www.consul.io/docs/guides/outage.html) to recover a failed cluster.
# Configuring Gitaly for Scaled and High Availability
Gitaly does not yet support full high availability. However, Gitaly is quite
stable and is in use on GitLab.com. Scaled and highly available GitLab environments
should consider using Gitaly on a separate node.
See the [Gitaly HA Epic](https://gitlab.com/groups/gitlab-org/-/epics/289) to
track plans and progress toward high availability support.
This document is relevant for [Scaled Architecture](./README.md#scalable-architecture-examples)
environments and [High Availability Architecture](./README.md#high-availability-architecture-examples).
## Running Gitaly on its own server
Starting with GitLab 11.4, Gitaly is a replacement for NFS except
when the [Elastic Search indexer](https://gitlab.com/gitlab-org/gitlab-elasticsearch-indexer)
is used.
NOTE: **Note:** While Gitaly can be used as a replacement for NFS, we do not recommend using EFS as it may impact GitLab's performance. Please review the [relevant documentation](nfs.md#avoid-using-awss-elastic-file-system-efs) for more details.
NOTE: **Note:** Gitaly network traffic is unencrypted so we recommend a firewall to
restrict access to your Gitaly server.
The steps below are the minimum necessary to configure a Gitaly server with
Omnibus:
1. SSH into the Gitaly server.
1.[Download/install](https://about.gitlab.com/installation) the Omnibus GitLab
package you want using **steps 1 and 2** from the GitLab downloads page.
- Do not complete any other steps on the download page.
1. Edit `/etc/gitlab/gitlab.rb` and add the contents:
Gitaly must trigger some callbacks to GitLab via GitLab Shell. As a result,
the GitLab Shell secret must be the same between the other GitLab servers and
the Gitaly server. The easiest way to accomplish this is to copy `/etc/gitlab/gitlab-secrets.json`
from an existing GitLab server to the Gitaly server. Without this shared secret,
Git operations in GitLab will result in an API error.
> **NOTE:** In most or all cases the storage paths below end in `repositories` which is
different than `path` in `git_data_dirs` of Omnibus installations. Check the
directory layout on your Gitaly server to be sure.
```ruby
# Enable Gitaly
gitaly['enable'] = true
## Disable all other services
sidekiq['enable'] = false
gitlab_workhorse['enable'] = false
unicorn['enable'] = false
postgresql['enable'] = false
nginx['enable'] = false
prometheus['enable'] = false
alertmanager['enable'] = false
pgbouncer_exporter['enable'] = false
redis_exporter['enable'] = false
gitlab_monitor['enable'] = false
# Prevent database connections during 'gitlab-ctl reconfigure'
gitlab_rails['rake_cache_clear'] = false
gitlab_rails['auto_migrate'] = false
# Configure the gitlab-shell API callback URL. Without this, `git push` will
# fail. This can be your 'front door' GitLab URL or an internal load
Setting up NFS for a GitLab HA setup allows all applications nodes in a cluster
to share the same files and maintain data consistency. Application nodes in an HA
setup act as clients while the NFS server plays host.
> Note: The instructions provided in this documentation allow for setting a quick
proof of concept but will leave NFS as potential single point of failure and
therefore not recommended for use in production. Explore options such as [Pacemaker
and Corosync](http://clusterlabs.org/) for highly available NFS in production.
Below are instructions for setting up an application node(client) in an HA cluster
to read from and write to a central NFS server(host).
NOTE: **Note:**
Using EFS may negatively impact performance. Please review the [relevant documentation](nfs.md#avoid-using-awss-elastic-file-system-efs) for additional details.
## NFS Server Setup
> Follow the instructions below to set up and configure your NFS server.
### Step 1 - Install NFS Server on Host
Installing the nfs-kernel-server package allows you to share directories with the clients running the GitLab application.
```sh
apt-get update
apt-get install nfs-kernel-server
```
### Step 2 - Export Host's Home Directory to Client
In this setup we will share the home directory on the host with the client. Edit the exports file as below to share the host's home directory with the client. If you have multiple clients running GitLab you must enter the client IP addresses in line in the `/etc/exports` file.
As part of its High Availability stack, GitLab Premium includes a bundled version of [Pgbouncer](https://pgbouncer.github.io/) that can be managed through `/etc/gitlab/gitlab.rb`.
In a High Availability setup, Pgbouncer is used to seamlessly migrate database connections between servers in a failover scenario.
Additionally, it can be used in a non-HA setup to pool connections, speeding up response time while reducing resource usage.
It is recommended to run pgbouncer alongside the `gitlab-rails` service, or on its own dedicated node in a cluster.
## Operations
### Running Pgbouncer as part of an HA GitLab installation
See our [HA documentation for PostgreSQL](database.md) for information on running pgbouncer as part of a HA setup
### Running Pgbouncer as part of a non-HA GitLab installation
1. Generate PGBOUNCER_USER_PASSWORD_HASH with the command `gitlab-ctl pg-password-md5 pgbouncer`
1. Generate SQL_USER_PASSWORD_HASH with the command `gitlab-ctl pg-password-md5 gitlab`. We'll also need to enter the plaintext SQL_USER_PASSWORD later
1. On your database node, ensure the following is set in your `/etc/gitlab/gitlab.rb`
postgresql['listen_address'] = 'XX.XX.XX.Y' # Where XX.XX.XX.Y is the ip address on the node postgresql should listen on
postgresql['md5_auth_cidr_addresses'] = %w(AA.AA.AA.B/32) # Where AA.AA.AA.B is the IP address of the pgbouncer node
```
1. Run `gitlab-ctl reconfigure`
**Note:** If the database was already running, it will need to be restarted after reconfigure by running `gitlab-ctl restart postgresql`.
1. On the node you are running pgbouncer on, make sure the following is set in `/etc/gitlab/gitlab.rb`
```ruby
pgbouncer['enable'] = true
pgbouncer['databases'] = {
gitlabhq_production: {
host: 'DATABASE_HOST',
user: 'pgbouncer',
password: 'PGBOUNCER_USER_PASSWORD_HASH'
}
}
```
1. Run `gitlab-ctl reconfigure`
1. On the node running unicorn, make sure the following is set in `/etc/gitlab/gitlab.rb`
```ruby
gitlab_rails['db_host'] = 'PGBOUNCER_HOST'
gitlab_rails['db_port'] = '6432'
gitlab_rails['db_password'] = 'SQL_USER_PASSWORD'
```
1. Run `gitlab-ctl reconfigure`
1. At this point, your instance should connect to the database through pgbouncer. If you are having issues, see the [Troubleshooting](#troubleshooting) section
### Interacting with pgbouncer
#### Administrative console
As part of omnibus-gitlab, we provide a command `gitlab-ctl pgb-console` to automatically connect to the pgbouncer administrative console. Please see the [pgbouncer documentation](https://pgbouncer.github.io/usage.html#admin-console) for detailed instructions on how to interact with the console.
To start a session, run
```shell
# gitlab-ctl pgb-console
Password for user pgbouncer:
psql (9.6.8, server 1.7.2/bouncer)
Type "help"for help.
pgbouncer=#
```
The password you will be prompted for is the PGBOUNCER_USER_PASSWORD
To get some basic information about the instance, run
```shell
pgbouncer=# show databases; show clients; show servers;
name | host | port | database | force_user | pool_size | reserve_pool | pool_mode | max_connections | current_connections
In case you are experiencing any issues connecting through pgbouncer, the first place to check is always the logs:
```shell
# gitlab-ctl tail pgbouncer
```
Additionally, you can check the output from `show databases` in the [Administrative console](#administrative-console). In the output, you would expect to see values in the `host` field for the `gitlabhq_production` database. Additionally, `current_connections` should be greater than 1.