Commit 508ba699 authored by Nick Thomas's avatar Nick Thomas

Merge branch 'jramsay-3400-add-secondary-before-db-replication-docs' into 'master'

Geo: add secondary before db replication

See merge request gitlab-org/gitlab-ee!3274
parents 8af6d0d0 33576745
......@@ -119,6 +119,12 @@ sensitive data in the database. Any secondary node must have the
gitlab-ctl reconfigure
```
Once reconfigured, the secondary will start automatically
replicating missing data from the primary in a process known as backfill.
Meanwhile, the primary node will start to notify changes to the secondary, which
will act on those notifications immediately. Make sure the secondary instance is
running and accessible.
### Step 3. Enabling hashed storage (from GitLab 10.0)
1. Visit the **primary** node's **Admin Area ➔ Settings**
......@@ -131,27 +137,18 @@ Using hashed storage significantly improves Geo replication - project and group
renames no longer require synchronization between nodes - so we recommend it is
used for all GitLab Geo installations.
### Step 4. Enabling the secondary GitLab node
### Step 4. Managing the secondary GitLab node
1. Visit the **primary** node's **Admin Area ➔ Geo Nodes** (`/admin/geo_nodes`)
in your browser.
1. Add the secondary node by providing its full URL. **Do NOT** check the box
'This is a primary node'.
1. Added in GitLab 9.5: Choose which namespaces should be replicated by the secondary node. Leave blank to replicate all. Read more in [selective replication](#selective-replication).
1. Click the **Add node** button.
1. Restart GitLab on the secondary:
You can monitor the status of the syncing process on a secondary node
by visiting the primary node's **Admin Area ➔ Geo Nodes** (`/admin/geo_nodes`)
in your browser.
```
gitlab-ctl restart
```
Please note that if `git_data_dirs` is customized on the primary for multiple
repository shards you must duplicate the same configuration on the secondary.
---
![GitLab Geo dashboard](img/geo-node-dashboard.png)
After the **Add Node** button is pressed, the secondary will start automatically
replicating missing data from the primary in a process known as backfill.
Meanwhile, the primary node will start to notify changes to the secondary, which
will act on those notifications immediately. Make sure the secondary instance is
running and accessible.
Disabling a secondary node stops the syncing process.
The two most obvious issues that replication can have here are:
......@@ -170,17 +167,6 @@ Currently, this is what is synced:
* Issue, merge request, snippet and comment attachments
* User, group, and project avatars
You can monitor the status of the syncing process on a secondary node
by visiting the primary node's **Admin Area ➔ Geo Nodes** (`/admin/geo_nodes`)
in your browser.
Please note that if `git_data_dirs` is customized on the primary for multiple
repository shards you must duplicate the same configuration on the secondary.
![GitLab Geo dashboard](img/geo-node-dashboard.png)
Disabling a secondary node stops the syncing process.
## Next steps
Your nodes should now be ready to use. You can login to the secondary node
......
......@@ -111,6 +111,11 @@ sensitive data in the database. Any secondary node must have the
1. Save and close the file.
The secondary will start automatically replicating missing data from the
primary in a process known as backfill. Meanwhile, the primary node will start
to notify changes to the secondary, which will act on those notifications
immediately. Make sure the secondary instance is running and accessible.
### Step 3. Enabling hashed storage (from GitLab 10.0)
1. Visit the **primary** node's **Admin Area ➔ Settings**
......@@ -124,27 +129,18 @@ renames no longer require synchronization between nodes - so we recommend it is
used for all GitLab Geo installations.
### Step 4. Enabling the secondary GitLab node
### Step 4. Managing the secondary GitLab node
1. Visit the **primary** node's **Admin Area ➔ Geo Nodes** (`/admin/geo_nodes`)
in your browser.
1. Add the secondary node by providing its full URL. **Do NOT** check the box
'This is a primary node'.
1. Added in GitLab 9.5: Choose which namespaces should be replicated by the secondary node. Leave blank to replicate all. Read more in [selective replication](#selective-replication).
1. Click the **Add node** button.
1. Restart GitLab on the secondary:
You can monitor the status of the syncing process on a secondary node
by visiting the primary node's **Admin Area ➔ Geo Nodes** (`/admin/geo_nodes`)
in your browser.
```
gitlab-ctl restart
```
Please note that if `git_data_dirs` is customized on the primary for multiple
repository shards you must duplicate the same configuration on the secondary.
---
![GitLab Geo dashboard](img/geo-node-dashboard.png)
After the **Add Node** button is pressed, the secondary will start automatically
replicating missing data from the primary in a process known as backfill.
Meanwhile, the primary node will start to notify changes to the secondary, which
will act on those notifications immediately. Make sure the secondary instance is
running and accessible.
Disabling a secondary node stops the syncing process.
The two most obvious issues that replication can have here are:
......@@ -167,13 +163,6 @@ You can monitor the status of the syncing process on a secondary node
by visiting the primary node's **Admin Area ➔ Geo Nodes** (`/admin/geo_nodes`)
in your browser.
Please note that if `git_data_dirs` is customized on the primary for multiple
repository shards you must duplicate the same configuration on the secondary.
![GitLab Geo dashboard](img/geo-node-dashboard.png)
Disabling a secondary node stops the syncing process.
## Next steps
Your nodes should now be ready to use. You can login to the secondary node
......
......@@ -161,9 +161,23 @@ The following guide assumes that:
1. Now that the PostgreSQL server is set up to accept remote connections, run
`netstat -plnt` to make sure that PostgreSQL is listening to the server's
public IP.
1. Continue to [set up the secondary server](#step-2-configure-the-secondary-server).
### Step 2. Configure the secondary server
### Step 2. Add the secondary GitLab node
To prevent the secondary geo node trying to act as the primary once the
database is replicated, the secondary geo node must be configured on the
primary before the database is replicated.
1. Visit the **primary** node's **Admin Area ➔ Geo Nodes**
(`/admin/geo_nodes`) in your browser.
1. Add the secondary node by providing its full URL. **Do NOT** check the box
'This is a primary node'.
1. Added in GitLab 9.5: Choose which namespaces should be replicated by the
secondary node. Leave blank to replicate all. Read more in
[selective replication](#selective-replication).
1. Click the **Add node** button.
### Step 3. Configure the secondary server
1. SSH into your GitLab **secondary** server and login as root:
......@@ -204,7 +218,7 @@ The following guide assumes that:
same time zone, but when the respective times are converted to UTC time,
the clocks must be synchronized to within 60 seconds of each other.
### Step 3. Initiate the replication process
### Step 4. Initiate the replication process
Below we provide a script that connects to the primary server, replicates the
database and creates the needed files for replication.
......
......@@ -138,9 +138,23 @@ The following guide assumes that:
1. Now that the PostgreSQL server is set up to accept remote connections, run
`netstat -plnt` to make sure that PostgreSQL is listening to the server's
public IP.
1. Continue to [set up the secondary server](#step-2-configure-the-secondary-server).
### Step 2. Configure the secondary server
### Step 2. Add the secondary GitLab node
To prevent the secondary geo node trying to act as the primary once the
database is replicated, the secondary geo node must be configured on the
primary before the database is replicated.
1. Visit the **primary** node's **Admin Area ➔ Geo Nodes**
(`/admin/geo_nodes`) in your browser.
1. Add the secondary node by providing its full URL. **Do NOT** check the box
'This is a primary node'.
1. Added in GitLab 9.5: Choose which namespaces should be replicated by the
secondary node. Leave blank to replicate all. Read more in
[selective replication](#selective-replication).
1. Click the **Add node** button.
### Step 3. Configure the secondary server
1. SSH into your GitLab **secondary** server and login as root:
......@@ -232,7 +246,7 @@ the clocks must be synchronized to within 60 seconds of each other.
bundle exec rake geo:db:migrate
```
### Step 3. Initiate the replication process
### Step 4. Initiate the replication process
Below we provide a script that connects to the primary server, replicates the
database and creates the needed files for replication.
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment