Commit 8893c7c4 authored by Catalin Irimie's avatar Catalin Irimie Committed by Marcel Amirault

Add Geo proxying for secondary sites docs

parent 2397a248
...@@ -148,6 +148,18 @@ NOTE: ...@@ -148,6 +148,18 @@ NOTE:
When using HTTPS protocol for port 443, you need to add an SSL certificate to the load balancers. When using HTTPS protocol for port 443, you need to add an SSL certificate to the load balancers.
If you wish to terminate SSL at the GitLab application server instead, use TCP protocol. If you wish to terminate SSL at the GitLab application server instead, use TCP protocol.
#### Internal URL
HTTP requests from any Geo secondary site to the primary Geo site use the Internal URL of the primary
Geo site. If this is not explicitly defined in the primary Geo site settings in the Admin Area, the
public URL of the primary site will be used.
To update the internal URL of the primary Geo site:
1. On the top bar, go to **Menu > Admin > Geo > Sites**.
1. Select **Edit** on the primary site.
1. Change the **Internal URL**, then select **Save changes**.
### LDAP ### LDAP
We recommend that if you use LDAP on your **primary** site, you also set up secondary LDAP servers on each **secondary** site. Otherwise, users are unable to perform Git operations over HTTP(s) on the **secondary** site using HTTP Basic Authentication. However, Git via SSH and personal access tokens still works. We recommend that if you use LDAP on your **primary** site, you also set up secondary LDAP servers on each **secondary** site. Otherwise, users are unable to perform Git operations over HTTP(s) on the **secondary** site using HTTP Basic Authentication. However, Git via SSH and personal access tokens still works.
...@@ -258,6 +270,10 @@ For information on using Geo in disaster recovery situations to mitigate data-lo ...@@ -258,6 +270,10 @@ For information on using Geo in disaster recovery situations to mitigate data-lo
For more information on how to replicate the Container Registry, see [Docker Registry for a **secondary** site](replication/docker_registry.md). For more information on how to replicate the Container Registry, see [Docker Registry for a **secondary** site](replication/docker_registry.md).
### Geo secondary proxy
For more information on using Geo proxying on secondary nodes, see [Geo proxying for secondary sites](secondary_proxy/index.md).
### Security Review ### Security Review
For more information on Geo security, see [Geo security review](replication/security_review.md). For more information on Geo security, see [Geo security review](replication/security_review.md).
......
---
stage: Enablement
group: Geo
info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments
type: howto
---
# Geo proxying for secondary sites **(PREMIUM SELF)**
> [Introduced](https://gitlab.com/groups/gitlab-org/-/epics/5914) in GitLab 14.4 [with a flag](../../feature_flags.md) named `geo_secondary_proxy`. Disabled by default.
FLAG:
On self-managed GitLab, by default this feature is not available. See below to [Set up a unified URL for Geo sites](#set-up-a-unified-url-for-geo-sites).
The feature is not ready for production use.
Use Geo proxying to:
- Have secondary sites serve read-write traffic by proxying to the primary site.
- Selectively accelerate replicated data types by directing read-only operations to the local site instead.
When enabled, users of the secondary site can use the WebUI as if they were accessing the
primary site's UI. This significantly improves the overall user experience of secondary sites.
With secondary proxying, web requests to secondary Geo sites are
proxied directly to the primary, and appear to act as a read-write site.
Proxying is done by the [`gitlab-workhorse`](https://gitlab.com/gitlab-org/gitlab-workhorse) component.
Traffic usually sent to the Rails application on the Geo secondary site is proxied
to the [internal URL](../index.md#internal-url) of the primary Geo site instead.
Use secondary proxying for use-cases including:
- Having all Geo sites behind a single URL.
- Geographically load-balancing traffic without worrying about write access.
## Set up a unified URL for Geo sites
Secondary sites can transparently serve read-write traffic. You can
use a single external URL so that requests can hit either the primary Geo site
or any secondary Geo sites that use Geo proxying.
### Configure an external URL to send traffic to both sites
Follow the [Location-aware public URL](location_aware_external_url.md) steps to create
a single URL used by all Geo sites, including the primary.
### Update the Geo sites to use the same external URL
1. On your Geo sites, SSH **into each node running Rails (Puma, Sidekiq, Log-Cursor)
and change the `external_url` to that of the single URL:
```shell
sudo editor /etc/gitlab/gitlab.rb
```
1. Reconfigure the updated nodes for the change to take effect if the URL was different than the one already set:
```shell
gitlab-ctl reconfigure
```
1. To match the new external URL set on the secondary Geo sites, the primary database
needs to reflect this change.
In the Geo administration page of the **primary** site, edit each Geo secondary that
is using the secondary proxying and set the `URL` field to the single URL.
Make sure the primary site is also using this URL.
### Enable secondary proxying
1. SSH into each application node (serving user traffic directly) on your secondary Geo site
and add the following environment variable:
```shell
sudo editor /etc/gitlab/gitlab.rb
```
```ruby
gitlab_workhorse['env'] = {
"GEO_SECONDARY_PROXY" => "1"
}
```
1. Reconfigure the updated nodes for the change to take effect:
```shell
gitlab-ctl reconfigure
```
1. SSH into one node running Rails on your primary Geo site and enable the Geo secondary proxy feature flag:
```shell
sudo gitlab-rails runner "Feature.enable(:geo_secondary_proxy)"
```
## Enable Geo proxying with Separate URLs
The ability to use proxying with separate URLs is still in development. You can follow the
["Geo secondary proxying with separate URLs" epic](https://gitlab.com/groups/gitlab-org/-/epics/6865)
for progress.
## Features accelerated by secondary Geo sites
Most HTTP traffic sent to a secondary Geo site can be proxied to the primary Geo site. With this architecture,
secondary Geo sites are able to support write requests. Certain requests are handled locally by secondary
sites for improved latency and bandwidth nearby.
The following table details the components currently tested through the Geo secondary site Workhorse proxy.
It does not cover all data types, more will be added in the future as they are tested.
| Feature / component | Proxied? |
|:----------------------------------------------------|:-----------------------|
| Project, wiki, design repository (using the web UI) | **{check-circle}** Yes |
| Project, wiki repository (using Git) | **{dotted-circle}** Partly <sup>1</sup> |
| Project, Personal Snippet (using the web UI) | **{check-circle}** Yes |
| Project, Personal Snippet (using Git) | **{dotted-circle}** Partly <sup>1</sup> |
| Group wiki repository (using the web UI) | **{check-circle}** Yes |
| Group wiki repository (using Git) | **{dotted-circle}** Partly <sup>1</sup> |
| User uploads | **{check-circle}** Yes |
| LFS objects (using the web UI) | **{check-circle}** Yes |
| LFS objects (using Git) | **{check-circle}** Yes |
| Pages | **{dotted-circle}** No <sup>2</sup> |
| Advanced search (using the web UI) | **{check-circle}** Yes |
1. Git reads are served from the local secondary while pushes get proxied to the primary.
Selective sync or cases where repositories don't exist locally on the Geo secondary throw a "not found" error.
1. Pages can use the same URL (without access control), but must be configured separately and are not proxied.
---
stage: Enablement
group: Geo
info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments
type: howto
---
# Location-aware public URL **(PREMIUM SELF)**
With [Geo proxying for secondary sites](index.md), you can provide GitLab users
with a single URL that automatically uses the Geo site closest to them.
Users don't need to use different URLs or worry about read-only operations to take
advantage of closer Geo sites as they move.
With [Geo proxying for secondary sites](index.md) web and Git requests are proxied
from **secondary** sites to the **primary**.
Though these instructions use [AWS Route53](https://aws.amazon.com/route53/),
other services such as [Cloudflare](https://www.cloudflare.com/) can be used
as well.
## Prerequisites
This example creates a `gitlab.example.com` subdomain that automatically directs
requests:
- From Europe to a **secondary** site.
- From all other locations to the **primary** site.
The URLs to access each node by itself are:
- `primary.example.com` as a Geo **primary** site.
- `secondary.example.com` as a Geo **secondary** site.
For this example, you need:
- A working GitLab **primary** site that is accessible at `gitlab.example.com` _and_ `primary.example.com`.
- A working GitLab **secondary** site.
- A Route53 Hosted Zone managing your domain for the Route53 setup.
If you haven't yet set up a Geo _primary_ site and _secondary_ site, see the
[Geo setup instructions](../index.md#setup-instructions).
## AWS Route53
### Create a traffic policy
In a Route53 Hosted Zone, traffic policies can be used to set up a variety of
routing configurations.
1. Go to the
[Route53 dashboard](https://console.aws.amazon.com/route53/home) and select
**Traffic policies**.
1. Select **Create traffic policy**.
1. Fill in the **Policy Name** field with `Single Git Host` and select **Next**.
1. Leave **DNS type** as `A: IP Address in IPv4 format`.
1. Select **Connect to...**, then **Geolocation rule**.
1. For the first **Location**:
1. Leave it as `Default`.
1. Select **Connect to...**, then **New endpoint**.
1. Choose **Type** `value` and fill it in with `<your **primary** IP address>`.
1. For the second **Location**:
1. Choose `Europe`.
1. Select **Connect to...** and select **New endpoint**.
1. Choose **Type** `value` and fill it in with `<your **secondary** IP address>`.
![Add traffic policy endpoints](img/single_url_add_traffic_policy_endpoints.png)
1. Select **Create traffic policy**.
1. Fill in **Policy record DNS name** with `gitlab`.
![Create policy records with traffic policy](img/single_url_create_policy_records_with_traffic_policy.png)
1. Select **Create policy records**.
You have successfully set up a single host, like `gitlab.example.com`, which
distributes traffic to your Geo sites by geolocation.
## Enable Geo proxying for secondary sites
After setting up a single URL to use for all Geo sites, continue with the [steps to enable Geo proxying for secondary sites](index.md).
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment