Merge branch 'geo-terminology-update-index' into 'master'

Replaced node with site to align with our defined terms on setup page See merge request gitlab-org/gitlab!53483

Merge branch 'geo-terminology-update-index' into 'master'
Replaced node with site to align with our defined terms on setup page See merge request gitlab-org/gitlab!53483
dc248dd7 · Achilleas Pipinellis · 45071e31 · 402a2afe · dc248dd7
Commit dc248dd7 authored May 12, 2021 by Achilleas Pipinellis
Hide whitespace changes
Inline Side-by-side

Showing with 49 additions and 47 deletions

doc/administration/geo/index.md doc/administration/geo/index.md +49 -47

No files found.
--- a/doc/administration/geo/index.md
+++ b/doc/administration/geo/index.md
@@ -22,12 +22,14 @@ Geo undergoes significant changes from release to release. Upgrades **are** supp

 Fetching large repositories can take a long time for teams located far from a single GitLab instance.

-Geo provides local, read-only instances of your GitLab instances. This can reduce the time it takes
+Geo provides local, read-only sites of your GitLab instances. This can reduce the time it takes
 to clone and fetch large repositories, speeding up development.

 For a video introduction to Geo, see [Introduction to GitLab Geo - GitLab Features](https://www.youtube.com/watch?v=-HDLxSjEh6w).

-To make sure you're using the right version of the documentation, navigate to [the Geo page on GitLab.com](https://gitlab.com/gitlab-org/gitlab/blob/master/doc/administration/geo/index.md) and choose the appropriate release from the **Switch branch/tag** dropdown. For example, [`v11.2.3-ee`](https://gitlab.com/gitlab-org/gitlab/blob/v11.2.3-ee/doc/administration/geo/index.md).
+To make sure you're using the right version of the documentation, navigate to [the Geo page on GitLab.com](https://gitlab.com/gitlab-org/gitlab/blob/master/doc/administration/geo/index.md) and choose the appropriate release from the **Switch branch/tag** dropdown. For example, [`v13.7.6-ee`](https://gitlab.com/gitlab-org/gitlab/-/blob/v13.7.6-ee/doc/administration/geo/index.md).
+
+Geo uses a set of defined terms that is described in the [Geo Glossary](glossary.md), please familiarize yourself with those terms.

 ## Use cases

@@ -35,21 +37,21 @@ Implementing Geo provides the following benefits:

 - Reduce from minutes to seconds the time taken for your distributed developers to clone and fetch large repositories and projects.
 - Enable all of your developers to contribute ideas and work in parallel, no matter where they are.
- Balance the read-only load between your **primary** and **secondary** nodes.
+- Balance the read-only load between your **primary** and **secondary** sites.

 In addition, it:

 - Can be used for cloning and fetching projects, in addition to reading any data available in the GitLab web interface (see [limitations](#limitations)).
 - Overcomes slow connections between distant offices, saving time by improving speed for distributed teams.
 - Helps reducing the loading time for automated tasks, custom integrations, and internal workflows.
- Can quickly fail over to a **secondary** node in a [disaster recovery](disaster_recovery/index.md) scenario.
- Allows [planned failover](disaster_recovery/planned_failover.md) to a **secondary** node.
+- Can quickly fail over to a **secondary** site in a [disaster recovery](disaster_recovery/index.md) scenario.
+- Allows [planned failover](disaster_recovery/planned_failover.md) to a **secondary** site.

 Geo provides:

- Read-only **secondary** nodes: Maintain one **primary** GitLab node while still enabling read-only **secondary** nodes for each of your distributed teams.
- Authentication system hooks: **Secondary** nodes receives all authentication data (like user accounts and logins) from the **primary** instance.
- An intuitive UI: **Secondary** nodes use the same web interface your team has grown accustomed to. In addition, there are visual notifications that block write operations and make it clear that a user is on a **secondary** node.
+- Read-only **secondary** sites: Maintain one **primary** GitLab site while still enabling read-only **secondary** sites for each of your distributed teams.
+- Authentication system hooks: **Secondary** sites receives all authentication data (like user accounts and logins) from the **primary** instance.
+- An intuitive UI: **Secondary** sites use the same web interface your team has grown accustomed to. In addition, there are visual notifications that block write operations and make it clear that a user is on a **secondary** sites.

 ### Gitaly Cluster

@@ -64,16 +66,16 @@ Your Geo instance can be used for cloning and fetching projects, in addition to

 When Geo is enabled, the:

- Original instance is known as the **primary** node.
- Replicated read-only nodes are known as **secondary** nodes.
+- Original instance is known as the **primary** site.
+- Replicated read-only sites are known as **secondary** sites.

 Keep in mind that:

- **Secondary** nodes talk to the **primary** node to:
+- **Secondary** sites talk to the **primary** site to:
  - Get user data for logins (API).
  - Replicate repositories, LFS Objects, and Attachments (HTTPS + JWT).
- In GitLab Premium 10.0 and later, the **primary** node no longer talks to **secondary** nodes to notify for changes (API).
- Pushing directly to a **secondary** node (for both HTTP and SSH, including Git LFS) was [introduced](https://about.gitlab.com/releases/2018/09/22/gitlab-11-3-released/) in [GitLab Premium](https://about.gitlab.com/pricing/#self-managed) 11.3.
+- In GitLab Premium 10.0 and later, the **primary** site no longer talks to **secondary** sites to notify for changes (API).
+- Pushing directly to a **secondary** site (for both HTTP and SSH, including Git LFS) was [introduced](https://about.gitlab.com/releases/2018/09/22/gitlab-11-3-released/) in [GitLab Premium](https://about.gitlab.com/pricing/#self-managed) 11.3.
 - There are [limitations](#limitations) when using Geo.

 ### Architecture
@@ -84,31 +86,31 @@ The following diagram illustrates the underlying architecture of Geo.

 In this diagram:

- There is the **primary** node and the details of one **secondary** node.
- Writes to the database can only be performed on the **primary** node. A **secondary** node receives database
+- There is the **primary** site and the details of one **secondary** site.
+- Writes to the database can only be performed on the **primary** site. A **secondary** site receives database
  updates via PostgreSQL streaming replication.
 - If present, the [LDAP server](#ldap) should be configured to replicate for [Disaster Recovery](disaster_recovery/index.md) scenarios.
- A **secondary** node performs different type of synchronizations against the **primary** node, using a special
+- A **secondary** site performs different type of synchronizations against the **primary** site, using a special
  authorization protected by JWT:
  - Repositories are cloned/updated via Git over HTTPS.
  - Attachments, LFS objects, and other files are downloaded via HTTPS using a private API endpoint.

 From the perspective of a user performing Git operations:

- The **primary** node behaves as a full read-write GitLab instance.
- **Secondary** nodes are read-only but proxy Git push operations to the **primary** node. This makes **secondary** nodes appear to support push operations themselves.
+- The **primary** site behaves as a full read-write GitLab instance.
+- **Secondary** sites are read-only but proxy Git push operations to the **primary** site. This makes **secondary** sites appear to support push operations themselves.

 To simplify the diagram, some necessary components are omitted. Note that:

 - Git over SSH requires [`gitlab-shell`](https://gitlab.com/gitlab-org/gitlab-shell) and OpenSSH.
 - Git over HTTPS required [`gitlab-workhorse`](https://gitlab.com/gitlab-org/gitlab-workhorse).

-Note that a **secondary** node needs two different PostgreSQL databases:
+Note that a **secondary** site needs two different PostgreSQL databases:

 - A read-only database instance that streams data from the main GitLab database.
- [Another database instance](#geo-tracking-database) used internally by the **secondary** node to record what data has been replicated.
+- [Another database instance](#geo-tracking-database) used internally by the **secondary** site to record what data has been replicated.

-In **secondary** nodes, there is an additional daemon: [Geo Log Cursor](#geo-log-cursor).
+In **secondary** sites, there is an additional daemon: [Geo Log Cursor](#geo-log-cursor).

 ## Requirements for running Geo

@@ -122,7 +124,7 @@ The following are required to run Geo:
 - PostgreSQL 11+ with [Streaming Replication](https://wiki.postgresql.org/wiki/Streaming_Replication)
 - Git 2.9+
 - Git-lfs 2.4.2+ on the user side when using LFS
- All nodes must run the same GitLab version.
+- All sites must run the same GitLab version.

 Additionally, check the GitLab [minimum requirements](../../install/requirements.md),
 and we recommend you use:
@@ -132,9 +134,9 @@ and we recommend you use:

 ### Firewall rules

-The following table lists basic ports that must be open between the **primary** and **secondary** nodes for Geo.
+The following table lists basic ports that must be open between the **primary** and **secondary** sites for Geo.

-| **Primary** node | **Secondary** node | Protocol     |
+| **Primary** site | **Secondary** site | Protocol     |
 |:-----------------|:-------------------|:-------------|
 | 80               | 80                 | HTTP         |
 | 443              | 443                | TCP or HTTPS |
@@ -153,10 +155,10 @@ If you wish to terminate SSL at the GitLab application server instead, use TCP p

 ### LDAP

-We recommend that if you use LDAP on your **primary** node, you also set up secondary LDAP servers on each **secondary** node. Otherwise, users will not be able to perform Git operations over HTTP(s) on the **secondary** node using HTTP Basic Authentication. However, Git via SSH and personal access tokens will still work.
+We recommend that if you use LDAP on your **primary** site, you also set up secondary LDAP servers on each **secondary** site. Otherwise, users will not be able to perform Git operations over HTTP(s) on the **secondary** site using HTTP Basic Authentication. However, Git via SSH and personal access tokens will still work.

 NOTE:
-It is possible for all **secondary** nodes to share an LDAP server, but additional latency can be an issue. Also, consider what LDAP server will be available in a [disaster recovery](disaster_recovery/index.md) scenario if a **secondary** node is promoted to be a **primary** node.
+It is possible for all **secondary** sites to share an LDAP server, but additional latency can be an issue. Also, consider what LDAP server will be available in a [disaster recovery](disaster_recovery/index.md) scenario if a **secondary** site is promoted to be a **primary** site.

 Check for instructions on how to set up replication in your LDAP service. Instructions will be different depending on the software or service used. For example, OpenLDAP provides [these instructions](https://www.openldap.org/doc/admin24/replication.html).

@@ -168,32 +170,32 @@ The tracking database instance is used as metadata to control what needs to be u
 - Fetch new LFS Objects.
 - Fetch changes from a repository that has recently been updated.

-Because the replicated database instance is read-only, we need this additional database instance for each **secondary** node.
+Because the replicated database instance is read-only, we need this additional database instance for each **secondary** site.

 ### Geo Log Cursor

 This daemon:

- Reads a log of events replicated by the **primary** node to the **secondary** database instance.
+- Reads a log of events replicated by the **primary** site to the **secondary** database instance.
 - Updates the Geo Tracking Database instance with changes that need to be executed.

-When something is marked to be updated in the tracking database instance, asynchronous jobs running on the **secondary** node will execute the required operations and update the state.
+When something is marked to be updated in the tracking database instance, asynchronous jobs running on the **secondary** site will execute the required operations and update the state.

-This new architecture allows GitLab to be resilient to connectivity issues between the nodes. It doesn't matter how long the **secondary** node is disconnected from the **primary** node as it will be able to replay all the events in the correct order and become synchronized with the **primary** node again.
+This new architecture allows GitLab to be resilient to connectivity issues between the sites. It doesn't matter how long the **secondary** site is disconnected from the **primary** site as it will be able to replay all the events in the correct order and become synchronized with the **primary** site again.

 ## Limitations

 WARNING:
 This list of limitations only reflects the latest version of GitLab. If you are using an older version, extra limitations may be in place.

- Pushing directly to a **secondary** node redirects (for HTTP) or proxies (for SSH) the request to the **primary** node instead of [handling it directly](https://gitlab.com/gitlab-org/gitlab/-/issues/1381), except when using Git over HTTP with credentials embedded within the URI. For example, `https://user:password@secondary.tld`.
- The **primary** node has to be online for OAuth login to happen. Existing sessions and Git are not affected. Support for the **secondary** node to use an OAuth provider independent from the primary is [being planned](https://gitlab.com/gitlab-org/gitlab/-/issues/208465).
+- Pushing directly to a **secondary** site redirects (for HTTP) or proxies (for SSH) the request to the **primary** site instead of [handling it directly](https://gitlab.com/gitlab-org/gitlab/-/issues/1381), except when using Git over HTTP with credentials embedded within the URI. For example, `https://user:password@secondary.tld`.
+- The **primary** site has to be online for OAuth login to happen. Existing sessions and Git are not affected. Support for the **secondary** site to use an OAuth provider independent from the primary is [being planned](https://gitlab.com/gitlab-org/gitlab/-/issues/208465).
 - The installation takes multiple manual steps that together can take about an hour depending on circumstances. We are working on improving this experience. See [Omnibus GitLab issue #2978](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/2978) for details.
- Real-time updates of issues/merge requests (for example, via long polling) doesn't work on the **secondary** node.
- [Selective synchronization](replication/configuration.md#selective-synchronization) applies only to files and repositories. Other datasets are replicated to the **secondary** node in full, making it inappropriate for use as an access control mechanism.
- Object pools for forked project deduplication work only on the **primary** node, and are duplicated on the **secondary** node.
- GitLab Runners cannot register with a **secondary** node. Support for this is [planned for the future](https://gitlab.com/gitlab-org/gitlab/-/issues/3294).
- Geo **secondary** nodes can not be configured to [use high-availability configurations of PostgreSQL](https://gitlab.com/groups/gitlab-org/-/epics/2536).
+- Real-time updates of issues/merge requests (for example, via long polling) doesn't work on the **secondary** site.
+- [Selective synchronization](replication/configuration.md#selective-synchronization) applies only to files and repositories. Other datasets are replicated to the **secondary** site in full, making it inappropriate for use as an access control mechanism.
+- Object pools for forked project deduplication work only on the **primary** site, and are duplicated on the **secondary** site.
+- GitLab Runners cannot register with a **secondary** site. Support for this is [planned for the future](https://gitlab.com/gitlab-org/gitlab/-/issues/3294).
+- Configuring Geo **secondary** sites to [use high-availability configurations of PostgreSQL](https://gitlab.com/groups/gitlab-org/-/epics/2536) is currently in **alpha** support.
 - [Selective synchronization](replication/configuration.md#selective-synchronization) only limits what repositories are replicated. The entire PostgreSQL data is still replicated. Selective synchronization is not built to accomodate compliance / export control use cases.

 ### Limitations on replication/verification
@@ -206,7 +208,7 @@ For setup instructions, see [Setting up Geo](setup/index.md).

 ## Post-installation documentation

-After installing GitLab on the **secondary** nodes and performing the initial configuration, see the following documentation for post-installation information.
+After installing GitLab on the **secondary** site(s) and performing the initial configuration, see the following documentation for post-installation information.

 ### Configuring Geo

@@ -214,16 +216,16 @@ For information on configuring Geo, see [Geo configuration](replication/configur

 ### Updating Geo

-For information on how to update your Geo nodes to the latest GitLab version, see [Updating the Geo nodes](replication/updating_the_geo_nodes.md).
+For information on how to update your Geo site(s) to the latest GitLab version, see [Updating the Geo sites](replication/updating_the_geo_nodes.md).

 ### Pausing and resuming replication

 > [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/35913) in [GitLab Premium](https://about.gitlab.com/pricing/) 13.2.

 WARNING:
-In GitLab 13.2 and 13.3, promoting a secondary node to a primary while the
+In GitLab 13.2 and 13.3, promoting a secondary site to a primary while the
 secondary is paused fails. Do not pause replication before promoting a
-secondary. If the node is paused, be sure to resume before promoting. This
+secondary. If the site is paused, be sure to resume before promoting. This
 issue has been fixed in GitLab 13.4 and later.

 WARNING:
@@ -232,7 +234,7 @@ Omnibus GitLab-managed database. External databases are currently not supported.

 In some circumstances, like during [upgrades](replication/updating_the_geo_nodes.md) or a [planned failover](disaster_recovery/planned_failover.md), it is desirable to pause replication between the primary and secondary.

-Pausing and resuming replication is done via a command line tool from the secondary node where the `postgresql` service is enabled.
+Pausing and resuming replication is done via a command line tool from the a node in the secondary site where the `postgresql` service is enabled.

 If `postgresql` is on a standalone database node, ensure that `gitlab.rb` on that node contains the configuration line `gitlab_rails['geo_node_name'] = 'node_name'`, where `node_name` is the same as the `geo_name_name` on the application node.

@@ -262,7 +264,7 @@ For information on using Geo in disaster recovery situations to mitigate data-lo

 ### Replicating the Container Registry

-For more information on how to replicate the Container Registry, see [Docker Registry for a **secondary** node](replication/docker_registry.md).
+For more information on how to replicate the Container Registry, see [Docker Registry for a **secondary** site](replication/docker_registry.md).

 ### Security Review

@@ -278,9 +280,9 @@ For an example of how to set up a location-aware Git remote URL with AWS Route53

 ### Backfill

-Once a **secondary** node is set up, it will start replicating missing data from
-the **primary** node in a process known as **backfill**. You can monitor the
-synchronization process on each Geo node from the **primary** node's **Geo Nodes**
+Once a **secondary** site is set up, it will start replicating missing data from
+the **primary** site in a process known as **backfill**. You can monitor the
+synchronization process on each Geo site from the **primary** site's **Geo Nodes**
 dashboard in your browser.

 Failures that happen during a backfill are scheduled to be retried at the end
@@ -288,7 +290,7 @@ of the backfill.

 ## Remove Geo site

-For more information on removing a Geo node, see [Removing **secondary** Geo nodes](replication/remove_geo_site.md).
+For more information on removing a Geo site, see [Removing **secondary** Geo sites](replication/remove_geo_site.md).

 ## Disable Geo