Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
G
gitlab-ce
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
1
Merge Requests
1
Analytics
Analytics
Repository
Value Stream
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Commits
Issue Boards
Open sidebar
nexedi
gitlab-ce
Commits
5e739f02
Commit
5e739f02
authored
Sep 13, 2017
by
Gabriel Mazetto
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Updated Geo documentation and added Log Cursor and Tracking Database details
parent
fd326fc3
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
34 additions
and
6 deletions
+34
-6
doc/gitlab-geo/README.md
doc/gitlab-geo/README.md
+34
-6
doc/gitlab-geo/img/geo-architecture.png
doc/gitlab-geo/img/geo-architecture.png
+0
-0
No files found.
doc/gitlab-geo/README.md
View file @
5e739f02
...
...
@@ -32,14 +32,15 @@ and the replicated read-only ones as **secondaries**.
Keep in mind that:
-
Secondaries talk to primary to get user data for logins (API), and to
clone/pull from repositories (HTTP(S)/SSH).
-
Primary talks to secondaries to notify for changes (API).
-
Secondaries talk to primary to get user data for logins (API), to
clone/pull from repositories (SSH) and to retrieve LFS Objects and Attachments
(HTTPS + JWT).
-
Since 10.0 Primary no longer talks to secondaries to notify for changes (API).
## Use-cases
-
Can be used for cloning and fetching projects, in addition
to reading any data
to reading any data
available in the GitLab web interface
-
Overcomes slow connection between distant offices, saving time by
improving speed for distributed teams
-
Helps reducing the loading time for automated tasks,
...
...
@@ -51,11 +52,12 @@ The following diagram illustrates the underlying architecture of GitLab Geo:
![
GitLab Geo architecture
](
img/geo-architecture.png
)
[
Source diagram
](
https://docs.google.com/drawings/d/1
VQIcj6jyE3idWKyt9MRUAaE3XXrkwx8g-Ne4pmURmwI
/edit
)
[
Source diagram
](
https://docs.google.com/drawings/d/1
L44flo2Mxng928yAcHduaCJyGtKNEjk2WQkxaCU_cT8
/edit
)
In this diagram, there is one Geo primary node and one secondary. The
secondary clones repositories via git over SSH. Attachments, LFS objects, and
other files are downloaded via HTTPS using a GitLab API to authenticate.
other files are downloaded via HTTPS using the GitLab API to authenticate,
with a special endpoint protected by JWT.
Writes to the database and Git repositories can only be performed on the Geo
primary node. The secondary node receives database updates via PostgreSQL
...
...
@@ -65,6 +67,8 @@ Note that the secondary needs two different PostgreSQL databases: a read-only
instance that streams data from the main GitLab database and another used
internally by the secondary node to record what data has been replicated.
In the secondary nodes there is an additional daemon: Geo Log Cursor.
### LDAP
We recommend that if you use LDAP on your primary that you also set up a
...
...
@@ -77,6 +81,30 @@ Check with your LDAP provider for instructions on on how to set up
replication. For example, OpenLDAP provides
[
these
instructions
](
https://www.openldap.org/doc/admin24/replication.html
)
.
### Geo Tracking Database
We use the tracking database as metadata to control what needs to be
updated on the disk of the local instance (for example, download new assets,
fetch new LFS Objects or fetch changes from a repository that has recently been
updated).
Because the replicated instance is read-only we need this additional instance
per secondary location.
### Geo Log Cursor
This daemon reads a log of events replicated by the primary node to the secondary
database and updates the Geo Tracking Database with changes that needs to be
executed.
When something is marked to be updated in the tracking database, asynchronous
jobs running on the secondary node will execute the required operations and
update the state.
This new architecture allows us to be resilient to connectivity issues between the
nodes. It doesn't matter if it was just a few minutes or days. The secondary
instance will be able to replay all the events and get in sync again.
## Setup instructions
In order to set up one or more GitLab Geo instances, follow the steps below in
...
...
doc/gitlab-geo/img/geo-architecture.png
View replaced file @
fd326fc3
View file @
5e739f02
64.8 KB
|
W:
|
H:
59.3 KB
|
W:
|
H:
2-up
Swipe
Onion skin
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment