GitLab CI/CD is one of the most data and compute intensive components features.
GitLab CI/CD is one of the most data and compute intensive components of GitLab.
Since its [initial release in November 2012](https://about.gitlab.com/blog/2012/11/13/continuous-integration-server-from-gitlab/),
the CI/CD subsystem has evolved significantly. It was [integrated into GitLab in September 2015](https://about.gitlab.com/releases/2015/09/22/gitlab-8-0-released/)
and has become [one of the most beloved CI/CD solutions](https://about.gitlab.com/blog/2017/09/27/gitlab-leader-continuous-integration-forrester-wave/).
...
...
@@ -22,19 +22,23 @@ we are reaching database limits that are slowing our development velocity down.
On February 1st, 2021, a billionth CI/CD job was created and the number of
builds is growing exponentially. We will run out of the available primary keys
before December 2021 unless we improve the database model used to store CI/CD
builds.
for builds before December 2021 unless we improve the database model used to
store CI/CD data.
We expect to see 20M builds created daily on Gitlab.com in the first half of
2024.
![ci_builds cumulative with forecast](ci_builds_cumulative_forecast.png)
## Goals
## Goal
1. Transition primary key for `ci_builds` to 64-bit integer
1. Reduce the amount of data stored in `ci_builds` table
1. Devise a database partitioning strategy for `ci_builds` table
**Enable future growth by making processing 20M builds in a day possible.**
## Challenges
The current state of CI/CD product architecture needs to be updated if we want
to sustain future growth.
### We are running out of the capacity to store primary keys
The primary key in `ci_builds` table is an integer generated in a sequence.
...
...
@@ -48,16 +52,21 @@ bigint yet.
We will run out of the capacity of the integer type to store primary keys in
`ci_builds` table before December 2021. When it happens without a viable
workaround, GitLab.com will go down.
workaround or an emergency plan, GitLab.com will go down.
`ci_builds` is just one of the tables that are running out of the primary keys
available for Int4 sequence.
Primary keys problem will be tackled by our Database Team.
### The table is too large
There is more than a billion rows in `ci_builds` table. We store more than 2
terabytes of data in that table, and the total size of indexes is more than 1
terabyte.
terabyte (as of February 2021).
This amount of data contributes to a significant problems related to having
this table in our database.
This amount of data contributes to a significant performance problems we
experience on our primary PostgreSQL database.
Most of the problem are related to how PostgreSQL database works internally,
and how it is making use of resources on a node the database runs on. We are at
...
...
@@ -71,20 +80,49 @@ seem fine in the development environment may not work on GitLab.com. The
difference in the dataset size between the environments makes it difficult to
predict the performance of event the most simple queries.
### Background migrations are not reliable
We also expect a significant, exponential growth in the upcoming years.
One of the forecasts done using [Facebook's
Prophet](https://facebook.github.io/prophet/) shows that in the first half of
2024 we expect seeing 20M builds created on Gitlab.com each day. In comparison
to around 2M we see created today, this is 10x growth our product might need to