Commit 3c9766d6 authored by Jacob Vosmaer's avatar Jacob Vosmaer

Document that gitlab-org/gitlab no longer uses CI_PRE_CLONE_SCRIPT

This updates the developer documentation to reflect the fact that we
stopped using CI_PRE_CLONE_SCRIPT to make the gitlab-org/gitlab CI Git
fetch workload manageable.
parent a09edca1
...@@ -250,12 +250,15 @@ concurrent = 4 ...@@ -250,12 +250,15 @@ concurrent = 4
This makes the cloning configuration to be part of the given runner This makes the cloning configuration to be part of the given runner
and does not require us to update each `.gitlab-ci.yml`. and does not require us to update each `.gitlab-ci.yml`.
## Pre-clone step ## Git fetch caching or pre-clone step
> [An issue exists](https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/463) to remove the need for this optimization. For very active repositories with a large number of references and files, you can either (or both):
For very active repositories with a large number of references and files, you can also - Consider using the [Gitaly pack-objects cache](../../administration/gitaly/configure_gitaly.md#pack-objects-cache) instead of a
optimize your CI jobs by seeding repository data with GitLab Runner's [`pre_clone_script`](https://docs.gitlab.com/runner/configuration/advanced-configuration.html#the-runners-section). pre-clone step. This is easier to set up and it benefits all repositories on your GitLab server, unlike the pre-clone step that
must be configured per-repository. The pack-objects cache also automatically works for forks. For `gitlab-org/gitlab` development
See [our development documentation](../../development/pipelines.md#pre-clone-step) for on GitLab.com, we stopped using a pre-clone step.
an overview of how we implemented this approach on GitLab.com for the main GitLab repository. - Optimize your CI/CD jobs by seeding repository data in a pre-clone step with the
[`pre_clone_script`](https://docs.gitlab.com/runner/configuration/advanced-configuration.html#the-runners-section) of GitLab Runner. See our
[development documentation](../../development/pipelines.md#pre-clone-step) for an overview of how we used to implement this approach on
GitLab.com for the main GitLab repository.
...@@ -791,19 +791,30 @@ request, be sure to start the `dont-interrupt-me` job before pushing. ...@@ -791,19 +791,30 @@ request, be sure to start the `dont-interrupt-me` job before pushing.
We limit the artifacts that are saved and retrieved by jobs to the minimum in order to reduce the upload/download time and costs, as well as the artifacts storage. We limit the artifacts that are saved and retrieved by jobs to the minimum in order to reduce the upload/download time and costs, as well as the artifacts storage.
### Pre-clone step ### Git fetch caching
Because GitLab.com uses the [pack-objects cache](../administration/gitaly/configure_gitaly.md#pack-objects-cache),
concurrent Git fetches of the same pipeline ref are deduplicated on
the Gitaly server (always) and served from cache (when available).
This works well for the following reasons:
The `gitlab-org/gitlab` project on GitLab.com uses a [pre-clone step](https://gitlab.com/gitlab-org/gitlab/-/issues/39134) - The pack-objects cache is enabled on all Gitaly servers on GitLab.com.
to seed the project with a recent archive of the repository. This is done for - The CI/CD [Git strategy setting](../ci/pipelines/settings.md#choose-the-default-git-strategy) for `gitlab-org/gitlab` is **Git clone**,
several reasons: causing all jobs to fetch the same data, which maximizes the cache hit ratio.
- We use [shallow clone](../ci/pipelines/settings.md#limit-the-number-of-changes-fetched-during-clone) to avoid downloading the full Git
history for every job.
### Pre-clone step
- It speeds up builds because a 800 MB download only takes seconds, as opposed to a full Git clone. NOTE:
- It significantly reduces load on the file server, as smaller deltas mean less time spent in `git pack-objects`. We no longer use this optimization for `gitlab-org/gitlab` because the [pack-objects cache](../administration/gitaly/configure_gitaly.md#pack-objects-cache)
allows Gitaly to serve the full CI/CD fetch traffic now. See [Git fetch caching](#git-fetch-caching).
The pre-clone step works by using the `CI_PRE_CLONE_SCRIPT` variable The pre-clone step works by using the `CI_PRE_CLONE_SCRIPT` variable
[defined by GitLab.com shared runners](../ci/runners/build_cloud/linux_build_cloud.md#pre-clone-script). [defined by GitLab.com shared runners](../ci/runners/build_cloud/linux_build_cloud.md#pre-clone-script).
The `CI_PRE_CLONE_SCRIPT` is currently defined as a project CI/CD variable: The `CI_PRE_CLONE_SCRIPT` is defined as a project CI/CD variable:
```shell ```shell
( (
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment