Commit 26bac1a3 authored by Marcel Amirault's avatar Marcel Amirault Committed by Evan Read

Update caching documentation

parent 3c07ce6e
...@@ -23,61 +23,55 @@ how it is defined in `.gitlab-ci.yml`. ...@@ -23,61 +23,55 @@ how it is defined in `.gitlab-ci.yml`.
NOTE: **Note:** NOTE: **Note:**
Be careful if you use cache and artifacts to store the same path in your jobs Be careful if you use cache and artifacts to store the same path in your jobs
as **caches are restored before artifacts** and the content would be overwritten. as **caches are restored before artifacts** and the content could be overwritten.
Don't mix the caching with passing artifacts between stages. Caching is not Don't use caching for passing artifacts between stages, as it is designed to store
designed to pass artifacts between stages. Cache is for runtime dependencies runtime dependencies needed to compile the project:
needed to compile the project:
- `cache`: **For storing project dependencies**
- `cache`: **Use for temporary storage for project dependencies.** Not useful
for keeping intermediate build results, like `jar` or `apk` files. Caches are used to speed up runs of a given job in **subsequent pipelines**, by
Cache was designed to be used to speed up invocations of subsequent runs of a storing downloaded dependencies so that they don't have to be fetched from the
given job, by keeping things like dependencies (e.g., npm packages, Go vendor internet again (like npm packages, Go vendor packages, etc.) While the cache could
packages, etc.) so they don't have to be re-fetched from the public internet. be configured to pass intermediate build results between stages, this should be
While the cache can be abused to pass intermediate build results between done with artifacts instead.
stages, there may be cases where artifacts are a better fit.
- `artifacts`: **Use for stage results that will be passed between stages.** - `artifacts`: **Use for stage results that will be passed between stages.**
Artifacts were designed to upload some compiled/generated bits of the build,
and they can be fetched by any number of concurrent Runners. They are Artifacts are files generated by a job which are stored and uploaded, and can then
guaranteed to be available and are there to pass data between jobs. They are be fetched and used by jobs in later stages of the **same pipeline**. This data
also exposed to be downloaded from the UI. **Artifacts can only exist in will not be available in different pipelines, but is available to be downloaded
directories relative to the build directory** and specifying paths which don't from the UI.
comply to this rule trigger an unintuitive and illogical error message (an
enhancement is discussed at The name `artifacts` sounds like it's only useful outside of the job, like for downloading
[https://gitlab.com/gitlab-org/gitlab-foss/issues/15530](https://gitlab.com/gitlab-org/gitlab-foss/issues/15530) a final image, but artifacts are also available in later stages within a pipeline.
). Artifacts need to be uploaded to the GitLab instance (not only the GitLab So if you build your application by downloading all the required modules, you might
runner) before the next stage job(s) can start, so you need to evaluate want to declare them as artifacts so that subsequent stages can use them. There are
carefully whether your bandwidth allows you to profit from parallelization some optimizations like declaring an [expiry time](../yaml/README.md#artifactsexpire_in)
with stages and shared artifacts before investing time in changes to the so you don't keep artifacts around too long, or using [dependencies](../yaml/README.md#dependencies)
setup. to control which jobs fetch the artifacts.
It's sometimes confusing because the name artifact sounds like something that Caches:
is only useful outside of the job, like for downloading a final image. But
artifacts are also available in between stages within a pipeline. So if you - Are disabled if not defined globally or per job (using `cache:`).
build your application by downloading all the required modules, you might want - Are available for all jobs in your `.gitlab-ci.yml` if enabled globally.
to declare them as artifacts so that each subsequent stage can depend on them - Can be used in subsequent pipelines by the same job in which the cache was created (if not defined globally).
being there. There are some optimizations like declaring an - Are stored where the Runner is installed **and** uploaded to S3 if [distributed cache is enabled](https://docs.gitlab.com/runner/configuration/autoscale.html#distributed-runners-caching).
[expiry time](../yaml/README.md#artifactsexpire_in) so you don't keep artifacts - If defined per job, are used:
around too long, and using [dependencies](../yaml/README.md#dependencies) to - By the same job in a subsequent pipeline.
control exactly where artifacts are passed around. - By subsequent jobs in the same pipeline, if the they have identical dependencies.
In summary: Artifacts:
- Caches are disabled if not defined globally or per job (using `cache:`). - Are disabled if not defined per job (using `artifacts:`).
- Caches are available for all jobs in your `.gitlab-ci.yml` if enabled globally. - Can only be enabled per job, not globally.
- Caches can be used by subsequent pipelines of that same job (a script in - Are created during a pipeline and can be used by the subsequent jobs of that currently active pipeline.
a stage) in which the cache was created (if not defined globally). - Are always uploaded to GitLab (known as coordinator).
- Caches are stored where the Runner is installed **and** uploaded to S3 if - Can have an expiration value for controlling disk usage (30 days by default).
[distributed cache is enabled](https://docs.gitlab.com/runner/configuration/autoscale.html#distributed-runners-caching).
- Caches defined per job are only used, either: NOTE: **Note:**
- For the next pipeline of that job. Both artifacts and caches define their paths relative to the project directory, and
- If that same cache is also defined in a subsequent job of the same pipeline. can't link to files outside it.
- Artifacts are disabled if not defined per job (using `artifacts:`).
- Artifacts can only be enabled per job, not globally.
- Artifacts are created during a pipeline and can be used by the subsequent
jobs of that currently active pipeline.
- Artifacts are always uploaded to GitLab (known as coordinator).
- Artifacts can have an expiration value for controlling disk usage (30 days by default).
## Good caching practices ## Good caching practices
......
...@@ -1514,8 +1514,10 @@ globally and all jobs will use that definition. ...@@ -1514,8 +1514,10 @@ globally and all jobs will use that definition.
#### `cache:paths` #### `cache:paths`
Use the `paths` directive to choose which files or directories will be cached. You can only specify paths within your `$CI_PROJECT_DIR`. Use the `paths` directive to choose which files or directories will be cached. Paths
Wildcards can be used that follow the [glob](https://en.wikipedia.org/wiki/Glob_(programming)) patterns and [filepath.Match](https://golang.org/pkg/path/filepath/#Match). are relative to the project directory (`$CI_PROJECT_DIR`) and cannot directly link outside it.
Wildcards can be used that follow the [glob](https://en.wikipedia.org/wiki/Glob_(programming))
patterns and [filepath.Match](https://golang.org/pkg/path/filepath/#Match).
Cache all files in `binaries` that end in `.apk` and the `.config` file: Cache all files in `binaries` that end in `.apk` and the `.config` file:
...@@ -1744,8 +1746,9 @@ be available for download in the GitLab UI. ...@@ -1744,8 +1746,9 @@ be available for download in the GitLab UI.
#### `artifacts:paths` #### `artifacts:paths`
You can only use paths that are within the local working copy. Paths are relative to the project directory (`$CI_PROJECT_DIR`) and cannot directly
Wildcards can be used that follow the [glob](https://en.wikipedia.org/wiki/Glob_(programming)) patterns and [filepath.Match](https://golang.org/pkg/path/filepath/#Match). link outside it. Wildcards can be used that follow the [glob](https://en.wikipedia.org/wiki/Glob_(programming))
patterns and [filepath.Match](https://golang.org/pkg/path/filepath/#Match).
To restrict which jobs a specific job will fetch artifacts from, see [dependencies](#dependencies). To restrict which jobs a specific job will fetch artifacts from, see [dependencies](#dependencies).
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment