Commit 569ede33 authored by Evan Read's avatar Evan Read

Merge branch 'implement-dag-docs' into 'master'

Docs for DAG based on `needs:`

See merge request gitlab-org/gitlab-ce!31337
parents f4ce990b 23c9fe8f
---
type: reference
---
# Directed Acyclic Graph
> [Introduced](https://gitlab.com/gitlab-org/gitlab-ce/issues/47063) in GitLab 12.2 (enabled by `ci_dag_support` feature flag).
A [directed acyclic graph](https://www.techopedia.com/definition/5739/directed-acyclic-graph-dag) can be
used in the context of a CI/CD pipeline to build relationships between jobs such that
execution is performed in the quickest possible manner, regardless how stages may
be set up.
For example, you may have a specific tool or separate website that is built
as part of your main project. Using a DAG, you can specify the relationship between
these jobs and GitLab will then execute the jobs as soon as possible instead of waiting
for each stage to complete.
Unlike other DAG solutions for CI/CD, GitLab does not require you to choose one or the
other. You can implement a hybrid combination of DAG and traditional
stage-based operation within a single pipeline. Configuration is kept very simple,
requiring a single keyword to enable the feature for any job.
Consider a monorepo as follows:
```
./service_a
./service_b
./service_c
./service_d
```
It has a pipeline that looks like the following:
| build | test | deploy |
| ----- | ---- | ------ |
| build_a | test_a | deploy_a |
| build_b | test_b | deploy_b |
| build_c | test_c | deploy_c |
| build_d | test_d | deploy_d |
Using a DAG, you can relate the `_a` jobs to each other separately from the `_b` jobs,
and even if service `a` takes a very long time to build, service `b` will not
wait for it and will finish as quickly as it can. In this very same pipeline, `_c` and
`_d` can be left alone and will run together in staged sequence just like any normal
GitLab pipeline.
## Use cases
A DAG can help solve several different kinds of relationships between jobs within
a CI/CD pipeline. Most typically this would cover when jobs need to fan in or out,
and/or merge back together (diamond dependencies). This can happen when you're
handling multi-platform builds or complex webs of dependencies as in something like
an operating system build or a complex deployment graph of independently deployable
but related microservices.
Additionally, a DAG can help with general speediness of pipelines and helping
to deliver fast feedback. By creating dependency relationships that don't unnecessarily
block each other, your pipelines will run as quickly as possible regardless of
pipeline stages, ensuring output (including errors) is available to developers
as quickly as possible.
## Usage
Relationships are defined between jobs using the [`needs:` keyword](../yaml/README.md#needs).
Note that `needs:` also works with the [parallel](../yaml/README.md#parallel) keyword,
giving your powerful options for parallelization within your pipeline.
## Limitations
A directed acyclic graph is a complicated feature, and as of the initial MVC there
are certain use cases that you may need to work around. For more information:
- [`needs` requirements and limitations](../yaml/README.md#requirements-and-limitations).
- Related epic [gitlab-org#1716](https://gitlab.com/groups/gitlab-org/-/epics/1716).
...@@ -1665,6 +1665,84 @@ You can ask your administrator to ...@@ -1665,6 +1665,84 @@ You can ask your administrator to
[flip this switch](../../administration/job_artifacts.md#validation-for-dependencies) [flip this switch](../../administration/job_artifacts.md#validation-for-dependencies)
and bring back the old behavior. and bring back the old behavior.
### `needs`
> Introduced in GitLab 12.2.
The `needs:` keyword enables executing jobs out-of-order, allowing you to implement
a [directed acyclic graph](../directed_acyclic_graph/index.md) in your `.gitlab-ci.yml`.
This lets you run some jobs without waiting for other ones, disregarding stage ordering
so you can have multiple stages running concurrently.
Let's consider the following example:
```yaml
linux:build:
stage: build
mac:build:
stage: build
linux:rspec:
stage: test
needs: [linux:build]
linux:rubocop:
stage: test
needs: [linux:build]
mac:rspec:
stage: test
needs: [mac:build]
mac:rubocop:
stage: test
needs: [mac:build]
production:
stage: deploy
```
This example creates three paths of execution:
- Linux path: the `linux:rspec` and `linux:rubocop` jobs will be run as soon
as the `linux:build` job finishes without waiting for `mac:build` to finish.
- macOS path: the `mac:rspec` and `mac:rubocop` jobs will be run as soon
as the `mac:build` job finishes, without waiting for `linux:build` to finish.
- The `production` job will be executed as soon as all previous jobs
finish; in this case: `linux:build`, `linux:rspec`, `linux:rubocop`,
`mac:build`, `mac:rspec`, `mac:rubocop`.
#### Requirements and limitations
1. If `needs:` is set to point to a job that is not instantiated
because of `only/except` rules or otherwise does not exist, the
job will fail.
1. Note that one day one of the launch, we are temporarily limiting the
maximum number of jobs that a single job can need in the `needs:` array. Track
our [infrastructure issue](https://gitlab.com/gitlab-com/gl-infra/infrastructure/issues/7541)
for details on the current limit.
1. If you use `dependencies:` with `needs:`, it's important that you
do not mark a job as having a dependency on something that won't
have been run at the time it needs it. It's better to use both
keywords in this case so that GitLab handles the ordering appropriately.
1. It is impossible for now to have `needs: []` (empty needs),
the job always needs to depend on something, unless this is the job
in the first stage (see [gitlab-ce#65504](https://gitlab.com/gitlab-org/gitlab-ce/issues/65504)).
1. If `needs:` refers to a job that is marked as `parallel:`.
the current job will depend on all parallel jobs created.
1. `needs:` is similar to `dependencies:` in that needs to use jobs from
prior stages, this means that it is impossible to create circular
dependencies or depend on jobs in the current stage (see [gitlab-ce#65505](https://gitlab.com/gitlab-org/gitlab-ce/issues/65505)).
1. Related to the above, stages must be explicitly defined for all jobs
that have the keyword `needs:` or are referred to by one.
1. For self-managed users, the feature must be turned on using the `ci_dag_support`
feature flag. The `ci_dag_limit_needs` option, if set, will limit the number of
jobs that a single job can need to `50`. If unset, the limit is `5`.
### `coverage` ### `coverage`
> [Introduced][ce-7447] in GitLab 8.17. > [Introduced][ce-7447] in GitLab 8.17.
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment