Commit 8e81d12e authored by Marcia Ramos's avatar Marcia Ramos

Merge branch 'ash2k/remove-agent-dev-docs' into 'master'

Remove agent dev docs

See merge request gitlab-org/gitlab!62433
parents 398ddd20 50aee24e
---
stage: Configure
group: Configure
info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#designated-technical-writers
redirect_to: 'https://gitlab.com/gitlab-org/cluster-integration/gitlab-agent/-/blob/master/doc/gitops.md'
remove_date: '2022-06-24'
---
# GitOps with the Kubernetes Agent **(PREMIUM SELF)**
This file was moved to [another location](https://gitlab.com/gitlab-org/cluster-integration/gitlab-agent/-/blob/master/doc/gitops.md).
The [GitLab Kubernetes Agent](../../user/clusters/agent/index.md) supports the
[pull-based version](https://www.gitops.tech/#pull-based-deployments) of
[GitOps](https://www.gitops.tech/). To be useful, the feature must be able to perform these tasks:
- Connect one or more Kubernetes clusters to a GitLab project or group.
- Synchronize cluster-wide state from a Git repository.
- Synchronize namespace-scoped state from a Git repository.
- Control the following settings:
- The kinds of objects an agent can manage.
- Enabling the namespaced mode of operation for managing objects only in a specific namespace.
- Enabling the non-namespaced mode of operation for managing objects in any namespace, and
managing non-namespaced objects.
- Synchronize state from one or more Git repositories into a cluster.
- Configure multiple agents running in different clusters to synchronize state
from the same repository.
## GitOps architecture
In this architecture, the Kubernetes cluster (`agentk`) periodically fetches
configuration from (`kas`), spawning a goroutine for each configured GitOps
repository. Each goroutine makes a streaming `GetObjectsToSynchronize()` gRPC call.
`kas` accepts these requests, then checks if this agent is authorized to access
this GitLab repository. If authorized, `kas` polls Gitaly for repository updates
and sends the latest manifests to the agent.
Before each poll, `kas` verifies with GitLab that the agent's token is still valid.
When `agentk` receives an updated manifest, it performs a synchronization using
[`gitops-engine`](https://github.com/argoproj/gitops-engine).
If a repository is removed from the list, `agentk` stops the `GetObjectsToSynchronize()`
calls to that repository.
```mermaid
graph TB
agentk -- fetch configuration --> kas
agentk -- fetch GitOps manifests --> kas
subgraph "GitLab"
kas[kas]
GitLabRoR[GitLab RoR]
Gitaly[Gitaly]
kas -- poll GitOps repositories --> Gitaly
kas -- authZ for agentk --> GitLabRoR
kas -- fetch configuration --> Gitaly
end
subgraph "Kubernetes cluster"
agentk[agentk]
end
```
## Architecture considered but not implemented
As part of the implementation process, this architecture was considered, but ultimately
not implemented.
In this architecture, `agentk` periodically fetches configuration from `kas`. For each
configured GitOps repository, it spawns a goroutine. Each goroutine then spawns a
copy of [`git-sync`](https://github.com/kubernetes/git-sync). It polls a particular
repository and invokes a corresponding webhook on `agentk` when it changes. When that
happens, `agentk` performs a synchronization using
[`gitops-engine`](https://github.com/argoproj/gitops-engine).
For repositories no longer in the list, `agentk` stops corresponding goroutines
and `git-sync` copies, also deleting their cloned repositories from disk:
```mermaid
graph TB
agentk -- fetch configuration --> kas
git-sync -- poll GitOps repositories --> GitLabRoR
subgraph "GitLab"
kas[kas]
GitLabRoR[GitLab RoR]
kas -- authZ for agentk --> GitLabRoR
kas -- fetch configuration --> Gitaly[Gitaly]
end
subgraph "Kubernetes cluster"
agentk[agentk]
git-sync[git-sync]
agentk -- control --> git-sync
git-sync -- notify about changes --> agentk
end
```
## Comparing implemented and non-implemented architectures
Both architectures attempt to answer the same question: how to grant an agent
access to a non-public repository?
In the **implemented** architecture:
- Favorable: Fewer moving parts, as `git-sync` and `git` are not used, making this
design more reliable.
- Favorable: Uses existing connectivity and authentication mechanisms are used (gRPC + `agentk` token).
- Favorable: No polling through external infrastructure. Saves traffic and avoids
noise in access logs.
In the **unimplemented** architecture:
- Favorable: `agentk` uses `git-sync` to access repositories with standard protocols
(either HTTPS, or SSH and Git) with accepted authentication and authorization methods.
- Unfavorable: The user must put credentials into a `secret`. GitLab doesn't have
a mechanism for per-repository tokens for robots.
- Unfavorable: Rotating all credentials is more work than rotating a single `agentk` token.
- Unfavorable: A dependency on an external component (`git-sync`) that can be avoided.
- Unfavorable: More network traffic and connections than the implemented design
### Ideas considered for the unimplemented design
As part of the design process, these ideas were considered, and discarded:
- Running `git-sync` and `gitops-engine` as part of `kas`.
- Favorable: More code and infrastructure under our control for GitLab.com
- Unfavorable: Running an arbitrary number of `git-sync` processes would require
an unbounded amount of RAM and disk space.
- Unfavorable: Unclear which `kas` replica is responsible for which agent and
repository synchronization. If done as part of `agentk`, leader election can be
done using [client-go](https://pkg.go.dev/k8s.io/client-go/tools/leaderelection?tab=doc).
- Running `git-sync` and a "`gitops-engine` driver" helper program as a separate
Kubernetes `Deployment`.
- Favorable: Better isolation and higher resiliency. For example, if the node
with `agentk` dies, not all synchronization stops.
- Favorable: Each deployment has its own memory and disk limits.
- Favorable: Per-repository synchronization identity (distinct `ServiceAccount`)
can be implemented.
- Unfavorable: Time consuming to implement properly:
- Each `Deployment` needs CRUD (create, update, and delete) permissions.
- Users may want to customize a `Deployment`, or add and remove satellite objects
like `PodDisruptionBudget`, `HorizontalPodAutoscaler`, and `PodSecurityPolicy`.
- Metrics, monitoring, logs for the `Deployment`.
<!-- This redirect file can be deleted after <2022-06-24>. -->
<!-- Before deletion, see: https://docs.gitlab.com/ee/development/documentation/#move-or-rename-a-page -->
---
stage: Configure
group: Configure
info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#designated-technical-writers
redirect_to: 'https://gitlab.com/gitlab-org/cluster-integration/gitlab-agent/-/blob/master/doc/identity_and_auth.md'
remove_date: '2022-06-24'
---
# Kubernetes Agent identity and authentication **(PREMIUM SELF)**
This file was moved to [another location](https://gitlab.com/gitlab-org/cluster-integration/gitlab-agent/-/blob/master/doc/identity_and_auth.md).
This page uses the word `agent` to describe the concept of the
GitLab Kubernetes Agent. The program that implements the concept is called `agentk`.
Read the
[architecture page](https://gitlab.com/gitlab-org/cluster-integration/gitlab-agent/-/blob/master/doc/architecture.md)
for more information.
## Agent identity and name
In a GitLab installation, each agent must have a unique, immutable name. This
name must be unique in the project the agent is attached to, and this name must
follow the [DNS label standard from RFC 1123](https://tools.ietf.org/html/rfc1123).
The name must:
- Contain at most 63 characters.
- Contain only lowercase alphanumeric characters or `-`.
- Start with an alphanumeric character.
- End with an alphanumeric character.
Kubernetes uses the
[same naming restriction](https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#dns-label-names)
for some names.
The regex for names is: `/\A[a-z0-9]([-a-z0-9]*[a-z0-9])?\z/`.
## Multiple agents in a cluster
A Kubernetes cluster may have 0 or more agents running in it. Each agent likely
has a different configuration. Some may enable features A and B, and some may
enable features B and C. This flexibility enables different groups of people to
use different features of the agent in the same cluster.
For example, [Priyanka (Platform Engineer)](https://about.gitlab.com/handbook/marketing/strategic-marketing/roles-personas/#priyanka-platform-engineer)
may want to use cluster-wide features of the agent, while
[Sasha (Software Developer)](https://about.gitlab.com/handbook/marketing/strategic-marketing/roles-personas/#sasha-software-developer)
uses the agent that only has access to a particular namespace.
Each agent is likely running using a
[`ServiceAccount`](https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/),
a distinct Kubernetes identity, with a distinct set of permissions attached to it.
These permissions enable the agent administrator to follow the
[principle of least privilege](https://en.wikipedia.org/wiki/Principle_of_least_privilege)
and minimize the permissions each particular agent needs.
## Kubernetes Agent authentication
When adding a new agent, GitLab provides the user with a bearer access token. The
agent uses this token to authenticate with GitLab. This token is a random string
and does not encode any information in it, but it is secret and must
be treated with care. Store it as a `Secret` in Kubernetes.
Each agent can have 0 or more tokens in a GitLab database. Having several valid
tokens helps you rotate tokens without needing to re-register an agent. Each token
record in the database has the following fields:
- Agent identity it belongs to.
- Token value. Encrypted at rest.
- Creation time.
- Who created it.
- Revocation flag to mark token as revoked.
- Revocation time.
- Who revoked it.
- A text field to store any comments the administrator may want to make about the token for future self.
Tokens can be managed by users with `maintainer` and higher level of
[permissions](../../user/permissions.md).
Tokens are immutable, and only the following fields can be updated:
- Revocation flag. Can only be updated to `true` once, but immutable after that.
- Revocation time. Set to the current time when revocation flag is set, but immutable after that.
- Comments field. Can be updated any number of times, including after the token has been revoked.
The agent sends its token, along with each request, to GitLab to authenticate itself.
For each request, GitLab checks the token's validity:
- Does the token exist in the database?
- Has the token been revoked?
This information may be cached for some time to reduce load on the database.
## Kubernetes Agent authorization
GitLab provides the following information in its response for a given Agent access token:
- Agent configuration Git repository. (The agent doesn't support per-folder authorization.)
- Agent name.
## Create an agent
You can create an agent by following the [user documentation](../../user/clusters/agent/index.md#create-an-agent-record-in-gitlab), or via Rails console:
```ruby
project = ::Project.find_by_full_path("path-to/your-configuration-project")
# agent-name should be the same as specified above in the config.yaml
agent = ::Clusters::Agent.create(name: "<agent-name>", project: project)
token = ::Clusters::AgentToken.create(agent: agent)
token.token # this will print out the token you need to use on the next step
```
<!-- This redirect file can be deleted after <2022-06-24>. -->
<!-- Before deletion, see: https://docs.gitlab.com/ee/development/documentation/#move-or-rename-a-page -->
---
stage: Configure
group: Configure
info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#designated-technical-writers
redirect_to: 'https://gitlab.com/gitlab-org/cluster-integration/gitlab-agent/-/blob/master/doc/architecture.md'
remove_date: '2022-06-24'
---
# Kubernetes Agent development **(PREMIUM SELF)**
This file was moved to [another location](https://gitlab.com/gitlab-org/cluster-integration/gitlab-agent/-/blob/master/doc/architecture.md).
This page contains developer-specific information about the GitLab Kubernetes Agent.
[End-user documentation about the GitLab Kubernetes Agent](../../user/clusters/agent/index.md)
is also available.
The agent can help you perform tasks like these:
- Integrate a cluster, located behind a firewall or NAT, with GitLab. To
learn more, read [issue #212810, Invert the model GitLab.com uses for Kubernetes integration by leveraging long lived reverse tunnels](https://gitlab.com/gitlab-org/gitlab/-/issues/212810).
- Access API endpoints in a cluster in real time. For an example use case, read
[issue #218220, Allow Prometheus in K8s cluster to be installed manually](https://gitlab.com/gitlab-org/gitlab/-/issues/218220#note_348729266).
- Enable real-time features by pushing information about events happening in a cluster.
For example, you could build a cluster view dashboard to visualize changes in progress
in a cluster. For more information about these efforts, read about the
[Real-Time Working Group](https://about.gitlab.com/company/team/structure/working-groups/real-time/).
- Enable a [cache of Kubernetes objects through informers](https://github.com/kubernetes/client-go/blob/ccd5becdffb7fd8006e31341baaaacd14db2dcb7/tools/cache/shared_informer.go#L34-L183),
kept up-to-date with very low latency. This cache helps you:
- Reduce or eliminate information propagation latency by avoiding Kubernetes API calls
and polling, and only fetching data from an up-to-date cache.
- Lower the load placed on the Kubernetes API by removing polling.
- Eliminate any rate-limiting errors by removing polling.
- Simplify backend code by replacing polling code with cache access. While it's another
API call, no polling is needed. This example describes [fetching cached data synchronously from the front end](https://gitlab.com/gitlab-org/gitlab/-/issues/217792#note_348582537) instead of fetching data from the Kubernetes API.
## Architecture of the Kubernetes Agent
The GitLab Kubernetes Agent and the GitLab Kubernetes Agent Server use
[bidirectional streaming](https://grpc.io/docs/what-is-grpc/core-concepts/#bidirectional-streaming-rpc)
to allow the connection acceptor (the gRPC server, GitLab Kubernetes Agent Server) to
act as a client. The connection acceptor sends requests as gRPC replies. The client-server
relationship is inverted because the connection must be initiated from inside the
Kubernetes cluster to bypass any firewall or NAT the cluster may be located behind.
To learn more about this inversion, read
[issue #212810](https://gitlab.com/gitlab-org/gitlab/-/issues/212810).
This diagram describes how GitLab (`GitLab RoR`), the GitLab Kubernetes Agent (`agentk`), and the GitLab Kubernetes Agent Server (`kas`) work together.
```mermaid
graph TB
agentk -- gRPC bidirectional streaming --> kas
subgraph "GitLab"
kas[kas]
GitLabRoR[GitLab RoR] -- gRPC --> kas
kas -- gRPC --> Gitaly[Gitaly]
kas -- REST API --> GitLabRoR
end
subgraph "Kubernetes cluster"
agentk[agentk]
end
```
- `GitLab RoR` is the main GitLab application. It uses gRPC to talk to `kas`.
- `agentk` is the GitLab Kubernetes Agent. It keeps a connection established to a
`kas` instance, waiting for requests to process. It may also actively send information
about things happening in the cluster.
- `kas` is the GitLab Kubernetes Agent Server, and is responsible for:
- Accepting requests from `agentk`.
- [Authentication of requests](https://gitlab.com/gitlab-org/cluster-integration/gitlab-agent/-/blob/master/doc/identity_and_auth.md) from `agentk` by querying `GitLab RoR`.
- Fetching agent's configuration from a corresponding Git repository by querying Gitaly.
- Matching incoming requests from `GitLab RoR` with existing connections from
the right `agentk`, forwarding requests to it and forwarding responses back.
- (Optional) Sending notifications through ActionCable for events received from `agentk`.
- Polling manifest repositories for [GitOps support](https://gitlab.com/gitlab-org/cluster-integration/gitlab-agent/-/blob/master/doc/gitops.md) by communicating with Gitaly.
<i class="fa fa-youtube-play youtube" aria-hidden="true"></i>
To learn more about how the repository is structured, see
[GitLab Kubernetes Agent repository overview](https://www.youtube.com/watch?v=j8CyaCWroUY).
## Guiding principles
GitLab prefers to add logic into `kas` rather than `agentk`. `agentk` should be kept
streamlined and small to minimize the need for upgrades. On GitLab.com, `kas` is
managed by GitLab, so upgrades and features can be added without requiring you
to upgrade `agentk` in your clusters.
`agentk` can't be viewed as a dumb reverse proxy because features are planned to be built
[on top of the cache with informers](https://github.com/kubernetes/client-go/blob/ccd5becdffb7fd8006e31341baaaacd14db2dcb7/tools/cache/shared_informer.go#L34-L183).
<!-- This redirect file can be deleted after <2022-06-24>. -->
<!-- Before deletion, see: https://docs.gitlab.com/ee/development/documentation/#move-or-rename-a-page -->
---
stage: Configure
group: Configure
info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#designated-technical-writers
redirect_to: 'https://gitlab.com/gitlab-org/cluster-integration/gitlab-agent/-/blob/master/doc/local.md'
remove_date: '2022-06-24'
---
# Run the Kubernetes Agent locally **(PREMIUM SELF)**
This file was moved to [another location](https://gitlab.com/gitlab-org/cluster-integration/gitlab-agent/-/blob/master/doc/local.md).
You can run `kas` and `agentk` locally to test the [Kubernetes Agent](index.md) yourself.
1. Create a `cfg.yaml` file from the contents of
[`config_example.yaml`](https://gitlab.com/gitlab-org/cluster-integration/gitlab-agent/-/blob/master/pkg/kascfg/config_example.yaml), or this example:
```yaml
agent:
listen:
network: tcp
address: 127.0.0.1:8150
websocket: false
gitops:
poll_period: "10s"
gitlab:
address: http://localhost:3000
authentication_secret_file: /Users/tkuah/code/ee-gdk/gitlab/.gitlab_kas_secret
```
1. Create a `token.txt`. This is the token for
[the agent you created](../../user/clusters/agent/index.md#create-an-agent-record-in-gitlab). This file must not contain a newline character. You can create the file with this command:
```shell
echo -n "<TOKEN>" > token.txt
```
1. Start the binaries with the following commands:
```shell
# Need GitLab to start
gdk start
# Stop GDK's version of kas
gdk stop gitlab-k8s-agent
# Start kas
bazel run //cmd/kas -- --configuration-file="$(pwd)/cfg.yaml"
```
1. In a new terminal window, run this command to start `agentk`:
```shell
bazel run //cmd/agentk -- --kas-address=grpc://127.0.0.1:8150 --token-file="$(pwd)/token.txt"
```
You can also inspect the
[Makefile](https://gitlab.com/gitlab-org/cluster-integration/gitlab-agent/-/blob/master/Makefile)
for more targets.
<i class="fa fa-youtube-play youtube" aria-hidden="true"></i>
To learn more about how the repository is structured, see
[GitLab Kubernetes Agent repository overview](https://www.youtube.com/watch?v=j8CyaCWroUY).
## Run tests locally
You can run all tests, or a subset of tests, locally.
- **To run all tests**: Run the command `make test`.
- **To run all test targets in the directory**: Run the command
`bazel test //internal/module/gitops/server:all`.
You can use `*` in the command, instead of `all`, but it must be quoted to
avoid shell expansion: `bazel test '//internal/module/gitops/server:*'`.
- **To run all tests in a directory and its subdirectories**: Run the command
`bazel test //internal/module/gitops/server/...`.
### Run specific test scenarios
To run only a specific test scenario, you need the directory name and the target
name of the test. For example, to run the tests at
`internal/module/gitops/server/module_test.go`, the `BUILD.bazel` file that
defines the test's target name lives at `internal/module/gitops/server/BUILD.bazel`.
In the latter, the target name is defined like:
```bazel
go_test(
name = "server_test",
size = "small",
srcs = [
"module_test.go",
```
The target name is `server_test` and the directory is `internal/module/gitops/server/`.
Run the test scenario with this command:
```shell
bazel test //internal/module/gitops/server:server_test
```
### Additional resources
- Bazel documentation about [specifying targets to build](https://docs.bazel.build/versions/master/guide.html#specifying-targets-to-build).
- [The Bazel query](https://docs.bazel.build/versions/master/query.html)
- [Bazel query how to](https://docs.bazel.build/versions/master/query-how-to.html)
## KAS QA tests
This section describes how to run KAS tests against different GitLab environments based on the
[GitLab QA orchestrator](https://gitlab.com/gitlab-org/gitlab-qa).
### Status
The `kas` QA tests currently have some limitations. You can run them manually on GDK, but they don't
run automatically with the nightly jobs against the live environment. See the section below
to learn how to run them against different environments.
### Prepare
Before performing any of these tests, if you have a `k3s` instance running, make sure to
stop it manually before running them. Otherwise, the tests might fail with the message
`failed to remove k3s cluster`.
You might need to specify the correct Agent image version that matches the `kas` image version. You can use the `GITLAB_AGENTK_VERSION` local environment for this.
### Against `staging`
1. Go to your local `qa/qa/service/cluster_provider/k3s.rb` and comment out
[this line](https://gitlab.com/gitlab-org/gitlab/-/blob/5b15540ea78298a106150c3a1d6ed26416109b9d/qa/qa/service/cluster_provider/k3s.rb#L8) and
[this line](https://gitlab.com/gitlab-org/gitlab/-/blob/5b15540ea78298a106150c3a1d6ed26416109b9d/qa/qa/service/cluster_provider/k3s.rb#L36).
We don't allow local connections on `staging` as they require an admin user.
1. Ensure you don't have an `EE_LICENSE` environment variable set as this would force an admin login.
1. Go to your GDK root folder and `cd gitlab/qa`.
1. Login with your user in staging and create a group to be used as sandbox.
Something like: `username-qa-sandbox`.
1. Create an access token for your user with the `api` permission.
1. Replace the values given below with your own and run:
```shell
GITLAB_SANDBOX_NAME="<THE GROUP ID YOU CREATED ON STEP 2>" \
GITLAB_QA_ACCESS_TOKEN="<THE ACCESS TOKEN YOU CREATED ON STEP 3>" \
GITLAB_USERNAME="<YOUR STAGING USERNAME>" \
GITLAB_PASSWORD="<YOUR STAGING PASSWORD>" \
bundle exec bin/qa Test::Instance::All https://staging.gitlab.com -- --tag quarantine qa/specs/features/ee/api/7_configure/kubernetes/kubernetes_agent_spec.rb
```
### Against GDK
1. Go to your `qa/qa/fixtures/kubernetes_agent/agentk-manifest.yaml.erb` and comment out [this line](https://gitlab.com/gitlab-org/gitlab/-/blob/a55b78532cfd29426cf4e5b4edda81407da9d449/qa/qa/fixtures/kubernetes_agent/agentk-manifest.yaml.erb#L27) and uncomment [this line](https://gitlab.com/gitlab-org/gitlab/-/blob/a55b78532cfd29426cf4e5b4edda81407da9d449/qa/qa/fixtures/kubernetes_agent/agentk-manifest.yaml.erb#L28).
GDK's `kas` listens on `grpc`, not on `wss`.
1. Go to the GDK's root folder and `cd gitlab/qa`.
1. On the contrary to staging, run the QA test in GDK as admin, which is the default choice. To do so, use the default sandbox group and run the command below. Make sure to adjust your credentials if necessary, otherwise, the test might fail:
```shell
GITLAB_USERNAME=root \
GITLAB_PASSWORD="5iveL\!fe" \
GITLAB_ADMIN_USERNAME=root \
GITLAB_ADMIN_PASSWORD="5iveL\!fe" \
bundle exec bin/qa Test::Instance::All http://gdk.test:3000 -- --tag quarantine qa/specs/features/ee/api/7_configure/kubernetes/kubernetes_agent_spec.rb
```
<!-- This redirect file can be deleted after <2022-06-24>. -->
<!-- Before deletion, see: https://docs.gitlab.com/ee/development/documentation/#move-or-rename-a-page -->
---
stage: Configure
group: Configure
info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#designated-technical-writers
redirect_to: 'https://gitlab.com/gitlab-org/cluster-integration/gitlab-agent/-/blob/master/doc/repository_overview.md'
remove_date: '2022-06-24'
---
# Kubernetes Agent repository overview **(PREMIUM SELF)**
This file was moved to [another location](https://gitlab.com/gitlab-org/cluster-integration/gitlab-agent/-/blob/master/doc/repository_overview.md).
This page describes the subfolders of the Kubernetes Agent repository.
[Development information](index.md) and
[end-user documentation](../../user/clusters/agent/index.md) are both available.
<i class="fa fa-youtube-play youtube" aria-hidden="true"></i>
For a video overview, see
[GitLab Kubernetes Agent repository overview](https://www.youtube.com/watch?v=j8CyaCWroUY).
## `build`
Various files for the build process.
### `build/deployment`
A [`kpt`](https://googlecontainertools.github.io/kpt/) package that bundles some
[Kustomize](https://kustomize.io/) layers and components. Can be used as-is, or
to create a custom package to install `agentk`.
## `cmd`
Commands are binaries that this repository produces. They are:
- `kas` is the GitLab Kubernetes Agent Server binary.
- `agentk` is the GitLab Kubernetes Agent binary.
Each of these directories contain application bootstrap code for:
- Reading configuration.
- Applying defaults to it.
- Constructing the dependency graph of objects that constitute the program.
- Running it.
### `cmd/agentk`
- `agentk` initialization logic.
- Implementation of the agent modules API.
### `cmd/kas`
- `kas` initialization logic.
- Implementation of the server modules API.
## `examples`
Git submodules for the example projects.
## `internal`
The main code of both `gitlab-kas` and `agentk`, and various supporting building blocks.
### `internal/api`
Structs that represent some important pieces of data.
### `internal/gitaly`
Items to work with [Gitaly](../../administration/gitaly/index.md).
### `internal/gitlab`
GitLab REST client.
### `internal/module`
Modules that implement server and agent-side functionality.
### `internal/tool`
Various building blocks. `internal/tool/testing` contains mocks and helpers
for testing. Mocks are generated with [`gomock`](https://pkg.go.dev/github.com/golang/mock).
## `it`
Contains scaffolding for integration tests. Unused at the moment.
## `pkg`
Contains exported packages.
### `pkg/agentcfg`
Contains protobuf definitions of the `agentk` configuration file. Used to configure
the agent through a configuration repository.
### `pkg/kascfg`
Contains protobuf definitions of the `gitlab-kas` configuration file. Contains an
example of that configuration file along with the test for it. The test ensures
the configuration file example is in sync with the protobuf definitions of the
file and defaults, which are applied when the file is loaded.
<!-- This redirect file can be deleted after <2022-06-24>. -->
<!-- Before deletion, see: https://docs.gitlab.com/ee/development/documentation/#move-or-rename-a-page -->
---
stage: Configure
group: Configure
info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#designated-technical-writers
redirect_to: 'https://gitlab.com/gitlab-org/cluster-integration/gitlab-agent/-/blob/master/doc/kas_request_routing.md'
remove_date: '2022-06-24'
---
# Routing `kas` requests in the Kubernetes Agent **(PREMIUM SELF)**
This file was moved to [another location](https://gitlab.com/gitlab-org/cluster-integration/gitlab-agent/-/blob/master/doc/kas_request_routing.md).
This document describes how `kas` routes requests to concrete `agentk` instances.
GitLab must talk to GitLab Kubernetes Agent Server (`kas`) to:
- Get information about connected agents. [Read more](https://gitlab.com/gitlab-org/gitlab/-/issues/249560).
- Interact with agents. [Read more](https://gitlab.com/gitlab-org/gitlab/-/issues/230571).
- Interact with Kubernetes clusters. [Read more](https://gitlab.com/gitlab-org/gitlab/-/issues/240918).
Each agent connects to an instance of `kas` and keeps an open connection. When
GitLab must talk to a particular agent, a `kas` instance connected to this agent must
be found, and the request routed to it.
## System design
For an architecture overview please see
[architecture.md](https://gitlab.com/gitlab-org/cluster-integration/gitlab-agent/-/blob/master/doc/architecture.md).
```mermaid
flowchart LR
subgraph "Kubernetes 1"
agentk1p1["agentk 1, Pod1"]
agentk1p2["agentk 1, Pod2"]
end
subgraph "Kubernetes 2"
agentk2p1["agentk 2, Pod1"]
end
subgraph "Kubernetes 3"
agentk3p1["agentk 3, Pod1"]
end
subgraph kas
kas1["kas 1"]
kas2["kas 2"]
kas3["kas 3"]
end
GitLab["GitLab Rails"]
Redis
GitLab -- "gRPC to any kas" --> kas
kas1 -- register connected agents --> Redis
kas2 -- register connected agents --> Redis
kas1 -- lookup agent --> Redis
agentk1p1 -- "gRPC" --> kas1
agentk1p2 -- "gRPC" --> kas2
agentk2p1 -- "gRPC" --> kas1
agentk3p1 -- "gRPC" --> kas2
```
For this architecture, this diagram shows a request to `agentk 3, Pod1` for the list of pods:
```mermaid
sequenceDiagram
GitLab->>+kas1: Get list of running<br />Pods from agentk<br />with agent_id=3
Note right of kas1: kas1 checks for<br />agent connected with agent_id=3.<br />It does not.<br />Queries Redis
kas1->>+Redis: Get list of connected agents<br />with agent_id=3
Redis-->-kas1: List of connected agents<br />with agent_id=3
Note right of kas1: kas1 picks a specific agentk instance<br />to address and talks to<br />the corresponding kas instance,<br />specifying which agentk instance<br />to route the request to.
kas1->>+kas2: Get the list of running Pods<br />from agentk 3, Pod1
kas2->>+agentk 3 Pod1: Get list of Pods
agentk 3 Pod1->>-kas2: Get list of Pods
kas2-->>-kas1: List of running Pods<br />from agentk 3, Pod1
kas1-->>-GitLab: List of running Pods<br />from agentk with agent_id=3
```
Each `kas` instance tracks the agents connected to it in Redis. For each agent, it
stores a serialized protobuf object with information about the agent. When an agent
disconnects, `kas` removes all corresponding information from Redis. For both events,
`kas` publishes a notification to a Redis [pub-sub channel](https://redis.io/topics/pubsub).
Each agent, while logically a single entity, can have multiple replicas (multiple pods)
in a cluster. `kas` accommodates that and records per-replica (generally per-connection)
information. Each open `GetConfiguration()` streaming request is given
a unique identifier which, combined with agent ID, identifies an `agentk` instance.
gRPC can keep multiple TCP connections open for a single target host. `agentk` only
runs one `GetConfiguration()` streaming request. `kas` uses that connection, and
doesn't see idle TCP connections because they are handled by the gRPC framework.
Each `kas` instance provides information to Redis, so other `kas` instances can discover and access it.
Information is stored in Redis with an [expiration time](https://redis.io/commands/expire),
to expire information for `kas` instances that become unavailable. To prevent
information from expiring too quickly, `kas` periodically updates the expiration time
for valid entries. Before terminating, `kas` cleans up the information it adds into Redis.
When `kas` must atomically update multiple data structures in Redis, it uses
[transactions](https://redis.io/topics/transactions) to ensure data consistency.
Grouped data items must have the same expiration time.
In addition to the existing `agentk -> kas` gRPC endpoint, `kas` exposes two new,
separate gRPC endpoints for GitLab and for `kas -> kas` requests. Each endpoint
is a separate network listener, making it easier to control network access to endpoints
and allowing separate configuration for each endpoint.
Databases, like PostgreSQL, aren't used because the data is transient, with no need
to reliably persist it.
### `GitLab : kas` external endpoint
GitLab authenticates with `kas` using JWT and the same shared secret used by the
`kas -> GitLab` communication. The JWT issuer should be `gitlab` and the audience
should be `gitlab-kas`.
When accessed through this endpoint, `kas` plays the role of request router.
If a request from GitLab comes but no connected agent can handle it, `kas` blocks
and waits for a suitable agent to connect to it or to another `kas` instance. It
stops waiting when the client disconnects, or when some long timeout happens, such
as client timeout. `kas` is notified of new agent connections through a
[pub-sub channel](https://redis.io/topics/pubsub) to avoid frequent polling.
When a suitable agent connects, `kas` routes the request to it.
### `kas : kas` internal endpoint
This endpoint is an implementation detail, an internal API, and should not be used
by any other system. It's protected by JWT using a secret, shared among all `kas`
instances. No other system must have access to this secret.
When accessed through this endpoint, `kas` uses the request itself to determine
which `agentk` to send the request to. It prevents request cycles by only following
the instructions in the request, rather than doing discovery. It's the responsibility
of the `kas` receiving the request from the _external_ endpoint to retry and re-route
requests. This method ensures a single central component for each request can determine
how a request is routed, rather than distributing the decision across several `kas` instances.
### Reverse gRPC tunnel
This section explains how the `agentk` -> `kas` reverse gRPC tunnel is implemented.
<i class="fa fa-youtube-play youtube" aria-hidden="true"></i>
For a video overview of how some of the blocks map to code, see
[GitLab Kubernetes Agent reverse gRPC tunnel architecture and code overview
](https://www.youtube.com/watch?v=9pnQF76hyZc).
#### High level schema
In this example, `Server side of module A` exposes its API to get the `Pod` list
on the `Public API gRPC server`. When it receives a request, it must determine
the agent ID from it, then call the proxying code which forwards the request to
a suitable `agentk` that can handle it.
The `Agent side of module A` exposes the same API on the `Internal gRPC server`.
When it receives the request, it needs to handle it (such as retrieving and returning
the `Pod` list).
This schema describes how reverse tunneling is handled fully transparently
for modules, so you can add new features:
```mermaid
graph TB
subgraph kas
server-internal-grpc-server[Internal gRPC server]
server-api-grpc-server[Public API gRPC server]
server-module-a[Server side of module A]
server-module-b[Server side of module B]
end
subgraph agentk
agent-internal-grpc-server[Internal gRPC server]
agent-module-a[Agent side of module A]
agent-module-b[Agent side of module B]
end
agent-internal-grpc-server -- request --> agent-module-a
agent-internal-grpc-server -- request --> agent-module-b
server-module-a-. expose API on .-> server-internal-grpc-server
server-module-b-. expose API on .-> server-api-grpc-server
server-internal-grpc-server -- proxy request --> agent-internal-grpc-server
server-api-grpc-server -- proxy request --> agent-internal-grpc-server
```
#### Implementation schema
`HandleTunnelConnection()` is called with the server-side interface of the reverse
tunnel. It registers the connection and blocks, waiting for a request to proxy
through the connection.
`HandleIncomingConnection()` is called with the server-side interface of the incoming
connection. It registers the connection and blocks, waiting for a matching tunnel
to proxy the connection through.
After it has two connections that match, `Connection registry` starts bi-directional
data streaming:
```mermaid
graph TB
subgraph kas
server-tunnel-module[Server tunnel module]
connection-registry[Connection registry]
server-internal-grpc-server[Internal gRPC server]
server-api-grpc-server[Public API gRPC server]
server-module-a[Server side of module A]
server-module-b[Server side of module B]
end
subgraph agentk
agent-internal-grpc-server[Internal gRPC server]
agent-tunnel-module[Agent tunnel module]
agent-module-a[Agent side of module A]
agent-module-b[Agent side of module B]
end
server-tunnel-module -- "HandleTunnelConnection()" --> connection-registry
server-internal-grpc-server -- "HandleIncomingConnection()" --> connection-registry
server-api-grpc-server -- "HandleIncomingConnection()" --> connection-registry
server-module-a-. expose API on .-> server-internal-grpc-server
server-module-b-. expose API on .-> server-api-grpc-server
agent-tunnel-module -- "establish tunnel, receive request" --> server-tunnel-module
agent-tunnel-module -- make request --> agent-internal-grpc-server
agent-internal-grpc-server -- request --> agent-module-a
agent-internal-grpc-server -- request --> agent-module-b
```
### API definitions
- [`agent_tracker/agent_tracker.proto`](https://gitlab.com/gitlab-org/cluster-integration/gitlab-agent/-/blob/master/internal/module/agent_tracker/agent_tracker.proto)
- [`agent_tracker/rpc/rpc.proto`](https://gitlab.com/gitlab-org/cluster-integration/gitlab-agent/-/blob/master/internal/module/agent_tracker/rpc/rpc.proto)
- [`reverse_tunnel/rpc/rpc.proto`](https://gitlab.com/gitlab-org/cluster-integration/gitlab-agent/-/blob/master/internal/module/reverse_tunnel/rpc/rpc.proto)
<!-- This redirect file can be deleted after <2022-06-24>. -->
<!-- Before deletion, see: https://docs.gitlab.com/ee/development/documentation/#move-or-rename-a-page -->
---
stage: Configure
group: Configure
info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#designated-technical-writers
redirect_to: 'https://gitlab.com/gitlab-org/cluster-integration/gitlab-agent/-/blob/master/doc/user_stories.md'
remove_date: '2022-06-24'
---
# Kubernetes Agent user stories **(PREMIUM SELF)**
This file was moved to [another location](https://gitlab.com/gitlab-org/cluster-integration/gitlab-agent/-/blob/master/doc/user_stories.md).
The [personas in action](https://about.gitlab.com/handbook/marketing/strategic-marketing/roles-personas/#user-personas)
for the Kubernetes Agent are:
- [Sasha, the Software Developer](https://about.gitlab.com/handbook/marketing/strategic-marketing/roles-personas/#sasha-software-developer).
- [Allison, the Application Operator](https://about.gitlab.com/handbook/marketing/strategic-marketing/roles-personas/#allison-application-ops).
- [Priyanka, the Platform Engineer](https://about.gitlab.com/handbook/marketing/strategic-marketing/roles-personas/#priyanka-platform-engineer).
[Devon, the DevOps engineer](https://about.gitlab.com/handbook/marketing/strategic-marketing/roles-personas/#devon-devops-engineer)
is intentionally excluded here, as DevOps is more of a role than a persona.
There are various workflows to support, so some user stories might seem to contradict each other. They don't.
## Software Developer user stories
<!-- vale gitlab.FirstPerson = NO -->
- As a Software Developer, I want to push my code, and move to the next development task,
to work on business applications.
- As a Software Developer, I want to set necessary dependencies and resource requirements
together with my application code, so my code runs fine after deployment.
<!-- vale gitlab.FirstPerson = YES -->
## Application Operator user stories
<!-- vale gitlab.FirstPerson = NO -->
- As an Application Operator, I want to standardize the deployments used by my teams,
so I can support all teams with minimal effort.
- As an Application Operator, I want to have a single place to define all the deployments,
so I can assure security fixes are applied everywhere.
- As an Application Operator, I want to offer a set of predefined templates to
Software Developers, so they can get started quickly and can deploy to production
without my intervention, and I am not a bottleneck.
- As an Application Operator, I want to know exactly what changes are being deployed,
so I can fulfill my SLAs.
- As an Application Operator, I want deep insights into what versions of my applications
are running and want to be able to debug them, so I can fix operational issues.
- As an Application Operator, I want application code to be automatically deployed
to staging environments when new versions are available.
- As an Application Operator, I want to follow my preferred deployment strategy,
so I can move code into production in a reliable way.
- As an Application Operator, I want review all code before it's deployed into production,
so I can fulfill my SLAs.
- As an Application Operator, I want to be notified before deployment when new code needs my attention,
so I can review it swiftly.
<!-- vale gitlab.FirstPerson = YES -->
## Platform Engineer user stories
<!-- vale gitlab.FirstPerson = NO -->
- As a Platform Engineer, I want to restrict customizations to preselected values
for Operators, so I can fulfill my SLAs.
- As a Platform Engineer, I want to allow some level of customization to Operators,
so I don't become a bottleneck.
- As a Platform Engineer, I want to define all deployments in a single place, so
I can assure security fixes are applied everywhere.
- As a Platform Engineer, I want to define the infrastructure by code, so my
infrastructure management is testable, reproducible, traceable, and scalable.
- As a Platform Engineer, I want to define various policies that applications must
follow, so that I can fulfill my SLAs.
- As a Platform Engineer, I want approved tooling for log management and persistent storage,
so I can scale, secure, and manage them as needed.
- As a Platform Engineer, I want to be alerted when my infrastructure differs from
its definition, so I can make sure that everything is configured as expected.
<!-- vale gitlab.FirstPerson = YES -->
<!-- This redirect file can be deleted after <2022-06-24>. -->
<!-- Before deletion, see: https://docs.gitlab.com/ee/development/documentation/#move-or-rename-a-page -->
......@@ -23,7 +23,7 @@ tasks in a secure and cloud-native way. It enables:
- [CI/CD Tunnel](ci_cd_tunnel.md) that enables users to access Kubernetes clusters from GitLab CI/CD jobs even if there is no network connectivity between GitLab Runner and a cluster.
Many more features are planned. Please review [our roadmap](https://gitlab.com/groups/gitlab-org/-/epics/3329)
and [our development documentation](../../../development/agent/index.md).
and [our development documentation](https://gitlab.com/gitlab-org/cluster-integration/gitlab-agent/-/tree/master/doc).
## GitLab Agent GitOps workflow
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment