# gitlab-workhorse

Gitlab-workhorse is a smart reverse proxy for GitLab. It handles
"large" HTTP requests such as file downloads, file uploads, Git
push/pull and Git archive downloads.

## Quick facts (how does Workhorse work)

-   Workhorse can handle some requests without involving Rails at all:
    for example, JavaScript files and CSS files are served straight
    from disk.
-   Workhorse can modify responses sent by Rails: for example if you use
    `send_file` in Rails then gitlab-workhorse will open the file on
    disk and send its contents as the response body to the client.
-   Workhorse can take over requests after asking permission from Rails.
    Example: handling `git clone`.
-   Workhorse can modify requests before passing them to Rails. Example:
    when handling a Git LFS upload Workhorse first asks permission from
    Rails, then it stores the request body in a tempfile, then it sends
    a modified request containing the tempfile path to Rails.
-   Workhorse can manage long-lived WebSocket connections for Rails.
    Example: handling the terminal websocket for environments.
-   Workhorse does not connect to Postgres, only to Rails and (optionally) Redis.
-   We assume that all requests that reach Workhorse pass through an
    upstream proxy such as NGINX or Apache first.
-   Workhorse does not accept HTTPS connections.
-   Workhorse does not clean up idle client connections.
-   We assume that all requests to Rails pass through Workhorse.

For more information see ['A brief history of
gitlab-workhorse'][brief-history-blog].

## Usage

```
  gitlab-workhorse [OPTIONS]

Options:
  -apiCiLongPollingDuration duration
      Long polling duration for job requesting for runners (default 50s - enabled) (default 50ns)
  -apiLimit uint
      Number of API requests allowed at single time
  -apiQueueDuration duration
      Maximum queueing duration of requests (default 30s)
  -apiQueueLimit uint
      Number of API requests allowed to be queued
  -authBackend string
      Authentication/authorization backend (default "http://localhost:8080")
  -authSocket string
      Optional: Unix domain socket to dial authBackend at
  -cableBackend string
      Optional: ActionCable backend (default authBackend)
  -cableSocket string
      Optional: Unix domain socket to dial cableBackend at (default authSocket)
  -config string
      TOML file to load config from
  -developmentMode
      Allow the assets to be served from Rails app
  -documentRoot string
      Path to static files content (default "public")
  -listenAddr string
      Listen address for HTTP server (default "localhost:8181")
  -listenNetwork string
      Listen 'network' (tcp, tcp4, tcp6, unix) (default "tcp")
  -listenUmask int
      Umask for Unix socket
  -logFile string
      Log file location
  -logFormat string
      Log format to use defaults to text (text, json, structured, none) (default "text")
  -pprofListenAddr string
      pprof listening address, e.g. 'localhost:6060'
  -prometheusListenAddr string
      Prometheus listening address, e.g. 'localhost:9229'
  -proxyHeadersTimeout duration
      How long to wait for response headers when proxying the request (default 5m0s)
  -secretPath string
      File with secret key to authenticate with authBackend (default "./.gitlab_workhorse_secret")
  -version
      Print version and exit
```

The 'auth backend' refers to the GitLab Rails application. The name is
a holdover from when gitlab-workhorse only handled Git push/pull over
HTTP.

Gitlab-workhorse can listen on either a TCP or a Unix domain socket. It
can also open a second listening TCP listening socket with the Go
[net/http/pprof profiler server](http://golang.org/pkg/net/http/pprof/).

Gitlab-workhorse can listen on redis events (currently only builds/register
for runners). This requires you to pass a valid TOML config file via
`-config` flag.
For regular setups it only requires the following (replacing the string
with the actual socket)

### Redis

Gitlab-workhorse integrates with Redis to do long polling for CI build
requests. This is configured via two things:

-   Redis settings in the TOML config file
-   The `-apiCiLongPollingDuration` command line flag to control polling
    behavior for CI build requests

It is OK to enable Redis in the config file but to leave CI polling
disabled; this just results in an idle Redis pubsub connection. The
opposite is not possible: CI long polling requires a correct Redis
configuration.

Below we discuss the options for the `[redis]` section in the config
file.

```
[redis]
URL = "unix:///var/run/gitlab/redis.sock"
Password = "my_awesome_password"
Sentinel = [ "tcp://sentinel1:23456", "tcp://sentinel2:23456" ]
SentinelMaster = "mymaster"
```

- `URL` takes a string in the format `unix://path/to/redis.sock` or
`tcp://host:port`.
- `Password` is only required if your redis instance is password-protected
- `Sentinel` is used if you are using Sentinel.
  *NOTE* that if both `Sentinel` and `URL` are given, only `Sentinel` will be used

Optional fields are as follows:
```
[redis]
DB = 0
ReadTimeout = "1s"
KeepAlivePeriod = "5m"
MaxIdle = 1
MaxActive = 1
```

- `DB` is the Database to connect to. Defaults to `0`
- `ReadTimeout` is how long a redis read-command can take. Defaults to `1s`
- `KeepAlivePeriod` is how long the redis connection is to be kept alive without anything flowing through it. Defaults to `5m`
- `MaxIdle` is how many idle connections can be in the redis-pool at once. Defaults to 1
- `MaxActive` is how many connections the pool can keep. Defaults to 1

### Relative URL support

If you are mounting GitLab at a relative URL, e.g.
`example.com/gitlab`, then you should also use this relative URL in
the `authBackend` setting:

```
gitlab-workhorse -authBackend http://localhost:8080/gitlab
```

### Interaction of authBackend and authSocket

The interaction between `authBackend` and `authSocket` can be a bit
confusing. It comes down to: if `authSocket` is set it overrides the
_host_ part of `authBackend` but not the relative path.

In table form:

|authBackend|authSocket|Workhorse connects to?|Rails relative URL|
|---|---|---|---|
|unset|unset|`localhost:8080`|`/`|
|`http://localhost:3000`|unset|`localhost:3000`|`/`|
|`http://localhost:3000/gitlab`|unset|`localhost:3000`|`/gitlab`|
|unset|`/path/to/socket`|`/path/to/socket`|`/`|
|`http://localhost:3000`|`/path/to/socket`|`/path/to/socket`|`/`|
|`http://localhost:3000/gitlab`|`/path/to/socket`|`/path/to/socket`|`/gitlab`|

The same applies to `cableBackend` and `cableSocket`.

## Installation

To install gitlab-workhorse you need [Go 1.13 or
newer](https://golang.org/dl) and [GNU
Make](https://www.gnu.org/software/make/).

To install into `/usr/local/bin` run `make install`.

```
make install
```

To install into `/foo/bin` set the PREFIX variable.

```
make install PREFIX=/foo
```

On some operating systems, such as FreeBSD, you may have to use
`gmake` instead of `make`.

## Dependencies

### Exiftool

Workhorse uses [exiftool](https://www.sno.phy.queensu.ca/~phil/exiftool/) for
removing EXIF data (which may contain sensitive information) from uploaded
images. If you installed GitLab:

-   Using the Omnibus package, you're all set.
    *NOTE* that if you are using CentOS Minimal, you may need to install `perl`
    package: `yum install perl`
-   From source, make sure `exiftool` is installed:

    ```sh
    # Debian/Ubuntu
    sudo apt-get install libimage-exiftool-perl

    # RHEL/CentOS
    sudo yum install perl-Image-ExifTool
    ```

### GraphicsMagick (**experimental**)

Workhorse has an experimental feature that allows us to rescale images on-the-fly.
If you do not run Workhorse in a container where the `gm` tool is already installed,
you will have to install it on your host machine instead:

#### macOS

```sh
brew install graphicsmagick
```

#### Debian/Ubuntu

```sh
sudo apt-get install graphicsmagick
```

For installation on other platforms, please consult http://www.graphicsmagick.org/README.html.

Note that Omnibus containers already come with `gm` installed.

## Error tracking

GitLab-Workhorse supports remote error tracking with
[Sentry](https://sentry.io). To enable this feature set the
`GITLAB_WORKHORSE_SENTRY_DSN` environment variable.
You can also set the `GITLAB_WORKHORSE_SENTRY_ENVIRONMENT` environment variable to
use the Sentry environment functionality to separate staging, production and
development.

Omnibus (`/etc/gitlab/gitlab.rb`):

```
gitlab_workhorse['env'] = {
    'GITLAB_WORKHORSE_SENTRY_DSN' => 'https://foobar'
    'GITLAB_WORKHORSE_SENTRY_ENVIRONMENT' => 'production'
}
```

Source installations (`/etc/default/gitlab`):

```
export GITLAB_WORKHORSE_SENTRY_DSN='https://foobar'
export GITLAB_WORKHORSE_SENTRY_ENVIRONMENT='production'
```

## Tests

Run the tests with:

```
make clean test
```

### Coverage / what to test

Each feature in gitlab-workhorse should have an integration test that
verifies that the feature 'kicks in' on the right requests and leaves
other requests unaffected. It is better to also have package-level tests
for specific behavior but the high-level integration tests should have
the first priority during development.

It is OK if a feature is only covered by integration tests.

## Distributed Tracing

Workhorse supports distributed tracing through [LabKit][] using [OpenTracing APIs](https://opentracing.io).

By default, no tracing implementation is linked into the binary, but different OpenTracing providers can be linked in using [build tags][build-tags]/[build constraints][build-tags]. This can be done by setting the `BUILD_TAGS` make variable.

For more details of the supported providers, see LabKit, but as an example, for Jaeger tracing support, include the tags: `BUILD_TAGS="tracer_static tracer_static_jaeger"`.

```shell
make BUILD_TAGS="tracer_static tracer_static_jaeger"
```

Once Workhorse is compiled with an opentracing provider, the tracing configuration is configured via the `GITLAB_TRACING` environment variable.

For example:

```shell
GITLAB_TRACING=opentracing://jaeger ./gitlab-workhorse
```

## Continuous Profiling

Workhorse supports continuous profiling through [LabKit][] using [Stackdriver Profiler](https://cloud.google.com/profiler).

By default, the Stackdriver Profiler implementation is linked in the binary using [build tags][build-tags], though it's not
required and can be skipped.

For example:

```shell
make BUILD_TAGS=""
```

Once Workhorse is compiled with Continuous Profiling, the profiler configuration can be set via `GITLAB_CONTINUOUS_PROFILING`
environment variable.

For example:

```shell
GITLAB_CONTINUOUS_PROFILING="stackdriver?service=workhorse&service_version=1.0.1&project_id=test-123 ./gitlab-workhorse"
```

More information about see the [LabKit monitoring docs](https://gitlab.com/gitlab-org/labkit/-/blob/master/monitoring/doc.go).

## License

This code is distributed under the MIT license, see the [LICENSE](LICENSE) file.

[brief-history-blog]: https://about.gitlab.com/2016/04/12/a-brief-history-of-gitlab-workhorse/
[LabKit]: https://gitlab.com/gitlab-org/labkit/
[build-tags]: https://golang.org/pkg/go/build/#hdr-Build_Constraints