Commit ea4ad773 authored by Craig Norris's avatar Craig Norris

Merge branch '354845-edit-workhorse-page' into 'master'

Polish the new-features Workhorse page

See merge request gitlab-org/gitlab!84762
parents ae14c73f f841b505
......@@ -7,40 +7,72 @@ info: To determine the technical writer assigned to the Stage/Group associated w
# Adding new features to Workhorse
GitLab Workhorse is a smart reverse proxy for GitLab. It handles
"long" HTTP requests such as file downloads, file uploads, Git
push/pull and Git archive downloads.
Workhorse itself is not a feature, but there are [several features in GitLab](gitlab_features.md) that would not work efficiently without Workhorse.
At a first glance, it may look like Workhorse is just a pipeline for processing HTTP streams so that you can reduce the amount of logic in your Ruby on Rails controller, but there are good reasons to avoid treating it like that.
Engineers embarking on the quest of offloading a feature to Workhorse often find that the endeavor is much higher than what originally anticipated. In part because of the new programming language (only a few engineers at GitLab are Go developers), in part because of the demanding requirements for Workhorse. Workhorse is stateless, memory and disk usage must be kept under tight control, and the request should not be slowed down in the process.
## Can I add a new feature to Workhorse?
We suggest to follow this route only if absolutely necessary and no other options are available.
Splitting a feature between the Rails code-base and Workhorse is deliberately choosing to introduce technical debt. It adds complexity to the system and coupling between the two components.
- Building features using Workhorse has a considerable complexity cost, so you should prefer designs based on Rails requests and Sidekiq jobs.
- Even when using Rails+Sidekiq is "more work" than using Rails+Workhorse, Rails+Sidekiq is easier to maintain in the long term because Workhorse is unique to GitLab while Rails+Sidekiq is an industry standard.
- For "global" behaviors around web requests consider using a Rack middleware instead of Workhorse.
- Generally speaking, we should only use Rails+Workhorse if the HTTP client expects behavior that is not reasonable to implement in Rails, like "long" requests.
## What is a "long" request?
There is one order of magnitude between Workhorse and Puma RAM usage. Having connection open for a period longer than milliseconds is a problem because of the amount of RAM it monopolizes once it reaches the Ruby on Rails controller.
So far we identified two classes of "long" requests: data transfers and HTTP long polling.
`git push`, `git pull`, uploading or downloading an artifact, the CI runner waiting for a new job are all good examples of long requests.
With the rise of cloud-native installations, Workhorse's feature-set was extended to add object storage direct-upload, to get rid of the shared Network File System (NFS) drives.
In 2020 @nolith presented at FOSDEM a talk titled [_Speed up the monolith. Building a smart reverse proxy in Go_](https://archive.fosdem.org/2020/schedule/event/speedupmonolith/).
You can watch the recording for more details on the history of Workhorse and the NFS removal.
[Uploads development documentation](../uploads.md)
contains the most common use-cases for adding a new type of upload and may answer all of your questions.
If you still think we should add a new feature to Workhorse, please open an issue explaining **what you want to implement** and **why it can't be implemented in our Ruby code-base**. Workhorse maintainers will be happy to help you assessing the situation.
[long HTTP requests](#what-are-long-requests), such as:
- File downloads.
- File uploads.
- Git pushes and pulls.
- Git archive downloads.
Workhorse itself is not a feature, but [several features in GitLab](gitlab_features.md)
would not work efficiently without Workhorse.
At a first glance, Workhorse appears to be just a pipeline for processing HTTP
streams to reduce the amount of logic in your Ruby on Rails controller. However,
don't treat it that way. Engineers trying to offload a feature to Workhorse often
find it takes more work than originally anticipated:
- It's a new programming language, and only a few engineers at GitLab are Go developers.
- Workhorse has demanding requirements:
- It's stateless.
- Memory and disk usage must be kept under tight control.
- The request should not be slowed down in the process.
## Avoid adding new features
We suggest adding new features only if absolutely necessary and no other options exist.
Splitting a feature between the Rails codebase and Workhorse is a deliberate choice
to introduce technical debt. It adds complexity to the system, and coupling between
the two components:
- Building features using Workhorse has a considerable complexity cost, so you should
prefer designs based on Rails requests and Sidekiq jobs.
- Even when using Rails-and-Sidekiq is more work than using Rails-and-Workhorse,
Rails-and-Sidekiq is easier to maintain in the long term. Workhorse is unique
to GitLab, while Rails-and-Sidekiq is an industry standard.
- For global behaviors around web requests, consider using a Rack middleware
instead of Workhorse.
- Generally speaking, use Rails-and-Workhorse only if the HTTP client expects
behavior reasonable to implement in Rails, like long requests.
## What are long requests?
One order of magnitude exists between Workhorse and Puma RAM usage. Having a connection
open for longer than milliseconds is problematic due to the amount of RAM
it monopolizes after it reaches the Ruby on Rails controller. We've identified two classes
of long requests: data transfers and HTTP long polling. Some examples:
- `git push`.
- `git pull`.
- Uploading or downloading an artifact.
- A CI runner waiting for a new job.
With the rise of cloud-native installations, Workhorse's feature set was extended
to add object storage direct-upload. This change removed the need for the shared
Network File System (NFS) drives.
If you still think we should add a new feature to Workhorse, open an issue for the
Workhorse maintainers and explain:
1. What you want to implement.
1. Why it can't be implemented in our Ruby codebase.
The Workhorse maintainers can help you assess the situation.
## Related topics
- In 2020, `@nolith` presented the talk
["Speed up the monolith. Building a smart reverse proxy in Go"](https://archive.fosdem.org/2020/schedule/event/speedupmonolith/)
at FOSDEM. The talk includes more details on the history of Workhorse and the NFS removal.
- The [uploads development documentation](../uploads.md) contains the most common
use cases for adding a new type of upload.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment