Commit dc7a5fb6 authored by GitLab Bot's avatar GitLab Bot

Automatic merge of gitlab-org/gitlab master

parents 843b689e cf057764
---
title: Apply GZip compression to discussion diffs
merge_request: 40778
author:
type: performance
......@@ -238,12 +238,13 @@ this, replace value of the `POSTGRESQL_SERVER_ADDRESS` with corresponding IP or
address of the PgBouncer instance.
This documentation doesn't provide PgBouncer installation instructions,
you can:
but you can:
- Find instructions on the [official website](https://www.pgbouncer.org/install.html).
- Use a [Docker image](https://hub.docker.com/r/edoburu/pgbouncer/).
In addition to base PgBouncer configuration options, set the following values:
In addition to the base PgBouncer configuration options, set the following values in
your `pgbouncer.ini` file:
- The [Praefect PostgreSQL database](#postgresql) in the `[databases]` section:
......
......@@ -18,6 +18,7 @@ GitLab has been tested on a number of object storage providers:
- [Digital Ocean Spaces](https://www.digitalocean.com/products/spaces/)
- [Oracle Cloud Infrastructure](https://docs.cloud.oracle.com/en-us/iaas/Content/Object/Tasks/s3compatibleapi.htm)
- [Openstack Swift](https://docs.openstack.org/swift/latest/s3_compat.html)
- [Azure Blob storage](https://docs.microsoft.com/en-us/azure/storage/blobs/storage-blobs-introduction)
- On-premises hardware and appliances from various storage vendors.
- MinIO. We have [a guide to deploying this](https://docs.gitlab.com/charts/advanced/external-object-storage/minio.html) within our Helm Chart documentation.
......@@ -158,7 +159,6 @@ See the section on [ETag mismatch errors](#etag-mismatch) for more details.
```toml
[object_storage]
enabled = true
provider = "AWS"
[object_storage.s3]
......@@ -272,6 +272,61 @@ gitlab_rails['object_store']['connection'] = {
}
```
#### Azure Blob storage
> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/25877) in GitLab 13.4.
Although Azure uses the word `container` to denote a collection of
blobs, GitLab standardizes on the term `bucket`. Be sure to configure
Azure container names in the `bucket` settings.
The following are the valid connection parameters for Azure. Read the
[Azure Blob storage documentation](https://docs.microsoft.com/en-us/azure/storage/blobs/storage-blobs-introduction)
to learn more.
| Setting | Description | Example |
|---------|-------------|---------|
| `provider` | Provider name | `AzureRM` |
| `azure_storage_account_name` | Name of the Azure Blob Storage account used to access the storage | `azuretest` |
| `azure_storage_access_key` | Storage account access key used to access the container. This is typically a secret, 512-bit encryption key encoded in base64. | `"czV2OHkvQj9FKEgrTWJRZVRoV21ZcTN0Nnc5eiRDJkYpSkBOY1JmVWpYbjJy\nNHU3eCFBJUQqRy1LYVBkU2dWaw==\n"` |
| `azure_storage_domain` | Domain name used to contact the Azure Blob Storage API (optional). Defaults to `blob.core.windows.net`. Set this if you are using Azure China, Azure Germany, Azure US Government, or some other custom Azure domain. | `blob.core.windows.net` |
##### Azure example (consolidated form)
For Omnibus installations, this is an example of the `connection` setting:
```ruby
gitlab_rails['object_store']['connection'] = {
'provider' => 'AzureRM',
'azure_storage_account_name' => '<AZURE STORAGE ACCOUNT NAME>',
'azure_storage_access_key' => '<AZURE STORAGE ACCESS KEY>',
'azure_storage_domain' => '<AZURE STORAGE DOMAIN>',
}
```
###### Azure Workhorse settings (source installs only)
NOTE: **Note:**
For source installations, Workhorse needs to be configured with the
Azure credentials as well. This is not needed in Omnibus installs because
the Workhorse settings are populated from the settings above.
1. Edit `/home/git/gitlab-workhorse/config.toml` and add or amend the following lines:
```toml
[object_storage]
provider = "AzureRM"
[object_storage.azurerm]
azure_storage_account_name = "<AZURE STORAGE ACCOUNT NAME>"
azure_storage_access_key = "<AZURE STORAGE ACCESS KEY>"
```
If you are using a custom Azure storage domain, note that
`azure_storage_domain` does **not** have to be set in the Workhorse
configuration. This information is exchanged in an API call between
GitLab Rails and Workhorse.
#### OpenStack-compatible connection settings
NOTE: **Note:**
......
......@@ -33,7 +33,6 @@ bundle exec rake gitlab:doctor:secrets RAILS_ENV=production
**Example output**
<!-- vale gitlab.SentenceSpacing = NO -->
```plaintext
I, [2020-06-11T17:17:54.951815 #27148] INFO -- : Checking encrypted values in the database
I, [2020-06-11T17:18:12.677708 #27148] INFO -- : - ApplicationSetting failures: 0
......@@ -45,7 +44,6 @@ I, [2020-06-11T17:18:15.575533 #27148] INFO -- : - ScimOauthAccessToken failure
I, [2020-06-11T17:18:15.575678 #27148] INFO -- : Total: 1 row(s) affected
I, [2020-06-11T17:18:15.575711 #27148] INFO -- : Done!
```
<!-- vale gitlab.SentenceSpacing = YES -->
### Verbose mode
......
......@@ -72,7 +72,7 @@ Use these instructions for exploring the GitLab database while developing with t
1. **Port number to connect to**: `5432` (default).
1. <!-- vale gitlab.Spelling = NO -->
**Use an ssl connection?**
<!-- vale gitlab.rulename = NO --> This depends on your installation. Options are:
<!-- vale gitlab.Spelling = YES --> This depends on your installation. Options are:
- **Use Secure Connection**
- **Standard Connection** (default)
1. **(Optional) The database to connect to**: `gitlabhq_development`.
......
......@@ -701,9 +701,9 @@ To configure markdownlint within your editor, install one of the following as ap
To configure Vale within your editor, install one of the following as appropriate:
- The Sublime Text [`SublimeLinter-contrib-vale` plugin](https://packagecontrol.io/packages/SublimeLinter-contrib-vale)
- The Visual Studio Code [`testthedocs.vale` extension](https://marketplace.visualstudio.com/items?itemName=testthedocs.vale)
- [Vim](https://github.com/dense-analysis/ale)
- The Sublime Text [`SublimeLinter-contrib-vale` plugin](https://packagecontrol.io/packages/SublimeLinter-contrib-vale).
- The Visual Studio Code [`errata-ai.vale-server` extension](https://marketplace.visualstudio.com/items?itemName=errata-ai.vale-server). You don't need Vale Server to use the plugin.
- [Vim](https://github.com/dense-analysis/ale).
We don't use [Vale Server](https://errata-ai.github.io/vale/#using-vale-with-a-text-editor-or-another-third-party-application).
......@@ -736,9 +736,7 @@ document:
- To disable all Vale linting rules, add a `<!-- vale off -->` tag before the text, and a
`<!-- vale on -->` tag after the text.
Whenever possible, exclude only the problematic rule and line(s). In some cases, such as list items,
you may need to disable linting for the entire list until
[Vale issue #175](https://github.com/errata-ai/vale/issues/175) is resolved.
Whenever possible, exclude only the problematic rule and line(s).
For more information, see
[Vale's documentation](https://errata-ai.gitbook.io/vale/getting-started/markup#markup-based-configuration).
......
......@@ -497,7 +497,7 @@ tenses, words, and phrases:
- Instead of "e.g.," use "for example," "such as," "for instance," or "like."
- Instead of "etc.," either use "and so on" or consider editing it out, since
it can be vague.
<!-- vale gitlab.rulename = NO -->
<!-- vale gitlab.LatinTerms = YES -->
- Avoid using the word *currently* when talking about the product or its
features. The documentation describes the product as it is, and not as it
will be at some indeterminate point in the future.
......@@ -534,6 +534,9 @@ tenses, words, and phrases:
[user interfaces](https://design.gitlab.com/content/punctuation/#contractions).
(Tested in [`Contractions.yml`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/doc/.vale/gitlab/Contractions.yml).)
<!-- vale gitlab.ContractionsKeep = NO -->
<!-- vale gitlab.ContractionsDiscard = NO -->
<!-- vale gitlab.FutureTense = NO -->
| Do | Don't |
|----------|-----------|
| it's | it is |
......@@ -582,7 +585,9 @@ tenses, words, and phrases:
| Requests to localhost are not allowed | Requests to localhost aren't allowed |
| Specified URL cannot be used | Specified URL can't be used |
<!-- vale on -->
<!-- vale gitlab.ContractionsKeep = YES -->
<!-- vale gitlab.ContractionsDiscard = YES -->
<!-- vale gitlab.FutureTense = YES -->
## Text
......
......@@ -244,10 +244,23 @@ class GeoNode < ApplicationRecord
ContainerRepository.project_id_in(projects)
end
def container_repositories_include?(container_repository_id)
return false unless Geo::ContainerRepositoryRegistry.replication_enabled?
return true unless selective_sync?
container_repositories.where(id: container_repository_id).exists?
end
def designs
projects.with_designs
end
def designs_include?(project_id)
return true unless selective_sync?
designs.where(id: project_id).exists?
end
def lfs_objects
return LfsObject.all unless selective_sync?
......
---
title: 'Geo: Fix design repository failures with selective sync, and make container repository updates more robust'
merge_request: 40643
author:
type: fixed
......@@ -37,8 +37,14 @@ module Gitlab
yield if healthy_shard_for?(event)
end
def replicable_project?(project_id)
strong_memoize(:"replicable_project_#{project_id}") do
def replicable_project?
memoize_and_short_circuit_if_registry_is_persisted(:"replicable_project_#{event.project_id}", registry) do
Gitlab::Geo.current_node.projects_include?(event.project_id)
end
end
def memoize_and_short_circuit_if_registry_is_persisted(memoize_key, registry, &block)
strong_memoize(memoize_key) do
# If a registry exists, then it *should* be replicated. The
# registry will be removed by the delete event or
# RegistryConsistencyWorker if it should no longer be replicated.
......@@ -47,7 +53,7 @@ module Gitlab
# for repository updates which are a large proportion of events.
next true if registry.persisted?
Gitlab::Geo.current_node.projects_include?(project_id)
yield
end
end
......
......@@ -8,8 +8,9 @@ module Gitlab
include BaseEvent
def process
if should_sync?
if replicable_container_repository?
registry.repository_updated!
registry.save
job_id = ::Geo::ContainerRepositorySyncWorker.perform_async(event.container_repository_id)
end
......@@ -19,11 +20,21 @@ module Gitlab
private
def should_sync?
strong_memoize(:should_sync) do
::Geo::ContainerRepositoryRegistry.replication_enabled? &&
registry.container_repository &&
replicable_project?(registry.container_repository.project_id)
def replicable_container_repository?
id = event.container_repository_id
strong_memoize(:"replicable_container_repository_#{id}") do
next false unless ::Geo::ContainerRepositoryRegistry.replication_enabled?
# If a registry exists, then it *should* be replicated. The
# registry will be removed by the delete event or
# RegistryConsistencyWorker if it should no longer be replicated.
#
# This early exit helps keep processing of update events
# efficient.
next true if registry.persisted?
Gitlab::Geo.current_node.container_repositories_include?(id)
end
end
......@@ -38,10 +49,9 @@ module Gitlab
def log_event(job_id)
super(
'Docker Repository update',
container_repository_id: registry.container_repository_id,
should_sync: should_sync?,
container_repository_id: event.container_repository_id,
replication_enabled: ::Geo::ContainerRepositoryRegistry.replication_enabled?,
replicable_project: replicable_project?(registry.container_repository.project_id),
replicable_container_repository: replicable_container_repository?,
project_id: registry.container_repository.project_id,
job_id: job_id)
end
......
......@@ -9,7 +9,7 @@ module Gitlab
def process
job_id =
if replicable_project?(event.project_id)
if replicable_design?
registry.repository_updated!
registry.save
......@@ -25,6 +25,12 @@ module Gitlab
@registry ||= ::Geo::DesignRegistry.find_or_initialize_by(project_id: event.project_id) # rubocop: disable CodeReuse/ActiveRecord
end
def replicable_design?
memoize_and_short_circuit_if_registry_is_persisted(:"replicable_design_#{event.project_id}", registry) do
Gitlab::Geo.current_node.designs_include?(event.project_id)
end
end
def schedule_job(event)
enqueue_job_if_shard_healthy(event) do
::Geo::DesignRepositorySyncWorker.perform_async(event.project_id)
......@@ -36,7 +42,7 @@ module Gitlab
'Design repository update',
project_id: event.project_id,
scheduled_at: Time.now,
replicable_project: replicable_project?(event.project_id),
replicable_design: replicable_design?,
job_id: job_id)
end
end
......
......@@ -8,7 +8,7 @@ module Gitlab
include BaseEvent
def process
if replicable_project?(event.project_id)
if replicable_project?
registry.repository_created!(event)
job_id = nil
......@@ -31,7 +31,7 @@ module Gitlab
wiki_path: event.wiki_path,
resync_repository: registry.resync_repository,
resync_wiki: registry.resync_wiki,
replicable_project: replicable_project?(event.project_id),
replicable_project: replicable_project?,
job_id: job_id)
end
end
......
......@@ -8,7 +8,7 @@ module Gitlab
include BaseEvent
def process
if replicable_project?(event.project_id)
if replicable_project?
registry.repository_updated!(event.source, scheduled_at)
job_id = enqueue_job_if_shard_healthy(event) do
......@@ -33,7 +33,7 @@ module Gitlab
resync_repository: registry.resync_repository,
resync_wiki: registry.resync_wiki,
scheduled_at: scheduled_at,
replicable_project: replicable_project?(event.project_id),
replicable_project: replicable_project?,
job_id: job_id)
end
......
......@@ -30,25 +30,11 @@ RSpec.describe Gitlab::Geo::LogCursor::Events::ContainerRepositoryUpdatedEvent,
stub_config(geo: { registry_replication: { enabled: true } })
end
context "when the container repository's project is not excluded by selective sync" do
# TODO: Fix https://gitlab.com/gitlab-org/gitlab/-/issues/233514 and
# use this test, and remove the other tests in this context.
# it_behaves_like 'event should trigger a sync'
context 'when a registry does not yet exist' do
it_behaves_like 'event schedules a sync worker'
it_behaves_like 'logs event source info'
end
context 'when a registry exists' do
let!(:registry) { create(registry_factory, *registry_factory_args) }
it_behaves_like 'event schedules a sync worker'
it_behaves_like 'logs event source info'
end
context "when the container repository is not excluded by selective sync" do
it_behaves_like 'event should trigger a sync'
end
context "when the container repository's project is excluded by selective sync" do
context "when the container repository is excluded by selective sync" do
before do
stub_current_geo_node(secondary_excludes_all_projects)
end
......
......@@ -30,11 +30,24 @@ RSpec.describe Gitlab::Geo::LogCursor::Events::DesignRepositoryUpdatedEvent, :cl
allow(Gitlab::ShardHealthCache).to receive(:healthy_shard?).with('default').and_return(true)
end
context "when the container repository's project is not excluded by selective sync" do
context 'when the design repository is not excluded by selective sync' do
it_behaves_like 'event should trigger a sync'
context 'when the project is included in selective sync but there is no design' do
before do
node = create(:geo_node, selective_sync_type: 'shards', selective_sync_shards: [project.repository_storage])
stub_current_geo_node(node)
end
context 'when a registry does not yet exist' do
it_behaves_like 'event does not create a registry'
it_behaves_like 'event does not schedule a sync worker'
it_behaves_like 'logs event source info'
end
end
end
context "when the container repository's project is excluded by selective sync" do
context "when the design repository is excluded by selective sync" do
before do
stub_current_geo_node(secondary_excludes_all_projects)
end
......
......@@ -3,6 +3,7 @@
module Gitlab
module Diff
class HighlightCache
include Gitlab::Utils::Gzip
include Gitlab::Utils::StrongMemoize
EXPIRATION = 1.week
......@@ -83,7 +84,7 @@ module Gitlab
redis.hset(
key,
diff_file_id,
compose_data(highlighted_diff_lines_hash.to_json)
gzip_compress(highlighted_diff_lines_hash.to_json)
)
end
......@@ -145,35 +146,12 @@ module Gitlab
end
results.map! do |result|
Gitlab::Json.parse(extract_data(result), symbolize_names: true) unless result.nil?
Gitlab::Json.parse(gzip_decompress(result), symbolize_names: true) unless result.nil?
end
file_paths.zip(results).to_h
end
def compose_data(json_data)
# #compress returns ASCII-8BIT, so we need to force the encoding to
# UTF-8 before caching it in redis, else we risk encoding mismatch
# errors.
#
ActiveSupport::Gzip.compress(json_data).force_encoding("UTF-8")
rescue Zlib::GzipFile::Error
json_data
end
def extract_data(data)
# Since we could be dealing with an already populated cache full of data
# that isn't gzipped, we want to also check to see if the data is
# gzipped before we attempt to #decompress it, thus we check the first
# 2 bytes for "\x1F\x8B" to confirm it is a gzipped string. While a
# non-gzipped string will raise a Zlib::GzipFile::Error, which we're
# rescuing, we don't want to count on rescue for control flow.
#
data[0..1] == "\x1F\x8B" ? ActiveSupport::Gzip.decompress(data) : data
rescue Zlib::GzipFile::Error
data
end
def cacheable?(diff_file)
diffable.present? && diff_file.text? && diff_file.diffable?
end
......
......@@ -3,6 +3,8 @@
module Gitlab
module DiscussionsDiff
class HighlightCache
extend Gitlab::Utils::Gzip
class << self
VERSION = 1
EXPIRATION = 1.week
......@@ -17,7 +19,7 @@ module Gitlab
mapping.each do |raw_key, value|
key = cache_key_for(raw_key)
multi.set(key, value.to_json, ex: EXPIRATION)
multi.set(key, gzip_compress(value.to_json), ex: EXPIRATION)
end
end
end
......@@ -44,7 +46,7 @@ module Gitlab
content.map! do |lines|
next unless lines
Gitlab::Json.parse(lines).map! do |line|
Gitlab::Json.parse(gzip_decompress(lines)).map! do |line|
Gitlab::Diff::Line.safe_init_from_hash(line)
end
end
......
# frozen_string_literal: true
module Gitlab
module Utils
module Gzip
def gzip_compress(data)
# .compress returns ASCII-8BIT, so we need to force the encoding to
# UTF-8 before caching it in redis, else we risk encoding mismatch
# errors.
#
ActiveSupport::Gzip.compress(data).force_encoding("UTF-8")
rescue Zlib::GzipFile::Error
data
end
def gzip_decompress(data)
# Since we could be dealing with an already populated cache full of data
# that isn't gzipped, we want to also check to see if the data is
# gzipped before we attempt to .decompress it, thus we check the first
# 2 bytes for "\x1F\x8B" to confirm it is a gzipped string. While a
# non-gzipped string will raise a Zlib::GzipFile::Error, which we're
# rescuing, we don't want to count on rescue for control flow.
#
data[0..1] == "\x1F\x8B" ? ActiveSupport::Gzip.decompress(data) : data
rescue Zlib::GzipFile::Error
data
end
end
end
end
......@@ -33,9 +33,9 @@ RSpec.describe Gitlab::DiscussionsDiff::HighlightCache, :clean_gitlab_redis_cach
mapping.each do |key, value|
full_key = described_class.cache_key_for(key)
found = Gitlab::Redis::Cache.with { |r| r.get(full_key) }
found_key = Gitlab::Redis::Cache.with { |r| r.get(full_key) }
expect(found).to eq(value.to_json)
expect(described_class.gzip_decompress(found_key)).to eq(value.to_json)
end
end
end
......
# frozen_string_literal: true
require 'fast_spec_helper'
RSpec.describe Gitlab::Utils::Gzip do
before do
example_class = Class.new do
include Gitlab::Utils::Gzip
def lorem_ipsum
"Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod "\
"tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim "\
"veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea "\
"commodo consequat. Duis aute irure dolor in reprehenderit in voluptate "\
"velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat "\
"cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id "\
"est laborum."
end
end
stub_const('ExampleClass', example_class)
end
subject { ExampleClass.new }
let(:sample_string) { subject.lorem_ipsum }
let(:compressed_string) { subject.gzip_compress(sample_string) }
describe "#gzip_compress" do
it "compresses data passed to it" do
expect(compressed_string.length).to be < sample_string.length
end
it "returns uncompressed data when encountering Zlib::GzipFile::Error" do
expect(ActiveSupport::Gzip).to receive(:compress).and_raise(Zlib::GzipFile::Error)
expect(compressed_string.length).to eq sample_string.length
end
end
describe "#gzip_decompress" do
let(:decompressed_string) { subject.gzip_decompress(compressed_string) }
it "decompresses encoded data" do
expect(decompressed_string).to eq sample_string
end
it "returns compressed data when encountering Zlib::GzipFile::Error" do
expect(ActiveSupport::Gzip).to receive(:decompress).and_raise(Zlib::GzipFile::Error)
expect(decompressed_string).not_to eq sample_string.length
end
it "returns unmodified data when it is determined to be uncompressed" do
expect(subject.gzip_decompress(sample_string)).to eq sample_string
end
end
end
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment