Commit 27b2b7df authored by Adam Hegyi's avatar Adam Hegyi

Merge branch '339811-enable-2pc-check-ci' into 'master'

Enable 2PC check on CI [RUN AS-IF-FOSS]

See merge request gitlab-org/gitlab!70795
parents 4d3d2bb4 d8fa8f6b
......@@ -425,6 +425,92 @@ to fix the cross-join. If the cross-join is being used in a migration, we do not
need to fix the code. See <https://gitlab.com/gitlab-org/gitlab/-/issues/340017>
for more details.
### Removing cross-database transactions
When dealing with multiple databases, it's important to pay close attention to data modification
that affects more than one database.
[Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/339811) GitLab 14.4, an automated check
prevents cross-database modifications.
When at least two different databases are modified during a transaction initiated on any database
server, the application triggers a cross-database modification error (only in test environment).
Example:
```ruby
# Open transaction on Main DB
ApplicationRecord.transaction do
ci_build.update!(updated_at: Time.current) # UPDATE on CI DB
ci_build.project.update!(updated_at: Time.current) # UPDATE on Main DB
end
# raises error: Cross-database data modification of 'main, ci' were detected within
# a transaction modifying the 'ci_build, projects' tables
```
The code example above updates the timestamp for two records within a transaction. With the
ongoing work on the CI database decomposition, we cannot ensure the schematics of a database
transaction.
If the second update query fails, the first update query will not be
rolled back because the `ci_build` record is located on a different database server. For
more information, look at the
[transaction guidelines](transaction_guidelines.md#dangerous-example-third-party-api-calls)
page.
#### Fixing cross-database errors
##### Removing the transaction block
Without an open transaction, the cross-database modification check cannot raise an error.
By making this change, we sacrifice consistency. In case of an application failure after the
first `UPDATE` query, the second `UPDATE` query will never execute.
The same code without the `transaction` block:
```ruby
ci_build.update!(updated_at: Time.current) # CI DB
ci_build.project.update!(updated_at: Time.current) # Main DB
```
##### Async processing
If we need more guarantee that an operation finishes the work consistently we can execute it
within a background job. A background job is scheduled asynchronously and retried several times
in case of an error. There is still a very small chance of introducing inconsistency.
Example:
```ruby
current_time = Time.current
MyAsyncConsistencyJob.perform_async(cu_build.id)
ci_build.update!(updated_at: current_time)
ci_build.project.update!(updated_at: current_time)
```
The `MyAsyncConsistencyJob` would also attempt to update the timestamp if they differ.
##### Aiming for perfect consistency
At this point, we don't have the tooling (we might not even need it) to ensure similar consistency
characteristics as we had with one database. If you think that the code you're working on requires
these properties, then you can disable the cross-database modification check by wrapping to
offending database queries with a block and create a follow-up issue mentioning the sharding group
(`gitlab-org/sharding-group`).
```ruby
Gitlab::Database.allow_cross_joins_across_databases(url: 'gitlab issue URL') do
ApplicationRecord.transaction do
ci_build.update!(updated_at: Time.current) # UPDATE on CI DB
ci_build.project.update!(updated_at: Time.current) # UPDATE on Main DB
end
end
```
Don't hesitate to reach out to the
[sharding group](https://about.gitlab.com/handbook/engineering/development/enablement/sharding)
for advice.
## `config/database.yml`
GitLab will support running multiple databases in the future, for example to [separate tables for the continuous integration features](https://gitlab.com/groups/gitlab-org/-/epics/6167) from the main database. In order to prepare for this change, we [validate the structure of the configuration](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/67877) in `database.yml` to ensure that only known databases are used.
......
......@@ -18,6 +18,10 @@ FactoryBot.define do
transient { child_of { nil } }
transient { upstream_of { nil } }
before(:create) do |pipeline, evaluator|
pipeline.ensure_project_iid!
end
after(:build) do |pipeline, evaluator|
if evaluator.child_of
pipeline.project = evaluator.child_of.project
......@@ -48,7 +52,6 @@ FactoryBot.define do
transient { ci_ref_presence { true } }
before(:create) do |pipeline, evaluator|
pipeline.ensure_project_iid!
pipeline.ensure_ci_ref! if evaluator.ci_ref_presence && pipeline.ci_ref_id.nil?
end
......
......@@ -33,8 +33,10 @@ module Database
end
def cleanup_with_cross_database_modification_prevented
ActiveSupport::Notifications.unsubscribe(PreventCrossDatabaseModification.cross_database_context[:subscriber])
PreventCrossDatabaseModification.cross_database_context[:enabled] = false
if PreventCrossDatabaseModification.cross_database_context
ActiveSupport::Notifications.unsubscribe(PreventCrossDatabaseModification.cross_database_context[:subscriber])
PreventCrossDatabaseModification.cross_database_context[:enabled] = false
end
end
end
......@@ -55,8 +57,11 @@ module Database
end
def self.prevent_cross_database_modification!(connection, sql)
return unless cross_database_context
return unless cross_database_context[:enabled]
return if connection.pool.instance_of?(ActiveRecord::ConnectionAdapters::NullPool)
database = connection.pool.db_config.name
if sql.start_with?('SAVEPOINT')
......@@ -87,7 +92,7 @@ module Database
if schemas.many?
raise Database::PreventCrossDatabaseModification::CrossDatabaseModificationAcrossUnsupportedTablesError,
"Cross-database data modification of '#{schemas.to_a.join(", ")}' were detected within " \
"a transaction modifying the '#{all_tables.to_a.join(", ")}'"
"a transaction modifying the '#{all_tables.to_a.join(", ")}' tables"
end
end
end
......@@ -96,16 +101,20 @@ end
Gitlab::Database.singleton_class.prepend(
Database::PreventCrossDatabaseModification::GitlabDatabaseMixin)
CROSS_DB_MODIFICATION_ALLOW_LIST = Set.new(YAML.load_file(File.join(__dir__, 'cross-database-modification-allowlist.yml'))).freeze
RSpec.configure do |config|
config.include(::Database::PreventCrossDatabaseModification::SpecHelpers)
# Using before and after blocks because the around block causes problems with the let_it_be
# record creations. It makes an extra savepoint which breaks the transaction count logic.
config.before(:each, :prevent_cross_database_modification) do
with_cross_database_modification_prevented
config.before do |example_file|
if CROSS_DB_MODIFICATION_ALLOW_LIST.exclude?(example_file.file_path)
with_cross_database_modification_prevented
end
end
config.after(:each, :prevent_cross_database_modification) do
config.after do |example_file|
cleanup_with_cross_database_modification_prevented
end
end
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment