Commit a2bb950f authored by Andreas Brandl's avatar Andreas Brandl

Merge branch 'pb-remove-bg-migration-schema-helpers' into 'master'

Remove migration helpers using BG migrations

See merge request gitlab-org/gitlab!77768
parents 66b1717d 0aa5b237
...@@ -228,100 +228,9 @@ can take a very long time to complete, preventing a deployment from proceeding. ...@@ -228,100 +228,9 @@ can take a very long time to complete, preventing a deployment from proceeding.
They can also produce a lot of pressure on the database due to it rapidly They can also produce a lot of pressure on the database due to it rapidly
updating many rows in sequence. updating many rows in sequence.
To reduce database pressure you should instead use To reduce database pressure you should instead use a background migration
`change_column_type_using_background_migration` or `rename_column_using_background_migration` when migrating a column in a large table (for example, `issues`). This will
when migrating a column in a large table (for example, `issues`). These methods work spread the work / load over a longer time period, without slowing down deployments.
similarly to the concurrent counterparts but uses background migration to spread
the work / load over a longer time period, without slowing down deployments.
For example, to change the column type using a background migration:
```ruby
class ExampleMigration < Gitlab::Database::Migration[1.0]
disable_ddl_transaction!
class Issue < ActiveRecord::Base
self.table_name = 'issues'
include EachBatch
def self.to_migrate
where('closed_at IS NOT NULL')
end
end
def up
change_column_type_using_background_migration(
Issue.to_migrate,
:closed_at,
:datetime_with_timezone
)
end
def down
change_column_type_using_background_migration(
Issue.to_migrate,
:closed_at,
:datetime
)
end
end
```
This would change the type of `issues.closed_at` to `timestamp with time zone`.
Keep in mind that the relation passed to
`change_column_type_using_background_migration` _must_ include `EachBatch`,
otherwise it will raise a `TypeError`.
This migration then needs to be followed in a separate release (_not_ a patch
release) by a cleanup migration, which should steal from the queue and handle
any remaining rows. For example:
```ruby
class MigrateRemainingIssuesClosedAt < Gitlab::Database::Migration[1.0]
disable_ddl_transaction!
class Issue < ActiveRecord::Base
self.table_name = 'issues'
include EachBatch
end
def up
Gitlab::BackgroundMigration.steal('CopyColumn')
Gitlab::BackgroundMigration.steal('CleanupConcurrentTypeChange')
migrate_remaining_rows if migrate_column_type?
end
def down
# Previous migrations already revert the changes made here.
end
def migrate_remaining_rows
Issue.where('closed_at_for_type_change IS NULL AND closed_at IS NOT NULL').each_batch do |batch|
batch.update_all('closed_at_for_type_change = closed_at')
end
cleanup_concurrent_column_type_change(:issues, :closed_at)
end
def migrate_column_type?
# Some environments may have already executed the previous version of this
# migration, thus we don't need to migrate those environments again.
column_for('issues', 'closed_at').type == :datetime # rubocop:disable Migration/Datetime
end
end
```
The same applies to `rename_column_using_background_migration`:
1. Create a migration using the helper, which will schedule background
migrations to spread the writes over a longer period of time.
1. In the next monthly release, create a clean-up migration to steal from the
Sidekiq queues, migrate any missing rows, and cleanup the rename. This
migration should skip the steps after stealing from the Sidekiq queues if the
column has already been renamed.
For more information, see [the documentation on cleaning up background For more information, see [the documentation on cleaning up background
migrations](background_migrations.md#cleaning-up). migrations](background_migrations.md#cleaning-up).
......
...@@ -778,186 +778,6 @@ module Gitlab ...@@ -778,186 +778,6 @@ module Gitlab
install_rename_triggers(table, old, new) install_rename_triggers(table, old, new)
end end
# Changes the column type of a table using a background migration.
#
# Because this method uses a background migration it's more suitable for
# large tables. For small tables it's better to use
# `change_column_type_concurrently` since it can complete its work in a
# much shorter amount of time and doesn't rely on Sidekiq.
#
# Example usage:
#
# class Issue < ActiveRecord::Base
# self.table_name = 'issues'
#
# include EachBatch
#
# def self.to_migrate
# where('closed_at IS NOT NULL')
# end
# end
#
# change_column_type_using_background_migration(
# Issue.to_migrate,
# :closed_at,
# :datetime_with_timezone
# )
#
# Reverting a migration like this is done exactly the same way, just with
# a different type to migrate to (e.g. `:datetime` in the above example).
#
# relation - An ActiveRecord relation to use for scheduling jobs and
# figuring out what table we're modifying. This relation _must_
# have the EachBatch module included.
#
# column - The name of the column for which the type will be changed.
#
# new_type - The new type of the column.
#
# batch_size - The number of rows to schedule in a single background
# migration.
#
# interval - The time interval between every background migration.
def change_column_type_using_background_migration(
relation,
column,
new_type,
batch_size: 10_000,
interval: 10.minutes
)
unless relation.model < EachBatch
raise TypeError, 'The relation must include the EachBatch module'
end
temp_column = "#{column}_for_type_change"
table = relation.table_name
max_index = 0
add_column(table, temp_column, new_type)
install_rename_triggers(table, column, temp_column)
# Schedule the jobs that will copy the data from the old column to the
# new one. Rows with NULL values in our source column are skipped since
# the target column is already NULL at this point.
relation.where.not(column => nil).each_batch(of: batch_size) do |batch, index|
start_id, end_id = batch.pluck('MIN(id), MAX(id)').first
max_index = index
migrate_in(
index * interval,
'CopyColumn',
[table, column, temp_column, start_id, end_id]
)
end
# Schedule the renaming of the column to happen (initially) 1 hour after
# the last batch finished.
migrate_in(
(max_index * interval) + 1.hour,
'CleanupConcurrentTypeChange',
[table, column, temp_column]
)
if perform_background_migration_inline?
# To ensure the schema is up to date immediately we perform the
# migration inline in dev / test environments.
Gitlab::BackgroundMigration.steal('CopyColumn')
Gitlab::BackgroundMigration.steal('CleanupConcurrentTypeChange')
end
end
# Renames a column using a background migration.
#
# Because this method uses a background migration it's more suitable for
# large tables. For small tables it's better to use
# `rename_column_concurrently` since it can complete its work in a much
# shorter amount of time and doesn't rely on Sidekiq.
#
# Example usage:
#
# rename_column_using_background_migration(
# :users,
# :feed_token,
# :rss_token
# )
#
# table - The name of the database table containing the column.
#
# old - The old column name.
#
# new - The new column name.
#
# type - The type of the new column. If no type is given the old column's
# type is used.
#
# batch_size - The number of rows to schedule in a single background
# migration.
#
# interval - The time interval between every background migration.
def rename_column_using_background_migration(
table,
old_column,
new_column,
type: nil,
batch_size: 10_000,
interval: 10.minutes
)
check_trigger_permissions!(table)
old_col = column_for(table, old_column)
new_type = type || old_col.type
max_index = 0
add_column(table, new_column, new_type,
limit: old_col.limit,
precision: old_col.precision,
scale: old_col.scale)
# We set the default value _after_ adding the column so we don't end up
# updating any existing data with the default value. This isn't
# necessary since we copy over old values further down.
change_column_default(table, new_column, old_col.default) if old_col.default
install_rename_triggers(table, old_column, new_column)
model = Class.new(ActiveRecord::Base) do
self.table_name = table
include ::EachBatch
end
# Schedule the jobs that will copy the data from the old column to the
# new one. Rows with NULL values in our source column are skipped since
# the target column is already NULL at this point.
model.where.not(old_column => nil).each_batch(of: batch_size) do |batch, index|
start_id, end_id = batch.pluck('MIN(id), MAX(id)').first
max_index = index
migrate_in(
index * interval,
'CopyColumn',
[table, old_column, new_column, start_id, end_id]
)
end
# Schedule the renaming of the column to happen (initially) 1 hour after
# the last batch finished.
migrate_in(
(max_index * interval) + 1.hour,
'CleanupConcurrentRename',
[table, old_column, new_column]
)
if perform_background_migration_inline?
# To ensure the schema is up to date immediately we perform the
# migration inline in dev / test environments.
Gitlab::BackgroundMigration.steal('CopyColumn')
Gitlab::BackgroundMigration.steal('CleanupConcurrentRename')
end
end
def convert_to_bigint_column(column) def convert_to_bigint_column(column)
"#{column}_convert_to_bigint" "#{column}_convert_to_bigint"
end end
......
...@@ -152,10 +152,6 @@ module Gitlab ...@@ -152,10 +152,6 @@ module Gitlab
delete_job_tracking(class_name, status: delete_tracking_jobs) if delete_tracking_jobs delete_job_tracking(class_name, status: delete_tracking_jobs) if delete_tracking_jobs
end end
def perform_background_migration_inline?
Rails.env.test? || Rails.env.development?
end
def migrate_in(*args, coordinator: coordinator_for_tracking_database) def migrate_in(*args, coordinator: coordinator_for_tracking_database)
with_migration_context do with_migration_context do
coordinator.perform_in(*args) coordinator.perform_in(*args)
......
...@@ -1752,116 +1752,6 @@ RSpec.describe Gitlab::Database::MigrationHelpers do ...@@ -1752,116 +1752,6 @@ RSpec.describe Gitlab::Database::MigrationHelpers do
end end
end end
describe '#change_column_type_using_background_migration' do
let!(:issue) { create(:issue, :closed, closed_at: Time.zone.now) }
let(:issue_model) do
Class.new(ActiveRecord::Base) do
self.table_name = 'issues'
include EachBatch
end
end
it 'changes the type of a column using a background migration' do
expect(model)
.to receive(:add_column)
.with('issues', 'closed_at_for_type_change', :datetime_with_timezone)
expect(model)
.to receive(:install_rename_triggers)
.with('issues', :closed_at, 'closed_at_for_type_change')
expect(BackgroundMigrationWorker)
.to receive(:perform_in)
.ordered
.with(
10.minutes,
'CopyColumn',
['issues', :closed_at, 'closed_at_for_type_change', issue.id, issue.id]
)
expect(BackgroundMigrationWorker)
.to receive(:perform_in)
.ordered
.with(
1.hour + 10.minutes,
'CleanupConcurrentTypeChange',
['issues', :closed_at, 'closed_at_for_type_change']
)
expect(Gitlab::BackgroundMigration)
.to receive(:steal)
.ordered
.with('CopyColumn')
expect(Gitlab::BackgroundMigration)
.to receive(:steal)
.ordered
.with('CleanupConcurrentTypeChange')
model.change_column_type_using_background_migration(
issue_model.all,
:closed_at,
:datetime_with_timezone
)
end
end
describe '#rename_column_using_background_migration' do
let!(:issue) { create(:issue, :closed, closed_at: Time.zone.now) }
it 'renames a column using a background migration' do
expect(model)
.to receive(:add_column)
.with(
'issues',
:closed_at_timestamp,
:datetime_with_timezone,
limit: anything,
precision: anything,
scale: anything
)
expect(model)
.to receive(:install_rename_triggers)
.with('issues', :closed_at, :closed_at_timestamp)
expect(BackgroundMigrationWorker)
.to receive(:perform_in)
.ordered
.with(
10.minutes,
'CopyColumn',
['issues', :closed_at, :closed_at_timestamp, issue.id, issue.id]
)
expect(BackgroundMigrationWorker)
.to receive(:perform_in)
.ordered
.with(
1.hour + 10.minutes,
'CleanupConcurrentRename',
['issues', :closed_at, :closed_at_timestamp]
)
expect(Gitlab::BackgroundMigration)
.to receive(:steal)
.ordered
.with('CopyColumn')
expect(Gitlab::BackgroundMigration)
.to receive(:steal)
.ordered
.with('CleanupConcurrentRename')
model.rename_column_using_background_migration(
'issues',
:closed_at,
:closed_at_timestamp
)
end
end
describe '#convert_to_bigint_column' do describe '#convert_to_bigint_column' do
it 'returns the name of the temporary column used to convert to bigint' do it 'returns the name of the temporary column used to convert to bigint' do
expect(model.convert_to_bigint_column(:id)).to eq('id_convert_to_bigint') expect(model.convert_to_bigint_column(:id)).to eq('id_convert_to_bigint')
...@@ -2065,8 +1955,6 @@ RSpec.describe Gitlab::Database::MigrationHelpers do ...@@ -2065,8 +1955,6 @@ RSpec.describe Gitlab::Database::MigrationHelpers do
t.integer :other_id t.integer :other_id
t.timestamps t.timestamps
end end
allow(model).to receive(:perform_background_migration_inline?).and_return(false)
end end
context 'when the target table does not exist' do context 'when the target table does not exist' do
......
...@@ -438,26 +438,6 @@ RSpec.describe Gitlab::Database::Migrations::BackgroundMigrationHelpers do ...@@ -438,26 +438,6 @@ RSpec.describe Gitlab::Database::Migrations::BackgroundMigrationHelpers do
it_behaves_like 'helpers that enqueue background migrations', BackgroundMigrationWorker, 'main' it_behaves_like 'helpers that enqueue background migrations', BackgroundMigrationWorker, 'main'
end end
describe '#perform_background_migration_inline?' do
it 'returns true in a test environment' do
stub_rails_env('test')
expect(model.perform_background_migration_inline?).to eq(true)
end
it 'returns true in a development environment' do
stub_rails_env('development')
expect(model.perform_background_migration_inline?).to eq(true)
end
it 'returns false in a production environment' do
stub_rails_env('production')
expect(model.perform_background_migration_inline?).to eq(false)
end
end
describe '#delete_job_tracking' do describe '#delete_job_tracking' do
let!(:job_class_name) { 'TestJob' } let!(:job_class_name) { 'TestJob' }
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment