Commit 0aa5b237 authored by pbair's avatar pbair

Remove migration helpers using BG migrations

Remove two older and not-recommended migration helpers which use
background migrations to rename or change type of a column. Generally we
prefer to use a custom BG migration for the time being, since these
migration helpers have a few issues with their implementation.

These migration helpers are also a good candidate to be reimplemented
using batched background migrations, following current "best practices."
parent 538543c5
......@@ -228,100 +228,9 @@ can take a very long time to complete, preventing a deployment from proceeding.
They can also produce a lot of pressure on the database due to it rapidly
updating many rows in sequence.
To reduce database pressure you should instead use
`change_column_type_using_background_migration` or `rename_column_using_background_migration`
when migrating a column in a large table (for example, `issues`). These methods work
similarly to the concurrent counterparts but uses background migration to spread
the work / load over a longer time period, without slowing down deployments.
For example, to change the column type using a background migration:
```ruby
class ExampleMigration < Gitlab::Database::Migration[1.0]
disable_ddl_transaction!
class Issue < ActiveRecord::Base
self.table_name = 'issues'
include EachBatch
def self.to_migrate
where('closed_at IS NOT NULL')
end
end
def up
change_column_type_using_background_migration(
Issue.to_migrate,
:closed_at,
:datetime_with_timezone
)
end
def down
change_column_type_using_background_migration(
Issue.to_migrate,
:closed_at,
:datetime
)
end
end
```
This would change the type of `issues.closed_at` to `timestamp with time zone`.
Keep in mind that the relation passed to
`change_column_type_using_background_migration` _must_ include `EachBatch`,
otherwise it will raise a `TypeError`.
This migration then needs to be followed in a separate release (_not_ a patch
release) by a cleanup migration, which should steal from the queue and handle
any remaining rows. For example:
```ruby
class MigrateRemainingIssuesClosedAt < Gitlab::Database::Migration[1.0]
disable_ddl_transaction!
class Issue < ActiveRecord::Base
self.table_name = 'issues'
include EachBatch
end
def up
Gitlab::BackgroundMigration.steal('CopyColumn')
Gitlab::BackgroundMigration.steal('CleanupConcurrentTypeChange')
migrate_remaining_rows if migrate_column_type?
end
def down
# Previous migrations already revert the changes made here.
end
def migrate_remaining_rows
Issue.where('closed_at_for_type_change IS NULL AND closed_at IS NOT NULL').each_batch do |batch|
batch.update_all('closed_at_for_type_change = closed_at')
end
cleanup_concurrent_column_type_change(:issues, :closed_at)
end
def migrate_column_type?
# Some environments may have already executed the previous version of this
# migration, thus we don't need to migrate those environments again.
column_for('issues', 'closed_at').type == :datetime # rubocop:disable Migration/Datetime
end
end
```
The same applies to `rename_column_using_background_migration`:
1. Create a migration using the helper, which will schedule background
migrations to spread the writes over a longer period of time.
1. In the next monthly release, create a clean-up migration to steal from the
Sidekiq queues, migrate any missing rows, and cleanup the rename. This
migration should skip the steps after stealing from the Sidekiq queues if the
column has already been renamed.
To reduce database pressure you should instead use a background migration
when migrating a column in a large table (for example, `issues`). This will
spread the work / load over a longer time period, without slowing down deployments.
For more information, see [the documentation on cleaning up background
migrations](background_migrations.md#cleaning-up).
......
......@@ -778,186 +778,6 @@ module Gitlab
install_rename_triggers(table, old, new)
end
# Changes the column type of a table using a background migration.
#
# Because this method uses a background migration it's more suitable for
# large tables. For small tables it's better to use
# `change_column_type_concurrently` since it can complete its work in a
# much shorter amount of time and doesn't rely on Sidekiq.
#
# Example usage:
#
# class Issue < ActiveRecord::Base
# self.table_name = 'issues'
#
# include EachBatch
#
# def self.to_migrate
# where('closed_at IS NOT NULL')
# end
# end
#
# change_column_type_using_background_migration(
# Issue.to_migrate,
# :closed_at,
# :datetime_with_timezone
# )
#
# Reverting a migration like this is done exactly the same way, just with
# a different type to migrate to (e.g. `:datetime` in the above example).
#
# relation - An ActiveRecord relation to use for scheduling jobs and
# figuring out what table we're modifying. This relation _must_
# have the EachBatch module included.
#
# column - The name of the column for which the type will be changed.
#
# new_type - The new type of the column.
#
# batch_size - The number of rows to schedule in a single background
# migration.
#
# interval - The time interval between every background migration.
def change_column_type_using_background_migration(
relation,
column,
new_type,
batch_size: 10_000,
interval: 10.minutes
)
unless relation.model < EachBatch
raise TypeError, 'The relation must include the EachBatch module'
end
temp_column = "#{column}_for_type_change"
table = relation.table_name
max_index = 0
add_column(table, temp_column, new_type)
install_rename_triggers(table, column, temp_column)
# Schedule the jobs that will copy the data from the old column to the
# new one. Rows with NULL values in our source column are skipped since
# the target column is already NULL at this point.
relation.where.not(column => nil).each_batch(of: batch_size) do |batch, index|
start_id, end_id = batch.pluck('MIN(id), MAX(id)').first
max_index = index
migrate_in(
index * interval,
'CopyColumn',
[table, column, temp_column, start_id, end_id]
)
end
# Schedule the renaming of the column to happen (initially) 1 hour after
# the last batch finished.
migrate_in(
(max_index * interval) + 1.hour,
'CleanupConcurrentTypeChange',
[table, column, temp_column]
)
if perform_background_migration_inline?
# To ensure the schema is up to date immediately we perform the
# migration inline in dev / test environments.
Gitlab::BackgroundMigration.steal('CopyColumn')
Gitlab::BackgroundMigration.steal('CleanupConcurrentTypeChange')
end
end
# Renames a column using a background migration.
#
# Because this method uses a background migration it's more suitable for
# large tables. For small tables it's better to use
# `rename_column_concurrently` since it can complete its work in a much
# shorter amount of time and doesn't rely on Sidekiq.
#
# Example usage:
#
# rename_column_using_background_migration(
# :users,
# :feed_token,
# :rss_token
# )
#
# table - The name of the database table containing the column.
#
# old - The old column name.
#
# new - The new column name.
#
# type - The type of the new column. If no type is given the old column's
# type is used.
#
# batch_size - The number of rows to schedule in a single background
# migration.
#
# interval - The time interval between every background migration.
def rename_column_using_background_migration(
table,
old_column,
new_column,
type: nil,
batch_size: 10_000,
interval: 10.minutes
)
check_trigger_permissions!(table)
old_col = column_for(table, old_column)
new_type = type || old_col.type
max_index = 0
add_column(table, new_column, new_type,
limit: old_col.limit,
precision: old_col.precision,
scale: old_col.scale)
# We set the default value _after_ adding the column so we don't end up
# updating any existing data with the default value. This isn't
# necessary since we copy over old values further down.
change_column_default(table, new_column, old_col.default) if old_col.default
install_rename_triggers(table, old_column, new_column)
model = Class.new(ActiveRecord::Base) do
self.table_name = table
include ::EachBatch
end
# Schedule the jobs that will copy the data from the old column to the
# new one. Rows with NULL values in our source column are skipped since
# the target column is already NULL at this point.
model.where.not(old_column => nil).each_batch(of: batch_size) do |batch, index|
start_id, end_id = batch.pluck('MIN(id), MAX(id)').first
max_index = index
migrate_in(
index * interval,
'CopyColumn',
[table, old_column, new_column, start_id, end_id]
)
end
# Schedule the renaming of the column to happen (initially) 1 hour after
# the last batch finished.
migrate_in(
(max_index * interval) + 1.hour,
'CleanupConcurrentRename',
[table, old_column, new_column]
)
if perform_background_migration_inline?
# To ensure the schema is up to date immediately we perform the
# migration inline in dev / test environments.
Gitlab::BackgroundMigration.steal('CopyColumn')
Gitlab::BackgroundMigration.steal('CleanupConcurrentRename')
end
end
def convert_to_bigint_column(column)
"#{column}_convert_to_bigint"
end
......
......@@ -152,10 +152,6 @@ module Gitlab
delete_job_tracking(class_name, status: delete_tracking_jobs) if delete_tracking_jobs
end
def perform_background_migration_inline?
Rails.env.test? || Rails.env.development?
end
def migrate_in(*args, coordinator: coordinator_for_tracking_database)
with_migration_context do
coordinator.perform_in(*args)
......
......@@ -1752,116 +1752,6 @@ RSpec.describe Gitlab::Database::MigrationHelpers do
end
end
describe '#change_column_type_using_background_migration' do
let!(:issue) { create(:issue, :closed, closed_at: Time.zone.now) }
let(:issue_model) do
Class.new(ActiveRecord::Base) do
self.table_name = 'issues'
include EachBatch
end
end
it 'changes the type of a column using a background migration' do
expect(model)
.to receive(:add_column)
.with('issues', 'closed_at_for_type_change', :datetime_with_timezone)
expect(model)
.to receive(:install_rename_triggers)
.with('issues', :closed_at, 'closed_at_for_type_change')
expect(BackgroundMigrationWorker)
.to receive(:perform_in)
.ordered
.with(
10.minutes,
'CopyColumn',
['issues', :closed_at, 'closed_at_for_type_change', issue.id, issue.id]
)
expect(BackgroundMigrationWorker)
.to receive(:perform_in)
.ordered
.with(
1.hour + 10.minutes,
'CleanupConcurrentTypeChange',
['issues', :closed_at, 'closed_at_for_type_change']
)
expect(Gitlab::BackgroundMigration)
.to receive(:steal)
.ordered
.with('CopyColumn')
expect(Gitlab::BackgroundMigration)
.to receive(:steal)
.ordered
.with('CleanupConcurrentTypeChange')
model.change_column_type_using_background_migration(
issue_model.all,
:closed_at,
:datetime_with_timezone
)
end
end
describe '#rename_column_using_background_migration' do
let!(:issue) { create(:issue, :closed, closed_at: Time.zone.now) }
it 'renames a column using a background migration' do
expect(model)
.to receive(:add_column)
.with(
'issues',
:closed_at_timestamp,
:datetime_with_timezone,
limit: anything,
precision: anything,
scale: anything
)
expect(model)
.to receive(:install_rename_triggers)
.with('issues', :closed_at, :closed_at_timestamp)
expect(BackgroundMigrationWorker)
.to receive(:perform_in)
.ordered
.with(
10.minutes,
'CopyColumn',
['issues', :closed_at, :closed_at_timestamp, issue.id, issue.id]
)
expect(BackgroundMigrationWorker)
.to receive(:perform_in)
.ordered
.with(
1.hour + 10.minutes,
'CleanupConcurrentRename',
['issues', :closed_at, :closed_at_timestamp]
)
expect(Gitlab::BackgroundMigration)
.to receive(:steal)
.ordered
.with('CopyColumn')
expect(Gitlab::BackgroundMigration)
.to receive(:steal)
.ordered
.with('CleanupConcurrentRename')
model.rename_column_using_background_migration(
'issues',
:closed_at,
:closed_at_timestamp
)
end
end
describe '#convert_to_bigint_column' do
it 'returns the name of the temporary column used to convert to bigint' do
expect(model.convert_to_bigint_column(:id)).to eq('id_convert_to_bigint')
......@@ -2065,8 +1955,6 @@ RSpec.describe Gitlab::Database::MigrationHelpers do
t.integer :other_id
t.timestamps
end
allow(model).to receive(:perform_background_migration_inline?).and_return(false)
end
context 'when the target table does not exist' do
......
......@@ -438,26 +438,6 @@ RSpec.describe Gitlab::Database::Migrations::BackgroundMigrationHelpers do
it_behaves_like 'helpers that enqueue background migrations', BackgroundMigrationWorker, 'main'
end
describe '#perform_background_migration_inline?' do
it 'returns true in a test environment' do
stub_rails_env('test')
expect(model.perform_background_migration_inline?).to eq(true)
end
it 'returns true in a development environment' do
stub_rails_env('development')
expect(model.perform_background_migration_inline?).to eq(true)
end
it 'returns false in a production environment' do
stub_rails_env('production')
expect(model.perform_background_migration_inline?).to eq(false)
end
end
describe '#delete_job_tracking' do
let!(:job_class_name) { 'TestJob' }
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment