Commit 10ceb33b authored by Sean McGivern's avatar Sean McGivern

Extend CTE search optimisation to projects

When we use the `search` param on an `IssuableFinder`, we can run into
issues. We have trigram indexes to support these searches. On
GitLab.com, we often see Postgres's optimiser prioritise the (global)
trigram indexes over the index on `project_id`. For group and project
searches, we know that it will be quicker to filter by `project_id`
first, as it returns fewer rows in most cases.

For group issues search, we ran into this issue previously, and went
through the following iterations:

1. Use a CTE on the project IDs as an optimisation fence. This prevents
   the planner from disregarding the index on `project_id`.
   Unfortunately it breaks some types of sorting, like priority and
   popularity, as they sort on a joined table.
2. Use a subquery for listing issues, and a CTE for counts. The subquery
   - in the case of group lists - didn't help as much as the CTE, but
   was faster than not including it. We can safely use a CTE for counts
   as they don't have sorting.

Now, however, we're seeing the same issue in a project context. The
subquery doesn't help at all there (it would only return one row, after
all). In an attempt to keep total code complexity under control, this
commit removes the subquery optimisation and applies the CTE
optimisation only for sorts we know that are safe.

This means that for more complicated sorts (like priority and
popularity), the search will continue to be very slow. If this is a
high-priority issue, we can consider introducing further optimisations,
but this finder is already very complicated and additional complexity
has a cost.

The group CTE optimisation is controlled by the same feature flag as
before, `attempt_group_search_optimizations`, which is enabled by
default. The new project CTE optimisation is controlled by a new feature
flag, `attempt_project_search_optimizations`, which is disabled by
default.
parent f87b7fe3
...@@ -100,6 +100,7 @@ module IssuableCollections ...@@ -100,6 +100,7 @@ module IssuableCollections
if @project if @project
options[:project_id] = @project.id options[:project_id] = @project.id
options[:attempt_project_search_optimizations] = true
elsif @group elsif @group
options[:group_id] = @group.id options[:group_id] = @group.id
options[:include_subgroups] = true options[:include_subgroups] = true
......
...@@ -84,7 +84,7 @@ class IssuableFinder ...@@ -84,7 +84,7 @@ class IssuableFinder
# https://www.postgresql.org/docs/current/static/queries-with.html # https://www.postgresql.org/docs/current/static/queries-with.html
items = by_search(items) items = by_search(items)
items = sort(items) unless use_cte_for_count? items = sort(items)
items items
end end
...@@ -92,7 +92,6 @@ class IssuableFinder ...@@ -92,7 +92,6 @@ class IssuableFinder
def filter_items(items) def filter_items(items)
items = by_project(items) items = by_project(items)
items = by_group(items) items = by_group(items)
items = by_subquery(items)
items = by_scope(items) items = by_scope(items)
items = by_created_at(items) items = by_created_at(items)
items = by_updated_at(items) items = by_updated_at(items)
...@@ -132,10 +131,12 @@ class IssuableFinder ...@@ -132,10 +131,12 @@ class IssuableFinder
# This does not apply when we are using a CTE for the search, as the labels # This does not apply when we are using a CTE for the search, as the labels
# GROUP BY is inside the subquery in that case, so we set labels_count to 1. # GROUP BY is inside the subquery in that case, so we set labels_count to 1.
# #
# We always use CTE when searching in Groups if the feature flag is enabled, # Groups and projects have separate feature flags to suggest the use
# but never when searching in Projects. # of a CTE. The CTE will not be used if the sort doesn't support it,
# but will always be used for the counts here as we ignore sorting
# anyway.
labels_count = label_names.any? ? label_names.count : 1 labels_count = label_names.any? ? label_names.count : 1
labels_count = 1 if use_cte_for_count? labels_count = 1 if use_cte_for_search?
finder.execute.reorder(nil).group(:state).count.each do |key, value| finder.execute.reorder(nil).group(:state).count.each do |key, value|
counts[count_key(key)] += value / labels_count counts[count_key(key)] += value / labels_count
...@@ -309,15 +310,14 @@ class IssuableFinder ...@@ -309,15 +310,14 @@ class IssuableFinder
end end
# rubocop: enable CodeReuse/ActiveRecord # rubocop: enable CodeReuse/ActiveRecord
def use_subquery_for_search? def use_cte_for_search?
strong_memoize(:use_subquery_for_search) do strong_memoize(:use_cte_for_search) do
!force_cte? && attempt_group_search_optimizations? next false unless search
end next false unless Gitlab::Database.postgresql?
end # Only simple unsorted & simple sorts can use CTE
next false if params[:sort].present? && !params[:sort].in?(klass.simple_sorts.keys)
def use_cte_for_count? attempt_group_search_optimizations? || attempt_project_search_optimizations?
strong_memoize(:use_cte_for_count) do
force_cte? && attempt_group_search_optimizations?
end end
end end
...@@ -332,12 +332,15 @@ class IssuableFinder ...@@ -332,12 +332,15 @@ class IssuableFinder
end end
def attempt_group_search_optimizations? def attempt_group_search_optimizations?
search &&
Gitlab::Database.postgresql? &&
params[:attempt_group_search_optimizations] && params[:attempt_group_search_optimizations] &&
Feature.enabled?(:attempt_group_search_optimizations, default_enabled: true) Feature.enabled?(:attempt_group_search_optimizations, default_enabled: true)
end end
def attempt_project_search_optimizations?
params[:attempt_project_search_optimizations] &&
Feature.enabled?(:attempt_project_search_optimizations)
end
def count_key(value) def count_key(value)
Array(value).last.to_sym Array(value).last.to_sym
end end
...@@ -408,20 +411,11 @@ class IssuableFinder ...@@ -408,20 +411,11 @@ class IssuableFinder
end end
# rubocop: enable CodeReuse/ActiveRecord # rubocop: enable CodeReuse/ActiveRecord
# Wrap projects and groups in a subquery if the conditions are met.
def by_subquery(items)
if use_subquery_for_search?
klass.where(id: items.select(:id)) # rubocop: disable CodeReuse/ActiveRecord
else
items
end
end
# rubocop: disable CodeReuse/ActiveRecord # rubocop: disable CodeReuse/ActiveRecord
def by_search(items) def by_search(items)
return items unless search return items unless search
if use_cte_for_count? if use_cte_for_search?
cte = Gitlab::SQL::RecursiveCTE.new(klass.table_name) cte = Gitlab::SQL::RecursiveCTE.new(klass.table_name)
cte << items cte << items
......
...@@ -172,6 +172,10 @@ module Issuable ...@@ -172,6 +172,10 @@ module Issuable
fuzzy_search(query, matched_columns) fuzzy_search(query, matched_columns)
end end
def simple_sorts
super.except('name_asc', 'name_desc')
end
def sort_by_attribute(method, excluded_labels: []) def sort_by_attribute(method, excluded_labels: [])
sorted = sorted =
case method.to_s case method.to_s
......
...@@ -21,19 +21,21 @@ module Sortable ...@@ -21,19 +21,21 @@ module Sortable
class_methods do class_methods do
def order_by(method) def order_by(method)
case method.to_s simple_sorts.fetch(method.to_s, -> { all }).call
when 'created_asc' then order_created_asc
when 'created_date' then order_created_desc
when 'created_desc' then order_created_desc
when 'id_asc' then order_id_asc
when 'id_desc' then order_id_desc
when 'name_asc' then order_name_asc
when 'name_desc' then order_name_desc
when 'updated_asc' then order_updated_asc
when 'updated_desc' then order_updated_desc
else
all
end end
def simple_sorts
{
'created_asc' => -> { order_created_asc },
'created_date' => -> { order_created_desc },
'created_desc' => -> { order_created_desc },
'id_asc' => -> { order_id_asc },
'id_desc' => -> { order_id_desc },
'name_asc' => -> { order_name_asc },
'name_desc' => -> { order_name_desc },
'updated_asc' => -> { order_updated_asc },
'updated_desc' => -> { order_updated_desc }
}
end end
private private
......
---
title: Speed up filtering issues in a project when searching
merge_request:
author:
type: performance
...@@ -719,7 +719,7 @@ describe IssuesFinder do ...@@ -719,7 +719,7 @@ describe IssuesFinder do
end end
end end
describe '#use_subquery_for_search?' do describe '#use_cte_for_search?' do
let(:finder) { described_class.new(nil, params) } let(:finder) { described_class.new(nil, params) }
before do before do
...@@ -731,7 +731,7 @@ describe IssuesFinder do ...@@ -731,7 +731,7 @@ describe IssuesFinder do
let(:params) { { attempt_group_search_optimizations: true } } let(:params) { { attempt_group_search_optimizations: true } }
it 'returns false' do it 'returns false' do
expect(finder.use_subquery_for_search?).to be_falsey expect(finder.use_cte_for_search?).to be_falsey
end end
end end
...@@ -743,15 +743,15 @@ describe IssuesFinder do ...@@ -743,15 +743,15 @@ describe IssuesFinder do
end end
it 'returns false' do it 'returns false' do
expect(finder.use_subquery_for_search?).to be_falsey expect(finder.use_cte_for_search?).to be_falsey
end end
end end
context 'when the attempt_group_search_optimizations param is falsey' do context 'when the force_cte param is falsey' do
let(:params) { { search: 'foo' } } let(:params) { { search: 'foo' } }
it 'returns false' do it 'returns false' do
expect(finder.use_subquery_for_search?).to be_falsey expect(finder.use_cte_for_search?).to be_falsey
end end
end end
...@@ -763,80 +763,39 @@ describe IssuesFinder do ...@@ -763,80 +763,39 @@ describe IssuesFinder do
end end
it 'returns false' do it 'returns false' do
expect(finder.use_subquery_for_search?).to be_falsey expect(finder.use_cte_for_search?).to be_falsey
end end
end end
context 'when force_cte? is true' do context 'when attempt_group_search_optimizations is unset and attempt_project_search_optimizations is set' do
let(:params) { { search: 'foo', attempt_group_search_optimizations: true, force_cte: true } } let(:params) { { search: 'foo', attempt_project_search_optimizations: true } }
it 'returns false' do
expect(finder.use_subquery_for_search?).to be_falsey
end
end
context 'when all conditions are met' do
let(:params) { { search: 'foo', attempt_group_search_optimizations: true } }
it 'returns true' do
expect(finder.use_subquery_for_search?).to be_truthy
end
end
end
describe '#use_cte_for_count?' do
let(:finder) { described_class.new(nil, params) }
context 'and the corresponding feature flag is disabled' do
before do before do
allow(Gitlab::Database).to receive(:postgresql?).and_return(true) stub_feature_flags(attempt_project_search_optimizations: false)
stub_feature_flags(attempt_group_search_optimizations: true)
end end
context 'when there is no search param' do
let(:params) { { attempt_group_search_optimizations: true, force_cte: true } }
it 'returns false' do it 'returns false' do
expect(finder.use_cte_for_count?).to be_falsey expect(finder.use_cte_for_search?).to be_falsey
end end
end end
context 'when the database is not Postgres' do context 'and the corresponding feature flag is enabled' do
let(:params) { { search: 'foo', force_cte: true, attempt_group_search_optimizations: true } }
before do before do
allow(Gitlab::Database).to receive(:postgresql?).and_return(false) stub_feature_flags(attempt_project_search_optimizations: true)
end
it 'returns false' do
expect(finder.use_cte_for_count?).to be_falsey
end
end
context 'when the force_cte param is falsey' do
let(:params) { { search: 'foo' } }
it 'returns false' do
expect(finder.use_cte_for_count?).to be_falsey
end
end end
context 'when the attempt_group_search_optimizations flag is disabled' do it 'returns true' do
let(:params) { { search: 'foo', force_cte: true, attempt_group_search_optimizations: true } } expect(finder.use_cte_for_search?).to be_truthy
before do
stub_feature_flags(attempt_group_search_optimizations: false)
end end
it 'returns false' do
expect(finder.use_cte_for_count?).to be_falsey
end end
end end
context 'when all conditions are met' do context 'when all conditions are met' do
let(:params) { { search: 'foo', force_cte: true, attempt_group_search_optimizations: true } } let(:params) { { search: 'foo', attempt_group_search_optimizations: true } }
it 'returns true' do it 'returns true' do
expect(finder.use_cte_for_count?).to be_truthy expect(finder.use_cte_for_search?).to be_truthy
end end
end end
end end
......
shared_examples 'issuables list meta-data' do |issuable_type, action = nil| shared_examples 'issuables list meta-data' do |issuable_type, action = nil|
include ProjectForksHelper include ProjectForksHelper
def get_action(action, project) def get_action(action, project, extra_params = {})
if action if action
get action, params: { author_id: project.creator.id } get action, params: { author_id: project.creator.id }.merge(extra_params)
else else
get :index, params: { namespace_id: project.namespace, project_id: project } get :index, params: { namespace_id: project.namespace, project_id: project }.merge(extra_params)
end end
end end
...@@ -17,23 +17,44 @@ shared_examples 'issuables list meta-data' do |issuable_type, action = nil| ...@@ -17,23 +17,44 @@ shared_examples 'issuables list meta-data' do |issuable_type, action = nil|
end end
end end
before do let!(:issuables) do
@issuable_ids = %w[fix improve/awesome].map do |source_branch| %w[fix improve/awesome].map do |source_branch|
create_issuable(issuable_type, project, source_branch: source_branch).id create_issuable(issuable_type, project, source_branch: source_branch)
end end
end end
let(:issuable_ids) { issuables.map(&:id) }
it "creates indexed meta-data object for issuable notes and votes count" do it "creates indexed meta-data object for issuable notes and votes count" do
get_action(action, project) get_action(action, project)
meta_data = assigns(:issuable_meta_data) meta_data = assigns(:issuable_meta_data)
aggregate_failures do aggregate_failures do
expect(meta_data.keys).to match_array(@issuable_ids) expect(meta_data.keys).to match_array(issuables.map(&:id))
expect(meta_data.values).to all(be_kind_of(Issuable::IssuableMeta)) expect(meta_data.values).to all(be_kind_of(Issuable::IssuableMeta))
end end
end end
context 'searching' do
let(:result_issuable) { issuables.first }
let(:search) { result_issuable.title }
before do
stub_feature_flags(attempt_project_search_optimizations: true)
end
# .simple_sorts is the same across all Sortable classes
sorts = ::Issue.simple_sorts.keys + %w[popularity priority label_priority]
sorts.each do |sort|
it "works when sorting by #{sort}" do
get_action(action, project, search: search, sort: sort)
expect(assigns(:issuable_meta_data).keys).to include(result_issuable.id)
end
end
end
it "avoids N+1 queries" do it "avoids N+1 queries" do
control = ActiveRecord::QueryRecorder.new { get_action(action, project) } control = ActiveRecord::QueryRecorder.new { get_action(action, project) }
issuable = create_issuable(issuable_type, project, source_branch: 'csv') issuable = create_issuable(issuable_type, project, source_branch: 'csv')
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment