Commit f04094a4 authored by Yorick Peterse's avatar Yorick Peterse

Constrain environment deployments to project IDs

When querying the deployments of an environment the query Rails produces
will be along the lines of the following:

    SELECT *
    FROM deployments
    WHERE environment_id = X

For queries such as this (or queries that use this as their base and add
more conditions) there is no meaningful index that can be used as long
as deployments.project_id is not part of a WHERE clause.

To work around this we change that "has_many :deployments" relation to
always add a "WHERE project_id = X" condition. This means that queries
filtering deployments can make better use of the existing indexes. For
example, when filtering by deployments.iid this will result in the
following query:

    SELECT *
    FROM deployments
    WHERE environment_id = X
    AND project_id = Y
    AND iid = Z

This means PostgreSQL can use the existing index on
(project_id, environment_id, iid) instead of having to use a different
index (or none at all) and having to scan over a large amount of data.

Query plan wise this means that instead of this query and plan:

    EXPLAIN (BUFFERS, ANALYZE)
    SELECT deployments.*
    FROM deployments
    WHERE deployments.environment_id = 5
    AND deployments.iid = 225;

    Index Scan using index_deployments_on_project_id_and_iid on deployments  (cost=0.42..14465.75 rows=1 width=117) (actual time=6.394..38.048 rows=1 loops=1)
      Index Cond: (iid = 225)
      Filter: (environment_id = 5)
      Rows Removed by Filter: 839
      Buffers: shared hit=4534
    Planning time: 0.076 ms
    Execution time: 38.073 ms

We produce the following query and plan:

    EXPLAIN (BUFFERS, ANALYZE)
    SELECT deployments.*
    FROM deployments
    WHERE deployments.environment_id = 5
    AND deployments.iid = 225
    AND deployments.project_id = 1292351;

    Index Scan using index_deployments_on_project_id_and_iid on deployments  (cost=0.42..4.45 rows=1 width=117) (actual time=0.018..0.018 rows=1 loops=1)
      Index Cond: ((project_id = 1292351) AND (iid = 225))
      Filter: (environment_id = 5)
      Buffers: shared hit=4
    Planning time: 0.088 ms
    Execution time: 0.039 ms

On GitLab.com these changes result in a (roughly) 11x improvement in SQL
timings for the CI environment status endpoint.

Fixes https://gitlab.com/gitlab-org/gitlab-ce/issues/36877
parent 46bdcd64
......@@ -6,7 +6,10 @@ class Environment < ActiveRecord::Base
belongs_to :project, required: true, validate: true
has_many :deployments, dependent: :destroy # rubocop:disable Cop/ActiveRecordDependent
has_many :deployments,
-> (env) { where(project_id: env.project_id) },
dependent: :destroy # rubocop:disable Cop/ActiveRecordDependent
has_one :last_deployment, -> { order('deployments.id DESC') }, class_name: 'Deployment'
before_validation :nullify_external_url
......
---
title: Constrain environment deployments to project IDs
merge_request:
author:
type: other
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment