• Yorick Peterse's avatar
    Faster way of obtaining latest event update time · 054f2f98
    Yorick Peterse authored
    Instead of using MAX(events.updated_at) we can simply sort the events in
    descending order by the "id" column and grab the first row. In other
    words, instead of this:
    
        SELECT max(events.updated_at) AS max_id
        FROM events
        LEFT OUTER JOIN projects   ON projects.id   = events.project_id
        LEFT OUTER JOIN namespaces ON namespaces.id = projects.namespace_id
        WHERE events.author_id IS NOT NULL
        AND events.project_id IN (13083);
    
    we can use this:
    
        SELECT events.updated_at AS max_id
        FROM events
        LEFT OUTER JOIN projects   ON projects.id   = events.project_id
        LEFT OUTER JOIN namespaces ON namespaces.id = projects.namespace_id
        WHERE events.author_id IS NOT NULL
        AND events.project_id IN (13083)
        ORDER BY events.id DESC
        LIMIT 1;
    
    This has the benefit that on PostgreSQL a backwards index scan can be
    used, which due to the "LIMIT 1" will at most process only a single row.
    This in turn greatly speeds up the process of grabbing the latest update
    time. This can be confirmed by looking at the query plans. The first
    query produces the following plan:
    
        Aggregate  (cost=43779.84..43779.85 rows=1 width=12) (actual time=2142.462..2142.462 rows=1 loops=1)
          ->  Index Scan using index_events_on_project_id on events  (cost=0.43..43704.69 rows=30060 width=12) (actual time=0.033..2138.086 rows=32769 loops=1)
                Index Cond: (project_id = 13083)
                Filter: (author_id IS NOT NULL)
        Planning time: 1.248 ms
        Execution time: 2142.548 ms
    
    The second query in turn produces the following plan:
    
        Limit  (cost=0.43..41.65 rows=1 width=16) (actual time=1.394..1.394 rows=1 loops=1)
          ->  Index Scan Backward using events_pkey on events  (cost=0.43..1238907.96 rows=30060 width=16) (actual time=1.394..1.394 rows=1 loops=1)
                Filter: ((author_id IS NOT NULL) AND (project_id = 13083))
                Rows Removed by Filter: 2104
        Planning time: 0.166 ms
        Execution time: 1.408 ms
    
    According to the above plans the 2nd query is around 1500 times faster.
    However, re-running the first query produces timings of around 80 ms,
    making the 2nd query "only" around 55 times faster.
    054f2f98
event_spec.rb 2.17 KB