Products.CMFActivity: Fix poor performance with many family-bound activities
When there are many simultaneously-pending activities attached to any processing node family, the node>=0 subquery becomes dominant (taking hundreds of time longer than the other subqueries). As a consequence, this starves processing nodes of activities and increases the CPU needs of the mariadb process hosting the activity tables. So, move this subquery out of the regular codepath, and only run it if no other subquery found any activity: - there is no activity preferentially targeting the current node - there is no activity bound to any of the current node's families - there is no activity without any node preference at all Also, simplify the content of that subquery: the effective priority can only be 3 * priority + 1 when this query is run, and node=0 rows can be excluded (they should not exist in the current database view). Also, factorise the logic producing "node=processing_node" and "node IN node_set" subqueries, for simplicity. In turn, this makes all family-dependent subqueries use a simple equality test, ensuring a stable query plan independently from the number of families the current node is member of. Also, use "UNION ALL" always, as now: - all subqueries have stritly distinct result sets - as per mariadb documentation, "UNION [DISTINCT] applies to all UNIONs on the left", so the original comment about where ALL is used was incorrect in assuming it was improving the effective query performance Also, line-split SQL queries as visible in the python source to be more readable, without effect on the produced SQL. Also, line-split a few non-trivial python expression to make their internal structure immediately apparent. Another effect of this change this change is to reduce activity theft (activities to be preferentially executed by one node being executed by another), potentially improving object cache hit-rate and hence decreasing I/O pressure on the ZODB.
Showing
Please register or sign in to comment