product/CMFActivity/Activity/SQLBase.py · a42da4def6754ad5361a00a356d064d98dac4238 · Ayush Tiwari / erp5

CMFActivity: Do not use offset for scanning messages to validate. · a42da4de

Vincent Pelletier authored Oct 12, 2018

This was inefficient for two reasons:
- any message we could validate during current iteration means a message we
  did not consider is now in the range we just scanned. And it will not be
  considered until validation node starts over and scan this same range
  again.
- "LIMIT x,1000" pattern on >1000 messages causes a quick-growing number of
  extra rows scanned by the SQL database just to skip the "x" first rows:
  at 2000 rows present it must scan 1000 + 2000 = 3000 rows for a complete
  loop over all pending activities. At 3k rows it must scan 6k rows.
  At 4k, 10k.
  While this is an overestimation (some rows should be possible to
  validate, so these would be scanned once only), this overhead grows so
  large that this overestimation can become negligible.

Instead, use a range condition consistent with query's "SORT ON", which is
already efficiently materialised by an index: SQL database just has to
dive into the existing index to start just above the last message from
previous iteration, and resume scanning from there, solving both issues
listed above.

a42da4de

SQLBase.py 33.3 KB

Replace SQLBase.py