Remove recursiveImmediateReindexObject, simplify recursiveReindexObject
@arnau @jerome @jm @romain @seb @tb (please invite more if applicable)
recursiveImmediateReindexObject
recursiveImmediateReindexObject
is a generic method (implemented on Base
and Folder
) which, by contract, walks its entire subtree withing one transaction.
These 3 facts together (generic, recursive and immediate) make this method a risk of memory exhaustion as it can be called on unexpected documents, in turn threatening availability.
Genericity and recusion can hardly be avoided: we want to have a common way of reindexing sets of documents based on a common ancestor (ex: reindexing an entire module). So it is immediateness must go away: we should not need reindexations or arbitrarily-large trees to happen in a single transaction. So this patch set removed recursiveImmediateReindexObject
method, and all uses shouldgo through recursiveReindexObject
instead.
There is one case though where recursion and immediateness are needed: the (normally rare) cases where immediate indexation of a newly-created document is desired. Such newly-created document could spawn a (limited) number of subdocuments, so I made such method purpose-specific: reindexOnCreation
. Until a few days ago, I thought this would not be needed, but it seem it is. So we still have a dangerous-ish method, only now with a more specific name - so misuses would be harder to justify. Ideas on how to make misuses even harder are very welcome.
recursiveReindexObject
recursiveReindexObject
is a generic method (declared on Base
and Folder
again) which calls recursiveImmediateReindexObject
in some cases (based on the number of immediate subdocuments). As described above, recursiveImmediateReindexObject
is now gone, so another solution is needed.
So I change recursiveReindexObject
so that it calls immediateReindexObject
instead (immediate, generic but non-recursive), and to do that I fixed the recursion limits of recursiveReindexObject
by making it call the existing recurseCallMethod
(_recurseCallMethod
actually, the unrestricted variant). This way, we have precise size-limiting capabilities, independently from tree depth.
In turn, this means recursiveReindexObject
can now be used on any document: cells, lines, top-level documents but also modules. So as a bonus, we can get rid of more complexity by dropping Folder_reindex{All,Objectlist,TreeObjectList}
and simplifying ERP5Site_reindexAll
.
Other commits
Most of the commits are preparatory work to reach the above. These are touching other important ERP5 components:
-
ActivityTool
becomes itself indexable while only its content was previously indexable. This was inconsistent (it meansparent_uid
cannot be used on immediateportal_activities
subdocuments) and in turn hurt catalog reproductibility. -
CategoryTool
becomes aBaseTool
subclass so it inherits fromERP5Type/Core/Folder
so it can be reindex the same way as everything else -
ActivityTool
becomes aBaseTool
subclass just to simplify inheritance) so that they behave consistently with other tools
A big discovery while working on this patch set was that our indexation tests (intended to check that a site is properly indexed right after its creation) were utterly broken: the test would check that after a full-site reindexation the catalog was a superset of its initial state. Why ? Because initial state was already tainted by previous tests in the same test class... Which basically cleared the catalog, indexing only a handful of document (~10) out of the thousand expected. Fixing this test is also what these preparatory commits are for: I needed a reliable way of testing catalog completeness.
One change I'm especially not sure about is ERP5Site: Do reindex the whole site after creation.
. The rationale is only that for whatever reason(s) we make site object non-indexable during bootstrap. As a result, many indexation activities are not even spawned. I believe there should be no fundamental reason to make it non-indexable, "only" bootstrap issues. Which I did not investigate, falling back to such easy change. Insights on bootstrap considerations is very welcome, as I would prefer "natural" on-installation indexations to be relied upon rather than such "reindex-all-after" hack (which , modulo the fact we do not clear catalog,(EDIT: we do clear catalog in both codes) reduces the relevance of catalog reproducibility tests).
Desired review process
I would like this merge request, if it does not achieve unanimous approval (which seems unlikely), to happen in progressive steps: all commits up to CategoryTool: Become a BaseTool subclass.
could conceivable go in master while the rest is still discussed, and in about any order. So please ACK/NACK individual commits, to help with this approach. I will then reorder commits to push ACK'ed ones so they get out of the way ASAP.