Commit b41dc8e0 authored by Vincent Pelletier's avatar Vincent Pelletier

SQLCatalog_deferFullTextIndex{,Activity}: Use serialization_tag .

Just like regular indexations, fulltext indexations are subject to last-commit-wins.
Which means that it is possible to reach a state where the fulltext table is
persistently desynchronised from ZODB:
- start fulltext indexation activity on many documents (typically: 100)
- modify one of the documents being indexed
- start fulltext indexation activity caused by this edit, and assume indexation only
happens for this object
- commit the single-object indexation (because it is very fast to retrieve fulltext
data from just one document)
- commit the many-objects indexation later (because it is much slower to
retrieve 100 fulltext representations)

As a consequence, cod emust spawn one fulltext indexation activity per
document, each with the appropriate serialisation tag. Serialisation tag which
must not conflict with regular indexation, so use a fixed prefix.
As a consequence of having to spawn one activity per document, use a
grouping method to still index by batches to amortise transaction overhead.
Keep the same method_id as before for backward-compatibility (maybe
dependencies on this value exist, even though it is bad practice).
Rewrite SQLCatalog_deferFullTextIndexActivity so ot works as a grouping
method, simplifying it in the process:
- build parameter_dict with all entries, as we already know all needed keys
- None is not callable, so test "not None" in just one expression
- remove whitespace at end of line
- use GroupedMessage API
parent 43d16f61
# This script is called to defer fulltext indexing in a lower priority.
context.activate(activity='SQLQueue', priority=4, group_method_id=None).SQLCatalog_deferFullTextIndexActivity(path_list=list(getPath))
GROUP_METHOD_ID = context.getPath() + '/SQLCatalog_deferFullTextIndexActivity'
for document_value, root_document_path in zip(getObject, getRootDocumentPath):
document_value.activate(
activity='SQLQueue',
priority=4,
group_method_id=GROUP_METHOD_ID,
serialization_tag='full_text_' + root_document_path,
).SQLCatalog_deferFullTextIndexActivity()
......@@ -50,7 +50,13 @@
</item>
<item>
<key> <string>_params</string> </key>
<value> <string>getPath</string> </value>
<value> <string>getObject, getRootDocumentPath</string> </value>
</item>
<item>
<key> <string>description</string> </key>
<value>
<none/>
</value>
</item>
<item>
<key> <string>id</string> </key>
......
......@@ -4,42 +4,28 @@ from zExceptions import Unauthorized
method = context.z_catalog_fulltext_list
property_list = method.arguments_src.split()
parameter_dict = {}
failed_path_list = []
restrictedTraverse = context.getPortalObject().restrictedTraverse
for path in path_list:
if not path: # should happen in tricky testERP5Catalog tests only
continue
obj = restrictedTraverse(path, None)
if obj is None:
continue
parameter_dict = {x: [] for x in property_list}
for group_object in object_list:
obj = group_object.object
tmp_dict = {}
try:
tmp_dict = {}
for property in property_list:
getter = getattr(obj, property, None)
if getter is not None and callable(getter):
if callable(getter):
value = getter()
else:
value = getattr(obj, 'get%s' % UpperCase(property))()
tmp_dict[property] = value
except ConflictError:
raise
except Unauthorized: # should happen in tricky testERP5Catalog tests only
except Unauthorized: # should happen in tricky testERP5Catalog tests only
continue
except Exception, e:
exception = e
failed_path_list.append(path)
else:
for property, value in tmp_dict.items():
parameter_dict.setdefault(property, []).append(value)
if len(failed_path_list):
if len(parameter_dict):
# reregister activity for failed objects only
context.activate(activity='SQLQueue', priority=5).SQLCatalog_deferFullTextIndexActivity(path_list=failed_path_list)
group_object.raised()
else:
# if all objects are failed one, just raise an exception to avoid infinite loop.
raise AttributeError, 'exception %r raised in indexing %r' % (exception, failed_path_list)
for property, value in tmp_dict.iteritems():
parameter_dict[property].append(value)
group_object.result = None
if parameter_dict:
return method(**parameter_dict)
......@@ -50,7 +50,7 @@
</item>
<item>
<key> <string>_params</string> </key>
<value> <string>path_list</string> </value>
<value> <string>object_list</string> </value>
</item>
<item>
<key> <string>_proxy_roles</string> </key>
......@@ -60,6 +60,12 @@
</tuple>
</value>
</item>
<item>
<key> <string>description</string> </key>
<value>
<none/>
</value>
</item>
<item>
<key> <string>id</string> </key>
<value> <string>SQLCatalog_deferFullTextIndexActivity</string> </value>
......
  • mentioned in commit fc2d07de

    Toggle commit list
  • This only fixes erp5_full_text_mroonga_catalog. What about other full_text BT ?

    /cc @kazuhiko

  • Other fulltext BTs should rather be deleted. They are inferior to mroonga in at least CJK support, so I see supporting them as a waste of time.

Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment