Commits · 08be2f8cc936fbdc3d5bafb4355c3a475198abc1 · Nikola Balog / erp5

An error occurred fetching the project authors.

29 Apr, 2021 1 commit
- apply raiser fixer & absolute import fixer · 08be2f8c
  Aurel authored 4 years ago
  
  08be2f8c
18 Jan, 2021 1 commit

CMFActivity: Optimise validation queries. · a016ed04

See SQLBase._getExecutableMessageSet for operation principle.
Removes the notion of order_validation_text: activity validation is no
longer evaluated per-activity , but per-dependency for multiple activities
at a time. In this context, order_validation_text does not make sense as
it flattens all dependency types for a given activity.
Rework activity-dependency-to-SQL methods: use a dict rather
dynamically-generated method names.
Based on initial work by Julien Muchembled.

a016ed04

21 May, 2019 1 commit

CMFActivity: Implement node families. · bf35001b

Vincent Pelletier authored 5 years ago

The intent is to be able to tell that an independently-defined group of
activity nodes may execute given activity, and no other node.
This allows more flexible parallelism control than serialization_tag.

bf35001b

21 Feb, 2019 1 commit

CMFActivity: new activate() parameter to prefer executing on the same node · 301962ad

Julien Muchembled authored 6 years ago

The goal is to make better use of the ZODB Storage cache. It is common to do
processing on a data set in several sequential transactions: in such case, by
continuing execution of these messages on the same node, data is loaded from
ZODB only once. Without this, and if there are many other messages to process,
processing always continue on a random node, causing much more load from ZODB.

To prevent nodes from having too much work to do, or too little compared to
other nodes, this new parameter is only a hint for CMFActivity. It remains
possible for a node to execute a message that was intended for another node.

Before this commit, a processing node selects the first message(s) according to
the following ordering:

  priority, date

and now:

  priority, node_preference, date

where node_preference is:

  -1 -> same node
   0 -> no preferred node
   1 -> another node

The implementation is tricky for 2 reasons:
- MariaDB can't order this way in a single simple query, so we have 1
  subquery for each case, potentially getting 3 times the wanted maximum of
  messages, then order/filter on the resulting union.
- MariaDB also can't filter efficiently messages for other nodes, so the 3rd
  subquery returns messages for any node, potentially duplicating results from
  the first 2 subqueries. This works because they'll be ordered last.
  Unfortunately, this requires extra indices.

In any case, message reservation must be very efficient, or MariaDB deadlocks
quickly happen, and locking an activity table during reservation reduces
parallelism too much.

In addition to better cache efficiency, this new feature can be used as a
workaround for a bug affecting serialiation_tag, causing IntegrityError when
reindexing many new objects. If you have 2 recursive reindexations for both a
document and one of its lines, and if you have so many messages than grouping
is split between these 2 messages, then you end up with 2 nodes indexing the
same line in parallel: for some tables, the pattern DELETE+INSERT conflicts
since InnoDB does not take any lock when deleting a non-existent row.

If you have many activities creating such documents, you can combine with
grouping and appropriate priority to make sure that such pair of messages won't
be executed on different nodes, except maybe at the end (when there's no
document to create anymore; then activity reexecution may be enough).
For example:

  from Products.CMFActivity.ActivityTool import getCurrentNode
  portal.setPlacelessDefaultReindexParameters(
    activate_kw={'node': 'same', 'priority': priority},
    group_id=getCurrentNode())

where `priority` is the same as the activity containing the above code, which
can also use grouping without increasing the probability of IntegrityError.

301962ad

13 Feb, 2019 2 commits
- CMFActivity: some optimization and clean-up in the code reserving messages · cee3e728
  Julien Muchembled authored 6 years ago
  
  cee3e728
- CMFActivity: limit insertion by size in bytes instead of number of rows · 17dc7e23
  Julien Muchembled authored 6 years ago
```
This fixes the issue that a transaction with many big messages failed to
commit. By dynamically find the maximum allowed size of a query, it also
speeds up insertion by minimizing the number of queries.
```
  17dc7e23
05 Feb, 2019 2 commits

CMFActivity: remove useless 'priority' index · 4b7acaa7
Julien Muchembled authored 6 years ago

4b7acaa7

CMFActivity: drop DTML completely and use consecutive uids when possible · d64887cb

Julien Muchembled authored 6 years ago

This moves the remaining DTML queries to Python, dropping the 'activity' skin.

Dealing with conflicts of uids is easier if the inserted uids are consecutive:
now, only 1 random value is generated, as base uid. This also preserves the
order of insertion, which is wanted for performance reasons:
- No more random write in the primary index.
- When modifying several lines of several documents, 1 document being processed
  at a time, we'd like that any grouped activity (usually indexation) follows
  the same order, so that a processing node prefer many lines from a few
  documents instead of mixing lines from too many documents at the same time.
  This is usually better for caches.

d64887cb

18 Jan, 2019 1 commit
- CMFActivity: move most SQL queries from DTML to Python · ad7ee9aa
  Julien Muchembled authored 6 years ago
  
  ad7ee9aa
26 Mar, 2018 1 commit

CMFActivity: Use a random value for activity uids · e1549361

Boxiang Sun authored 7 years ago

Sequential number generators stored in a fixed-size format eventually run
out of values. But activity queues only care about what activities are
currently present: any uid can be reused as soon as it is available.
So stop using a sequential id generator for activity uids, and instead use
random values.

Vincent Pelletier:
- Commit message.
- Minor formatting changes.
- Do probability computations, and increase activity uid storage size to
  64bits integers, up from 32. Table schema migration happens on first
  activity node which starts on upgraded code.
- Apply to SQLJobLib too.

e1549361

15 Mar, 2018 1 commit
- CMFActivity.Activity.SQLJobLib: Drop unused globals. · 0ffa37d7
  Vincent Pelletier authored 6 years ago
  
  0ffa37d7
06 Mar, 2018 1 commit

CMFActivity: Use CMFActivity as a backend for joblib · d2c88bd6

Hardik Juneja authored 8 years ago

This commit:

- Adds a new Activity called "SQLJoblib"
- Adds a Backend to be used by joblib
- Uses OOBTree to store results instead of ConflictFreeLog
- Adds a getResultDict API to fetch resut Dict

It uses the original work from rafael@nexedi.com and loic.esteve@inria.fr

d2c88bd6