Commit 7ab965ec authored by Klaus Wölfel's avatar Klaus Wölfel Committed by Xiaowu Zhang

composition: Allow to define default models

Models with workflow state "default" are only considered in composition if no
other valid model is defined. This allows to implement "ingest anything"
policy in wendelin.
parent 9a146471
......@@ -54,28 +54,36 @@ def _getEffectiveModel(self, start_date, stop_date):
if not reference:
return self
query_list = [Query(reference=reference),
Query(portal_type=self.getPortalType()),
Query(validation_state=('deleted', 'invalidated'),
def get_model_list(excluded_validation_state_list):
query_list = [Query(reference=reference),
Query(portal_type=self.getPortalType()),
Query(validation_state=excluded_validation_state_list,
operator='NOT')]
if start_date is not None:
query_list.append(ComplexQuery(Query(effective_date=None),
Query(effective_date=start_date,
if start_date is not None:
query_list.append(ComplexQuery(Query(effective_date=None),
Query(effective_date=start_date,
range='<='),
logical_operator='OR'))
if stop_date is not None:
query_list.append(ComplexQuery(Query(expiration_date=None),
Query(expiration_date=stop_date,
range='>'),
logical_operator='OR'))
# XXX What to do the catalog returns nothing (either because 'self' was just
# created and not yet indexed, or because it was invalidated) ?
# For the moment, we return self if self is invalidated and we raise otherwise.
# This way, if this happens in activity it may succeed when activity is retried.
model_list = self.getPortalObject().portal_catalog.unrestrictedSearchResults(
query=ComplexQuery(logical_operator='AND', *query_list),
sort_on=(('version', 'descending'),))
logical_operator='OR'))
if stop_date is not None:
query_list.append(ComplexQuery(Query(expiration_date=None),
Query(expiration_date=stop_date,
range='>'),
logical_operator='OR'))
# XXX What to do the catalog returns nothing (either because 'self' was just
# created and not yet indexed, or because it was invalidated) ?
# For the moment, we return self if self is invalidated and we raise otherwise.
# This way, if this happens in activity it may succeed when activity is retried.
return self.getPortalObject().portal_catalog.unrestrictedSearchResults(
query=ComplexQuery(logical_operator='AND', *query_list),
sort_on=(('version', 'descending'),))
# Only include default model if no other valid model found
model_list = get_model_list(
excluded_validation_state_list=('deleted', 'invalidated', 'default'))
if not model_list:
model_list = get_model_list(
excluded_validation_state_list=('deleted', 'invalidated'))
if not model_list:
if self.getValidationState() == 'invalidated':
return self
......
  • Is this modification related to portal_callables? If so, how does it? I could not understand this commit intention.

  • @tatuya This modification is not related to portal_callables, but it is needed for merging the wendelin data model to master: It allows to define a default data supply in case no other valid data supply is found. This is for example usefull when you would like to implement an "ingest everything" policy in wendelin: If no dataa supply for a a fluentd tag is found, we can still have a default data supply for all unnokwn ingestion. Or if no data supply for a particular project is found, we can still have a default data supply per sensor.

  • Thank you for your quick reply. Now I understand the intention. In my understanding, this modification is carefully reviewed because this method is extremely important in ERP in ERP5. Also, I prefer this modification is separately merge to master than other portal_callables branch modifications. Probably I will first work on this merge. I plan to do:

    • implement a test in testTradeCondition
    • ask Julien or Kazuhiko or Jerome or someone to review this (or perfectly understand myself the usage of getEffectiveModel)
  • I agree with your plan, thanks for the effort.

  • Where is this used in wendelin? First I guess ERP5Site_handleDefaultFluentdIngestion.py, but it does not seem using it. So I could not find it.

  • @tatuya , I think Klaus uses it for new model which I work on making a generic and default for Wendelin.

  • Thank you Ivan. But anyway I would like to find the example. I have also checked klaus/wendelin.git/master but I could not find it. In my understanding, there must be a usage of getEffectiveModel(), or something similar.

    At the same time, IMO we need to define or implement the 'default' concept with a generic way to merge this commit. Otherwise, developers can not understand what is 'default' state in effective models. Because the 'default' state does not exist in erp5.git. Therefore I think we need for example:

    • create a concrete 'default' state in a concrete workflow (I know it is in data_supply_validation_workflow in wendelin.git but it is too specific)
    • implement an abstract interface to define 'default' concept, for example in IVersionable or in IComposition (this does not exist). Then, implement isDefault()? in ERP5/Document/Document.py, or in getDefaultModel()? in composition.py.

    I would like to know the example to understand this kind of background.

    If 'default' is not generic, we should not touch mixin/composition.py. If it is generic, we need some more work than just putting 'default' state in composition.py.

  • @tatuya , I agree completely. It's beyond just Wendelin and should be considered and work on as a 'generic' addition (which I think is the case).

    I think Klaus can show some example (project specific).

  • @tatuya @Tyagov the wendelin case is: when data comes in, wendelin tries to find a movement (data ingestion) which matches the data (tag). If no such data ingestion exists, wendelin creates one by specialising a data supply. Sometimes one wants to implement an "ingest everything" policy. In this case, if no validated data supply is found, wendelin will search for a default data supply to ingest the data. This is the reason for "default" state.

    A similar case in erp5 outside wendelin: Automatically applying a trade condition: If no validated trade condition for a sale order is found, use a "default" trade condition.

  • @klaus Thank you for your comment. Sorry my question was not clear. Could you give the link of the source code where it is used? If it is in a closed repository, could you send me a e-mail?

    Edited by Tatuya Kamada
  • Edited by Klaus Wölfel
  • @klaus Thank you! This is what I wanted to find. This also means that the modification in composition.py has not been used yet.

  • @klaus, I see. If so, it is very difficult to notice (I am surprised). Anyway, I will check this out. Thank you.

  • @tatuya to proove that asComposedDocument() returns only correct with patch applied, I must add a test.

  • @klaus I am back to this task.

    For this task, I studied again Business Process (took a certain amount of time), and my current understanding is we do not need this commit. Because asComposedDocument() and _getEffectiveModel() are designed to use the effective date and the version (and implicitly the reference), furthermore they are enough to do that. (see: https://lab.nexedi.com/nexedi/erp5/blob/6bd1ebbe58dc0c4c1fa7c641e91127b1a340facd/product/ERP5/mixin/composition.py#L48)

    With versioning, we can replace IngestionPolicy_getIngestionOperationAndParameterDict.py with the following commit:

    tatuya/wendelin@0a053876

    (I am OK to add the test for this change in testWendelin or somewhere else, if you want)

    So in my opinion, we have no reason to introduce 'default' into Path or Effective Model or specialise. In other words, "IVersionable + composition" already have the power to express default which is defined as:

    default  ≡  version = ∞  

    Here are my questions:

    • Is default_state for Path {Sale Trade Condition, Business Process, Supply, Data Supply, Data Transformation} really needed? Why?
    • Do you think default_state in validation workflow make sense? Honestly, I can not understand the workflow transition:

    default -> validated -> invalidated

     I can understand if it is 'planned', 'reviewed', 'designed' or something in the middle state before 'validated'.
     But I do not understand, or honestly I oppose to have 'default' before 'validated'. Also, I feel very strange 'default' is not validated.
    I can understand to have something like priority_state and priority_workflow,
    such as : 

    draft -> {default -> normal -> priority}

     But in validation, I do not understand, and I opposed to have it.
    
    If still there is a strong reason to have 'default' in Path, I can add the definition, that is default is infinite-version on IVersionable, and implement getLatestVersionValue() in mixin/version.py or in mixin/composition.py, but personally I could not find  the strong reason to do it explicitly.
    
    Rephrasing, if you have a reason  to introduce not-validated-default-Path, could you please tell me the reason?
  • My last sentence is a bit confused.

    My opinion in one sentence:

    We need to add default only if we need not-validated-default-Path.

  • @klaus This is very off-topic, but I feel Data Supply and Data Transformation {can be, should be} unified with the same fashion of Business Process, after reading developer-Business.Process . For example, unifying into Data Process, with the sub-object: Data Link, Data Path.

    Also, this is a little notice, but we {can, should} have the causality from Data Analysis to Data Ingestion. And we should add proper jumps in between Data * documents.

    Edited by Tatuya Kamada
  • @tatuya for version versus "default" state: if we would use version, default would be "version == 0" or "version == -∞" in my understanding. Because default is only used if no normal data supply is found. Actually using version == 0 was my first approach, but @jp was against. I don't remember the reason, maybe he can remind us.

    For Data Link and Data Path: Link and Path weill be used in wendelin additionnaly to Data Supply and Data Transformation once we implement Business Process in wendelin, same as we have in Trade / Production: Sale Supply, Transformation, Link and Path.

  • If default is 0 or -∞, I completely misunderstand the concept of default. I must die. Thank you notifying me. This is a fallback model, isn't it?

    Even though, I feel very strange to have default_state on validation_workflow, but this is only a naming or a psychological problem. If 'default' is an archetype and an incomplete model. I can understand to have it in validation workflow before it is validated.

    Because default is only used if no normal data supply is found

    Sorry, I am not 100% sure. Could you please make concrete examples?

    I try to make example. My understanding is right?

    data_supply_module/1   (This is a default model)
       Reference:  model_a
      Start_date:  2017-01-01
      Stop_date:  2017-12-31
      version: 0
      validation_state: default
    
    data_supply_module/2  (This is a specific model)
      Reference:  model_a
      specialize:  data_supply_module/1
      Start_date:  2017-01-01
      Stop_date:  2017-12-31
      version: 1
      validation_state: invalidated
    
    data_supply_module/3  (This is more specific model)
      Reference:  model_a
      specialize:  data_supply_module/2
      Start_date:  2017-01-01
      Stop_date:  2017-12-31
      version: 2
      validation_state: invalidated

    In this example, we have no validated model, so default should be selected.

    If this understanding is correct, default is always the root model.

  • yes, your understanding is correct. And I also agree, that from workflow point of view it is strange to have default_state in validation workflow. Default is a concept which is orthogonal to validation. So maybe the best solution is indeed to add another workflow such as the priority workflow you were proposing.

  • @klaus, thank you for your confirmation.

    the wendelin case is: when data comes in, wendelin tries to find a movement (data ingestion) which matches the data (tag). If no such data ingestion exists, wendelin creates one by specialising a data supply. Sometimes one wants to implement an "ingest everything" policy. In this case, if no validated data supply is found, wendelin will search for a default data supply to ingest the data. This is the reason for "default" state.

    If my understanding is correct, I am not sure why data_ingestion is related 'default' in your previous example.

    I make an example.

    = case 1. no data ingestion
    
    data reference: model_a
    data_ingestion_module: empty   # to simply the situation
    
    data_supply_module/1
      reference: model_a
      validation_state: default
    
    data_supply_module/1
      specialize: data_supply_module/2
      reference: model_a
      validation_state: draft
    = case 2. there is a data ingestion
    
    data reference: model_a
    data_ingestion_module/1
      specialize: data_supply_module/1
    
    data_supply_module/1
      reference: model_a
      validation_state: default
    
    data_supply_module/1
      specialize: data_supply_module/2
      reference: model_a
      validation_state: draft

    In both two cases, how to select the model is the same. So I do not understand why data_ingestion is getting involved in your explanation. I can understand if we need to find the model only when we can not find the matched data ingestion.

    Edited by Tatuya Kamada
  • yes, in current wendelin use case we need to find the model only when we can not find the matched data ingestion.

  • @klaus, thank you again for your confirmation.

    If my understanding is right, still, we no need default.. because using validation state in validation_workflow is enough to have the equivalent functionality. Even we can remove the versions, in this example.

    = case 4. express default only validation_state.
    
    data_supply_module/1   (This is a default model)
      Reference:  model_a
      Start_date:  2017-01-01
      Stop_date:  2017-12-31
      validation_state: validated
    
    data_supply_module/2  (This is a specific model)
      Reference:  model_a
      specialize:  data_supply_module/1
      Start_date:  2017-01-01
      Stop_date:  2017-12-31
      validation_state: invalidated
    
    data_supply_module/3  (This is more specific model)
      Reference:  model_a
      specialize:  data_supply_module/2
      Start_date:  2017-01-01
      Stop_date:  2017-12-31
      validation_state: invalidated

    I remove the version and replaced the validation_state on "data_supply_module/1" with s/default/validated/.
    Even so, I can express the default model.

    I guess there is something missing in the example. Or, their must be a counter example that I can not imagine.

  • Consider this example:

    = case 5, default and validated model
    
    data_supply_module/1   (This is a default model)
      Reference:  model_a
      Start_date:  2017-01-01
      Stop_date:  2017-12-31
      validation_state: validated
    
    data_supply_module/2  (This is a specific model)
      Reference:  model_a
      specialize:  data_supply_module/1
      Start_date:  2017-01-01
      Stop_date:  2017-12-31
      validation_state: invalidated
    
    data_supply_module/3  (This is more specific model)
      Reference:  model_a
      specialize:  data_supply_module/2
      Start_date:  2017-01-01
      Stop_date:  2017-12-31
      validation_state: validated

    Here we have a validated model + a default model. In this case we want the validated model to be chosen, not the default one. If we do not use version and do not use "default" validation state, then the chosen model is random. But in this configuration we want that always data_supply_module/3 is chosen.

    Edited by Klaus Wölfel
  • Yes. But, a question arise .. "why not version?"

    Now I understand. "Why not version" is the essential question. I guess you feel so , too. I will ask Jean-Paul. Thank you!

  • yes, "why not version?" is the essential question. One answer I could imagine is: if we only use version, that it will be more difficult to have several versions of default models. We would need to find a versioning scheme to decide which version numbers are for default models (e.g all negative numbers) and which ones are for normal models.

  • @jp Why we need default model instead of using version?

Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment