Bk sqlcatalog
- non-transaction-bounded caches replaced by transaction / cluster friendly caches
- make sure that properties of catalog can not be changed whenever connection to database is absent.
The work on this patch was done in July 2015, the discussion related to it is bellow:
From: Julien Muchembled jm@nexedi.com To: Boris Kocherov bk@raskon.org
CC: Vincent Pelletier vincent@nexedi.com, Romain Courteaud romain@nexedi.com
Date: Thu, 16 Jul 2015 12:21:05 +0200
Subject: [bk_sqlcatalog] improve database error handling: if database is not connected user sees erp5 designed exception
Le 07/16/15 11:18, Boris Kocherov a écrit:
+def database_exception_pass(default_value=None):
- @simple_decorator
- def decorator(function):
- def wrapped(*args, **kw):
try:
return function(*args, **kw)
except DatabaseError:
exc_info = sys.exc_info()
if str(exc_info[1]).endswith(" is not connected to a database"):
return default_value
else:
raise exc_info[0], exc_info[1], exc_info[2]
- return wrapped
- return decorator
I'm afraid that this would mask exceptions if a MySQL transaction was already started. That would be bad if tcp_finish fails.
The bare except: in DA.call (Products/ERP5Type/patches/DA.py) is really ugly.
This makes me realize that DB._query in Products.ZMySQLDA.db should not catch exceptions if the TM is registered.
----- End forwarded message -----
From: Boris Kocherov bk@raskon.org
To: Julien Muchembled jm@nexedi.com
Cc: Vincent Pelletier vincent@nexedi.com, Romain Courteaud romain@nexedi.com
Date: Mon, 20 Jul 2015 01:21:27 +0300
Subject: Re: [bk_sqlcatalog] improve database error handling: if database is not connected user sees erp5 designed exception
Hello, Julien, As i see, my code masks only 1 error — when ERP5 can not connect to mySQL while starting. If the connection is lost after ERP5 is loaded then MySql client generates other errors during timeout period. This timeout is defined by wait_timeout variable which is 8 hours by default. The masking of exceptions by DB._query is done for transparent connection restart after 8 hours of inactivity (that is impossible in current ERP5 implementation). In code around DB._query i don't understand why SQL commit is done in tpc_finish. Isnt he following situation be possible: all mySql Queries are done, the datas are saved in ZopeODB but if MySql connection is lost then the tpc_finish fails, so datas will not be saved in MySql DB? May be it's better to use tpc_vote instead of tpc_finish?
In fact i don't understand what is the purpose of monkey patch DA.call .
Boris.
----- End forwarded message -----
From: Vincent Pelletier vincent@nexedi.com
To: Boris Kocherov bk@raskon.org
Cc: Julien Muchembled jm@nexedi.com, Romain Courteaud romain@nexedi.com
Date: Mon, 20 Jul 2015 10:47:20 +0200
Subject: Re: [bk_sqlcatalog] improve database error handling: if database is not connected user sees erp5 designed exception
Hello Boris,
On Mon, 20 Jul 2015 01:21:27 +0300, Boris Kocherov bk@raskon.org wrote:
As i see, my code masks only 1 error — when ERP5 can not connect to mySQL while starting. If the connection is lost after ERP5 is loaded then MySql client generates other errors during timeout period. This timeout is defined by wait_timeout variable which is 8 hours by default. The masking of exceptions by DB._query is done for transparent connection restart after 8 hours of inactivity (that is impossible in current ERP5 implementation).
I do not understand exactly the (dis)connection scenario which is not possible in current implementation.
At the risk of missing your point, I would like to describe the problem with silent reconnections. Silent re-connection outside of transaction boundaries would be a serious bug: with mysql, transactions are bound to connections. If connection is lost and reestablished without aborting zope transaction:
- transaction isolation is violated (ie, Zope transaction would suddenly see a more recent database snapshot)
- data durability is violated, as any data modification query already executed is lost (----- Forwarded message from Vincent Pelletier vincent@nexedi.com -----as we never requested a commit, mysql can reasonably rollback everything) So everything around SQL connectors must be very careful to propagate exceptions (and more generally preventing current transaction from committing, which may not be done in current code, but which exists in ZODB). Ideally, tpc_vote (discussed next) should poke ("SELECT 1" ?) the database to check one last time that connection is still alive. I did not check lately, maybe it does just that.
In code around DB._query i don't understand why SQL commit is done in tpc_finish. Isn't he following situation be possible: all mySql Queries are done, the datas are saved in ZopeODB but if MySql connection is lost then the tpc_finish fails, so datas will not be saved in MySql DB? May be it's better to use tpc_vote instead of tpc_finish?
The scenario you describe is possible.
But committing SQL transaction in tpc_votes would make things worse: The purpose of tpc_vote is to ask all transactional connectors's approval to commit the transaction. The refusal of a single connector cause the transaction to be aborted. So the following would be possible:
- tpc_vote mysql -> commits SQL
- tpc_vote zodb, zodb refuses commit (typical: conflict error)
- tpc_abort, trying to rollback all change: fails to rollback SQL transaction as it's already committed. Here, without any network incident, we get inconsistent databases.
Inter-database inconsistencies risks are just unavoidable in a multi-connector (or multi-base) scenario, because of loss of atomicity (tpc_vote is "test" and tpc_finish is "set" from typical test-and-set atomic operations). I think current implementation makes it as unlikely as possible.
In fact i don't understand what is the purpose of monkey patch DA.call .
Most of the code comes from original method. Judging from the coding style (more than one statement per line), the "except:" statement I see very likely comes from Zope code.
As for the specific features added by the patch, I cannot tell much more than the commit log.
Regards,
Vincent Pelletier ERP5 - open source ERP/CRM for flexible enterprises
----- End forwarded message -----
From: Boris Kocherov bk@raskon.org
To: Vincent Pelletier vincent@nexedi.com
Cc: Julien Muchembled jm@nexedi.com, Romain Courteaud romain@nexedi.com
Date: Mon, 20 Jul 2015 13:38:33 +0300
Subject: Re: [bk_sqlcatalog] improve database error handling: if database is not connected user sees erp5 designed exception
On 20/07/15 10:47, Vincent Pelletier wrote:
Hello Boris,
On Mon, 20 Jul 2015 01:21:27 +0300, Boris Kocherov bk@raskon.org wrote:
As i see, my code masks only 1 error — when ERP5 can not connect to mySQL while starting. If the connection is lost after ERP5 is loaded then MySql client generates other errors during timeout period. This timeout is defined by wait_timeout variable which is 8 hours by default. The masking of exceptions by DB._query is done for transparent connection restart after 8 hours of inactivity (that is impossible in current ERP5 implementation).
I do not understand exactly the (dis)connection scenario which is not possible in current implementation.
i am sorry for wrong description of MySql client behaviour during disconnection: the error types don't depend on timeout. In fact my code which we discuss can be removed because the result of the exception masking in this code is the possibility to show for user the standard ERP5 page view while SQL was not connected during ERP5 loading. If this code will be removed then the ERP5 standard page can not be rendered because of inaccessibility of erp5 workflow list, prefrence and count of activities.
At the risk of missing your point, I would like to describe the problem with silent reconnections. Silent re-connection outside of transaction boundaries would be a serious bug: with mysql, transactions are bound to connections. If connection is lost and reestablished without aborting zope transaction:
- transaction isolation is violated (ie, Zope transaction would suddenly see a more recent database snapshot)
- data durability is violated, as any data modification query already executed is lost (as we never requested a commit, mysql can reasonably rollback everything) So everything around SQL connectors must be very careful to propagate exceptions (and more generally preventing current transaction from committing, which may not be done in current code, but which exists in ZODB).
I totally agree.
Ideally, tpc_vote (discussed next) should poke ("SELECT 1" ?) the database to check one last time that connection is still alive. I did not check lately, maybe it does just that. No, currently it does not.
In code around DB._query i don't understand why SQL commit is done in tpc_finish. Isn't he following situation be possible: all mySql Queries are done, the datas are saved in ZopeODB but if MySql connection is lost then the tpc_finish fails, so datas will not be saved in MySql DB? May be it's better to use tpc_vote instead of tpc_finish?
The scenario you describe is possible.
But committing SQL transaction in tpc_votes would make things worse: The purpose of tpc_vote is to ask all transactional connectors's approval to commit the transaction. The refusal of a single connector cause the transaction to be aborted. So the following would be possible:
- tpc_vote mysql -> commits SQL
- tpc_vote zodb, zodb refuses commit (typical: conflict error)
- tpc_abort, trying to rollback all change: fails to rollback SQL transaction as it's already committed. Here, without any network incident, we get inconsistent databases.
Inter-database inconsistencies risks are just unavoidable in a multi-connector (or multi-base) scenario, because of loss of atomicity (tpc_vote is "test" and tpc_finish is "set" from typical test-and-set atomic operations). I think current implementation makes it as unlikely as possible.
May be the using of mysql XA transaction (https://dev.mysql.com/doc/refman/5.0/en/xa.html) can help to solve the problem. Anyway this problem is out of range of my task given by JPS while it's interesting problem :)
Boris
----- End forwarded message -----
From: Vincent Pelletier vincent@nexedi.com
To: Boris Kocherov bk@raskon.org
Cc: Julien Muchembled jm@nexedi.com, Romain Courteaud romain@nexedi.com
Date: Mon, 20 Jul 2015 17:09:10 +0200
Subject: Re: [bk_sqlcatalog] improve database error handling: if database is not connected user sees erp5 designed exception
On Mon, 20 Jul 2015 13:38:33 +0300, Boris Kocherov bk@raskon.org wrote:
i am sorry for wrong description of MySql client behaviour during disconnection: the error types don't depend on timeout. In fact my code which we discuss can be removed because the result of the exception masking in this code is the possibility to show for user the standard ERP5 page view while SQL was not connected during ERP5 loading. If this code will be removed then the ERP5 standard page can not be rendered because of inaccessibility of erp5 workflow list, prefrence and count of activities.
I think it's acceptable for ERP5 UI to be broken when MySQL cannot be reached. What would be problematic though would be ZMI unavailability because it then prevents editing connection strings, hence fixing the error.
Ideally, tpc_vote (discussed next) should poke ("SELECT 1" ?) the database to check one last time that connection is still alive. I did not check lately, maybe it does just that. No, currently it does not. [...] May be the using of mysql XA transaction (https://dev.mysql.com/doc/refman/5.0/en/xa.html) can help to solve the problem. Anyway this problem is out of range of my task given by JPS while it's interesting problem :)
Indeed, investigating XA usage would be very interesting - and would obsolete the "SELECT 1" hack I proposed above.
BTW, I forgot to mention that, in addition to Romain whom you should have received an "away" automatic mail, Julien is also away. He will be back a few days before Romain. Sorry for the delay in reviewing your work.
Regards,
Vincent Pelletier
ERP5 - open source ERP/CRM for flexible enterprises