An error occurred fetching the project authors.
- 07 Nov, 2018 1 commit
-
-
Julien Muchembled authored
Without this new mechanism to detect oids that aren't write-locked, a transaction could be committed successfully without detecting conflicts. In the added test, the resulting value was 2, whereas it should be 5 if there was no node failure.
-
- 05 Nov, 2018 1 commit
-
-
Julien Muchembled authored
-
- 30 May, 2018 1 commit
-
-
Julien Muchembled authored
/reviewed-on !9
-
- 31 Mar, 2017 1 commit
-
-
Julien Muchembled authored
This is a follow up of commit 64afd7d2, which focused on read accesses when there is no transaction activity. This commit also includes a test to check a simpler scenario that the one described in the previous commit.
-
- 23 Mar, 2017 1 commit
-
-
Julien Muchembled authored
In the worst case, with many clients trying to lock the same oids, the cluster could enter in an infinite cascade of deadlocks. Here is an overview with 3 storage nodes and 3 transactions: S1 S2 S3 order of locking tids # abbreviations: l1 l1 l2 123 # l: lock q23 q23 d1q3 231 # d: deadlock triggered r1:l3 r1:l2 (r1) # for S3, we still have l2 # q: queued d2q1 q13 q13 312 # r: rebase Above, we show what happens when a random transaction gets a lock just after that another is rebased. Here, the result is that the last 2 lines are a permutation of the first 2, and this can repeat indefinitely with bad luck. This commit reduces the probability of deadlock by processing delayed stores/checks in the order of their locking tid. In the above example, S1 would give the lock to 2 when 1 is rebased, and 2 would vote successfully.
-
- 27 Feb, 2017 1 commit
-
-
Julien Muchembled authored
This happened in 2 cases: - Commit a4c06242 ("Review aborting of transactions") introduced a race condition causing oids to remain write-locked forever after that the transaction modifying them is aborted. - An unfinished transaction is not locked/unlocked during tpc_finish: oids must be unlocked when being notified that the transaction is finished.
-
- 21 Feb, 2017 3 commits
-
-
Julien Muchembled authored
This is a first version with several optimizations possible: - improve EventQueue (or implement a specific queue) to minimize deadlocks - turn the RebaseObject packet into a notification Sorting oids could also be useful to reduce the probability of deadlocks, but that would never be enough to avoid them completely, even if there's a single storage. For example: 1. C1 does a first store (x or y) 2. C2 stores x and y; one is delayed 3. C1 stores the other -> deadlock When solving the deadlock, the data of the first store may only exist on the storage. 2 functional tests are removed because they're redundant, either with ZODB tests or with the new threaded tests.
-
Julien Muchembled authored
- Make sure that errors while processing a delayed packet are reported to the connection that sent this packet. - Provide a mechanism to process events for the same connection in chronological order.
-
Julien Muchembled authored
-
- 14 Feb, 2017 3 commits
-
-
Julien Muchembled authored
Fix conflict handling after a successful store to a node being disconnected for having missed a transaction
-
Julien Muchembled authored
-
Julien Muchembled authored
-
- 02 Feb, 2017 4 commits
-
-
Julien Muchembled authored
Now that we do inequality comparisons between timestamps, the master must use a monotonic clock, to avoid issues when the clock is turned back. Before, the probability that time.time() returned again the same value was probably negligible.
-
Julien Muchembled authored
It was disabled long time ago and NEO has evolved in such a way that the new implementation will be completely different.
-
Julien Muchembled authored
It's dead code, because 1 year after it was introduced, something else was implemented to detect deadlocks immediately. Anyway, it would be an unacceptable way to detect them.
-
Julien Muchembled authored
-
- 18 Jan, 2017 1 commit
-
-
Julien Muchembled authored
-
- 23 Dec, 2016 1 commit
-
-
Julien Muchembled authored
-
- 27 Nov, 2016 2 commits
-
-
Julien Muchembled authored
Therefore, a client node in the node manager is always RUNNING.
-
Julien Muchembled authored
-
- 15 Nov, 2016 1 commit
-
-
Kirill Smelkov authored
A backup cluster for tids <= backup_tid has all data to provide regular read-only ZODB service. Having regular ZODB access to the data can be handy e.g. for externally verifying data for consistency between main and backup clusters. Peeking around without disturbing main cluster might be also useful sometimes. In this patch: - master & storage nodes are taught: * to instantiate read-only or regular client service handler depending on cluster state: RUNNING -> regular BACKINGUP -> read-only * in read-only client handler: + to reject write-related operations + to provide read operations but adjust semantic as last_tid in the database would be = backup_tid - new READ_ONLY_ACCESS protocol error code is introduced so that client can raise POSException.ReadOnlyError upon receiving it. I have not implemented back-channel for invalidations in read-only mode (yet ?). This way once a client connects to cluster in backup state, it won't see new data fetched by backup cluster from upstream after client connected. The reasons invalidations are not implemented is that for now (imho) there is no off-hand ready infrastructure to get updates from replicating node on transaction-by-transaction basis (it currently only notifies when whole batch is done). For consistency verification (main reason for this patch) we also don't need invalidations to work, as in that task we always connect afresh to backup. So I simply only put relevant TODOs about invalidations for now. The patch is not very polished but should work. /reviewed-on nexedi/neoppod!4
-
- 01 Aug, 2016 1 commit
-
-
Julien Muchembled authored
-
- 22 Mar, 2016 1 commit
-
-
Julien Muchembled authored
-
- 25 Jan, 2016 1 commit
-
-
Julien Muchembled authored
-
- 30 Nov, 2015 1 commit
-
-
Julien Muchembled authored
NEO did not ensure that all data and metadata are written on disk before tpc_finish, and it was for example vulnerable to ENOSPC errors. In other words, some work had to be moved to tpc_vote: - In tpc_vote, all involved storage nodes are now asked to write all metadata to ttrans/tobj and _commit_. Because the final tid is not known yet, the tid column of ttrans and tobj now contains NULL and the ttid respectively. - In tpc_finish, AskLockInformation is still required for read locking, ttrans.tid is updated with the final value and this change is _committed_. - The verification phase is greatly simplified, more reliable and faster. For all voted transactions, we can know if a tpc_finish was started by getting the final tid from the ttid, either from ttrans or from trans. And we know that such transactions can't be partial so we don't need to check oids. So in addition to minimizing the risk of failures during tpc_finish, we also fix a bug causing the verification phase to discard transactions with objects for which readCurrent was called. On performance side: - Although tpc_vote now asks all involved storages, instead of only those storing the transaction metadata, the client has been improved to do this in parallel. The additional commits are also all done in parallel. - A possible improvement to compensate the additional commits is to delay the commit done by the unlock. - By minimizing the time to lock transactions, objects are read-locked for a much shorter period. This is even more important that locked transactions must be unlocked in the same order. Transactions with too many modified objects will now timeout inside tpc_vote instead of tpc_finish. Of course, such transactions may still cause other transaction to timeout in tpc_finish.
-
- 28 Aug, 2015 1 commit
-
-
Julien Muchembled authored
This fixes a random failure in testClientReconnection: Traceback (most recent call last): File "neo/tests/threaded/test.py", line 754, in testClientReconnection self.assertTrue(cluster.client.history(x1._p_oid)) failureException: None is not true
-
- 15 Jun, 2015 1 commit
-
-
Julien Muchembled authored
Limiting the size of data.value column to 16 MB saves 1 byte by switching to MEDIUMBLOB, and it avoid the need of big redo logs in InnoDB.
-
- 21 May, 2015 1 commit
-
-
Julien Muchembled authored
-
- 24 Jun, 2014 1 commit
-
-
Julien Muchembled authored
-
- 07 Jan, 2014 1 commit
-
-
Julien Muchembled authored
-
- 28 Oct, 2013 2 commits
-
-
Vincent Pelletier authored
This fixes a bug causing a crash during tpc_finish phase when a storage involved in a transaction does not receive any object, but receives at least one CheckCurrentSerial request: no transaction was registered, and storage would fail to lock transaction when requested by master during tpc_finish phase.
-
Vincent Pelletier authored
-
- 20 Mar, 2012 1 commit
-
-
Julien Muchembled authored
-
- 13 Mar, 2012 1 commit
-
-
Julien Muchembled authored
-
- 24 Feb, 2012 1 commit
-
-
Julien Muchembled authored
Replication is also fully reimplemented: - It is not done anymore on whole partitions. - It runs at lowest priority not to degrades performance for client nodes. Schema of MySQL table is changed to optimize storage layout: rows are now grouped by age, for good partial replication performance. This certainly also speeds up simple loads/stores.
-
- 07 Feb, 2012 1 commit
-
-
Vincent Pelletier authored
-
- 17 Jan, 2012 1 commit
-
-
Julien Muchembled authored
-
- 26 Oct, 2011 1 commit
-
-
Julien Muchembled authored
-
- 11 Oct, 2011 2 commits
-
-
Julien Muchembled authored
- Change protocol to use SHA1 for all checksums: - Use SHA1 instead of CRC32 for data checksums. - Use SHA1 instead of MD5 for replication. - Change DatabaseManager API so that backends can store raw data separately from object metadata: - When processing AskStoreObject, call the backend to store the data immediately, instead of keeping it in RAM or in the temporary object table. Data is then referenced only by its checksum. Without such change, the storage could fail to store the transaction due to lack of RAM, or it could make tpc_finish step very slow. - Backends have to store data in a separate space, and remove entries as soon as they get unreferenced. So they must have an index of checksums in object metadata space. A new '_uncommitted_data' backend attribute keeps references of uncommitted data. - New methods: _pruneData, _storeData, storeData, unlockData - MySQL: change vertical partitioning of 'obj' by having data in a separate 'data' table instead of using a shortened 'obj_short' table. - BTree: data is moved from '_obj' to a new '_data' btree. - Undo is optimized so that backpointers are not required anymore to fetch data: - The checksum of an object is None only when creation is undone. - Removed DatabaseManager methods: _getObjectData, _getDataTIDFromData - DatabaseManager: move some code from _getDataTID to findUndoTID so that _getDataTID only has what's specific to backend. - Removed because already covered by ZODB tests: - neo.tests.storage.testStorageDBTests.StorageDBTests.test__getDataTID - neo.tests.storage.testStorageDBTests.StorageDBTests.test__getDataTIDFromData
-
Julien Muchembled authored
This changes how NEO stores undo information and how it is transmitted on the network.
-