- 12 Jun, 2017 5 commits
-
-
Julien Muchembled authored
-
Julien Muchembled authored
-
Julien Muchembled authored
The most important change is that it does not discard readable cells too quickly anymore. A partition can now have multiple FEEDING cells, to avoid going below the wanted level of replication. The new algorithm is also better at minimizing the amount replication.
-
Julien Muchembled authored
The MySQL implementation is written to work around the issue reported at https://jira.mariadb.org/browse/MDEV-12867
-
Julien Muchembled authored
-
- 12 May, 2017 5 commits
-
-
Julien Muchembled authored
Since it's not worth anymore to keep track of the last connection activity (which, btw, ignored TCP ACKs, i.e. timeouts could theorically be triggered before all the data were actually sent), the semantics of closeClient has also changed. Before this commit, the 1-minute timeout was reset whenever there was activity (connection still used as server). Now, it happens exactly 100 seconds after the connection is not used anymore as client.
-
Julien Muchembled authored
-
Julien Muchembled authored
-
Julien Muchembled authored
-
Julien Muchembled authored
-
- 11 May, 2017 1 commit
-
-
Julien Muchembled authored
The next line (MTClientConnection) already logs new connections and the storage node is necessarily in RUNNING state.
-
- 10 May, 2017 2 commits
-
-
Julien Muchembled authored
Now, the primary master is the running master with None displayed in the last column. Before, it could be the id timestamp of when it was secondary, which was obsolete information.
-
Julien Muchembled authored
This fixes up commit 23b6a66a, which reimplements election. poll raised, retrying Traceback (most recent call last): ... File "neo/client/handlers/master.py", line 41, in notPrimaryMaster super(PrimaryNotificationsHandler, self).notPrimaryMaster(*args) File "neo/lib/handler.py", line 157, in notPrimaryMaster assert primary != self.app.server File "neo/client/app.py", line 109, in __getattr__ return self.__getattribute__(attr) AttributeError: 'Application' object has no attribute 'server'
-
- 04 May, 2017 1 commit
-
-
Julien Muchembled authored
-
- 02 May, 2017 1 commit
-
-
Julien Muchembled authored
This fixes the following crash: Traceback (most recent call last): ... File "neo/master/handlers/identification.py", line 94, in requestIdentification uuid = app.getNewUUID(uuid, address, node_type) File "neo/master/app.py", line 449, in getNewUUID assert uuid != self.uuid AssertionError
-
- 28 Apr, 2017 3 commits
-
-
Julien Muchembled authored
-
Julien Muchembled authored
-
Julien Muchembled authored
This really fixes the bug described in commit 40bac312, which could probably be reverted. It only reduced the probability of failure. What happened is that the second conflict on 'a' for t3 what first reported by an answer to first store with: - a base serial at which a=0 - a conflict serial at which a=7 However, the cached data is not 8 anymore but 12, since a second store already occurred after the first conflict (reported by the other storage node). When this conflict was resolved before receiving the conflict for second store, it gave: resolve(old=0, saved=7, new=12) -> 19 instead of: resolve(old=4, saved=7, new=12) -> 15 (if we still had the data of the first store, we could also do resolve(old=0, saved=7, new=8) but that would be inefficient from a memory point of view) The bug was difficult to reproduce. testNotifyReplicated had to be run many many times before that race conditions trigger it. The test was changed to enforce some of them, and the above scenario now happens almost always.
-
- 27 Apr, 2017 7 commits
-
-
Julien Muchembled authored
-
Julien Muchembled authored
-
Julien Muchembled authored
-
Julien Muchembled authored
-
Julien Muchembled authored
- atomic write to disk to avoid corruption - update when the address changes (not only when a node is removed/added)
-
Julien Muchembled authored
-
Julien Muchembled authored
This fixes 2 issues: - Because neoctl connects to admin nodes without requesting identification, the protocol version was not checked, which could even be dangerous (think of a user asking for information, but the packet sent by neoctl could be decoded as a packet to alter data, like Truncate). - In case of mismatched protocol version, the error was not loggued on the node that initiated the connection. Compatibility is handled as follows: - For an old node receiving data from a new node, the 2 high order bytes of the packet id, which is always 0 for the first packet, is decoded as the packet code. Packet 0 has never existed, which results in PacketMalformedError. - For a new node receiving data from an old node, the id of the first packet, which is always 0, is decoded as the version, which results in a version mismatch error. This new protocol also guarantees that there's no conflict with SSL. For simplification, the packet length does not count the header anymore.
-
- 25 Apr, 2017 4 commits
-
-
Julien Muchembled authored
When using network byte order ('!'), the size of struct items is independant of the platform. They have never changed from one version of Python to another.
-
Julien Muchembled authored
-
Julien Muchembled authored
-
Julien Muchembled authored
-
- 24 Apr, 2017 6 commits
-
-
Julien Muchembled authored
The election is not a separate process anymore. It happens during the RECOVERING phase, and there's no use of timeouts anymore. Each master node keeps a timestamp of when it started to play the primary role, and the node with the smallest timestamp is elected. The election stops when the cluster is started: as long as it is operational, the primary master can't be deposed. An election must happen whenever the cluster is not operational anymore, to handle the case of a network cut between a primary master and all other nodes: then another master node (secondary) takes over and when the initial primary master is back, it loses against the new primary master if the cluster is already started.
-
Julien Muchembled authored
-
Julien Muchembled authored
-
Julien Muchembled authored
-
Julien Muchembled authored
In order to do that correctly, this commit contains several other changes: When connecting to a primary master, a full node list always follows the identification. For storage nodes, this means that they now know all nodes during the RECOVERING phase. The initial full node list now always contains a node tuple for: - the server-side node (i.e. the primary master): on a master, this is done by always having a node describing itself in its node manager. - the client-side node, to make sure it gets a id timestamp: now an admin node also receives a node for itself.
-
Julien Muchembled authored
This keeps the connection fully functional when a handler raises an exception.
-
- 19 Apr, 2017 2 commits
-
-
Julien Muchembled authored
Commits like 7eb7cf1b ("Minimize the amount of work during tpc_finish") dropped what was done in commit 07b48079 ("Ignore some requests, based on connection state") to protect request handlers when they respond. This commit fixes this in a generic way.
-
Julien Muchembled authored
-
- 18 Apr, 2017 3 commits
-
-
Julien Muchembled authored
The initial intention was to rely on stable sorting when several events have the same key. For this to happen, sorting must not continue the comparison with the second item of events. This could lead to data corruption (conflict resolution with wrong base): FAIL: testNotifyReplicated (neo.tests.threaded.test.Test) ---------------------------------------------------------------------- Traceback (most recent call last): File "neo/tests/threaded/__init__.py", line 1093, in wrapper return wrapped(self, cluster, *args, **kw) File "neo/tests/threaded/test.py", line 2019, in testNotifyReplicated self.assertEqual([15, 11, 13, 16], [r[x].value for x in 'abcd']) File "neo/tests/__init__.py", line 187, in assertEqual return super(NeoTestBase, self).assertEqual(first, second, msg=msg) failureException: Lists differ: [15, 11, 13, 16] != [19, 11, 13, 16] First differing element 0: 15 19 - [15, 11, 13, 16] ? ^ + [19, 11, 13, 16] ? ^
-
Julien Muchembled authored
-
Julien Muchembled authored
'aborted' could appear twice.
-