- 30 Jun, 2017 1 commit
-
-
Julien Muchembled authored
-
- 29 Jun, 2017 1 commit
-
-
Julien Muchembled authored
The explanation became wrong during a git-rebase, when it was decided to keep the old code that drop partitions. The new one needs more work and it is kept in a branch.
-
- 16 Jun, 2017 1 commit
-
-
Julien Muchembled authored
-
- 15 Jun, 2017 2 commits
-
-
Julien Muchembled authored
-
Julien Muchembled authored
Using NEO 0.9.1 and partitioning enabled, I could reproduce the issue with MariaDB 5.3.5, but not with MariaDB 5.3.12 and 5.5.23. I suppose it was fixed. In testOudatedCellsOnDownStorage, 'select count(*) from obj' returned a wrong value (always 1). Strangely, 'select count(*) from test_neo0.obj' was always correct (102).
-
- 14 Jun, 2017 1 commit
-
-
Julien Muchembled authored
When 'neo.tests.cluster' is loaded (usually when functional tests are run), __builtin__.pdb is replaced by an extended pdb, which should behave the same way if it is used like the former. winpdb is so slow that a console pdb is often preferred.
-
- 13 Jun, 2017 1 commit
-
-
Julien Muchembled authored
-
- 12 Jun, 2017 7 commits
-
-
Julien Muchembled authored
-
Julien Muchembled authored
-
Julien Muchembled authored
-
Julien Muchembled authored
-
Julien Muchembled authored
The most important change is that it does not discard readable cells too quickly anymore. A partition can now have multiple FEEDING cells, to avoid going below the wanted level of replication. The new algorithm is also better at minimizing the amount replication.
-
Julien Muchembled authored
The MySQL implementation is written to work around the issue reported at https://jira.mariadb.org/browse/MDEV-12867
-
Julien Muchembled authored
-
- 12 May, 2017 5 commits
-
-
Julien Muchembled authored
Since it's not worth anymore to keep track of the last connection activity (which, btw, ignored TCP ACKs, i.e. timeouts could theorically be triggered before all the data were actually sent), the semantics of closeClient has also changed. Before this commit, the 1-minute timeout was reset whenever there was activity (connection still used as server). Now, it happens exactly 100 seconds after the connection is not used anymore as client.
-
Julien Muchembled authored
-
Julien Muchembled authored
-
Julien Muchembled authored
-
Julien Muchembled authored
-
- 11 May, 2017 1 commit
-
-
Julien Muchembled authored
The next line (MTClientConnection) already logs new connections and the storage node is necessarily in RUNNING state.
-
- 10 May, 2017 2 commits
-
-
Julien Muchembled authored
Now, the primary master is the running master with None displayed in the last column. Before, it could be the id timestamp of when it was secondary, which was obsolete information.
-
Julien Muchembled authored
This fixes up commit 23b6a66a, which reimplements election. poll raised, retrying Traceback (most recent call last): ... File "neo/client/handlers/master.py", line 41, in notPrimaryMaster super(PrimaryNotificationsHandler, self).notPrimaryMaster(*args) File "neo/lib/handler.py", line 157, in notPrimaryMaster assert primary != self.app.server File "neo/client/app.py", line 109, in __getattr__ return self.__getattribute__(attr) AttributeError: 'Application' object has no attribute 'server'
-
- 04 May, 2017 1 commit
-
-
Julien Muchembled authored
-
- 02 May, 2017 1 commit
-
-
Julien Muchembled authored
This fixes the following crash: Traceback (most recent call last): ... File "neo/master/handlers/identification.py", line 94, in requestIdentification uuid = app.getNewUUID(uuid, address, node_type) File "neo/master/app.py", line 449, in getNewUUID assert uuid != self.uuid AssertionError
-
- 28 Apr, 2017 3 commits
-
-
Julien Muchembled authored
-
Julien Muchembled authored
-
Julien Muchembled authored
This really fixes the bug described in commit 40bac312, which could probably be reverted. It only reduced the probability of failure. What happened is that the second conflict on 'a' for t3 what first reported by an answer to first store with: - a base serial at which a=0 - a conflict serial at which a=7 However, the cached data is not 8 anymore but 12, since a second store already occurred after the first conflict (reported by the other storage node). When this conflict was resolved before receiving the conflict for second store, it gave: resolve(old=0, saved=7, new=12) -> 19 instead of: resolve(old=4, saved=7, new=12) -> 15 (if we still had the data of the first store, we could also do resolve(old=0, saved=7, new=8) but that would be inefficient from a memory point of view) The bug was difficult to reproduce. testNotifyReplicated had to be run many many times before that race conditions trigger it. The test was changed to enforce some of them, and the above scenario now happens almost always.
-
- 27 Apr, 2017 7 commits
-
-
Julien Muchembled authored
-
Julien Muchembled authored
-
Julien Muchembled authored
-
Julien Muchembled authored
-
Julien Muchembled authored
- atomic write to disk to avoid corruption - update when the address changes (not only when a node is removed/added)
-
Julien Muchembled authored
-
Julien Muchembled authored
This fixes 2 issues: - Because neoctl connects to admin nodes without requesting identification, the protocol version was not checked, which could even be dangerous (think of a user asking for information, but the packet sent by neoctl could be decoded as a packet to alter data, like Truncate). - In case of mismatched protocol version, the error was not loggued on the node that initiated the connection. Compatibility is handled as follows: - For an old node receiving data from a new node, the 2 high order bytes of the packet id, which is always 0 for the first packet, is decoded as the packet code. Packet 0 has never existed, which results in PacketMalformedError. - For a new node receiving data from an old node, the id of the first packet, which is always 0, is decoded as the version, which results in a version mismatch error. This new protocol also guarantees that there's no conflict with SSL. For simplification, the packet length does not count the header anymore.
-
- 25 Apr, 2017 4 commits
-
-
Julien Muchembled authored
When using network byte order ('!'), the size of struct items is independant of the platform. They have never changed from one version of Python to another.
-
Julien Muchembled authored
-
Julien Muchembled authored
-
Julien Muchembled authored
-
- 24 Apr, 2017 2 commits
-
-
Julien Muchembled authored
The election is not a separate process anymore. It happens during the RECOVERING phase, and there's no use of timeouts anymore. Each master node keeps a timestamp of when it started to play the primary role, and the node with the smallest timestamp is elected. The election stops when the cluster is started: as long as it is operational, the primary master can't be deposed. An election must happen whenever the cluster is not operational anymore, to handle the case of a network cut between a primary master and all other nodes: then another master node (secondary) takes over and when the initial primary master is back, it loses against the new primary master if the cluster is already started.
-
Julien Muchembled authored
-