- 28 Aug, 2015 1 commit
-
-
Julien Muchembled authored
deadlocks mainly happened while stopping a cluster, hence the complete review of NEOCluster.stop() A major change is to make the client node handle its lock like other nodes (i.e. in the polling thread itself) to better know when to call Serialized.background() (there was a race condition with the test of 'self.poll_thread.isAlive()' in ClientApplication.close).
-
- 14 Aug, 2015 2 commits
-
-
Julien Muchembled authored
-
Julien Muchembled authored
For example, a backup storage node that was rejected because the upstream cluster was not ready could reconnect in loop without delay, using 100% CPU and flooding logs. A new 'setReconnectionNoDelay' method on Connection can be used for cases where it's legitimate to quickly reconnect. With this new delayed reconnection, it's possible to remove the remaining time.sleep().
-
- 12 Aug, 2015 16 commits
-
-
Julien Muchembled authored
Such kind of test has never helped to detect regressions and any bug in EpollEventManager would be quickly reported by other tests. testConnection may go the same way if it keeps annoying me too much.
-
Julien Muchembled authored
This is currently not an issue because the 'time.sleep(1)' in iterateForObject (storage) and _connectToPrimaryNode (master) leave enough time. What could happen is a new connection attempt for a node that already has a connection (causing a failure assertion in Node.setConnection).
-
Julien Muchembled authored
This could happen if a file descriptor was reallocated by the kernel.
-
Julien Muchembled authored
-
Julien Muchembled authored
-
Julien Muchembled authored
With this patch, the epolling object is not awoken every second to check if a timeout has expired. The API of Connection is changed to get the smallest timeout.
-
Julien Muchembled authored
-
Julien Muchembled authored
-
Julien Muchembled authored
This is a prerequisite for tickless poll loops.
-
Julien Muchembled authored
-
Julien Muchembled authored
This mainly changes several methods to lock automatically instead of asserting that the caller did it. This removes any overhead for non-MT classes, and the use of 'with' instead of lock/unlock methods also simplifies the API.
-
Julien Muchembled authored
-
Julien Muchembled authored
shutdown is implicit because we don't duplicate sockets.
-
Julien Muchembled authored
-
Julien Muchembled authored
- For all threads except the main one, the id is displayed instead of the name, because the latter is not always unique. - Outputs may be interlaced by concurrent thread, so tracebacks are also prefixed by their idents.
-
Julien Muchembled authored
-
- 28 Jul, 2015 1 commit
-
-
Julien Muchembled authored
-
- 13 Jul, 2015 2 commits
-
-
Julien Muchembled authored
-
Julien Muchembled authored
-
- 10 Jul, 2015 1 commit
-
-
Julien Muchembled authored
-
- 09 Jul, 2015 1 commit
-
-
Julien Muchembled authored
-
- 03 Jul, 2015 3 commits
-
-
Julien Muchembled authored
-
Julien Muchembled authored
-
Julien Muchembled authored
-
- 01 Jul, 2015 1 commit
-
-
Julien Muchembled authored
-
- 30 Jun, 2015 2 commits
-
-
Julien Muchembled authored
-
Julien Muchembled authored
-
- 29 Jun, 2015 2 commits
-
-
Julien Muchembled authored
-
Julien Muchembled authored
-
- 24 Jun, 2015 8 commits
-
-
Julien Muchembled authored
When the connection to the primary master node is lost, the node manager does not have anymore a reliable list of running nodes, so iterateForObject() must not retry any cell.
-
Julien Muchembled authored
-
Julien Muchembled authored
-
Julien Muchembled authored
-
Julien Muchembled authored
-
Julien Muchembled authored
-
Julien Muchembled authored
Since transactions have metadata like a description, it may not be useless to allow them. But the behaviour of FileStorage is to silently drop them, so we may have to do the same in the future. An application that is not supposed to commit empty transactions should write its own unit test to prevent this.
-
Julien Muchembled authored
This happened between storage nodes of different clusters because they're not informed about their state, e.g. a dead upstream storage node. In any case, logs were flooded at 100% cpu usage.
-