Commit df7fc4d8 authored by Tim Peters's avatar Tim Peters

_ConnectionPool._reduce_size(): when forgetting a Connection

due to exceeding pool_size available connections, clear its
cache right away.  Because such a connection can never be in
the open state again, hanging on to resources in its cache is
just wasteful.  This was reported as "a problem" on zodb-dev
recently, although it's unclear how the poster got into a state
where it mattered so much.
parent bcf80d21
...@@ -80,6 +80,20 @@ Commit hooks ...@@ -80,6 +80,20 @@ Commit hooks
``beforeCommitHook()`` is now deprecated, and will be removed in ZODB 3.8. ``beforeCommitHook()`` is now deprecated, and will be removed in ZODB 3.8.
Thanks to Julien Anguenot for contributing code and tests. Thanks to Julien Anguenot for contributing code and tests.
Connection management
---------------------
- (3.6b6) When more than ``pool_size`` connections have been closed,
``DB`` forgets the excess (over ``pool_size``) connections closed first.
Python's cyclic garbage collection can take "a long time" to reclaim them
(and may in fact never reclaim them if application code keeps strong
references to them), but such forgotten connections can never be opened
again, so their caches are now cleared at the time ``DB`` forgets them.
Most applications won't notice a difference, but applications that open
many connections, and/or store many large objects in connection caches,
and/or store limited resources (such as RDB connections) in connection
caches may benefit.
ZEO ZEO
--- ---
......
...@@ -118,6 +118,19 @@ class _ConnectionPool(object): ...@@ -118,6 +118,19 @@ class _ConnectionPool(object):
while len(self.available) > target: while len(self.available) > target:
c = self.available.pop(0) c = self.available.pop(0)
self.all.remove(c) self.all.remove(c)
# While application code may still hold a reference to `c`,
# there's little useful that can be done with this Connection
# anymore. Its cache may be holding on to limited resources,
# and we replace the cache with an empty one now so that we
# don't have to wait for gc to reclaim it. Note that it's not
# possible for DB.open() to return `c` again: `c` can never
# be in an open state again.
# TODO: Perhaps it would be better to break the reference
# cycles between `c` and `c._cache`, so that refcounting reclaims
# both right now. But if user code _does_ have a strong
# reference to `c` now, breaking the cycle would not reclaim `c`
# now, and `c` would be left in a user-visible crazy state.
c._resetCache()
# Pop an available connection and return it, or return None if none are # Pop an available connection and return it, or return None if none are
# available. In the latter case, the caller should create a new # available. In the latter case, the caller should create a new
...@@ -527,7 +540,7 @@ class DB(object): ...@@ -527,7 +540,7 @@ class DB(object):
# Tell the connection it belongs to self. # Tell the connection it belongs to self.
result.open(transaction_manager, mvcc, synch) result.open(transaction_manager, mvcc, synch)
# A good time to do some cache cleanup. # A good time to do some cache cleanup.
self._connectionMap(lambda c: c.cacheGC()) self._connectionMap(lambda c: c.cacheGC())
......
...@@ -272,6 +272,54 @@ first popped: ...@@ -272,6 +272,54 @@ first popped:
>>> len(pool.available), len(pool.all) >>> len(pool.available), len(pool.all)
(0, 2) (0, 2)
Next: when a closed Connection is removed from .available due to exceeding
pool_size, that Connection's cache is cleared (this behavior was new in
ZODB 3.6b6). While user code may still hold a reference to that
Connection, once it vanishes from .available it's really not usable for
anything sensible (it can never be in the open state again). Waiting for
gc to reclaim the Connection and its cache eventually works, but that can
take "a long time" and caches can hold on to many objects, and limited
resources (like RDB connections), for the duration.
>>> st.close()
>>> st = Storage()
>>> db = DB(st, pool_size=2)
>>> conn0 = db.open()
>>> len(conn0._cache) # empty now
0
>>> import transaction
>>> conn0.root()['a'] = 1
>>> transaction.commit()
>>> len(conn0._cache) # but now the cache holds the root object
1
Now open more connections so that the total exceeds pool_size (2):
>>> conn1 = db.open()
>>> conn2 = db.open()
>>> pool = db._pools['']
>>> len(pool.all), len(pool.available) # all Connections are in use
(3, 0)
Return pool_size (2) Connections to the pool:
>>> conn0.close()
>>> conn1.close()
>>> len(pool.all), len(pool.available)
(3, 2)
>>> len(conn0._cache) # nothing relevant has changed yet
1
When we close the third connection, conn0 will be booted from .all, and
we expect its cache to be cleared then:
>>> conn2.close()
>>> len(pool.all), len(pool.available)
(2, 2)
>>> len(conn0._cache) # conn0's cache is empty again
0
>>> del conn0, conn1, conn2
Clean up. Clean up.
>>> st.close() >>> st.close()
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment