Commit b63a4b46 authored by Chris McDonough's avatar Chris McDonough

Merge from 2.7 branch.

New "Transience" implementation.  This implementation offers no new features,
but is is vastly simpler, which makes it easier to maintain.  The older version
used all sorts of (questionable) tricks to attempt to avoid conflicts and to
improve performance, such as using Python Queue-module queues to store action 
lists, using an index to quickly look up which "bucket" a transient object
was stored within and several other persistent objects which attempted to keep
pointers into the data.  The older version also had a lot of "voodoo" code in
it which papered over problems that was apparenly caused by its complexity.
This code is now removed/replaced and the implementation is fairly straight-
forward.

The newer version is probably much slower (due to the lack of an index, it
needs to scan all "current" buckets to attempt to find a value), but it 
operates reliably under high load. 

This implementation removes backwards compatibility support for transient
object containers persisted via the Zope 2.5.X implementation.  It is possible
to use it against instances created in Zope 2.6.X and better, and it
is possible after using it against a database created under one of these
flavors to move back to an "older" Zope in this range, although it is likely
that data in the TOC will be silently lost when this is done.
parent efe70c9e
......@@ -43,21 +43,8 @@ Data Structures Maintained by a Transient Object Container
"current" bucket, which is the bucket that is contained within the
_data structured with a key equal to the "current" timeslice.
- An "index" which is an OOBTree mapping transient object "key" to
"timeslice", letting us quickly figure out which element in the _data
mapping contains the transient object related to the key. It is
stored as the attribute "_index" of the TOC. When calling code
wants to obtain a Transient Object, its key is looked up in the
index, which returns a timeslice. We ask the _data structure for the
bucket it has stored under that timeslice. Then the bucket is asked
for the object stored under the key. This returns the Transient Object.
- A "last timeslice" integer, which is equal to the "last" timeslice
under which TOC housekeeping operations were performed.
- A "next to deindex" integer, which is a timeslice
representing the next bucket which requires "deindexing"
(the removal of all the keys of the bucket from the index).
- A "max_timeslice" integer, which is equal to the "largest" timeslice
for which there exists a bucket in the _data structure.
When a Transient Object is created via new_or_existing, it is added
to the "current" bucket. As time goes by, the bucket to which the
......@@ -69,41 +56,35 @@ Data Structures Maintained by a Transient Object Container
During the course of normal operations, a TransientObject will move
from an "old" bucket to the "current" bucket many times, as long as
it continues to be accessed. It is possible for a TransientObject
to *never* expire, as long as it is called up out of its TOC often
to never expire, as long as it is called up out of its TOC often
enough.
If a TransientObject is not accessed in the period of time defined by
the TOC's "timeout", it is deindexed and eventually garbage collected.
the TOC's "timeout", it is eventually garbage collected.
How the TransientObjectContainer Determines if a TransientObject is "Current"
A TO is current if it has an entry in the "index". When a TO has an
entry in the index, it implies that the TO resides in a bucket that
is no "older" than the TOC timeout period, based on the bucket's
timeslice.
All "current" timeslice buckets (as specified by the timeout) are
searched for the transient object, most recent bucket first.
Housekeeping: Finalization, Notification, Garbage Collection, and
Bucket Replentishing
Housekeeping: Notification, Garbage Collection, and Bucket
Replentishing
The TOC performs "deindexing", "notification", "garbage
collection", and "bucket replentishing". It performs these tasks
"in-band". This means that the TOC does not maintain a separate
thread that wakes up every so often to do these housekeeping tasks.
Instead, during the course of normal operations, the TOC
opportunistically performs them.
Deindexing is defined as the act of making an "expired" TO
inaccessible (by deleting it from the "index"). After a TO is
deindexed, it may not be used by application code any longer,
although it may "stick around" in a bucket for a while until the
bucket is eventually garbage collected.
Notification is defined as optionally calling a function at TOC
finalization time. The optional function call is user-defined, but
it is managed by the "notifyDestruct" method of the TOC.
The TOC performs "notification", "garbage collection", and "bucket
replentishing". It performs these tasks "in-band". This means that
the TOC does not maintain a separate thread that wakes up every so
often to do these housekeeping tasks. Instead, during the course of
normal operations, the TOC opportunistically performs them.
Garbage collection is defined as deleting "expired" buckets in the
_data structure (the _data structure maps a timeslice to a bucket).
Typically this is done by throwing away one or more buckets in the
_data structure after they expire.
Notification is defined as optionally calling a function at TOC
finalization time against individual transient object contained
within a bucket. The optional function call is user-defined, but it
is managed by the "notifyDel" method of the TOC.
Bucket replentishing is defined as the action of (opportunistically)
creating more buckets to insert into the the _data structure,
......@@ -112,6 +93,9 @@ Bucket Replentishing
will be immediately created thereafter. We create new buckets in
batches to reduce the possibility of conflicts.
Housekeeping is performed on a somewhat random basis to avoid
unnecessary conflicts.
Goals
- A low number of ZODB conflict errors (which reduce performance).
......
......@@ -58,6 +58,7 @@ class TestBase(TestCase):
def setUp(self):
Products.Transience.Transience.time = fauxtime
Products.Transience.TransientObject.time = fauxtime
Products.Transience.Transience.setStrict(1)
self.app = makerequest.makerequest(_getApp())
timeout = self.timeout = 1
sm=TransientObjectContainer(
......@@ -72,6 +73,7 @@ class TestBase(TestCase):
del self.app
Products.Transience.Transience.time = oldtime
Products.Transience.TransientObject.time = oldtime
Products.Transience.Transience.setStrict(0)
class TestLastAccessed(TestBase):
def testLastAccessed(self):
......@@ -92,7 +94,7 @@ class TestLastAccessed(TestBase):
# to get to the next Windows time.time() tick.
fauxtime.sleep(WRITEGRANULARITY + 0.06 * 60)
sdo = self.app.sm.get('TempObject')
assert sdo.getLastAccessed() > la1, (sdo.getLastAccessed(), la1)
self.assert_(sdo.getLastAccessed() > la1)
class TestNotifications(TestBase):
def testAddNotification(self):
......@@ -100,8 +102,8 @@ class TestNotifications(TestBase):
sdo = self.app.sm.new_or_existing('TempObject')
now = fauxtime.time()
k = sdo.get('starttime')
assert type(k) == type(now)
assert k <= now
self.assertEqual(type(k), type(now))
self.assert_(k <= now)
def testDelNotification(self):
self.app.sm.setDelNotificationTarget(delNotificationTarget)
......@@ -110,12 +112,11 @@ class TestNotifications(TestBase):
fauxtime.sleep(timeout + (timeout * .75))
sdo1 = self.app.sm.get('TempObject')
# force the sdm to do housekeeping
self.app.sm._housekeep(self.app.sm._deindex_next() -
self.app.sm._period)
self.app.sm._gc()
now = fauxtime.time()
k = sdo.get('endtime')
assert (type(k) == type(now)), type(k)
assert k <= now, (k, now)
self.assertEqual(type(k), type(now))
self.assert_(k <= now)
def addNotificationTarget(item, context):
item['starttime'] = fauxtime.time()
......
......@@ -25,6 +25,7 @@ class TestTransientObject(TestCase):
def setUp(self):
Products.Transience.Transience.time = fauxtime
Products.Transience.TransientObject.time = fauxtime
Products.Transience.Transience.setStrict(1)
self.errmargin = .20
self.timeout = 60
self.t = TransientObjectContainer('sdc', timeout_mins=self.timeout/60)
......@@ -32,55 +33,56 @@ class TestTransientObject(TestCase):
def tearDown(self):
Products.Transience.Transience.time = oldtime
Products.Transience.TransientObject.time = oldtime
Products.Transience.Transience.setStrict(0)
self.t = None
del self.t
def test_id(self):
t = self.t.new('xyzzy')
assert t.getId() != 'xyzzy'
assert t.getContainerKey() == 'xyzzy'
self.failIfEqual(t.getId(), 'xyzzy') # dont acquire
self.assertEqual(t.getContainerKey(), 'xyzzy')
def test_validate(self):
t = self.t.new('xyzzy')
assert t.isValid()
self.assert_(t.isValid())
t.invalidate()
assert not t.isValid()
self.failIf(t.isValid())
def test_getLastAccessed(self):
t = self.t.new('xyzzy')
ft = fauxtime.time()
assert t.getLastAccessed() <= ft
self.assert_(t.getLastAccessed() <= ft)
def test_getCreated(self):
t = self.t.new('xyzzy')
ft = fauxtime.time()
assert t.getCreated() <= ft
self.assert_(t.getCreated() <= ft)
def test_getLastModifiedUnset(self):
t = self.t.new('xyzzy')
assert t.getLastModified() == None
self.assertEqual(t.getLastModified(), None)
def test_getLastModifiedSet(self):
t = self.t.new('xyzzy')
t['a'] = 1
assert t.getLastModified() is not None
self.failIfEqual(t.getLastModified(), None)
def testSetLastModified(self):
t = self.t.new('xyzzy')
ft = fauxtime.time()
t.setLastModified()
assert t.getLastModified() is not None
self.failIfEqual(t.getLastModified(), None)
def test_setLastAccessed(self):
t = self.t.new('xyzzy')
ft = fauxtime.time()
assert t.getLastAccessed() <= ft
self.assert_(t.getLastAccessed() <= ft)
fauxtime.sleep(self.timeout) # go to sleep past the granuarity
ft2 = fauxtime.time()
t.setLastAccessed()
ft3 = fauxtime.time()
assert t.getLastAccessed() <= ft3
assert t.getLastAccessed() >= ft2
self.assert_(t.getLastAccessed() <= ft3)
self.assert_(t.getLastAccessed() >= ft2)
def _genKeyError(self, t):
return t.get('foobie')
......@@ -91,27 +93,27 @@ class TestTransientObject(TestCase):
def test_dictionaryLike(self):
t = self.t.new('keytest')
t.update(data)
assert t.keys() == data.keys()
assert t.values() == data.values()
assert t.items() == data.items()
self.assertEqual(t.keys(), data.keys())
self.assertEqual(t.values(), data.values())
self.assertEqual(t.items(), data.items())
for k in data.keys():
assert t.get(k) == data.get(k)
assert t.get('foobie') is None
self.assertEqual(t.get(k), data.get(k))
self.assertEqual(t.get('foobie'), None)
self.assertRaises(AttributeError, self._genLenError, t)
assert t.get('foobie',None) is None
assert t.has_key('a')
assert not t.has_key('foobie')
self.assertEqual(t.get('foobie',None), None)
self.assert_(t.has_key('a'))
self.failIf(t.has_key('foobie'))
t.clear()
assert not len(t.keys())
self.assertEqual(len(t.keys()), 0)
def test_TTWDictionary(self):
t = self.t.new('mouthfultest')
t.set('foo', 'bar')
assert t['foo'] == 'bar'
assert t.get('foo') == 'bar'
self.assertEqual(t['foo'], 'bar')
self.assertEqual(t.get('foo'), 'bar')
t.set('foobie', 'blech')
t.delete('foobie')
assert t.get('foobie') is None
self.assertEqual(t.get('foobie'), None)
def test_suite():
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment