Commit b2d53a75 authored by Tim Peters's avatar Tim Peters

The cache simulation seems good enough to be useful now,

although the toy app I wrote to generate a 500MB trace
file doesn't use MVCC in an essential way (some of the
MVCC simulation code is nevertheless exercised, since
an invalidation of current data in the presence of MVCC
always creates a cache record for the newly non-current
revision).

Still puzzling over what to do when the trace file records
a load hit but the simulated cache gets a miss.  The
old simulation code seemed to assume that a store for the same
oid would show up in the trace file next, and it could get
the info it needed about the missing object from the store
trace.  But that isn't true:  precisely because the load was
a hit in the trace file, the object isn't going to be stored
again "soon" in the trace file.

Here are some actual-vs-simulated hit rate results, for a
20MB cache, with a trace file covering about 9 million
loads, over 3 ZEO client (re)starts:

actual   simulated
------   ---------
  93.1        92.7
  79.8        79.0
  68.0        69.1
      
  81.4        81.1  overall

Since the simulated hit rates are both higher and lower
than the actual hit rates, that argues against a gross
systematic bias in the simulation (although there may
be several systematic biases in opposite directions).
parent 9c47277a
...@@ -14,22 +14,22 @@ Enabling Cache Tracing ...@@ -14,22 +14,22 @@ Enabling Cache Tracing
---------------------- ----------------------
To enable cache tracing, you must use a persistent cache (specify a ``client`` To enable cache tracing, you must use a persistent cache (specify a ``client``
name), and set the environment variable ZEO_CACHE_TRACE. The path to the name), and set the environment variable ZEO_CACHE_TRACE to a non-empty
trace file is derived from the path to the persistent cache file by appending value. The path to the trace file is derived from the path to the persistent
".trace". If the file doesn't exist, ZEO will try to create it. If the file cache file by appending ".trace". If the file doesn't exist, ZEO will try to
does exist, it's opened for appending (previous trace information is not create it. If the file does exist, it's opened for appending (previous trace
overwritten). If there are problems with the file, a warning message is information is not overwritten). If there are problems with the file, a
logged. To start or stop tracing, the ZEO client process (typically a Zope warning message is logged. To start or stop tracing, the ZEO client process
application server) must be restarted. (typically a Zope application server) must be restarted.
The trace file can grow pretty quickly; on a moderately loaded server, we The trace file can grow pretty quickly; on a moderately loaded server, we
observed it growing by 5 MB per hour. The file consists of binary records, observed it growing by 7 MB per hour. The file consists of binary records,
each 34 bytes long if 8-byte oids are in use; a detailed description of the each 34 bytes long if 8-byte oids are in use; a detailed description of the
record lay-out is given in stats.py. No sensitive data is logged: data record lay-out is given in stats.py. No sensitive data is logged: data
record sizes and binary object and transaction ids are logged, but no record sizes (but not data records), and binary object and transaction ids
information about object types or names, user names, version names, are logged, but no object pickles, object types or names, user names,
transaction comments, access paths, or machine information such as machine transaction comments, access paths, or machine information (such as machine
name or IP address. name or IP address) are logged.
Analyzing a Cache Trace Analyzing a Cache Trace
----------------------- -----------------------
...@@ -40,31 +40,29 @@ essential statistics for each segment of 15 minutes, interspersed with lines ...@@ -40,31 +40,29 @@ essential statistics for each segment of 15 minutes, interspersed with lines
indicating client restarts, followed by a more detailed summary of overall indicating client restarts, followed by a more detailed summary of overall
statistics. statistics.
The most important statistic is probably the "hit rate", a percentage The most important statistic is the "hit rate", a percentage indicating how
indicating how many requests to load an object could be satisfied from many requests to load an object could be satisfied from the cache. Hit rates
the cache. Hit rates around 70% are good. 90% is probably close to around 70% are good. 90% is excellent. If you see a hit rate under 60% you
the theoretical maximum. If you see a hit rate under 60% you can can probably improve the cache performance (and hence your Zope application
probably improve the cache performance (and hence your Zope server's performance) by increasing the ZEO cache size. This is normally
application server's performance) by increasing the ZEO cache size. configured using key ``cache_size`` in the ``zeoclient`` section of your
This is normally configured using cache_size key in the ``zeoclient`` configuration file. The default cache size is 20 MB, which is very small.
section of your configuration file. The default cache size is 20 MB, which
is very small.
The stats.py tool shows its command line syntax when invoked without The stats.py tool shows its command line syntax when invoked without
arguments. The tracefile argument can be a gzipped file if it has a arguments. The tracefile argument can be a gzipped file if it has a .gz
.gz extension. It will read from stdin (assuming uncompressed data) extension. It will read from stdin (assuming uncompressed data) if the
if the tracefile argument is '-'. tracefile argument is '-'.
Simulating Different Cache Sizes Simulating Different Cache Sizes
-------------------------------- --------------------------------
Based on a cache trace file, you can make a prediction of how well the Based on a cache trace file, you can make a prediction of how well the cache
cache might do with a different cache size. The simul.py tool runs an might do with a different cache size. The simul.py tool runs a simulation of
accurate simulation of the ZEO client cache implementation based upon the ZEO client cache implementation based upon the events read from a trace
the events read from a trace file. A new simulation is started each file. A new simulation is started each time the trace file records a client
time the trace file records a client restart event; if a trace file restart event; if a trace file contains more than one restart event, a
contains more than one restart event, a separate line is printed for separate line is printed for each simulation, and a line with overall
each simulation, and line with overall statistics is added at the end. statistics is added at the end.
Example, assuming the trace file is in /tmp/cachetrace.log:: Example, assuming the trace file is in /tmp/cachetrace.log::
...@@ -98,3 +96,26 @@ the cache only helps if an object is loaded more than once. ...@@ -98,3 +96,26 @@ the cache only helps if an object is loaded more than once.
The simul.py tool also supports simulating different cache The simul.py tool also supports simulating different cache
strategies. Since none of these are implemented, these are not strategies. Since none of these are implemented, these are not
further documented here. further documented here.
Simulation Limitations
----------------------
The cache simulation is an approximation, and actual hit rate may be higher
or lower than the simulated result. These are some factors that inhibit
exact simulation:
- The simulator doesn't try to emulate versions. If the trace file contains
loads and stores of objects in versions, the simulator treats them as if
they were loads and stores of non-version data.
- Each time a load of an object O in the trace file was a cache hit, but the
simulated cache has evicted O, the simulated cache has no way to repair its
knowledge about O. This is more frequent when simulating caches smaller
than the cache used to produce the trace file. When a real cache suffers a
cache miss, it asks the ZEO server for the needed information about O, and
saves O in the client cache. The simulated cache doesn't have a ZEO server
to ask, and O continues to be absent in the simulated cache. Further
requests for O will continue to be simulated cache misses, although in a
real cache they'll likely be cache hits. On the other hand, the
simulated cache doesn't need to evict any objects to make room for O, so it
may enjoy further cache hits on objects a real cache would need to evict.
...@@ -210,7 +210,7 @@ class ClientCache(object): ...@@ -210,7 +210,7 @@ class ClientCache(object):
# than any comparable non-None object in recent Pythons. # than any comparable non-None object in recent Pythons.
i = bisect.bisect_left(L, (tid, None)) i = bisect.bisect_left(L, (tid, None))
# Now L[i-1] < (tid, None) < L[i], and the start_tid for everything in # Now L[i-1] < (tid, None) < L[i], and the start_tid for everything in
# L[:i} is < tid, and the start_tid for everything in L[i:] is >= tid. # L[:i] is < tid, and the start_tid for everything in L[i:] is >= tid.
# Therefore the largest start_tid < tid must be at L[i-1]. If i is 0, # Therefore the largest start_tid < tid must be at L[i-1]. If i is 0,
# there is no start_tid < tid: we don't have any data old enougn. # there is no start_tid < tid: we don't have any data old enougn.
if i == 0: if i == 0:
......
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment