Commit edf0bf07 authored by Jason Madden's avatar Jason Madden Committed by GitHub

Merge pull request #45 from NextThought/cache-docs

Update zeo-client-cache-tracing to use accurate script names.
parents 6b390906 4fb51de6
......@@ -34,11 +34,12 @@ name or IP address) are logged.
Analyzing a Cache Trace
-----------------------
The stats.py command-line tool is the first-line tool to analyze a cache
trace. Its default output consists of two parts: a one-line summary of
essential statistics for each segment of 15 minutes, interspersed with lines
indicating client restarts, followed by a more detailed summary of overall
statistics.
The cache_stats.py command-line tool (``python -m
ZEO.scripts.cache_stats``) is the first-line tool to analyze a cache
trace. Its default output consists of two parts: a one-line summary of
essential statistics for each segment of 15 minutes, interspersed with
lines indicating client restarts, followed by a more detailed summary
of overall statistics.
The most important statistic is the "hit rate", a percentage indicating how
many requests to load an object could be satisfied from the cache. Hit rates
......@@ -48,7 +49,7 @@ server's performance) by increasing the ZEO cache size. This is normally
configured using key ``cache_size`` in the ``zeoclient`` section of your
configuration file. The default cache size is 20 MB, which is small.
The stats.py tool shows its command line syntax when invoked without
The cache_stats.py tool shows its command line syntax when invoked without
arguments. The tracefile argument can be a gzipped file if it has a .gz
extension. It will be read from stdin (assuming uncompressed data) if the
tracefile argument is '-'.
......@@ -57,7 +58,7 @@ Simulating Different Cache Sizes
--------------------------------
Based on a cache trace file, you can make a prediction of how well the cache
might do with a different cache size. The simul.py tool runs a simulation of
might do with a different cache size. The cache_simul.py tool runs a simulation of
the ZEO client cache implementation based upon the events read from a trace
file. A new simulation is started each time the trace file records a client
restart event; if a trace file contains more than one restart event, a
......@@ -66,7 +67,7 @@ statistics is added at the end.
Example, assuming the trace file is in /tmp/cachetrace.log::
$ python simul.py -s 4 /tmp/cachetrace.log
$ python -m ZEO.scripts.cache_simul.py -s 4 /tmp/cachetrace.log
CircularCacheSimulation, cache size 4,194,304 bytes
START TIME DURATION LOADS HITS INVALS WRITES HITRATE EVICTS INUSE
Jul 22 22:22 39:09 3218856 1429329 24046 41517 44.4% 40776 99.8
......@@ -80,7 +81,7 @@ by object eviction and not yet reused to hold another object's state).
Let's try this again with an 8 MB cache::
$ python simul.py -s 8 /tmp/cachetrace.log
$ python -m ZEO.scripts.cache_simul.py -s 8 /tmp/cachetrace.log
CircularCacheSimulation, cache size 8,388,608 bytes
START TIME DURATION LOADS HITS INVALS WRITES HITRATE EVICTS INUSE
Jul 22 22:22 39:09 3218856 2182722 31315 41517 67.8% 40016 100.0
......@@ -89,7 +90,7 @@ That's a huge improvement in hit rate, which isn't surprising since these are
very small cache sizes. The default cache size is 20 MB, which is still on
the small side::
$ python simul.py /tmp/cachetrace.log
$ python -m ZEO.scripts.cache_simul.py /tmp/cachetrace.log
CircularCacheSimulation, cache size 20,971,520 bytes
START TIME DURATION LOADS HITS INVALS WRITES HITRATE EVICTS INUSE
Jul 22 22:22 39:09 3218856 2982589 37922 41517 92.7% 37761 99.9
......@@ -97,7 +98,7 @@ the small side::
Again a very nice improvement in hit rate, and there's not a lot of room left
for improvement. Let's try 100 MB::
$ python simul.py -s 100 /tmp/cachetrace.log
$ python -m ZEO.scripts.cache_simul.py -s 100 /tmp/cachetrace.log
CircularCacheSimulation, cache size 104,857,600 bytes
START TIME DURATION LOADS HITS INVALS WRITES HITRATE EVICTS INUSE
Jul 22 22:22 39:09 3218856 3218741 39572 41517 100.0% 22778 100.0
......@@ -115,7 +116,7 @@ never loaded again. If, for example, a third of the objects are loaded only
once, it's quite possible for the theoretical maximum hit rate to be 67%, no
matter how large the cache.
The simul.py script also contains code to simulate different cache
The cache_simul.py script also contains code to simulate different cache
strategies. Since none of these are implemented, and only the default cache
strategy's code has been updated to be aware of MVCC, these are not further
documented here.
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment