Commit 787cb615 authored by Jim Fulton's avatar Jim Fulton

Merge branch 'master' into ZEO5

Conflicts:
	.travis.yml
	CHANGES.rst
	setup.py
	src/ZEO/StorageServer.py
	src/ZEO/cache.py
parents f4eeb598 bc92bdd7
......@@ -84,8 +84,6 @@ Dropped features:
- Server support for clients older than ZEO 4.2.0
4.2.0 (2016-06-15)
------------------
......
......@@ -34,11 +34,12 @@ name or IP address) are logged.
Analyzing a Cache Trace
-----------------------
The stats.py command-line tool is the first-line tool to analyze a cache
trace. Its default output consists of two parts: a one-line summary of
essential statistics for each segment of 15 minutes, interspersed with lines
indicating client restarts, followed by a more detailed summary of overall
statistics.
The cache_stats.py command-line tool (``python -m
ZEO.scripts.cache_stats``) is the first-line tool to analyze a cache
trace. Its default output consists of two parts: a one-line summary of
essential statistics for each segment of 15 minutes, interspersed with
lines indicating client restarts, followed by a more detailed summary
of overall statistics.
The most important statistic is the "hit rate", a percentage indicating how
many requests to load an object could be satisfied from the cache. Hit rates
......@@ -48,7 +49,7 @@ server's performance) by increasing the ZEO cache size. This is normally
configured using key ``cache_size`` in the ``zeoclient`` section of your
configuration file. The default cache size is 20 MB, which is small.
The stats.py tool shows its command line syntax when invoked without
The cache_stats.py tool shows its command line syntax when invoked without
arguments. The tracefile argument can be a gzipped file if it has a .gz
extension. It will be read from stdin (assuming uncompressed data) if the
tracefile argument is '-'.
......@@ -57,7 +58,7 @@ Simulating Different Cache Sizes
--------------------------------
Based on a cache trace file, you can make a prediction of how well the cache
might do with a different cache size. The simul.py tool runs a simulation of
might do with a different cache size. The cache_simul.py tool runs a simulation of
the ZEO client cache implementation based upon the events read from a trace
file. A new simulation is started each time the trace file records a client
restart event; if a trace file contains more than one restart event, a
......@@ -66,7 +67,7 @@ statistics is added at the end.
Example, assuming the trace file is in /tmp/cachetrace.log::
$ python simul.py -s 4 /tmp/cachetrace.log
$ python -m ZEO.scripts.cache_simul.py -s 4 /tmp/cachetrace.log
CircularCacheSimulation, cache size 4,194,304 bytes
START TIME DURATION LOADS HITS INVALS WRITES HITRATE EVICTS INUSE
Jul 22 22:22 39:09 3218856 1429329 24046 41517 44.4% 40776 99.8
......@@ -80,7 +81,7 @@ by object eviction and not yet reused to hold another object's state).
Let's try this again with an 8 MB cache::
$ python simul.py -s 8 /tmp/cachetrace.log
$ python -m ZEO.scripts.cache_simul.py -s 8 /tmp/cachetrace.log
CircularCacheSimulation, cache size 8,388,608 bytes
START TIME DURATION LOADS HITS INVALS WRITES HITRATE EVICTS INUSE
Jul 22 22:22 39:09 3218856 2182722 31315 41517 67.8% 40016 100.0
......@@ -89,7 +90,7 @@ That's a huge improvement in hit rate, which isn't surprising since these are
very small cache sizes. The default cache size is 20 MB, which is still on
the small side::
$ python simul.py /tmp/cachetrace.log
$ python -m ZEO.scripts.cache_simul.py /tmp/cachetrace.log
CircularCacheSimulation, cache size 20,971,520 bytes
START TIME DURATION LOADS HITS INVALS WRITES HITRATE EVICTS INUSE
Jul 22 22:22 39:09 3218856 2982589 37922 41517 92.7% 37761 99.9
......@@ -97,7 +98,7 @@ the small side::
Again a very nice improvement in hit rate, and there's not a lot of room left
for improvement. Let's try 100 MB::
$ python simul.py -s 100 /tmp/cachetrace.log
$ python -m ZEO.scripts.cache_simul.py -s 100 /tmp/cachetrace.log
CircularCacheSimulation, cache size 104,857,600 bytes
START TIME DURATION LOADS HITS INVALS WRITES HITRATE EVICTS INUSE
Jul 22 22:22 39:09 3218856 3218741 39572 41517 100.0% 22778 100.0
......@@ -115,7 +116,7 @@ never loaded again. If, for example, a third of the objects are loaded only
once, it's quite possible for the theoretical maximum hit rate to be 67%, no
matter how large the cache.
The simul.py script also contains code to simulate different cache
The cache_simul.py script also contains code to simulate different cache
strategies. Since none of these are implemented, and only the default cache
strategy's code has been updated to be aware of MVCC, these are not further
documented here.
......
......@@ -12,8 +12,6 @@
#
##############################################################################
"""Monitor behavior of ZEO server and record statistics.
$Id$
"""
from __future__ import print_function
from __future__ import print_function
......
......@@ -43,7 +43,6 @@ import socket
import logging
import ZConfig.datatypes
import ZEO
from zdaemon.zdoptions import ZDOptions
logger = logging.getLogger('ZEO.runzeo')
......@@ -115,7 +114,7 @@ class ZEOOptions(ZDOptions, ZEOOptionsMixin):
__doc__ = __doc__
logsectionname = "eventlog"
schemadir = os.path.dirname(ZEO.__file__)
schemadir = os.path.dirname(__file__)
def __init__(self):
ZDOptions.__init__(self)
......@@ -337,7 +336,7 @@ class ZEOServer:
def create_server(storages, options):
from ZEO.StorageServer import StorageServer
from .StorageServer import StorageServer
return StorageServer(
options.address,
storages,
......
......@@ -12,14 +12,9 @@
# FOR A PARTICULAR PURPOSE
#
##############################################################################
"""Cache simulation.
Usage: simul.py [-s size] tracefile
"""
Cache simulation.
Options:
-s size: cache size in MB (default 20 MB)
-i: summarizing interval in minutes (default 15; max 60)
-r: rearrange factor
Note:
......@@ -27,102 +22,52 @@ Note:
- The simulation will be far off if the trace file
was created starting with a non-empty cache
"""
from __future__ import print_function
from __future__ import print_function
from __future__ import print_function
from __future__ import print_function
from __future__ import print_function
from __future__ import print_function
from __future__ import print_function
from __future__ import print_function
from __future__ import print_function
from __future__ import print_function
from __future__ import print_function
from __future__ import print_function
from __future__ import print_function, absolute_import
import bisect
import getopt
import struct
import re
import sys
import ZEO.cache
import argparse
from ZODB.utils import z64
from ZODB.utils import z64, u64
from .cache_stats import add_interval_argument
from .cache_stats import add_tracefile_argument
# we assign ctime locally to facilitate test replacement!
from time import ctime
import six
def usage(msg):
print(msg, file=sys.stderr)
print(__doc__, file=sys.stderr)
def main(args=None):
if args is None:
args = sys.argv[1:]
# Parse options.
MB = 1<<20
cachelimit = 20*MB
rearrange = 0.8
parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument("--size", "-s",
default=20*MB, dest="cachelimit",
type=lambda s: int(float(s)*MB),
help="cache size in MB (default 20MB)")
add_interval_argument(parser)
parser.add_argument("--rearrange", "-r",
default=0.8, type=float,
help="rearrange factor")
add_tracefile_argument(parser)
simclass = CircularCacheSimulation
interval_step = 15
try:
opts, args = getopt.getopt(args, "s:i:r:")
except getopt.error as msg:
usage(msg)
return 2
for o, a in opts:
if o == '-s':
cachelimit = int(float(a)*MB)
elif o == '-i':
interval_step = int(a)
elif o == '-r':
rearrange = float(a)
else:
assert False, (o, a)
interval_step *= 60
if interval_step <= 0:
interval_step = 60
elif interval_step > 3600:
interval_step = 3600
if len(args) != 1:
usage("exactly one file argument required")
return 2
filename = args[0]
# Open file.
if filename.endswith(".gz"):
# Open gzipped file.
try:
import gzip
except ImportError:
print("can't read gzipped files (no module gzip)", file=sys.stderr)
return 1
try:
f = gzip.open(filename, "rb")
except IOError as msg:
print("can't open %s: %s" % (filename, msg), file=sys.stderr)
return 1
elif filename == "-":
# Read from stdin.
f = sys.stdin
else:
# Open regular file.
try:
f = open(filename, "rb")
except IOError as msg:
print("can't open %s: %s" % (filename, msg), file=sys.stderr)
return 1
options = parser.parse_args(args)
f = options.tracefile
interval_step = options.interval
# Create simulation object.
sim = simclass(cachelimit, rearrange)
interval_sim = simclass(cachelimit, rearrange)
sim = simclass(options.cachelimit, options.rearrange)
interval_sim = simclass(options.cachelimit, options.rearrange)
# Print output header.
sim.printheader()
......
......@@ -14,18 +14,7 @@ from __future__ import print_function
##############################################################################
"""Trace file statistics analyzer.
Usage: stats.py [-h] [-i interval] [-q] [-s] [-S] [-v] [-X] tracefile
-h: print histogram of object load frequencies
-i: summarizing interval in minutes (default 15; max 60)
-q: quiet; don't print summaries
-s: print histogram of object sizes
-S: don't print statistics
-v: verbose; print each record
-X: enable heuristic checking for misaligned records: oids > 2**32
will be rejected; this requires the tracefile to be seekable
"""
"""File format:
File format:
Each record is 26 bytes, plus a variable number of bytes to store an oid,
with the following layout. Numbers are big-endian integers.
......@@ -58,84 +47,80 @@ i.e. the low bit is always zero.
"""
import sys
import time
import getopt
import argparse
import struct
import gzip
# we assign ctime locally to facilitate test replacement!
from time import ctime
import six
def usage(msg):
print(msg, file=sys.stderr)
print(__doc__, file=sys.stderr)
def add_interval_argument(parser):
def _interval(a):
interval = int(60 * float(a))
if interval <= 0:
interval = 60
elif interval > 3600:
interval = 3600
return interval
parser.add_argument("--interval", "-i",
default=15*60, type=_interval,
help="summarizing interval in minutes (default 15; max 60)")
def add_tracefile_argument(parser):
class GzipFileType(argparse.FileType):
def __init__(self):
super(GzipFileType, self).__init__(mode='rb')
def __call__(self, s):
f = super(GzipFileType, self).__call__(s)
if s.endswith(".gz"):
f = gzip.GzipFile(filename=s, fileobj=f)
return f
parser.add_argument("tracefile", type=GzipFileType(),
help="The trace to read; may be gzipped")
def main(args=None):
if args is None:
args = sys.argv[1:]
# Parse options
verbose = False
quiet = False
dostats = True
print_size_histogram = False
print_histogram = False
interval = 15*60 # Every 15 minutes
heuristic = False
try:
opts, args = getopt.getopt(args, "hi:qsSvX")
except getopt.error as msg:
usage(msg)
return 2
for o, a in opts:
if o == '-h':
print_histogram = True
elif o == "-i":
interval = int(60 * float(a))
if interval <= 0:
interval = 60
elif interval > 3600:
interval = 3600
elif o == "-q":
quiet = True
verbose = False
elif o == "-s":
print_size_histogram = True
elif o == "-S":
dostats = False
elif o == "-v":
verbose = True
elif o == '-X':
heuristic = True
else:
assert False, (o, opts)
if len(args) != 1:
usage("exactly one file argument required")
return 2
filename = args[0]
# Open file
if filename.endswith(".gz"):
# Open gzipped file
try:
import gzip
except ImportError:
print("can't read gzipped files (no module gzip)", file=sys.stderr)
return 1
try:
f = gzip.open(filename, "rb")
except IOError as msg:
print("can't open %s: %s" % (filename, msg), file=sys.stderr)
return 1
elif filename == '-':
# Read from stdin
f = sys.stdin
else:
# Open regular file
try:
f = open(filename, "rb")
except IOError as msg:
print("can't open %s: %s" % (filename, msg), file=sys.stderr)
return 1
parser = argparse.ArgumentParser(description="Trace file statistics analyzer",
# Our -h, short for --load-histogram
# conflicts with default for help, so we handle
# manually.
add_help=False)
verbose_group = parser.add_mutually_exclusive_group()
verbose_group.add_argument('--verbose', '-v',
default=False, action='store_true',
help="Be verbose; print each record")
verbose_group.add_argument('--quiet', '-q',
default=False, action='store_true',
help="Reduce output; don't print summaries")
parser.add_argument("--sizes", '-s',
default=False, action="store_true", dest="print_size_histogram",
help="print histogram of object sizes")
parser.add_argument("--no-stats", '-S',
default=True, action="store_false", dest="dostats",
help="don't print statistics")
parser.add_argument("--load-histogram", "-h",
default=False, action="store_true", dest="print_histogram",
help="print histogram of object load frequencies")
parser.add_argument("--check", "-X",
default=False, action="store_true", dest="heuristic",
help=" enable heuristic checking for misaligned records: oids > 2**32"
" will be rejected; this requires the tracefile to be seekable")
add_interval_argument(parser)
add_tracefile_argument(parser)
if '--help' in args:
parser.print_help()
sys.exit(2)
options = parser.parse_args(args)
f = options.tracefile
rt0 = time.time()
bycode = {} # map code to count of occurrences
......@@ -169,7 +154,7 @@ def main(args=None):
ts, code, oidlen, start_tid, end_tid = unpack(FMT, r)
if ts == 0:
# Must be a misaligned record caused by a crash.
if not quiet:
if not options.quiet:
print("Skipping 8 bytes at offset", f.tell() - FMT_SIZE)
f.seek(f.tell() - FMT_SIZE + 8)
continue
......@@ -179,14 +164,14 @@ def main(args=None):
records += 1
if t0 is None:
t0 = ts
thisinterval = t0 // interval
thisinterval = t0 // options.interval
h0 = he = ts
te = ts
if ts // interval != thisinterval:
if not quiet:
if ts // options.interval != thisinterval:
if not options.quiet:
dumpbyinterval(byinterval, h0, he)
byinterval = {}
thisinterval = ts // interval
thisinterval = ts // options.interval
h0 = ts
he = ts
dlen, code = (code & 0x7fffff00) >> 8, code & 0xff
......@@ -208,7 +193,7 @@ def main(args=None):
elif code & 0x70 == 0x50: # All stores
bysizew[dlen] = d = bysizew.get(dlen) or {}
d[oid] = d.get(oid, 0) + 1
if verbose:
if options.verbose:
print("%s %02x %s %016x %016x %c%s" % (
ctime(ts)[4:-5],
code,
......@@ -221,12 +206,12 @@ def main(args=None):
oids[oid] = oids.get(oid, 0) + 1
total_loads += 1
elif code == 0x00: # restart
if not quiet:
if not options.quiet:
dumpbyinterval(byinterval, h0, he)
byinterval = {}
thisinterval = ts // interval
thisinterval = ts // options.interval
h0 = he = ts
if not quiet:
if not options.quiet:
print(ctime(ts)[4:-5], end=' ')
print('='*20, "Restart", '='*20)
except KeyboardInterrupt:
......@@ -235,7 +220,7 @@ def main(args=None):
end_pos = f.tell()
f.close()
rte = time.time()
if not quiet:
if not options.quiet:
dumpbyinterval(byinterval, h0, he)
# Error if nothing was read
......@@ -244,7 +229,7 @@ def main(args=None):
return 1
# Print statistics
if dostats:
if options.dostats:
print()
print("Read %s trace records (%s bytes) in %.1f seconds" % (
addcommas(records), addcommas(end_pos), rte-rt0))
......@@ -267,7 +252,7 @@ def main(args=None):
explain.get(code) or "*** unknown code ***"))
# Print histogram.
if print_histogram:
if options.print_histogram:
print()
print("Histogram of object load frequency")
total = len(oids)
......@@ -287,7 +272,7 @@ def main(args=None):
obj_percent, load_percent, cum))
# Print size histogram.
if print_size_histogram:
if options.print_size_histogram:
print()
print("Histograms of object sizes")
print()
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment