Commit 787cb615 authored by Jim Fulton's avatar Jim Fulton

Merge branch 'master' into ZEO5

Conflicts:
	.travis.yml
	CHANGES.rst
	setup.py
	src/ZEO/StorageServer.py
	src/ZEO/cache.py
parents f4eeb598 bc92bdd7
...@@ -84,8 +84,6 @@ Dropped features: ...@@ -84,8 +84,6 @@ Dropped features:
- Server support for clients older than ZEO 4.2.0 - Server support for clients older than ZEO 4.2.0
4.2.0 (2016-06-15) 4.2.0 (2016-06-15)
------------------ ------------------
......
...@@ -34,11 +34,12 @@ name or IP address) are logged. ...@@ -34,11 +34,12 @@ name or IP address) are logged.
Analyzing a Cache Trace Analyzing a Cache Trace
----------------------- -----------------------
The stats.py command-line tool is the first-line tool to analyze a cache The cache_stats.py command-line tool (``python -m
trace. Its default output consists of two parts: a one-line summary of ZEO.scripts.cache_stats``) is the first-line tool to analyze a cache
essential statistics for each segment of 15 minutes, interspersed with lines trace. Its default output consists of two parts: a one-line summary of
indicating client restarts, followed by a more detailed summary of overall essential statistics for each segment of 15 minutes, interspersed with
statistics. lines indicating client restarts, followed by a more detailed summary
of overall statistics.
The most important statistic is the "hit rate", a percentage indicating how The most important statistic is the "hit rate", a percentage indicating how
many requests to load an object could be satisfied from the cache. Hit rates many requests to load an object could be satisfied from the cache. Hit rates
...@@ -48,7 +49,7 @@ server's performance) by increasing the ZEO cache size. This is normally ...@@ -48,7 +49,7 @@ server's performance) by increasing the ZEO cache size. This is normally
configured using key ``cache_size`` in the ``zeoclient`` section of your configured using key ``cache_size`` in the ``zeoclient`` section of your
configuration file. The default cache size is 20 MB, which is small. configuration file. The default cache size is 20 MB, which is small.
The stats.py tool shows its command line syntax when invoked without The cache_stats.py tool shows its command line syntax when invoked without
arguments. The tracefile argument can be a gzipped file if it has a .gz arguments. The tracefile argument can be a gzipped file if it has a .gz
extension. It will be read from stdin (assuming uncompressed data) if the extension. It will be read from stdin (assuming uncompressed data) if the
tracefile argument is '-'. tracefile argument is '-'.
...@@ -57,7 +58,7 @@ Simulating Different Cache Sizes ...@@ -57,7 +58,7 @@ Simulating Different Cache Sizes
-------------------------------- --------------------------------
Based on a cache trace file, you can make a prediction of how well the cache Based on a cache trace file, you can make a prediction of how well the cache
might do with a different cache size. The simul.py tool runs a simulation of might do with a different cache size. The cache_simul.py tool runs a simulation of
the ZEO client cache implementation based upon the events read from a trace the ZEO client cache implementation based upon the events read from a trace
file. A new simulation is started each time the trace file records a client file. A new simulation is started each time the trace file records a client
restart event; if a trace file contains more than one restart event, a restart event; if a trace file contains more than one restart event, a
...@@ -66,7 +67,7 @@ statistics is added at the end. ...@@ -66,7 +67,7 @@ statistics is added at the end.
Example, assuming the trace file is in /tmp/cachetrace.log:: Example, assuming the trace file is in /tmp/cachetrace.log::
$ python simul.py -s 4 /tmp/cachetrace.log $ python -m ZEO.scripts.cache_simul.py -s 4 /tmp/cachetrace.log
CircularCacheSimulation, cache size 4,194,304 bytes CircularCacheSimulation, cache size 4,194,304 bytes
START TIME DURATION LOADS HITS INVALS WRITES HITRATE EVICTS INUSE START TIME DURATION LOADS HITS INVALS WRITES HITRATE EVICTS INUSE
Jul 22 22:22 39:09 3218856 1429329 24046 41517 44.4% 40776 99.8 Jul 22 22:22 39:09 3218856 1429329 24046 41517 44.4% 40776 99.8
...@@ -80,7 +81,7 @@ by object eviction and not yet reused to hold another object's state). ...@@ -80,7 +81,7 @@ by object eviction and not yet reused to hold another object's state).
Let's try this again with an 8 MB cache:: Let's try this again with an 8 MB cache::
$ python simul.py -s 8 /tmp/cachetrace.log $ python -m ZEO.scripts.cache_simul.py -s 8 /tmp/cachetrace.log
CircularCacheSimulation, cache size 8,388,608 bytes CircularCacheSimulation, cache size 8,388,608 bytes
START TIME DURATION LOADS HITS INVALS WRITES HITRATE EVICTS INUSE START TIME DURATION LOADS HITS INVALS WRITES HITRATE EVICTS INUSE
Jul 22 22:22 39:09 3218856 2182722 31315 41517 67.8% 40016 100.0 Jul 22 22:22 39:09 3218856 2182722 31315 41517 67.8% 40016 100.0
...@@ -89,7 +90,7 @@ That's a huge improvement in hit rate, which isn't surprising since these are ...@@ -89,7 +90,7 @@ That's a huge improvement in hit rate, which isn't surprising since these are
very small cache sizes. The default cache size is 20 MB, which is still on very small cache sizes. The default cache size is 20 MB, which is still on
the small side:: the small side::
$ python simul.py /tmp/cachetrace.log $ python -m ZEO.scripts.cache_simul.py /tmp/cachetrace.log
CircularCacheSimulation, cache size 20,971,520 bytes CircularCacheSimulation, cache size 20,971,520 bytes
START TIME DURATION LOADS HITS INVALS WRITES HITRATE EVICTS INUSE START TIME DURATION LOADS HITS INVALS WRITES HITRATE EVICTS INUSE
Jul 22 22:22 39:09 3218856 2982589 37922 41517 92.7% 37761 99.9 Jul 22 22:22 39:09 3218856 2982589 37922 41517 92.7% 37761 99.9
...@@ -97,7 +98,7 @@ the small side:: ...@@ -97,7 +98,7 @@ the small side::
Again a very nice improvement in hit rate, and there's not a lot of room left Again a very nice improvement in hit rate, and there's not a lot of room left
for improvement. Let's try 100 MB:: for improvement. Let's try 100 MB::
$ python simul.py -s 100 /tmp/cachetrace.log $ python -m ZEO.scripts.cache_simul.py -s 100 /tmp/cachetrace.log
CircularCacheSimulation, cache size 104,857,600 bytes CircularCacheSimulation, cache size 104,857,600 bytes
START TIME DURATION LOADS HITS INVALS WRITES HITRATE EVICTS INUSE START TIME DURATION LOADS HITS INVALS WRITES HITRATE EVICTS INUSE
Jul 22 22:22 39:09 3218856 3218741 39572 41517 100.0% 22778 100.0 Jul 22 22:22 39:09 3218856 3218741 39572 41517 100.0% 22778 100.0
...@@ -115,7 +116,7 @@ never loaded again. If, for example, a third of the objects are loaded only ...@@ -115,7 +116,7 @@ never loaded again. If, for example, a third of the objects are loaded only
once, it's quite possible for the theoretical maximum hit rate to be 67%, no once, it's quite possible for the theoretical maximum hit rate to be 67%, no
matter how large the cache. matter how large the cache.
The simul.py script also contains code to simulate different cache The cache_simul.py script also contains code to simulate different cache
strategies. Since none of these are implemented, and only the default cache strategies. Since none of these are implemented, and only the default cache
strategy's code has been updated to be aware of MVCC, these are not further strategy's code has been updated to be aware of MVCC, these are not further
documented here. documented here.
......
...@@ -12,8 +12,6 @@ ...@@ -12,8 +12,6 @@
# #
############################################################################## ##############################################################################
"""Monitor behavior of ZEO server and record statistics. """Monitor behavior of ZEO server and record statistics.
$Id$
""" """
from __future__ import print_function from __future__ import print_function
from __future__ import print_function from __future__ import print_function
......
...@@ -43,7 +43,6 @@ import socket ...@@ -43,7 +43,6 @@ import socket
import logging import logging
import ZConfig.datatypes import ZConfig.datatypes
import ZEO
from zdaemon.zdoptions import ZDOptions from zdaemon.zdoptions import ZDOptions
logger = logging.getLogger('ZEO.runzeo') logger = logging.getLogger('ZEO.runzeo')
...@@ -115,7 +114,7 @@ class ZEOOptions(ZDOptions, ZEOOptionsMixin): ...@@ -115,7 +114,7 @@ class ZEOOptions(ZDOptions, ZEOOptionsMixin):
__doc__ = __doc__ __doc__ = __doc__
logsectionname = "eventlog" logsectionname = "eventlog"
schemadir = os.path.dirname(ZEO.__file__) schemadir = os.path.dirname(__file__)
def __init__(self): def __init__(self):
ZDOptions.__init__(self) ZDOptions.__init__(self)
...@@ -337,7 +336,7 @@ class ZEOServer: ...@@ -337,7 +336,7 @@ class ZEOServer:
def create_server(storages, options): def create_server(storages, options):
from ZEO.StorageServer import StorageServer from .StorageServer import StorageServer
return StorageServer( return StorageServer(
options.address, options.address,
storages, storages,
......
...@@ -12,14 +12,9 @@ ...@@ -12,14 +12,9 @@
# FOR A PARTICULAR PURPOSE # FOR A PARTICULAR PURPOSE
# #
############################################################################## ##############################################################################
"""Cache simulation. """
Cache simulation.
Usage: simul.py [-s size] tracefile
Options:
-s size: cache size in MB (default 20 MB)
-i: summarizing interval in minutes (default 15; max 60)
-r: rearrange factor
Note: Note:
...@@ -27,102 +22,52 @@ Note: ...@@ -27,102 +22,52 @@ Note:
- The simulation will be far off if the trace file - The simulation will be far off if the trace file
was created starting with a non-empty cache was created starting with a non-empty cache
""" """
from __future__ import print_function from __future__ import print_function, absolute_import
from __future__ import print_function
from __future__ import print_function
from __future__ import print_function
from __future__ import print_function
from __future__ import print_function
from __future__ import print_function
from __future__ import print_function
from __future__ import print_function
from __future__ import print_function
from __future__ import print_function
from __future__ import print_function
import bisect import bisect
import getopt
import struct import struct
import re import re
import sys import sys
import ZEO.cache import ZEO.cache
import argparse
from ZODB.utils import z64
from ZODB.utils import z64, u64 from .cache_stats import add_interval_argument
from .cache_stats import add_tracefile_argument
# we assign ctime locally to facilitate test replacement! # we assign ctime locally to facilitate test replacement!
from time import ctime from time import ctime
import six import six
def usage(msg):
print(msg, file=sys.stderr)
print(__doc__, file=sys.stderr)
def main(args=None): def main(args=None):
if args is None: if args is None:
args = sys.argv[1:] args = sys.argv[1:]
# Parse options. # Parse options.
MB = 1<<20 MB = 1<<20
cachelimit = 20*MB parser = argparse.ArgumentParser(description=__doc__)
rearrange = 0.8 parser.add_argument("--size", "-s",
default=20*MB, dest="cachelimit",
type=lambda s: int(float(s)*MB),
help="cache size in MB (default 20MB)")
add_interval_argument(parser)
parser.add_argument("--rearrange", "-r",
default=0.8, type=float,
help="rearrange factor")
add_tracefile_argument(parser)
simclass = CircularCacheSimulation simclass = CircularCacheSimulation
interval_step = 15
try: options = parser.parse_args(args)
opts, args = getopt.getopt(args, "s:i:r:")
except getopt.error as msg: f = options.tracefile
usage(msg) interval_step = options.interval
return 2
for o, a in opts:
if o == '-s':
cachelimit = int(float(a)*MB)
elif o == '-i':
interval_step = int(a)
elif o == '-r':
rearrange = float(a)
else:
assert False, (o, a)
interval_step *= 60
if interval_step <= 0:
interval_step = 60
elif interval_step > 3600:
interval_step = 3600
if len(args) != 1:
usage("exactly one file argument required")
return 2
filename = args[0]
# Open file.
if filename.endswith(".gz"):
# Open gzipped file.
try:
import gzip
except ImportError:
print("can't read gzipped files (no module gzip)", file=sys.stderr)
return 1
try:
f = gzip.open(filename, "rb")
except IOError as msg:
print("can't open %s: %s" % (filename, msg), file=sys.stderr)
return 1
elif filename == "-":
# Read from stdin.
f = sys.stdin
else:
# Open regular file.
try:
f = open(filename, "rb")
except IOError as msg:
print("can't open %s: %s" % (filename, msg), file=sys.stderr)
return 1
# Create simulation object. # Create simulation object.
sim = simclass(cachelimit, rearrange) sim = simclass(options.cachelimit, options.rearrange)
interval_sim = simclass(cachelimit, rearrange) interval_sim = simclass(options.cachelimit, options.rearrange)
# Print output header. # Print output header.
sim.printheader() sim.printheader()
......
...@@ -14,18 +14,7 @@ from __future__ import print_function ...@@ -14,18 +14,7 @@ from __future__ import print_function
############################################################################## ##############################################################################
"""Trace file statistics analyzer. """Trace file statistics analyzer.
Usage: stats.py [-h] [-i interval] [-q] [-s] [-S] [-v] [-X] tracefile File format:
-h: print histogram of object load frequencies
-i: summarizing interval in minutes (default 15; max 60)
-q: quiet; don't print summaries
-s: print histogram of object sizes
-S: don't print statistics
-v: verbose; print each record
-X: enable heuristic checking for misaligned records: oids > 2**32
will be rejected; this requires the tracefile to be seekable
"""
"""File format:
Each record is 26 bytes, plus a variable number of bytes to store an oid, Each record is 26 bytes, plus a variable number of bytes to store an oid,
with the following layout. Numbers are big-endian integers. with the following layout. Numbers are big-endian integers.
...@@ -58,84 +47,80 @@ i.e. the low bit is always zero. ...@@ -58,84 +47,80 @@ i.e. the low bit is always zero.
""" """
import sys import sys
import time import time
import getopt import argparse
import struct import struct
import gzip
# we assign ctime locally to facilitate test replacement! # we assign ctime locally to facilitate test replacement!
from time import ctime from time import ctime
import six import six
def usage(msg): def add_interval_argument(parser):
print(msg, file=sys.stderr) def _interval(a):
print(__doc__, file=sys.stderr) interval = int(60 * float(a))
if interval <= 0:
interval = 60
elif interval > 3600:
interval = 3600
return interval
parser.add_argument("--interval", "-i",
default=15*60, type=_interval,
help="summarizing interval in minutes (default 15; max 60)")
def add_tracefile_argument(parser):
class GzipFileType(argparse.FileType):
def __init__(self):
super(GzipFileType, self).__init__(mode='rb')
def __call__(self, s):
f = super(GzipFileType, self).__call__(s)
if s.endswith(".gz"):
f = gzip.GzipFile(filename=s, fileobj=f)
return f
parser.add_argument("tracefile", type=GzipFileType(),
help="The trace to read; may be gzipped")
def main(args=None): def main(args=None):
if args is None: if args is None:
args = sys.argv[1:] args = sys.argv[1:]
# Parse options # Parse options
verbose = False parser = argparse.ArgumentParser(description="Trace file statistics analyzer",
quiet = False # Our -h, short for --load-histogram
dostats = True # conflicts with default for help, so we handle
print_size_histogram = False # manually.
print_histogram = False add_help=False)
interval = 15*60 # Every 15 minutes verbose_group = parser.add_mutually_exclusive_group()
heuristic = False verbose_group.add_argument('--verbose', '-v',
try: default=False, action='store_true',
opts, args = getopt.getopt(args, "hi:qsSvX") help="Be verbose; print each record")
except getopt.error as msg: verbose_group.add_argument('--quiet', '-q',
usage(msg) default=False, action='store_true',
return 2 help="Reduce output; don't print summaries")
for o, a in opts: parser.add_argument("--sizes", '-s',
if o == '-h': default=False, action="store_true", dest="print_size_histogram",
print_histogram = True help="print histogram of object sizes")
elif o == "-i": parser.add_argument("--no-stats", '-S',
interval = int(60 * float(a)) default=True, action="store_false", dest="dostats",
if interval <= 0: help="don't print statistics")
interval = 60 parser.add_argument("--load-histogram", "-h",
elif interval > 3600: default=False, action="store_true", dest="print_histogram",
interval = 3600 help="print histogram of object load frequencies")
elif o == "-q": parser.add_argument("--check", "-X",
quiet = True default=False, action="store_true", dest="heuristic",
verbose = False help=" enable heuristic checking for misaligned records: oids > 2**32"
elif o == "-s": " will be rejected; this requires the tracefile to be seekable")
print_size_histogram = True add_interval_argument(parser)
elif o == "-S": add_tracefile_argument(parser)
dostats = False
elif o == "-v": if '--help' in args:
verbose = True parser.print_help()
elif o == '-X': sys.exit(2)
heuristic = True
else: options = parser.parse_args(args)
assert False, (o, opts)
f = options.tracefile
if len(args) != 1:
usage("exactly one file argument required")
return 2
filename = args[0]
# Open file
if filename.endswith(".gz"):
# Open gzipped file
try:
import gzip
except ImportError:
print("can't read gzipped files (no module gzip)", file=sys.stderr)
return 1
try:
f = gzip.open(filename, "rb")
except IOError as msg:
print("can't open %s: %s" % (filename, msg), file=sys.stderr)
return 1
elif filename == '-':
# Read from stdin
f = sys.stdin
else:
# Open regular file
try:
f = open(filename, "rb")
except IOError as msg:
print("can't open %s: %s" % (filename, msg), file=sys.stderr)
return 1
rt0 = time.time() rt0 = time.time()
bycode = {} # map code to count of occurrences bycode = {} # map code to count of occurrences
...@@ -169,7 +154,7 @@ def main(args=None): ...@@ -169,7 +154,7 @@ def main(args=None):
ts, code, oidlen, start_tid, end_tid = unpack(FMT, r) ts, code, oidlen, start_tid, end_tid = unpack(FMT, r)
if ts == 0: if ts == 0:
# Must be a misaligned record caused by a crash. # Must be a misaligned record caused by a crash.
if not quiet: if not options.quiet:
print("Skipping 8 bytes at offset", f.tell() - FMT_SIZE) print("Skipping 8 bytes at offset", f.tell() - FMT_SIZE)
f.seek(f.tell() - FMT_SIZE + 8) f.seek(f.tell() - FMT_SIZE + 8)
continue continue
...@@ -179,14 +164,14 @@ def main(args=None): ...@@ -179,14 +164,14 @@ def main(args=None):
records += 1 records += 1
if t0 is None: if t0 is None:
t0 = ts t0 = ts
thisinterval = t0 // interval thisinterval = t0 // options.interval
h0 = he = ts h0 = he = ts
te = ts te = ts
if ts // interval != thisinterval: if ts // options.interval != thisinterval:
if not quiet: if not options.quiet:
dumpbyinterval(byinterval, h0, he) dumpbyinterval(byinterval, h0, he)
byinterval = {} byinterval = {}
thisinterval = ts // interval thisinterval = ts // options.interval
h0 = ts h0 = ts
he = ts he = ts
dlen, code = (code & 0x7fffff00) >> 8, code & 0xff dlen, code = (code & 0x7fffff00) >> 8, code & 0xff
...@@ -208,7 +193,7 @@ def main(args=None): ...@@ -208,7 +193,7 @@ def main(args=None):
elif code & 0x70 == 0x50: # All stores elif code & 0x70 == 0x50: # All stores
bysizew[dlen] = d = bysizew.get(dlen) or {} bysizew[dlen] = d = bysizew.get(dlen) or {}
d[oid] = d.get(oid, 0) + 1 d[oid] = d.get(oid, 0) + 1
if verbose: if options.verbose:
print("%s %02x %s %016x %016x %c%s" % ( print("%s %02x %s %016x %016x %c%s" % (
ctime(ts)[4:-5], ctime(ts)[4:-5],
code, code,
...@@ -221,12 +206,12 @@ def main(args=None): ...@@ -221,12 +206,12 @@ def main(args=None):
oids[oid] = oids.get(oid, 0) + 1 oids[oid] = oids.get(oid, 0) + 1
total_loads += 1 total_loads += 1
elif code == 0x00: # restart elif code == 0x00: # restart
if not quiet: if not options.quiet:
dumpbyinterval(byinterval, h0, he) dumpbyinterval(byinterval, h0, he)
byinterval = {} byinterval = {}
thisinterval = ts // interval thisinterval = ts // options.interval
h0 = he = ts h0 = he = ts
if not quiet: if not options.quiet:
print(ctime(ts)[4:-5], end=' ') print(ctime(ts)[4:-5], end=' ')
print('='*20, "Restart", '='*20) print('='*20, "Restart", '='*20)
except KeyboardInterrupt: except KeyboardInterrupt:
...@@ -235,7 +220,7 @@ def main(args=None): ...@@ -235,7 +220,7 @@ def main(args=None):
end_pos = f.tell() end_pos = f.tell()
f.close() f.close()
rte = time.time() rte = time.time()
if not quiet: if not options.quiet:
dumpbyinterval(byinterval, h0, he) dumpbyinterval(byinterval, h0, he)
# Error if nothing was read # Error if nothing was read
...@@ -244,7 +229,7 @@ def main(args=None): ...@@ -244,7 +229,7 @@ def main(args=None):
return 1 return 1
# Print statistics # Print statistics
if dostats: if options.dostats:
print() print()
print("Read %s trace records (%s bytes) in %.1f seconds" % ( print("Read %s trace records (%s bytes) in %.1f seconds" % (
addcommas(records), addcommas(end_pos), rte-rt0)) addcommas(records), addcommas(end_pos), rte-rt0))
...@@ -267,7 +252,7 @@ def main(args=None): ...@@ -267,7 +252,7 @@ def main(args=None):
explain.get(code) or "*** unknown code ***")) explain.get(code) or "*** unknown code ***"))
# Print histogram. # Print histogram.
if print_histogram: if options.print_histogram:
print() print()
print("Histogram of object load frequency") print("Histogram of object load frequency")
total = len(oids) total = len(oids)
...@@ -287,7 +272,7 @@ def main(args=None): ...@@ -287,7 +272,7 @@ def main(args=None):
obj_percent, load_percent, cum)) obj_percent, load_percent, cum))
# Print size histogram. # Print size histogram.
if print_size_histogram: if options.print_size_histogram:
print() print()
print("Histograms of object sizes") print("Histograms of object sizes")
print() print()
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment