Merge branch 'master' into ZEO5

Conflicts: .travis.yml CHANGES.rst setup.py src/ZEO/StorageServer.py src/ZEO/cache.py

Merge branch 'master' into ZEO5
Conflicts: .travis.yml CHANGES.rst setup.py src/ZEO/StorageServer.py src/ZEO/cache.py
787cb615 · Jim Fulton · f4eeb598 · bc92bdd7 · 787cb615 · 787cb615
Commit 787cb615 authored Aug 14, 2016 by Jim Fulton
6 changed files
--- a/CHANGES.rst
+++ b/CHANGES.rst
@@ -84,8 +84,6 @@ Dropped features:

 - Server support for clients older than ZEO 4.2.0

-
-
 4.2.0 (2016-06-15)
 ------------------


--- a/doc/zeo-client-cache-tracing.txt
+++ b/doc/zeo-client-cache-tracing.txt
@@ -34,11 +34,12 @@ name or IP address) are logged.
 Analyzing a Cache Trace
 -----------------------

-The stats.py command-line tool is the first-line tool to analyze a cache
-trace.  Its default output consists of two parts:  a one-line summary of
-essential statistics for each segment of 15 minutes, interspersed with lines
-indicating client restarts, followed by a more detailed summary of overall
-statistics.
+The cache_stats.py command-line tool (``python -m
+ZEO.scripts.cache_stats``) is the first-line tool to analyze a cache
+trace. Its default output consists of two parts: a one-line summary of
+essential statistics for each segment of 15 minutes, interspersed with
+lines indicating client restarts, followed by a more detailed summary
+of overall statistics.

 The most important statistic is the "hit rate", a percentage indicating how
 many requests to load an object could be satisfied from the cache.  Hit rates
@@ -48,7 +49,7 @@ server's performance) by increasing the ZEO cache size.  This is normally
 configured using key ``cache_size`` in the ``zeoclient`` section of your
 configuration file.  The default cache size is 20 MB, which is small.

-The stats.py tool shows its command line syntax when invoked without
+The cache_stats.py tool shows its command line syntax when invoked without
 arguments.  The tracefile argument can be a gzipped file if it has a .gz
 extension.  It will be read from stdin (assuming uncompressed data) if the
 tracefile argument is '-'.
@@ -57,7 +58,7 @@ Simulating Different Cache Sizes
 --------------------------------

 Based on a cache trace file, you can make a prediction of how well the cache
-might do with a different cache size.  The simul.py tool runs a simulation of
+might do with a different cache size.  The cache_simul.py tool runs a simulation of
 the ZEO client cache implementation based upon the events read from a trace
 file.  A new simulation is started each time the trace file records a client
 restart event; if a trace file contains more than one restart event, a
@@ -66,7 +67,7 @@ statistics is added at the end.

 Example, assuming the trace file is in /tmp/cachetrace.log::

-    $ python simul.py -s 4 /tmp/cachetrace.log
+    $ python -m ZEO.scripts.cache_simul.py -s 4 /tmp/cachetrace.log
    CircularCacheSimulation, cache size 4,194,304 bytes
      START TIME  DURATION    LOADS     HITS INVALS WRITES HITRATE  EVICTS   INUSE
    Jul 22 22:22     39:09  3218856  1429329  24046  41517   44.4%   40776    99.8
@@ -80,7 +81,7 @@ by object eviction and not yet reused to hold another object's state).

 Let's try this again with an 8 MB cache::

-    $ python simul.py -s 8 /tmp/cachetrace.log
+    $ python -m ZEO.scripts.cache_simul.py -s 8 /tmp/cachetrace.log
    CircularCacheSimulation, cache size 8,388,608 bytes
      START TIME  DURATION    LOADS     HITS INVALS WRITES HITRATE  EVICTS   INUSE
    Jul 22 22:22     39:09  3218856  2182722  31315  41517   67.8%   40016   100.0
@@ -89,7 +90,7 @@ That's a huge improvement in hit rate, which isn't surprising since these are
 very small cache sizes.  The default cache size is 20 MB, which is still on
 the small side::

-    $ python simul.py /tmp/cachetrace.log
+    $ python -m ZEO.scripts.cache_simul.py /tmp/cachetrace.log
    CircularCacheSimulation, cache size 20,971,520 bytes
      START TIME  DURATION    LOADS     HITS INVALS WRITES HITRATE  EVICTS   INUSE
    Jul 22 22:22     39:09  3218856  2982589  37922  41517   92.7%   37761    99.9
@@ -97,7 +98,7 @@ the small side::
 Again a very nice improvement in hit rate, and there's not a lot of room left
 for improvement.  Let's try 100 MB::

-    $ python simul.py -s 100 /tmp/cachetrace.log
+    $ python -m ZEO.scripts.cache_simul.py -s 100 /tmp/cachetrace.log
    CircularCacheSimulation, cache size 104,857,600 bytes
      START TIME  DURATION    LOADS     HITS INVALS WRITES HITRATE  EVICTS   INUSE
    Jul 22 22:22     39:09  3218856  3218741  39572  41517  100.0%   22778   100.0
@@ -115,7 +116,7 @@ never loaded again.  If, for example, a third of the objects are loaded only
 once, it's quite possible for the theoretical maximum hit rate to be 67%, no
 matter how large the cache.

-The simul.py script also contains code to simulate different cache
+The cache_simul.py script also contains code to simulate different cache
 strategies.  Since none of these are implemented, and only the default cache
 strategy's code has been updated to be aware of MVCC, these are not further
 documented here.

--- a/src/ZEO/monitor.py
+++ b/src/ZEO/monitor.py
@@ -12,8 +12,6 @@
 #
 ##############################################################################
 """Monitor behavior of ZEO server and record statistics.
-
-$Id$
 """
 from __future__ import print_function
 from __future__ import print_function

--- a/src/ZEO/runzeo.py
+++ b/src/ZEO/runzeo.py
@@ -43,7 +43,6 @@ import socket
 import logging

 import ZConfig.datatypes
-import ZEO
 from zdaemon.zdoptions import ZDOptions

 logger = logging.getLogger('ZEO.runzeo')
@@ -115,7 +114,7 @@ class ZEOOptions(ZDOptions, ZEOOptionsMixin):
    __doc__ = __doc__

    logsectionname = "eventlog"
-    schemadir = os.path.dirname(ZEO.__file__)
+    schemadir = os.path.dirname(__file__)

    def __init__(self):
        ZDOptions.__init__(self)
@@ -337,7 +336,7 @@ class ZEOServer:


 def create_server(storages, options):
-    from ZEO.StorageServer import StorageServer
+    from .StorageServer import StorageServer
    return StorageServer(
        options.address,
        storages,

--- a/src/ZEO/scripts/cache_simul.py
+++ b/src/ZEO/scripts/cache_simul.py
@@ -12,14 +12,9 @@
 # FOR A PARTICULAR PURPOSE
 #
 ##############################################################################
-"""Cache simulation.
-
-Usage: simul.py [-s size] tracefile
+"""
+Cache simulation.

-Options:
-s size: cache size in MB (default 20 MB)
-i: summarizing interval in minutes (default 15; max 60)
-r: rearrange factor

 Note:

@@ -27,102 +22,52 @@ Note:

 - The simulation will be far off if the trace file
  was created starting with a non-empty cache
-
-
 """
-from __future__ import print_function
-from __future__ import print_function
-from __future__ import print_function
-from __future__ import print_function
-from __future__ import print_function
-from __future__ import print_function
-from __future__ import print_function
-from __future__ import print_function
-from __future__ import print_function
-from __future__ import print_function
-from __future__ import print_function
-from __future__ import print_function
+from __future__ import print_function, absolute_import

 import bisect
-import getopt
 import struct
 import re
 import sys
 import ZEO.cache
+import argparse
+
+from ZODB.utils import z64

-from ZODB.utils import z64, u64
+from .cache_stats import add_interval_argument
+from .cache_stats import add_tracefile_argument

 # we assign ctime locally to facilitate test replacement!
 from time import ctime
 import six

-def usage(msg):
-    print(msg, file=sys.stderr)
-    print(__doc__, file=sys.stderr)

 def main(args=None):
    if args is None:
        args = sys.argv[1:]
    # Parse options.
    MB = 1<<20
-    cachelimit = 20*MB
-    rearrange = 0.8
+    parser = argparse.ArgumentParser(description=__doc__)
+    parser.add_argument("--size", "-s",
+                        default=20*MB, dest="cachelimit",
+                        type=lambda s: int(float(s)*MB),
+                        help="cache size in MB (default 20MB)")
+    add_interval_argument(parser)
+    parser.add_argument("--rearrange", "-r",
+                        default=0.8, type=float,
+                        help="rearrange factor")
+    add_tracefile_argument(parser)
+
    simclass = CircularCacheSimulation
-    interval_step = 15
-    try:
-        opts, args = getopt.getopt(args, "s:i:r:")
-    except getopt.error as msg:
-        usage(msg)
-        return 2
-
-    for o, a in opts:
-        if o == '-s':
-            cachelimit = int(float(a)*MB)
-        elif o == '-i':
-            interval_step = int(a)
-        elif o == '-r':
-            rearrange = float(a)
-        else:
-            assert False, (o, a)
-
-    interval_step *= 60
-    if interval_step <= 0:
-        interval_step = 60
-    elif interval_step > 3600:
-        interval_step = 3600
-
-    if len(args) != 1:
-        usage("exactly one file argument required")
-        return 2
-    filename = args[0]
-
-    # Open file.
-    if filename.endswith(".gz"):
-        # Open gzipped file.
-        try:
-            import gzip
-        except ImportError:
-            print("can't read gzipped files (no module gzip)", file=sys.stderr)
-            return 1
-        try:
-            f = gzip.open(filename, "rb")
-        except IOError as msg:
-            print("can't open %s: %s" % (filename, msg), file=sys.stderr)
-            return 1
-    elif filename == "-":
-        # Read from stdin.
-        f = sys.stdin
-    else:
-        # Open regular file.
-        try:
-            f = open(filename, "rb")
-        except IOError as msg:
-            print("can't open %s: %s" % (filename, msg), file=sys.stderr)
-            return 1
+
+    options = parser.parse_args(args)
+
+    f = options.tracefile
+    interval_step = options.interval

    # Create simulation object.
-    sim = simclass(cachelimit, rearrange)
-    interval_sim = simclass(cachelimit, rearrange)
+    sim = simclass(options.cachelimit, options.rearrange)
+    interval_sim = simclass(options.cachelimit, options.rearrange)

    # Print output header.
    sim.printheader()

--- a/src/ZEO/scripts/cache_stats.py
+++ b/src/ZEO/scripts/cache_stats.py
@@ -14,18 +14,7 @@ from __future__ import print_function
 ##############################################################################
 """Trace file statistics analyzer.

-Usage: stats.py [-h] [-i interval] [-q] [-s] [-S] [-v] [-X] tracefile
-h: print histogram of object load frequencies
-i: summarizing interval in minutes (default 15; max 60)
-q: quiet; don't print summaries
-s: print histogram of object sizes
-S: don't print statistics
-v: verbose; print each record
-X: enable heuristic checking for misaligned records: oids > 2**32
-    will be rejected; this requires the tracefile to be seekable
-"""
-
-"""File format:
+File format:

 Each record is 26 bytes, plus a variable number of bytes to store an oid,
 with the following layout.  Numbers are big-endian integers.
@@ -58,84 +47,80 @@ i.e. the low bit is always zero.
 """
 import sys
 import time
-import getopt
+import argparse
 import struct
+import gzip

 # we assign ctime locally to facilitate test replacement!
 from time import ctime
 import six

-def usage(msg):
-    print(msg, file=sys.stderr)
-    print(__doc__, file=sys.stderr)
+def add_interval_argument(parser):
+    def _interval(a):
+        interval = int(60 * float(a))
+        if interval <= 0:
+            interval = 60
+        elif interval > 3600:
+            interval = 3600
+        return interval
+    parser.add_argument("--interval", "-i",
+                        default=15*60, type=_interval,
+                        help="summarizing interval in minutes (default 15; max 60)")
+
+def add_tracefile_argument(parser):
+
+    class GzipFileType(argparse.FileType):
+        def __init__(self):
+            super(GzipFileType, self).__init__(mode='rb')
+
+        def __call__(self, s):
+            f = super(GzipFileType, self).__call__(s)
+            if s.endswith(".gz"):
+                f = gzip.GzipFile(filename=s, fileobj=f)
+            return f
+
+    parser.add_argument("tracefile", type=GzipFileType(),
+                        help="The trace to read; may be gzipped")

 def main(args=None):
    if args is None:
        args = sys.argv[1:]
    # Parse options
-    verbose = False
-    quiet = False
-    dostats = True
-    print_size_histogram = False
-    print_histogram = False
-    interval = 15*60 # Every 15 minutes
-    heuristic = False
-    try:
-        opts, args = getopt.getopt(args, "hi:qsSvX")
-    except getopt.error as msg:
-        usage(msg)
-        return 2
-    for o, a in opts:
-        if o == '-h':
-            print_histogram = True
-        elif o == "-i":
-            interval = int(60 * float(a))
-            if interval <= 0:
-                interval = 60
-            elif interval > 3600:
-                interval = 3600
-        elif o == "-q":
-            quiet = True
-            verbose = False
-        elif o == "-s":
-            print_size_histogram = True
-        elif o == "-S":
-            dostats = False
-        elif o == "-v":
-            verbose = True
-        elif o == '-X':
-            heuristic = True
-        else:
-            assert False, (o, opts)
-
-    if len(args) != 1:
-        usage("exactly one file argument required")
-        return 2
-    filename = args[0]
-
-    # Open file
-    if filename.endswith(".gz"):
-        # Open gzipped file
-        try:
-            import gzip
-        except ImportError:
-            print("can't read gzipped files (no module gzip)", file=sys.stderr)
-            return 1
-        try:
-            f = gzip.open(filename, "rb")
-        except IOError as msg:
-            print("can't open %s: %s" % (filename, msg), file=sys.stderr)
-            return 1
-    elif filename == '-':
-        # Read from stdin
-        f = sys.stdin
-    else:
-        # Open regular file
-        try:
-            f = open(filename, "rb")
-        except IOError as msg:
-            print("can't open %s: %s" % (filename, msg), file=sys.stderr)
-            return 1
+    parser = argparse.ArgumentParser(description="Trace file statistics analyzer",
+                                     # Our -h, short for --load-histogram
+                                     # conflicts with default for help, so we handle
+                                     # manually.
+                                     add_help=False)
+    verbose_group = parser.add_mutually_exclusive_group()
+    verbose_group.add_argument('--verbose', '-v',
+                               default=False, action='store_true',
+                               help="Be verbose; print each record")
+    verbose_group.add_argument('--quiet', '-q',
+                               default=False, action='store_true',
+                               help="Reduce output; don't print summaries")
+    parser.add_argument("--sizes", '-s',
+                        default=False, action="store_true", dest="print_size_histogram",
+                        help="print histogram of object sizes")
+    parser.add_argument("--no-stats", '-S',
+                        default=True, action="store_false", dest="dostats",
+                        help="don't print statistics")
+    parser.add_argument("--load-histogram", "-h",
+                        default=False, action="store_true", dest="print_histogram",
+                        help="print histogram of object load frequencies")
+    parser.add_argument("--check", "-X",
+                        default=False, action="store_true", dest="heuristic",
+                        help=" enable heuristic checking for misaligned records: oids > 2**32"
+                        " will be rejected; this requires the tracefile to be seekable")
+    add_interval_argument(parser)
+    add_tracefile_argument(parser)
+
+    if '--help' in args:
+        parser.print_help()
+        sys.exit(2)
+
+    options = parser.parse_args(args)
+
+    f = options.tracefile

    rt0 = time.time()
    bycode = {}     # map code to count of occurrences
@@ -169,7 +154,7 @@ def main(args=None):
            ts, code, oidlen, start_tid, end_tid = unpack(FMT, r)
            if ts == 0:
                # Must be a misaligned record caused by a crash.
-                if not quiet:
+                if not options.quiet:
                    print("Skipping 8 bytes at offset", f.tell() - FMT_SIZE)
                    f.seek(f.tell() - FMT_SIZE + 8)
                continue
@@ -179,14 +164,14 @@ def main(args=None):
            records += 1
            if t0 is None:
                t0 = ts
-                thisinterval = t0 // interval
+                thisinterval = t0 // options.interval
                h0 = he = ts
            te = ts
-            if ts // interval != thisinterval:
-                if not quiet:
+            if ts // options.interval != thisinterval:
+                if not options.quiet:
                    dumpbyinterval(byinterval, h0, he)
                byinterval = {}
-                thisinterval = ts // interval
+                thisinterval = ts // options.interval
                h0 = ts
            he = ts
            dlen, code = (code & 0x7fffff00) >> 8, code & 0xff
@@ -208,7 +193,7 @@ def main(args=None):
                elif code & 0x70 == 0x50: # All stores
                    bysizew[dlen] = d = bysizew.get(dlen) or {}
                    d[oid] = d.get(oid, 0) + 1
-            if verbose:
+            if options.verbose:
                print("%s %02x %s %016x %016x %c%s" % (
                    ctime(ts)[4:-5],
                    code,
@@ -221,12 +206,12 @@ def main(args=None):
                oids[oid] = oids.get(oid, 0) + 1
                total_loads += 1
            elif code == 0x00:    # restart
-                if not quiet:
+                if not options.quiet:
                    dumpbyinterval(byinterval, h0, he)
                byinterval = {}
-                thisinterval = ts // interval
+                thisinterval = ts // options.interval
                h0 = he = ts
-                if not quiet:
+                if not options.quiet:
                    print(ctime(ts)[4:-5], end=' ')
                    print('='*20, "Restart", '='*20)
    except KeyboardInterrupt:
@@ -235,7 +220,7 @@ def main(args=None):
    end_pos = f.tell()
    f.close()
    rte = time.time()
-    if not quiet:
+    if not options.quiet:
        dumpbyinterval(byinterval, h0, he)

    # Error if nothing was read
@@ -244,7 +229,7 @@ def main(args=None):
        return 1

    # Print statistics
-    if dostats:
+    if options.dostats:
        print()
        print("Read %s trace records (%s bytes) in %.1f seconds" % (
            addcommas(records), addcommas(end_pos), rte-rt0))
@@ -267,7 +252,7 @@ def main(args=None):
                explain.get(code) or "*** unknown code ***"))

    # Print histogram.
-    if print_histogram:
+    if options.print_histogram:
        print()
        print("Histogram of object load frequency")
        total = len(oids)
@@ -287,7 +272,7 @@ def main(args=None):
                         obj_percent, load_percent, cum))

    # Print size histogram.
-    if print_size_histogram:
+    if options.print_size_histogram:
        print()
        print("Histograms of object sizes")
        print()