ZEO caching of blob data
========================

ZEO supports 2 modes for providing clients access to blob data:

shared
    Blob data are shared via a network file system.  The client shares
    a common blob directory with the server.

non-shared
    Blob data are loaded from the storage server and cached locally.
    A maximum size for the blob data can be set and data are removed
    when the size is exceeded.

In this test, we'll demonstrate that blobs data are removed from a ZEO
cache when the amount of data stored exceeds a given limit.

Let's start by setting up some data:

    >>> addr, _ = start_server(blob_dir='server-blobs')

We'll also create a client.

    >>> import ZEO
    >>> db = ZEO.DB(addr, blob_dir='blobs', blob_cache_size=4000)

Here, we passed a blob_cache_size parameter, which specifies a target
blob cache size.  This is not a hard limit, but rather a target.  It
defaults to a very large value. We also passed a blob_cache_size_check
option. The blob_cache_size_check option specifies the number of
bytes, as a percent of the target that can be written or downloaded
from the server before the cache size is checked. The
blob_cache_size_check option defaults to 100. We passed 10, to check
after writing 10% of the target size.

We want to check for name collections in the blob cache dir. We'll try
to provoke name collections by reducing the number of cache directory
subdirectories.

    >>> import ZEO.ClientStorage
    >>> orig_blob_cache_layout_size = ZEO.ClientStorage.BlobCacheLayout.size
    >>> ZEO.ClientStorage.BlobCacheLayout.size = 11

Now, let's write some data:

    >>> import ZODB.blob, transaction, time
    >>> conn = db.open()
    >>> for i in range(1, 101):
    ...     conn.root()[i] = ZODB.blob.Blob()
    ...     conn.root()[i].open('w').write(chr(i)*100)
    >>> transaction.commit()

We've committed 10000 bytes of data, but our target size is 4000.  We
expect to have not much more than the target size in the cache blob
directory.

    >>> import os
    >>> def cache_size(d):
    ...     size = 0
    ...     for base, dirs, files in os.walk(d):
    ...         for f in files:
    ...             if f.endswith('.blob'):
    ...                 size += os.stat(os.path.join(base, f)).st_size
    ...     return size
    
    >>> db.storage._check_blob_size_thread.join()

    >>> cache_size('blobs') < 5000
    True

If we read all of the blobs, data will be downloaded again, as
necessary, but the cache size will remain not much bigger than the
target:

    >>> for i in range(1, 101):
    ...     data = conn.root()[i].open().read()
    ...     if data != chr(i)*100:
    ...         print 'bad data', `chr(i)`, `data`

    >>> db.storage._check_blob_size_thread.join()

    >>> cache_size('blobs') < 5000
    True

    >>> for i in range(1, 101):
    ...     data = conn.root()[i].open().read()
    ...     if data != chr(i)*100:
    ...         print 'bad data', `chr(i)`, `data`

    >>> db.storage._check_blob_size_thread.join()

    >>> for i in range(1, 101):
    ...     data = conn.root()[i].open('c').read()
    ...     if data != chr(i)*100:
    ...         print 'bad data', `chr(i)`, `data`

    >>> db.storage._check_blob_size_thread.join()

    >>> cache_size('blobs') < 5000
    True

    >>> for i in range(1, 101):
    ...     data = open(conn.root()[i].committed(), 'rb').read()
    ...     if data != chr(i)*100:
    ...         print 'bad data', `chr(i)`, `data`

    >>> db.storage._check_blob_size_thread.join()

    >>> cache_size('blobs') < 5000
    True

Now let see if we can stress things a bit.  We'll create many clients
and get them to pound on the blobs all at once to see if we can
provoke problems:

    >>> import threading, random
    >>> def run():
    ...     db = ZEO.DB(addr, blob_dir='blobs', blob_cache_size=4000)
    ...     conn = db.open()
    ...     for i in range(300):
    ...         time.sleep(0)
    ...         i = random.randint(1, 100)
    ...         data = conn.root()[i].open().read()
    ...         if data != chr(i)*100:
    ...             print 'bad data', `chr(i)`, `data`
    ...         i = random.randint(1, 100)
    ...         data = conn.root()[i].open('c').read()
    ...         if data != chr(i)*100:
    ...             print 'bad data', `chr(i)`, `data`
    ...     db._storage._check_blob_size_thread.join()
    ...     db.close()

    >>> threads = [threading.Thread(target=run) for i in range(10)]
    >>> for thread in threads:
    ...     thread.setDaemon(True)
    >>> for thread in threads:
    ...     thread.start()
    >>> for thread in threads:
    ...     thread.join()

    >>> cache_size('blobs') < 5000
    True

.. cleanup

    >>> db.close()
    >>> ZEO.ClientStorage.BlobCacheLayout.size = orig_blob_cache_layout_size