ZEO caching of blob data ======================== ZEO supports 2 modes for providing clients access to blob data: shared Blob data are shared via a network file system. The client shares a common blob directory with the server. non-shared Blob data are loaded from the storage server and cached locally. A maximum size for the blob data can be set and data are removed when the size is exceeded. In this test, we'll demonstrate that blobs data are removed from a ZEO cache when the amount of data stored exceeds a given limit. Let's start by setting up some data: >>> addr, _ = start_server(blob_dir='server-blobs') We'll also create a client. >>> import ZEO >>> db = ZEO.DB(addr, blob_dir='blobs', blob_cache_size=4000) Here, we passed a blob_cache_size parameter, which specifies a target blob cache size. This is not a hard limit, but rather a target. It defaults to a very large value. We also passed a blob_cache_size_check option. The blob_cache_size_check option specifies the number of bytes, as a percent of the target that can be written or downloaded from the server before the cache size is checked. The blob_cache_size_check option defaults to 100. We passed 10, to check after writing 10% of the target size. We want to check for name collections in the blob cache dir. We'll try to provoke name collections by reducing the number of cache directory subdirectories. >>> import ZEO.ClientStorage >>> orig_blob_cache_layout_size = ZEO.ClientStorage.BlobCacheLayout.size >>> ZEO.ClientStorage.BlobCacheLayout.size = 11 Now, let's write some data: >>> import ZODB.blob, transaction, time >>> conn = db.open() >>> for i in range(1, 101): ... conn.root()[i] = ZODB.blob.Blob() ... conn.root()[i].open('w').write(chr(i)*100) >>> transaction.commit() We've committed 10000 bytes of data, but our target size is 4000. We expect to have not much more than the target size in the cache blob directory. >>> import os >>> def cache_size(d): ... size = 0 ... for base, dirs, files in os.walk(d): ... for f in files: ... if f.endswith('.blob'): ... size += os.stat(os.path.join(base, f)).st_size ... return size >>> db.storage._check_blob_size_thread.join() >>> cache_size('blobs') < 5000 True If we read all of the blobs, data will be downloaded again, as necessary, but the cache size will remain not much bigger than the target: >>> for i in range(1, 101): ... data = conn.root()[i].open().read() ... if data != chr(i)*100: ... print 'bad data', `chr(i)`, `data` >>> db.storage._check_blob_size_thread.join() >>> cache_size('blobs') < 5000 True >>> for i in range(1, 101): ... data = conn.root()[i].open().read() ... if data != chr(i)*100: ... print 'bad data', `chr(i)`, `data` >>> db.storage._check_blob_size_thread.join() >>> for i in range(1, 101): ... data = conn.root()[i].open('c').read() ... if data != chr(i)*100: ... print 'bad data', `chr(i)`, `data` >>> db.storage._check_blob_size_thread.join() >>> cache_size('blobs') < 5000 True >>> for i in range(1, 101): ... data = open(conn.root()[i].committed(), 'rb').read() ... if data != chr(i)*100: ... print 'bad data', `chr(i)`, `data` >>> db.storage._check_blob_size_thread.join() >>> cache_size('blobs') < 5000 True Now let see if we can stress things a bit. We'll create many clients and get them to pound on the blobs all at once to see if we can provoke problems: >>> import threading, random >>> def run(): ... db = ZEO.DB(addr, blob_dir='blobs', blob_cache_size=4000) ... conn = db.open() ... for i in range(300): ... time.sleep(0) ... i = random.randint(1, 100) ... data = conn.root()[i].open().read() ... if data != chr(i)*100: ... print 'bad data', `chr(i)`, `data` ... i = random.randint(1, 100) ... data = conn.root()[i].open('c').read() ... if data != chr(i)*100: ... print 'bad data', `chr(i)`, `data` ... db._storage._check_blob_size_thread.join() ... db.close() >>> threads = [threading.Thread(target=run) for i in range(10)] >>> for thread in threads: ... thread.setDaemon(True) >>> for thread in threads: ... thread.start() >>> for thread in threads: ... thread.join() >>> cache_size('blobs') < 5000 True .. cleanup >>> db.close() >>> ZEO.ClientStorage.BlobCacheLayout.size = orig_blob_cache_layout_size