=========================================== How to use NFS to make Blobs more efficient =========================================== :Author: Christian Theune <ct@gocept.com> Overview ======== When handling blobs, the biggest goal is to avoid writing operations that require the blob data to be transferred using up IO resources. When bringing a blob into the system, at least one O(N) operation has to happen, e.g. when the blob is uploaded via a network server. The blob should be extracted as a file on the final storage volume as early as possible, avoiding further copies. In a ZEO setup, all data is stored on a networked server and passed to it using zrpc. This is a major problem for handling blobs, because it will lock all transactions from committing when storing a single large blob. As a default, this mechanism works but is not recommended for high-volume installations. Shared filesystem ================= The solution for the transfer problem is to setup various storage parameters so that blobs are always handled on a single volume that is shared via network between ZEO servers and clients. Step 1: Setup a writable shared filesystem for ZEO server and client -------------------------------------------------------------------- On the ZEO server, create two directories on the volume that will be used by this setup (assume the volume is accessible via $SERVER/): - $SERVER/blobs - $SERVER/tmp Then export the $SERVER directory using a shared network filesystem like NFS. Make sure it's writable by the ZEO clients. Assume the exported directory is available on the client as $CLIENT. Step 2: Application temporary directories ----------------------------------------- Applications (i.e. Zope) will put uploaded data in a temporary directory first. Adjust your TMPDIR, TMP or TEMP environment variable to point to the shared filesystem: $ export TMPDIR=$CLIENT/tmp Step 3: ZEO client caches ------------------------- Edit the file `zope.conf` on the ZEO client and adjust the configuration of the `zeoclient` storage with two new variables:: blob-dir = $CLIENT/blobs blob-cache-writable = yes Step 4: ZEO server ------------------ Edit the file `zeo.conf` on the ZEO server to configure the blob directory. Assuming the published storage of the ZEO server is a file storage, then the configuration should look like this:: <blobstorage 1> <filestorage> path $INSTANCE/var/Data.fs <filestorage> blob-dir $SERVER/blobs </blobstorage> (Remember to manually replace $SERVER and $CLIENT with the exported directory as accessible by either the ZEO server or the ZEO client.) Conclusion ---------- At this point, after restarting your ZEO server and clients, the blob directory will be shared and a minimum amount of IO will occur when working with blobs.