throughput.html 6.5 KB
Newer Older
unknown's avatar
unknown committed
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117
<!--$Id: throughput.so,v 10.24 2000/12/04 18:05:44 bostic Exp $-->
<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.-->
<!--All rights reserved.-->
<html>
<head>
<title>Berkeley DB Reference Guide: Transaction throughput</title>
<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit.">
<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++">
</head>
<body bgcolor=white>
        <a name="2"><!--meow--></a>    
<table><tr valign=top>
<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Transaction Protected Applications</dl></h3></td>
<td width="1%"><a href="../../ref/transapp/reclimit.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/xa/intro.html"><img src="../../images/next.gif" alt="Next"></a>
</td></tr></table>
<p>
<h1 align=center>Transaction throughput</h1>
<p>Generally, the speed of a database system is measured by the transaction
throughput, expressed as the number of transactions per second.  The two
gating factors for Berkeley DB performance in a transactional system are usually
the underlying database files and the log file.  Both are factors because
they require disk I/O, which is slow relative to other system resources
like CPU.
<p>In the worst case scenario:
<ul type=disc>
<li>Database access is truly random and the database is too large to fit into
the cache, resulting in a single I/O per requested key/data pair.
<li>Both the database and the log are on a single disk.
</ul>
<p>This means that for each transaction, Berkeley DB is potentially performing
several filesystem operations:
<ul type=disc>
<li>Disk seek to database file.
<li>Database file read.
<li>Disk seek to log file.
<li>Log file write.
<li>Flush log file information to disk.
<li>Disk seek to update log file metadata (e.g., inode).
<li>Log metadata write.
<li>Flush log file metadata to disk.
</ul>
<p>There are a number of ways to increase transactional throughput, all of
which attempt to decrease the number of filesystem operations per
transaction:
<ul type=disc>
<li>Tune the size of the database cache.  If the Berkeley DB key/data pairs used
during the transaction are found in the database cache, the seek and read
from the database are no longer necessary, resulting in two fewer
filesystem operations per transaction.  To determine if your cache size
is too small, see <a href="../../ref/am_conf/cachesize.html">Selecting a
cache size</a>.
<li>Put the database and the log files on different disks.  This allows reads
and writes to the log files and the database files to be performed
concurrently.
<li>Set the filesystem configuration so that file access and modification
times are not updated.  Note, although the file access and modification
times are not used by Berkeley DB, this may affect other programs, so be
careful.
<li>Upgrade your hardware.  When considering the hardware on which to run your
application, however, it is important to consider the entire system.  The
controller and bus can have as much to do with the disk performance as
the disk itself.  It is also important to remember that throughput is
rarely the limiting factor, and that disk seek times are normally the true
performance issue for Berkeley DB.
<li>Turn on the <a href="../../api_c/env_open.html#DB_TXN_NOSYNC">DB_TXN_NOSYNC</a> flag.  This changes the Berkeley DB behavior
so that the log files are not flushed when transactions are committed.
While this change will greatly increase your transaction throughput, it
means that transactions will exhibit the ACI (atomicity, consistency and
isolation) properties, but not D (durability).  Database integrity will
be maintained but it is possible that some number of the most recently
committed transactions may be undone during recovery instead of being
redone.
</ul>
<p>If you are bottlenecked on logging, the following test will help you
confirm that the number of transactions per second that your application
does is reasonable for the hardware on which you're running.  Your test
program should repeatedly perform the following operations:
<ul type=disc>
<li>Seek to the beginning of a file.
<li>Write to the file.
<li>Flush the file write to disk.
</ul>
<p>The number of times that you can perform these three operations per second
is a rough measure of the number of transactions per second of which the
hardware is capable.  This test simulates the operations applied to the
log file. (As a simplifying assumption in this experiment, we assume that
the database files are either on a separate disk, or that they fit, with
some few exceptions, into the database cache.)  We do not have to directly
simulate updating the log file directory information, as it will normally
be updated and flushed to disk as a result of flushing the log file write
to disk.
<p>Running this test program, where we write 256 bytes, for 1000 operations,
on reasonably standard commodity hardware (Pentium II CPU, SCSI disk),
returned the following results:
<p><blockquote><pre>% testfile -b256 -o1000
running: 1000 ops
Elapsed time: 16.641934 seconds
1000 ops:   60.09 ops per second</pre></blockquote>
<p>Note that the number of bytes being written to the log as part of each
transaction can dramatically affect the transaction throughput.  The
above test run used 256, which is a reasonable size log write.  Your
log writes may be different.  To determine your average log write size,
use the <a href="../../utility/db_stat.html">db_stat</a> utility to display your log statistics.
<p>As a quick sanity check, for this particular disk, the average seek time
is 9.4 msec, and the average latency is 4.17 msec.  That results in a
minimum requirement for a data transfer to the disk of 13.57 msec, or a
maximum of 74 transfers per second.  This is close enough to the above 60
operations per second (which wasn't done on a quiescent disk) that the
number is believable.
<p>An implementation of the above <a href="writetest.txt">example test
program</a> for IEEE/ANSI Std 1003.1 (POSIX) standard systems is included in the Berkeley DB
distribution.
<table><tr><td><br></td><td width="1%"><a href="../../ref/transapp/reclimit.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/xa/intro.html"><img src="../../images/next.gif" alt="Next"></a>
</td></tr></table>
<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font>
</body>
</html>