mirror of
https://github.com/berkeleydb/libdb.git
synced 2024-11-16 17:16:25 +00:00
173 lines
9.1 KiB
HTML
173 lines
9.1 KiB
HTML
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
|
||
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
|
||
<html xmlns="http://www.w3.org/1999/xhtml">
|
||
<head>
|
||
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
|
||
<title>Transaction throughput</title>
|
||
<link rel="stylesheet" href="gettingStarted.css" type="text/css" />
|
||
<meta name="generator" content="DocBook XSL Stylesheets V1.73.2" />
|
||
<link rel="start" href="index.html" title="Berkeley DB Programmer's Reference Guide" />
|
||
<link rel="up" href="transapp.html" title="Chapter 11. Berkeley DB Transactional Data Store Applications" />
|
||
<link rel="prev" href="transapp_tune.html" title="Transaction tuning" />
|
||
<link rel="next" href="transapp_faq.html" title="Transaction FAQ" />
|
||
</head>
|
||
<body>
|
||
<div xmlns="" class="navheader">
|
||
<div class="libver">
|
||
<p>Library Version 11.2.5.3</p>
|
||
</div>
|
||
<table width="100%" summary="Navigation header">
|
||
<tr>
|
||
<th colspan="3" align="center">Transaction throughput</th>
|
||
</tr>
|
||
<tr>
|
||
<td width="20%" align="left"><a accesskey="p" href="transapp_tune.html">Prev</a> </td>
|
||
<th width="60%" align="center">Chapter 11.
|
||
Berkeley DB Transactional Data Store Applications
|
||
</th>
|
||
<td width="20%" align="right"> <a accesskey="n" href="transapp_faq.html">Next</a></td>
|
||
</tr>
|
||
</table>
|
||
<hr />
|
||
</div>
|
||
<div class="sect1" lang="en" xml:lang="en">
|
||
<div class="titlepage">
|
||
<div>
|
||
<div>
|
||
<h2 class="title" style="clear: both"><a id="transapp_throughput"></a>Transaction throughput</h2>
|
||
</div>
|
||
</div>
|
||
</div>
|
||
<p>Generally, the speed of a database system is measured by the
|
||
<span class="emphasis"><em>transaction throughput</em></span>, expressed as a number of
|
||
transactions per second. The two gating factors for Berkeley DB performance
|
||
in a transactional system are usually the underlying database files and
|
||
the log file. Both are factors because they require disk I/O, which is
|
||
slow relative to other system resources such as CPU.</p>
|
||
<p>In the worst-case scenario:</p>
|
||
<div class="itemizedlist">
|
||
<ul type="disc">
|
||
<li>Database access is truly random and the database is too large for any
|
||
significant percentage of it to fit into the cache, resulting in a
|
||
single I/O per requested key/data pair.</li>
|
||
<li>Both the database and the log are on a single disk.</li>
|
||
</ul>
|
||
</div>
|
||
<p>This means that for each transaction, Berkeley DB is potentially performing
|
||
several filesystem operations:</p>
|
||
<div class="itemizedlist">
|
||
<ul type="disc">
|
||
<li>Disk seek to database file</li>
|
||
<li>Database file read</li>
|
||
<li>Disk seek to log file</li>
|
||
<li>Log file write</li>
|
||
<li>Flush log file information to disk</li>
|
||
<li>Disk seek to update log file metadata (for example, inode information)</li>
|
||
<li>Log metadata write</li>
|
||
<li>Flush log file metadata to disk</li>
|
||
</ul>
|
||
</div>
|
||
<p>There are a number of ways to increase transactional throughput, all of
|
||
which attempt to decrease the number of filesystem operations per
|
||
transaction. First, the Berkeley DB software includes support for
|
||
<span class="emphasis"><em>group commit</em></span>. Group commit simply means that when the
|
||
information about one transaction is flushed to disk, the information
|
||
for any other waiting transactions will be flushed to disk at the same
|
||
time, potentially amortizing a single log write over a large number of
|
||
transactions. There are additional tuning parameters which may be
|
||
useful to application writers:</p>
|
||
<div class="itemizedlist">
|
||
<ul type="disc">
|
||
<li>Tune the size of the database cache. If the Berkeley DB key/data pairs used
|
||
during the transaction are found in the database cache, the seek and read
|
||
from the database are no longer necessary, resulting in two fewer
|
||
filesystem operations per transaction. To determine whether your cache
|
||
size is too small, see <a class="xref" href="general_am_conf.html#am_conf_cachesize" title="Selecting a cache size">Selecting a cache size</a>.</li>
|
||
<li>Put the database and the log files on different disks. This allows reads
|
||
and writes to the log files and the database files to be performed
|
||
concurrently.</li>
|
||
<li>Set the filesystem configuration so that file access and modification times
|
||
are not updated. Note that although the file access and modification times
|
||
are not used by Berkeley DB, this may affect other programs -- so be careful.</li>
|
||
<li>Upgrade your hardware. When considering the hardware on which to run your
|
||
application, however, it is important to consider the entire system. The
|
||
controller and bus can have as much to do with the disk performance as
|
||
the disk itself. It is also important to remember that throughput is
|
||
rarely the limiting factor, and that disk seek times are normally the true
|
||
performance issue for Berkeley DB.</li>
|
||
<li>
|
||
Turn on the <a href="../api_reference/C/envset_flags.html#envset_flags_DB_TXN_NOSYNC" class="olink">DB_TXN_NOSYNC</a> or <a href="../api_reference/C/envset_flags.html#set_flags_DB_TXN_WRITE_NOSYNC" class="olink">DB_TXN_WRITE_NOSYNC</a> flags. This
|
||
changes the Berkeley DB behavior so that the log files are not written
|
||
and/or flushed when transactions are committed. Although this change
|
||
will greatly increase your transaction throughput, it means that
|
||
transactions will exhibit the ACI (atomicity, consistency, and
|
||
isolation) properties, but not D (durability). Database integrity will
|
||
be maintained, but it is possible that some number of the most recently
|
||
committed transactions may be undone during recovery instead of being
|
||
redone.
|
||
</li>
|
||
</ul>
|
||
</div>
|
||
<p>If you are bottlenecked on logging, the following test will help you
|
||
confirm that the number of transactions per second that your application
|
||
does is reasonable for the hardware on which you are running. Your test
|
||
program should repeatedly perform the following operations:</p>
|
||
<div class="itemizedlist">
|
||
<ul type="disc">
|
||
<li>Seek to the beginning of a file</li>
|
||
<li>Write to the file</li>
|
||
<li>Flush the file write to disk</li>
|
||
</ul>
|
||
</div>
|
||
<p>The number of times that you can perform these three operations per
|
||
second is a rough measure of the minimum number of transactions per
|
||
second of which the hardware is capable. This test simulates the
|
||
operations applied to the log file. (As a simplifying assumption in this
|
||
experiment, we assume that the database files are either on a separate
|
||
disk; or that they fit, with some few exceptions, into the database
|
||
cache.) We do not have to directly simulate updating the log file
|
||
directory information because it will normally be updated and flushed
|
||
to disk as a result of flushing the log file write to disk.</p>
|
||
<p>Running this test program, in which we write 256 bytes for 1000 operations
|
||
on reasonably standard commodity hardware (Pentium II CPU, SCSI disk),
|
||
returned the following results:</p>
|
||
<pre class="programlisting">% testfile -b256 -o1000
|
||
running: 1000 ops
|
||
Elapsed time: 16.641934 seconds
|
||
1000 ops: 60.09 ops per second</pre>
|
||
<p>Note that the number of bytes being written to the log as part of each
|
||
transaction can dramatically affect the transaction throughput. The
|
||
test run used 256, which is a reasonable size log write. Your log
|
||
writes may be different. To determine your average log write size, use
|
||
the <a href="../api_reference/C/db_stat.html" class="olink">db_stat</a> utility to display your log statistics.</p>
|
||
<p>As a quick sanity check, the average seek time is 9.4 msec for this
|
||
particular disk, and the average latency is 4.17 msec. That results in
|
||
a minimum requirement for a data transfer to the disk of 13.57 msec, or
|
||
a maximum of 74 transfers per second. This is close enough to the
|
||
previous 60 operations per second (which was not done on a quiescent
|
||
disk) that the number is believable.</p>
|
||
<p>An implementation of the previous <a class="ulink" href="writetest.cs" target="_top">example test
|
||
program</a> for IEEE/ANSI Std 1003.1 (POSIX) standard systems is included in the Berkeley DB
|
||
distribution.</p>
|
||
</div>
|
||
<div class="navfooter">
|
||
<hr />
|
||
<table width="100%" summary="Navigation footer">
|
||
<tr>
|
||
<td width="40%" align="left"><a accesskey="p" href="transapp_tune.html">Prev</a> </td>
|
||
<td width="20%" align="center">
|
||
<a accesskey="u" href="transapp.html">Up</a>
|
||
</td>
|
||
<td width="40%" align="right"> <a accesskey="n" href="transapp_faq.html">Next</a></td>
|
||
</tr>
|
||
<tr>
|
||
<td width="40%" align="left" valign="top">Transaction tuning </td>
|
||
<td width="20%" align="center">
|
||
<a accesskey="h" href="index.html">Home</a>
|
||
</td>
|
||
<td width="40%" align="right" valign="top"> Transaction FAQ</td>
|
||
</tr>
|
||
</table>
|
||
</div>
|
||
</body>
|
||
</html>
|