mirror of
https://github.com/berkeleydb/libdb.git
synced 2024-11-16 17:16:25 +00:00
699 lines
29 KiB
HTML
699 lines
29 KiB
HTML
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
|
||
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
|
||
<html xmlns="http://www.w3.org/1999/xhtml">
|
||
<head>
|
||
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
|
||
<title>Btree access method specific configuration</title>
|
||
<link rel="stylesheet" href="gettingStarted.css" type="text/css" />
|
||
<meta name="generator" content="DocBook XSL Stylesheets V1.73.2" />
|
||
<link rel="start" href="index.html" title="Berkeley DB Programmer's Reference Guide" />
|
||
<link rel="up" href="am_conf.html" title="Chapter 2. Access Method Configuration" />
|
||
<link rel="prev" href="general_am_conf.html" title="General access method configuration" />
|
||
<link rel="next" href="hash_conf.html" title="Hash access method specific configuration" />
|
||
</head>
|
||
<body>
|
||
<div xmlns="" class="navheader">
|
||
<div class="libver">
|
||
<p>Library Version 11.2.5.2</p>
|
||
</div>
|
||
<table width="100%" summary="Navigation header">
|
||
<tr>
|
||
<th colspan="3" align="center">Btree access method specific configuration</th>
|
||
</tr>
|
||
<tr>
|
||
<td width="20%" align="left"><a accesskey="p" href="general_am_conf.html">Prev</a> </td>
|
||
<th width="60%" align="center">Chapter 2.
|
||
Access Method Configuration
|
||
</th>
|
||
<td width="20%" align="right"> <a accesskey="n" href="hash_conf.html">Next</a></td>
|
||
</tr>
|
||
</table>
|
||
<hr />
|
||
</div>
|
||
<div class="sect1" lang="en" xml:lang="en">
|
||
<div class="titlepage">
|
||
<div>
|
||
<div>
|
||
<h2 class="title" style="clear: both"><a id="bt_conf"></a>Btree access method specific configuration</h2>
|
||
</div>
|
||
</div>
|
||
</div>
|
||
<div class="toc">
|
||
<dl>
|
||
<dt>
|
||
<span class="sect2">
|
||
<a href="bt_conf.html#am_conf_bt_compare">Btree comparison</a>
|
||
</span>
|
||
</dt>
|
||
<dt>
|
||
<span class="sect2">
|
||
<a href="bt_conf.html#am_conf_bt_prefix">Btree prefix comparison</a>
|
||
</span>
|
||
</dt>
|
||
<dt>
|
||
<span class="sect2">
|
||
<a href="bt_conf.html#am_conf_bt_minkey">Minimum keys per page</a>
|
||
</span>
|
||
</dt>
|
||
<dt>
|
||
<span class="sect2">
|
||
<a href="bt_conf.html#am_conf_bt_recnum">Retrieving Btree records by logical record number</a>
|
||
</span>
|
||
</dt>
|
||
<dt>
|
||
<span class="sect2">
|
||
<a href="bt_conf.html#am_conf_bt_compress">Compression</a>
|
||
</span>
|
||
</dt>
|
||
</dl>
|
||
</div>
|
||
<p>
|
||
There are a series of configuration tasks which you can perform when
|
||
using the Btree access method. They are described in the following sections.
|
||
</p>
|
||
<div class="sect2" lang="en" xml:lang="en">
|
||
<div class="titlepage">
|
||
<div>
|
||
<div>
|
||
<h3 class="title"><a id="am_conf_bt_compare"></a>Btree comparison</h3>
|
||
</div>
|
||
</div>
|
||
</div>
|
||
<p>The Btree data structure is a sorted, balanced tree structure storing
|
||
associated key/data pairs. By default, the sort order is lexicographical,
|
||
with shorter keys collating before longer keys. The user can specify the
|
||
sort order for the Btree by using the <a href="../api_reference/C/dbset_bt_compare.html" class="olink">DB->set_bt_compare()</a> method.</p>
|
||
<p>Sort routines are passed pointers to keys as arguments. The keys are
|
||
represented as <a href="../api_reference/C/dbt.html" class="olink">DBT</a> structures. The routine must return an integer
|
||
less than, equal to, or greater than zero if the first argument is
|
||
considered to be respectively less than, equal to, or greater than the
|
||
second argument. The only fields that the routines may examine in the
|
||
<a href="../api_reference/C/dbt.html" class="olink">DBT</a> structures are <span class="bold"><strong>data</strong></span> and <span class="bold"><strong>size</strong></span> fields.</p>
|
||
<p>An example routine that might be used to sort integer keys in the database
|
||
is as follows:</p>
|
||
<a id="prog_am2"></a>
|
||
<pre class="programlisting">
|
||
int
|
||
compare_int(DB *dbp, const DBT *a, const DBT *b)
|
||
|
||
{
|
||
int ai, bi;
|
||
/*
|
||
* Returns:
|
||
* < 0 if a < b
|
||
* = 0 if a = b
|
||
* > 0 if a > b
|
||
*/
|
||
memcpy(&ai, a->data, sizeof(int));
|
||
memcpy(&bi, b->data, sizeof(int));
|
||
return (ai - bi);
|
||
}
|
||
</pre>
|
||
<p>Note that the data must first be copied into memory that is appropriately
|
||
aligned, as Berkeley DB does not guarantee any kind of alignment of the
|
||
underlying data, including for comparison routines. When writing
|
||
comparison routines, remember that databases created on machines of
|
||
different architectures may have different integer byte orders, for which
|
||
your code may need to compensate.</p>
|
||
<p>An example routine that might be used to sort keys based on the first
|
||
five bytes of the key (ignoring any subsequent bytes) is as follows:</p>
|
||
<a id="prog_am3"></a>
|
||
<pre class="programlisting">
|
||
int
|
||
compare_dbt(DB *dbp, const DBT *a, const DBT *b)
|
||
|
||
{
|
||
int len;
|
||
u_char *p1, *p2;
|
||
|
||
/*
|
||
* Returns:
|
||
* < 0 if a < b
|
||
* = 0 if a = b
|
||
* > 0 if a > b
|
||
*/
|
||
for (p1 = a->data, p2 = b->data, len = 5; len--; ++p1, ++p2)
|
||
if (*p1 != *p2)
|
||
return ((long)*p1 - (long)*p2);
|
||
return (0);
|
||
}
|
||
</pre>
|
||
<p>All comparison functions must cause the keys in the database to be
|
||
well-ordered. The most important implication of being well-ordered is
|
||
that the key relations must be transitive, that is, if key A is less
|
||
than key B, and key B is less than key C, then the comparison routine
|
||
must also return that key A is less than key C.</p>
|
||
<p>It is reasonable for a comparison function to not examine an entire key
|
||
in some applications, which implies partial keys may be specified to the
|
||
Berkeley DB interfaces. When partial keys are specified to Berkeley DB, interfaces
|
||
which retrieve data items based on a user-specified key (for example,
|
||
<a href="../api_reference/C/dbget.html" class="olink">DB->get()</a> and <a href="../api_reference/C/dbcget.html" class="olink">DBC->get()</a> with the <a href="../api_reference/C/dbcget.html#dbcget_DB_SET" class="olink">DB_SET</a> flag), will
|
||
modify the user-specified key by returning the actual key stored in the
|
||
database.</p>
|
||
</div>
|
||
<div class="sect2" lang="en" xml:lang="en">
|
||
<div class="titlepage">
|
||
<div>
|
||
<div>
|
||
<h3 class="title"><a id="am_conf_bt_prefix"></a>Btree prefix comparison</h3>
|
||
</div>
|
||
</div>
|
||
</div>
|
||
<p>The Berkeley DB Btree implementation maximizes the number of keys that can be
|
||
stored on an internal page by storing only as many bytes of each key as
|
||
are necessary to distinguish it from adjacent keys. The prefix
|
||
comparison routine is what determines this minimum number of bytes (that
|
||
is, the length of the unique prefix), that must be stored. A prefix
|
||
comparison function for the Btree can be specified by calling
|
||
<a href="../api_reference/C/dbset_bt_prefix.html" class="olink">DB->set_bt_prefix()</a>.</p>
|
||
<p>The prefix comparison routine must be compatible with the overall
|
||
comparison function of the Btree, since what distinguishes any two keys
|
||
depends entirely on the function used to compare them. This means that
|
||
if a prefix comparison routine is specified by the application, a
|
||
compatible overall comparison routine must also have been specified.</p>
|
||
<p>Prefix comparison routines are passed pointers to keys as arguments.
|
||
The keys are represented as <a href="../api_reference/C/dbt.html" class="olink">DBT</a> structures. The only fields
|
||
the routines may examine in the <a href="../api_reference/C/dbt.html" class="olink">DBT</a> structures are <span class="bold"><strong>data</strong></span>
|
||
and <span class="bold"><strong>size</strong></span> fields.</p>
|
||
<p>The prefix comparison function must return the number of bytes necessary
|
||
to distinguish the two keys. If the keys are identical (equal and equal
|
||
in length), the length should be returned. If the keys are equal up to
|
||
the smaller of the two lengths, then the length of the smaller key plus
|
||
1 should be returned.</p>
|
||
<p>An example prefix comparison routine follows:</p>
|
||
<a id="prog_am4"></a>
|
||
<pre class="programlisting">
|
||
size_t
|
||
compare_prefix(DB *dbp, const DBT *a, const DBT *b)
|
||
|
||
{
|
||
size_t cnt, len;
|
||
u_int8_t *p1, *p2;
|
||
|
||
cnt = 1;
|
||
len = a->size > b->size ? b->size : a->size;
|
||
for (p1 =
|
||
a->data, p2 = b->data; len--; ++p1, ++p2, ++cnt)
|
||
if (*p1 != *p2)
|
||
return (cnt);
|
||
/*
|
||
* They match up to the smaller of the two sizes.
|
||
* Collate the longer after the shorter.
|
||
*/
|
||
if (a->size < b->size)
|
||
return (a->size + 1);
|
||
if (b->size < a->size)
|
||
return (b->size + 1);
|
||
return (b->size);
|
||
}
|
||
</pre>
|
||
<p>The usefulness of this functionality is data-dependent, but in some data
|
||
sets can produce significantly reduced tree sizes and faster search times.</p>
|
||
</div>
|
||
<div class="sect2" lang="en" xml:lang="en">
|
||
<div class="titlepage">
|
||
<div>
|
||
<div>
|
||
<h3 class="title"><a id="am_conf_bt_minkey"></a>Minimum keys per page</h3>
|
||
</div>
|
||
</div>
|
||
</div>
|
||
<p>The number of keys stored on each page affects the size of a Btree and
|
||
how it is maintained. Therefore, it also affects the retrieval and search
|
||
performance of the tree. For each Btree, Berkeley DB computes a maximum key
|
||
and data size. This size is a function of the page size and the fact that
|
||
at least two key/data pairs must fit on any Btree page. Whenever key or
|
||
data items exceed the calculated size, they are stored on overflow pages
|
||
instead of in the standard Btree leaf pages.</p>
|
||
<p>Applications may use the <a href="../api_reference/C/dbset_bt_minkey.html" class="olink">DB->set_bt_minkey()</a> method to change the minimum
|
||
number of keys that must fit on a Btree page from two to another value.
|
||
Altering this value in turn alters the on-page maximum size, and can be
|
||
used to force key and data items which would normally be stored in the
|
||
Btree leaf pages onto overflow pages.</p>
|
||
<p>Some data sets can benefit from this tuning. For example, consider an
|
||
application using large page sizes, with a data set almost entirely
|
||
consisting of small key and data items, but with a few large items. By
|
||
setting the minimum number of keys that must fit on a page, the
|
||
application can force the outsized items to be stored on overflow pages.
|
||
That in turn can potentially keep the tree more compact, that is, with
|
||
fewer internal levels to traverse during searches.</p>
|
||
<p>The following calculation is similar to the one performed by the Btree
|
||
implementation. (The <span class="bold"><strong>minimum_keys</strong></span> value is multiplied by 2
|
||
because each key/data pair requires 2 slots on a Btree page.)</p>
|
||
<pre class="programlisting">maximum_size = page_size / (minimum_keys * 2)</pre>
|
||
<p>Using this calculation, if the page size is 8KB and the default
|
||
<span class="bold"><strong>minimum_keys</strong></span> value of 2 is used, then any key or data items
|
||
larger than 2KB will be forced to an overflow page. If an application
|
||
were to specify a <span class="bold"><strong>minimum_key</strong></span> value of 100, then any key or data
|
||
items larger than roughly 40 bytes would be forced to overflow pages.</p>
|
||
<p>It is important to remember that accesses to overflow pages do not perform
|
||
as well as accesses to the standard Btree leaf pages, and so setting the
|
||
value incorrectly can result in overusing overflow pages and decreasing
|
||
the application's overall performance.</p>
|
||
</div>
|
||
<div class="sect2" lang="en" xml:lang="en">
|
||
<div class="titlepage">
|
||
<div>
|
||
<div>
|
||
<h3 class="title"><a id="am_conf_bt_recnum"></a>Retrieving Btree records by logical record number</h3>
|
||
</div>
|
||
</div>
|
||
</div>
|
||
<p>The Btree access method optionally supports retrieval by logical record
|
||
numbers. To configure a Btree to support record numbers, call the
|
||
<a href="../api_reference/C/dbset_flags.html" class="olink">DB->set_flags()</a> method with the <a href="../api_reference/C/dbset_flags.html#dbset_flags_DB_RECNUM" class="olink">DB_RECNUM</a> flag.</p>
|
||
<p>Configuring a Btree for record numbers should not be done lightly.
|
||
While often useful, it may significantly slow down the speed at which
|
||
items can be stored into the database, and can severely impact
|
||
application throughput. Generally it should be avoided in trees with
|
||
a need for high write concurrency.</p>
|
||
<p>To retrieve by record number, use the <a href="../api_reference/C/dbget.html#dbget_DB_SET_RECNO" class="olink">DB_SET_RECNO</a> flag to the
|
||
<a href="../api_reference/C/dbget.html" class="olink">DB->get()</a> and <a href="../api_reference/C/dbcget.html" class="olink">DBC->get()</a> methods. The following is an example of
|
||
a routine that displays the data item for a Btree database created with
|
||
the <a href="../api_reference/C/dbset_flags.html#dbset_flags_DB_RECNUM" class="olink">DB_RECNUM</a> option.</p>
|
||
<a id="prog_am5"></a>
|
||
<pre class="programlisting">
|
||
int
|
||
rec_display(DB *dbp, db_recno_t recno)
|
||
|
||
{
|
||
DBT key, data;
|
||
int ret;
|
||
|
||
memset(&key, 0, sizeof(key));
|
||
key.data = &recno;
|
||
key.size = sizeof(recno);
|
||
memset(&data, 0, sizeof(data));
|
||
|
||
if ((ret = dbp->get(dbp, NULL, &key, &data, DB_SET_RECNO)) != 0)
|
||
return (ret);
|
||
printf("data for %lu: %.*s\n",
|
||
(u_long)recno, (int)data.size, (char *)data.data);
|
||
return (0);
|
||
}
|
||
</pre>
|
||
<p>To determine a key's record number, use the <a href="../api_reference/C/dbcget.html#dbcget_DB_GET_RECNO" class="olink">DB_GET_RECNO</a> flag
|
||
to the <a href="../api_reference/C/dbcget.html" class="olink">DBC->get()</a> method. The following is an example of a routine that
|
||
displays the record number associated with a specific key.</p>
|
||
<a id="prog_am6"></a>
|
||
<pre class="programlisting">
|
||
int
|
||
recno_display(DB *dbp, char *keyvalue)
|
||
|
||
{
|
||
DBC *dbcp;
|
||
DBT key, data;
|
||
db_recno_t recno;
|
||
int ret, t_ret;
|
||
|
||
/* Acquire a cursor for the database. */
|
||
if ((ret = dbp->cursor(dbp, NULL, &dbcp, 0)) != 0) {
|
||
dbp->err(dbp, ret, "DB->cursor");
|
||
goto err;
|
||
}
|
||
|
||
/* Position the cursor. */
|
||
memset(&key, 0, sizeof(key));
|
||
key.data = keyvalue;
|
||
key.size = strlen(keyvalue);
|
||
memset(&data, 0, sizeof(data));
|
||
if ((ret = dbcp->get(dbcp, &key, &data, DB_SET)) != 0) {
|
||
dbp->err(dbp, ret, "DBC->get(DB_SET): %s", keyvalue);
|
||
goto err;
|
||
}
|
||
|
||
/*
|
||
* Request the record number, and store it into appropriately
|
||
* sized and aligned local memory.
|
||
*/
|
||
memset(&data, 0, sizeof(data));
|
||
data.data = &recno;
|
||
data.ulen = sizeof(recno);
|
||
data.flags = DB_DBT_USERMEM;
|
||
if ((ret = dbcp->get(dbcp, &key, &data, DB_GET_RECNO)) != 0) {
|
||
dbp->err(dbp, ret, "DBC->get(DB_GET_RECNO)");
|
||
goto err;
|
||
}
|
||
|
||
printf("key for requested key was %lu\n", (u_long)recno);
|
||
|
||
err: /* Close the cursor. */
|
||
if ((t_ret = dbcp->close(dbcp)) != 0) {
|
||
if (ret == 0)
|
||
ret = t_ret;
|
||
dbp->err(dbp, ret, "DBC->close");
|
||
}
|
||
return (ret);
|
||
}
|
||
</pre>
|
||
</div>
|
||
<div class="sect2" lang="en" xml:lang="en">
|
||
<div class="titlepage">
|
||
<div>
|
||
<div>
|
||
<h3 class="title"><a id="am_conf_bt_compress"></a>Compression</h3>
|
||
</div>
|
||
</div>
|
||
</div>
|
||
<p>
|
||
The Btree access method supports the automatic compression of key/data
|
||
pairs upon their insertion into the database. The key/data pairs are
|
||
decompressed before they are returned to the application, making an
|
||
application's interaction with a compressed database identical to that
|
||
for a non-compressed database. To configure Berkeley DB for
|
||
compression, call the <a href="../api_reference/C/dbset_bt_compress.html" class="olink">DB->set_bt_compress()</a> method and specify custom
|
||
compression and decompression functions. If <a href="../api_reference/C/dbset_bt_compress.html" class="olink">DB->set_bt_compress()</a> is
|
||
called with NULL compression and decompression functions, Berkeley DB
|
||
will use its default compression functions.
|
||
</p>
|
||
<div class="note" style="margin-left: 0.5in; margin-right: 0.5in;">
|
||
<h3 class="title">Note</h3>
|
||
<p>
|
||
Compression only works with the Btree access method, and then only
|
||
so long as your database is not configured for unsorted duplicates.
|
||
</p>
|
||
</div>
|
||
<p>
|
||
The default compression function performs prefix compression on each
|
||
key added to the database. This means that, for a key
|
||
<span class="emphasis"><em>n</em></span> bytes in length, the first
|
||
<span class="emphasis"><em>i</em></span> bytes that match the first
|
||
<span class="emphasis"><em>i</em></span> bytes of the previous key exactly are omitted
|
||
and only the final <span class="emphasis"><em>n-i</em></span> bytes are stored in the
|
||
database. If the bytes of key being stored match the bytes of the
|
||
previous key exactly, then the same prefix compression algorithm is
|
||
applied to the data value being stored. To use Berkeley DB's default
|
||
compression behavior, both the default compression and decompression
|
||
functions must be used.
|
||
</p>
|
||
<p>
|
||
For example, to configure your database for default compression:
|
||
</p>
|
||
<a id="prog_am7"></a>
|
||
<pre class="programlisting">
|
||
DB *dbp = NULL;
|
||
DB_ENV *envp = NULL;
|
||
u_int32_t db_flags;
|
||
const char *file_name = "mydb.db";
|
||
int ret;
|
||
|
||
...
|
||
|
||
/* Skipping environment open to shorten this example */
|
||
/* Initialize the DB handle */
|
||
ret = db_create(&dbp, envp, 0);
|
||
if (ret != 0) {
|
||
fprintf(stderr, "%s\n", db_strerror(ret));
|
||
return (EXIT_FAILURE);
|
||
}
|
||
|
||
/* Turn on default data compression */
|
||
dbp->set_bt_compress(dbp, NULL, NULL);
|
||
|
||
/* Now open the database */
|
||
db_flags = DB_CREATE; /* Allow database creation */
|
||
|
||
ret = dbp->open(dbp, /* Pointer to the database */
|
||
NULL, /* Txn pointer */
|
||
file_name, /* File name */
|
||
NULL, /* Logical db name */
|
||
DB_BTREE, /* Database type (using btree) */
|
||
db_flags, /* Open flags */
|
||
0); /* File mode. Using defaults */
|
||
if (ret != 0) {
|
||
dbp->err(dbp, ret, "Database '%s' open failed",
|
||
file_name);
|
||
return (EXIT_FAILURE);
|
||
}</pre>
|
||
<div class="sect3" lang="en" xml:lang="en">
|
||
<div class="titlepage">
|
||
<div>
|
||
<div>
|
||
<h4 class="title"><a id="am_conf_bt_custom_compress"></a>Custom compression</h4>
|
||
</div>
|
||
</div>
|
||
</div>
|
||
<p>
|
||
An application wishing to perform it's own compression may supply a
|
||
compression and decompression function which will be called instead of
|
||
Berkeley DB's default functions. The compression function is
|
||
passed five <a href="../api_reference/C/dbt.html" class="olink">DBT</a> structures:
|
||
</p>
|
||
<div class="itemizedlist">
|
||
<ul type="disc">
|
||
<li>
|
||
<p>
|
||
The key and data immediately preceeding the key/data pair
|
||
that is being stored.
|
||
</p>
|
||
</li>
|
||
<li>
|
||
<p>
|
||
The key and data being stored in the tree.
|
||
</p>
|
||
</li>
|
||
<li>
|
||
<p>
|
||
The buffer where the compressed data should be written.
|
||
</p>
|
||
</li>
|
||
</ul>
|
||
</div>
|
||
<p>
|
||
The total size of the buffer used to store the compressed data is
|
||
identified in the <a href="../api_reference/C/dbt.html" class="olink">DBT</a>'s <code class="literal">ulen</code> field. If the
|
||
compressed data cannot fit in the buffer, the compression function
|
||
should store the amount of space needed in <a href="../api_reference/C/dbt.html" class="olink">DBT</a>'s
|
||
<code class="literal">size</code> field and then return
|
||
<code class="literal">DB_BUFFER_SMALL</code>. Berkeley DB will subsequently
|
||
re-call the compression function with the required amount of space
|
||
allocated in the compression data buffer.
|
||
</p>
|
||
<p>
|
||
Multiple compressed key/data pairs will likely be written to the
|
||
same buffer and the compression function should take steps to
|
||
ensure it does not overwrite data.
|
||
</p>
|
||
<p>
|
||
For example, the following code fragments illustrate the use of a custom
|
||
compression routine. This code is actually a much simplified
|
||
example of the default compression provided by Berkeley DB. It does
|
||
simple prefix compression on the key part of the data.
|
||
</p>
|
||
<a id="prog_am8"></a>
|
||
<pre class="programlisting">
|
||
int compress(DB *dbp, const DBT *prevKey, const DBT *prevData,
|
||
const DBT *key, const DBT *data, DBT *dest)
|
||
{
|
||
u_int8_t *dest_data_ptr;
|
||
const u_int8_t *key_data, *prevKey_data;
|
||
size_t len, prefix, suffix;
|
||
|
||
key_data = (const u_int8_t*)key->data;
|
||
prevKey_data = (const u_int8_t*)prevKey->data;
|
||
len = key->size > prevKey->size ? prevKey->size : key->size;
|
||
for (; len-- && *key_data == *prevKey_data; ++key_data,
|
||
++prevKey_data)
|
||
continue;
|
||
|
||
prefix = (size_t)(key_data - (u_int8_t*)key->data);
|
||
suffix = key->size - prefix;
|
||
|
||
/* Check that we have enough space in dest */
|
||
dest->size = (u_int32_t)(__db_compress_count_int(prefix) +
|
||
__db_compress_count_int(suffix) +
|
||
__db_compress_count_int(data->size) + suffix + data->size);
|
||
if (dest->size > dest->ulen)
|
||
return (DB_BUFFER_SMALL);
|
||
|
||
/* prefix length */
|
||
dest_data_ptr = (u_int8_t*)dest->data;
|
||
dest_data_ptr += __db_compress_int(dest_data_ptr, prefix);
|
||
|
||
/* suffix length */
|
||
dest_data_ptr += __db_compress_int(dest_data_ptr, suffix);
|
||
|
||
/* data length */
|
||
dest_data_ptr += __db_compress_int(dest_data_ptr, data->size);
|
||
|
||
/* suffix */
|
||
memcpy(dest_data_ptr, key_data, suffix);
|
||
dest_data_ptr += suffix;
|
||
|
||
/* data */
|
||
memcpy(dest_data_ptr, data->data, data->size);
|
||
|
||
return (0);
|
||
} </pre>
|
||
<p>
|
||
The corresponding decompression function is likewise passed five <a href="../api_reference/C/dbt.html" class="olink">DBT</a> structures:
|
||
</p>
|
||
<div class="itemizedlist">
|
||
<ul type="disc">
|
||
<li>
|
||
<p>
|
||
The key and data <a href="../api_reference/C/dbt.html" class="olink">DBT</a>s immediately preceding the
|
||
decompressed key and data.
|
||
</p>
|
||
</li>
|
||
<li>
|
||
<p>
|
||
The compressed data from the database.
|
||
</p>
|
||
</li>
|
||
<li>
|
||
<p>
|
||
One to store the decompressed key and another one for the
|
||
decompressed data.
|
||
</p>
|
||
</li>
|
||
</ul>
|
||
</div>
|
||
<p>
|
||
Because the compression of <code class="literal">record X</code> relies upon
|
||
<code class="literal">record X-1</code>, the decompression function can be
|
||
called repeatedly to linearally decompress a set of records stored
|
||
in the compressed buffer.
|
||
</p>
|
||
<p>
|
||
The total size of the buffer available to store the decompressed data is
|
||
identified in the destination <a href="../api_reference/C/dbt.html" class="olink">DBT</a>'s <code class="literal">ulen</code> field. If the
|
||
decompressed data cannot fit in the buffer, the decompression function
|
||
should store the amount of space needed in the destination <a href="../api_reference/C/dbt.html" class="olink">DBT</a>'s
|
||
<code class="literal">size</code> field and then return
|
||
<code class="literal">DB_BUFFER_SMALL</code>. Berkeley DB will subsequently
|
||
re-call the decompression function with the required amount of space
|
||
allocated in the decompression data buffer.
|
||
</p>
|
||
<p>
|
||
For example, the decompression routine that corresponds to the
|
||
example compression routine provided above is:
|
||
</p>
|
||
<a id="prog_am9"></a>
|
||
<pre class="programlisting">int decompress(DB *dbp, const DBT *prevKey, const DBT *prevData,
|
||
DBT *compressed, DBT *destKey, DBT *destData)
|
||
{
|
||
u_int8_t *comp_data, *dest_data;
|
||
u_int32_t prefix, suffix, size;
|
||
|
||
/* Unmarshal prefix, suffix and data length */
|
||
comp_data = (u_int8_t*)compressed->data;
|
||
size = __db_decompress_count_int(comp_data);
|
||
if (size > compressed->size)
|
||
return (EINVAL);
|
||
comp_data += __db_decompress_int32(comp_data, &prefix);
|
||
|
||
size += __db_decompress_count_int(comp_data);
|
||
if (size > compressed->size)
|
||
return (EINVAL);
|
||
comp_data += __db_decompress_int32(comp_data, &suffix);
|
||
|
||
size += __db_decompress_count_int(comp_data);
|
||
if (size > compressed->size)
|
||
return (EINVAL);
|
||
comp_data += __db_decompress_int32(comp_data, &destData->size);
|
||
|
||
/* Check destination lengths */
|
||
destKey->size = prefix + suffix;
|
||
if (destKey->size > destKey->ulen ||
|
||
destData->size > destData->ulen)
|
||
return (DB_BUFFER_SMALL);
|
||
|
||
/* Write the prefix */
|
||
if (prefix > prevKey->size)
|
||
return (EINVAL);
|
||
dest_data = (u_int8_t*)destKey->data;
|
||
memcpy(dest_data, prevKey->data, prefix);
|
||
dest_data += prefix;
|
||
|
||
/* Write the suffix */
|
||
size += suffix;
|
||
if (size > compressed->size)
|
||
return (EINVAL);
|
||
memcpy(dest_data, comp_data, suffix);
|
||
comp_data += suffix;
|
||
|
||
/* Write the data */
|
||
size += destData->size;
|
||
if (size > compressed->size)
|
||
return (EINVAL);
|
||
memcpy(destData->data, comp_data, destData->size);
|
||
comp_data += destData->size;
|
||
|
||
/* Return bytes read */
|
||
compressed->size =
|
||
(u_int32_t)(comp_data - (u_int8_t*)compressed->data);
|
||
return (0);
|
||
} </pre>
|
||
</div>
|
||
<div class="sect3" lang="en" xml:lang="en">
|
||
<div class="titlepage">
|
||
<div>
|
||
<div>
|
||
<h4 class="title"><a id="id3881949"></a>Programmer Notes</h4>
|
||
</div>
|
||
</div>
|
||
</div>
|
||
<p>
|
||
As you use compression with your databases, be aware of the
|
||
following:
|
||
</p>
|
||
<div class="itemizedlist">
|
||
<ul type="disc">
|
||
<li>
|
||
<p>
|
||
Compression works by placing key/data pairs from a single
|
||
database page into a single block of compressed data. This is true
|
||
whether you use DB's default compression, or you write
|
||
your own compression. Because all of key/data data is
|
||
placed in a single block of memory, you cannot decompress
|
||
data unless you have decompressed everything that came
|
||
before it in the block. That is, you cannot decompress item
|
||
<span class="emphasis"><em>n</em></span> in the data block, unless you also
|
||
decompress items <span class="emphasis"><em>0</em></span> through
|
||
<span class="emphasis"><em>n-1</em></span>.
|
||
</p>
|
||
</li>
|
||
<li>
|
||
<p>
|
||
If you increase the minimum number of key/data pairs placed
|
||
on a Btree leaf page (using <a href="../api_reference/C/dbset_bt_minkey.html" class="olink">DB->set_bt_minkey()</a>), you will
|
||
decrease your seek times on a compressed database. However,
|
||
this will also decrease the effectiveness of the
|
||
compression.
|
||
</p>
|
||
</li>
|
||
<li>
|
||
<p>
|
||
Compressed databases are fastest if bulk load is used to
|
||
add data to them. See
|
||
<a class="xref" href="am_misc_bulk.html" title="Retrieving and updating records in bulk">Retrieving and updating records in bulk</a>
|
||
for information on using bulk load.
|
||
</p>
|
||
</li>
|
||
</ul>
|
||
</div>
|
||
</div>
|
||
</div>
|
||
</div>
|
||
<div class="navfooter">
|
||
<hr />
|
||
<table width="100%" summary="Navigation footer">
|
||
<tr>
|
||
<td width="40%" align="left"><a accesskey="p" href="general_am_conf.html">Prev</a> </td>
|
||
<td width="20%" align="center">
|
||
<a accesskey="u" href="am_conf.html">Up</a>
|
||
</td>
|
||
<td width="40%" align="right"> <a accesskey="n" href="hash_conf.html">Next</a></td>
|
||
</tr>
|
||
<tr>
|
||
<td width="40%" align="left" valign="top">General access method configuration </td>
|
||
<td width="20%" align="center">
|
||
<a accesskey="h" href="index.html">Home</a>
|
||
</td>
|
||
<td width="40%" align="right" valign="top"> Hash access method specific configuration</td>
|
||
</tr>
|
||
</table>
|
||
</div>
|
||
</body>
|
||
</html>
|