mirror of
https://github.com/berkeleydb/libdb.git
synced 2024-11-16 17:16:25 +00:00
251 lines
17 KiB
HTML
251 lines
17 KiB
HTML
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
|
||
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
|
||
<html xmlns="http://www.w3.org/1999/xhtml">
|
||
<head>
|
||
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
|
||
<title>Retrieving and updating records in bulk</title>
|
||
<link rel="stylesheet" href="gettingStarted.css" type="text/css" />
|
||
<meta name="generator" content="DocBook XSL Stylesheets V1.73.2" />
|
||
<link rel="start" href="index.html" title="Berkeley DB Programmer's Reference Guide" />
|
||
<link rel="up" href="am_misc.html" title="Chapter 4. Access Method Wrapup" />
|
||
<link rel="prev" href="am_misc.html" title="Chapter 4. Access Method Wrapup" />
|
||
<link rel="next" href="am_misc_partial.html" title="Partial record storage and retrieval" />
|
||
</head>
|
||
<body>
|
||
<div xmlns="" class="navheader">
|
||
<div class="libver">
|
||
<p>Library Version 11.2.5.3</p>
|
||
</div>
|
||
<table width="100%" summary="Navigation header">
|
||
<tr>
|
||
<th colspan="3" align="center">Retrieving and updating records in bulk</th>
|
||
</tr>
|
||
<tr>
|
||
<td width="20%" align="left"><a accesskey="p" href="am_misc.html">Prev</a> </td>
|
||
<th width="60%" align="center">Chapter 4.
|
||
Access Method Wrapup
|
||
</th>
|
||
<td width="20%" align="right"> <a accesskey="n" href="am_misc_partial.html">Next</a></td>
|
||
</tr>
|
||
</table>
|
||
<hr />
|
||
</div>
|
||
<div class="sect1" lang="en" xml:lang="en">
|
||
<div class="titlepage">
|
||
<div>
|
||
<div>
|
||
<h2 class="title" style="clear: both"><a id="am_misc_bulk"></a>Retrieving and updating records in bulk</h2>
|
||
</div>
|
||
</div>
|
||
</div>
|
||
<div class="toc">
|
||
<dl>
|
||
<dt>
|
||
<span class="sect2">
|
||
<a href="am_misc_bulk.html#am_misc_bulk_get">Bulk retrieval</a>
|
||
</span>
|
||
</dt>
|
||
<dt>
|
||
<span class="sect2">
|
||
<a href="am_misc_bulk.html#am_misc_bulk_put">Bulk updates</a>
|
||
</span>
|
||
</dt>
|
||
<dt>
|
||
<span class="sect2">
|
||
<a href="am_misc_bulk.html#am_misc_bulk_del">Bulk deletes</a>
|
||
</span>
|
||
</dt>
|
||
</dl>
|
||
</div>
|
||
<p>When retrieving or modifying large numbers of records, the number
|
||
of method calls can often dominate performance. Berkeley DB offers bulk get,
|
||
put and delete interfaces which can significantly increase performance for some
|
||
applications.
|
||
</p>
|
||
<div class="sect2" lang="en" xml:lang="en">
|
||
<div class="titlepage">
|
||
<div>
|
||
<div>
|
||
<h3 class="title"><a id="am_misc_bulk_get"></a>Bulk retrieval</h3>
|
||
</div>
|
||
</div>
|
||
</div>
|
||
<p>
|
||
To retrieve records in bulk, an application buffer must be specified to
|
||
the <a href="../api_reference/C/dbget.html" class="olink">DB->get()</a> or <a href="../api_reference/C/dbcget.html" class="olink">DBC->get()</a> methods. This is done in the C API by setting
|
||
the <span class="bold"><strong>data</strong></span> and <span class="bold"><strong>ulen</strong></span> fields of the <span class="bold"><strong>data</strong></span> <a href="../api_reference/C/dbt.html" class="olink">DBT</a> to reference an application
|
||
buffer, and the
|
||
<span class="bold"><strong>flags</strong></span> field of that structure to
|
||
<a href="../api_reference/C/dbt.html#dbt_DB_DBT_USERMEM" class="olink">DB_DBT_USERMEM</a>. In the Berkeley DB C++ and Java APIs, the actions
|
||
are similar, although there are API-specific methods to set the <a href="../api_reference/C/dbt.html" class="olink">DBT</a>
|
||
values. Then, the <a href="../api_reference/C/dbcget.html#dbcget_DB_MULTIPLE" class="olink">DB_MULTIPLE</a> or <a href="../api_reference/C/dbcget.html#dbcget_DB_MULTIPLE_KEY" class="olink">DB_MULTIPLE_KEY</a> flags are
|
||
specified to the <a href="../api_reference/C/dbget.html" class="olink">DB->get()</a> or <a href="../api_reference/C/dbcget.html" class="olink">DBC->get()</a> methods, which cause multiple
|
||
records to be returned in the specified buffer.
|
||
</p>
|
||
<p>
|
||
The difference between <a href="../api_reference/C/dbcget.html#dbcget_DB_MULTIPLE" class="olink">DB_MULTIPLE</a> and <a href="../api_reference/C/dbcget.html#dbcget_DB_MULTIPLE_KEY" class="olink">DB_MULTIPLE_KEY</a> is as
|
||
follows: <a href="../api_reference/C/dbcget.html#dbcget_DB_MULTIPLE" class="olink">DB_MULTIPLE</a> returns multiple data items for a single key.
|
||
For example, the <a href="../api_reference/C/dbcget.html#dbcget_DB_MULTIPLE" class="olink">DB_MULTIPLE</a> flag would be used to retrieve all of
|
||
the duplicate data items for a single key in a single call. The
|
||
<a href="../api_reference/C/dbcget.html#dbcget_DB_MULTIPLE_KEY" class="olink">DB_MULTIPLE_KEY</a> flag is used to retrieve multiple key/data pairs,
|
||
where each returned key may or may not have duplicate data items.
|
||
</p>
|
||
<p>
|
||
Once the <a href="../api_reference/C/dbget.html" class="olink">DB->get()</a> or <a href="../api_reference/C/dbcget.html" class="olink">DBC->get()</a> method has returned, the application will
|
||
walk through the buffer handling the returned records. This is
|
||
implemented for the C and C++ APIs using four macros:
|
||
<a href="../api_reference/C/DB_MULTIPLE_INIT.html" class="olink">DB_MULTIPLE_INIT</a>, <a href="../api_reference/C/DB_MULTIPLE_NEXT.html" class="olink">DB_MULTIPLE_NEXT</a>, <a href="../api_reference/C/DB_MULTIPLE_KEY_NEXT.html" class="olink">DB_MULTIPLE_KEY_NEXT</a>, and
|
||
<a href="../api_reference/C/DB_MULTIPLE_RECNO_NEXT.html" class="olink">DB_MULTIPLE_RECNO_NEXT</a>. For the Java API, this is implemented as
|
||
three iterator classes:
|
||
<a class="ulink" href="../java/com/sleepycat/db/MultipleDataEntry.html" target="_top">MultipleDataEntry</a>,
|
||
<a class="ulink" href="../java/com/sleepycat/db/MultipleKeyDataEntry.html" target="_top">MultipleKeyDataEntry</a>,
|
||
and <a class="ulink" href="../java/com/sleepycat/db/MultipleRecnoDataEntry.html" target="_top">MultipleRecnoDataEntry</a>.
|
||
</p>
|
||
<p>
|
||
The <a href="../api_reference/C/DB_MULTIPLE_INIT.html" class="olink">DB_MULTIPLE_INIT</a> macro is always called first. It initializes a
|
||
local application variable and the <span class="bold"><strong>data</strong></span> <a href="../api_reference/C/dbt.html" class="olink">DBT</a> for stepping through the set of
|
||
returned records. Then, the application calls one of the remaining
|
||
three macros: <a href="../api_reference/C/DB_MULTIPLE_NEXT.html" class="olink">DB_MULTIPLE_NEXT</a>, <a href="../api_reference/C/DB_MULTIPLE_KEY_NEXT.html" class="olink">DB_MULTIPLE_KEY_NEXT</a>, and
|
||
<a href="../api_reference/C/DB_MULTIPLE_RECNO_NEXT.html" class="olink">DB_MULTIPLE_RECNO_NEXT</a>.
|
||
</p>
|
||
<p>
|
||
If the <a href="../api_reference/C/dbcget.html#dbcget_DB_MULTIPLE" class="olink">DB_MULTIPLE</a> flag was specified to the <a href="../api_reference/C/dbget.html" class="olink">DB->get()</a> or <a href="../api_reference/C/dbcget.html" class="olink">DBC->get()</a>
|
||
method, the application will always call the <a href="../api_reference/C/DB_MULTIPLE_NEXT.html" class="olink">DB_MULTIPLE_NEXT</a> macro.
|
||
If the <a href="../api_reference/C/dbcget.html#dbcget_DB_MULTIPLE_KEY" class="olink">DB_MULTIPLE_KEY</a> flag was specified to the <a href="../api_reference/C/dbget.html" class="olink">DB->get()</a> or <a href="../api_reference/C/dbcget.html" class="olink">DBC->get()</a>
|
||
method, and the underlying database is a Btree or Hash database, the
|
||
application will always call the <a href="../api_reference/C/DB_MULTIPLE_KEY_NEXT.html" class="olink">DB_MULTIPLE_KEY_NEXT</a> macro. If the
|
||
<a href="../api_reference/C/dbcget.html#dbcget_DB_MULTIPLE_KEY" class="olink">DB_MULTIPLE_KEY</a> flag was specified to the <a href="../api_reference/C/dbget.html" class="olink">DB->get()</a> or <a href="../api_reference/C/dbcget.html" class="olink">DBC->get()</a> method,
|
||
and the underlying database is a Queue or Recno database, the
|
||
application will always call the <a href="../api_reference/C/DB_MULTIPLE_RECNO_NEXT.html" class="olink">DB_MULTIPLE_RECNO_NEXT</a> macro. The
|
||
<a href="../api_reference/C/DB_MULTIPLE_NEXT.html" class="olink">DB_MULTIPLE_NEXT</a>, <a href="../api_reference/C/DB_MULTIPLE_KEY_NEXT.html" class="olink">DB_MULTIPLE_KEY_NEXT</a>, and
|
||
<a href="../api_reference/C/DB_MULTIPLE_RECNO_NEXT.html" class="olink">DB_MULTIPLE_RECNO_NEXT</a> macros are called repeatedly, until the end of
|
||
the returned records is reached. The end of the returned records is
|
||
detected by the application's local pointer variable being set to NULL.
|
||
</p>
|
||
<p>
|
||
Note that if you want to use a cursor for bulk retrieval of records in
|
||
a Btree database, you should open the cursor using the
|
||
<code class="literal">DB_CURSOR_BULK</code> flag. This optimizes the cursor for
|
||
bulk retrieval.
|
||
</p>
|
||
<p>
|
||
The following is an example of a routine that displays the contents of
|
||
a Btree database using the bulk return interfaces.
|
||
</p>
|
||
<a id="prog_am22"></a>
|
||
<pre class="programlisting">int
|
||
rec_display(DB *dbp)
|
||
{
|
||
DBC *dbcp;
|
||
DBT key, data;
|
||
size_t retklen, retdlen;
|
||
void *retkey, *retdata;
|
||
int ret, t_ret;
|
||
void *p;
|
||
|
||
memset(&key, 0, sizeof(key));
|
||
memset(&data, 0, sizeof(data));
|
||
|
||
/* Review the database in 5MB chunks. */
|
||
#define BUFFER_LENGTH (5 * 1024 * 1024)
|
||
if ((data.data = malloc(BUFFER_LENGTH)) == NULL)
|
||
return (errno);
|
||
data.ulen = BUFFER_LENGTH;
|
||
data.flags = DB_DBT_USERMEM;
|
||
|
||
/* Acquire a cursor for the database. */
|
||
if ((ret = dbp->cursor(dbp, NULL, &dbcp, DB_CURSOR_BULK))
|
||
!= 0) {
|
||
dbp->err(dbp, ret, "DB->cursor");
|
||
free(data.data);
|
||
return (ret);
|
||
}
|
||
|
||
for (;;) {
|
||
/*
|
||
* Acquire the next set of key/data pairs. This code
|
||
* does not handle single key/data pairs that won't fit
|
||
* in a BUFFER_LENGTH size buffer, instead returning
|
||
* DB_BUFFER_SMALL to our caller.
|
||
*/
|
||
if ((ret = dbcp->get(dbcp,
|
||
&key, &data, DB_MULTIPLE_KEY | DB_NEXT)) != 0) {
|
||
if (ret != DB_NOTFOUND)
|
||
dbp->err(dbp, ret, "DBcursor->get");
|
||
break;
|
||
}
|
||
|
||
for (DB_MULTIPLE_INIT(p, &data);;) {
|
||
DB_MULTIPLE_KEY_NEXT(p,
|
||
&data, retkey, retklen, retdata, retdlen);
|
||
if (p == NULL)
|
||
break;
|
||
printf("key: %.*s, data: %.*s\n",
|
||
(int)retklen, (char *)retkey, (int)retdlen,
|
||
(char *)retdata);
|
||
}
|
||
}
|
||
|
||
if ((t_ret = dbcp->close(dbcp)) != 0) {
|
||
dbp->err(dbp, ret, "DBcursor->close");
|
||
if (ret == 0)
|
||
ret = t_ret;
|
||
}
|
||
|
||
free(data.data);
|
||
|
||
return (ret);
|
||
}</pre>
|
||
</div>
|
||
<div class="sect2" lang="en" xml:lang="en">
|
||
<div class="titlepage">
|
||
<div>
|
||
<div>
|
||
<h3 class="title"><a id="am_misc_bulk_put"></a>Bulk updates</h3>
|
||
</div>
|
||
</div>
|
||
</div>
|
||
<p>To put records in bulk with the btree or hash access methods, construct bulk buffers in the <span class="bold"><strong>key</strong></span> and <span class="bold"><strong>data</strong></span> <a href="../api_reference/C/dbt.html" class="olink">DBT</a> using <a href="../api_reference/C/DB_MULTIPLE_WRITE_INIT.html" class="olink">DB_MULTIPLE_WRITE_INIT</a> and <a href="../api_reference/C/DB_MULTIPLE_WRITE_NEXT.html" class="olink">DB_MULTIPLE_WRITE_NEXT</a>. To put records in bulk with the recno or queue access methods, construct bulk buffers in the <span class="bold"><strong>data</strong></span> <a href="../api_reference/C/dbt.html" class="olink">DBT</a> as before, but construct the <span class="bold"><strong>key</strong></span> <a href="../api_reference/C/dbt.html" class="olink">DBT</a> using <a href="../api_reference/C/DB_MULTIPLE_RECNO_WRITE_INIT.html" class="olink">DB_MULTIPLE_RECNO_WRITE_INIT</a> and <a href="../api_reference/C/DB_MULTIPLE_RECNO_WRITE_NEXT.html" class="olink">DB_MULTIPLE_RECNO_WRITE_NEXT</a> with a data size of zero;. In both cases, set the <a href="../api_reference/C/dbcget.html#dbcget_DB_MULTIPLE" class="olink">DB_MULTIPLE</a> flag to <a href="../api_reference/C/dbput.html" class="olink">DB->put()</a>.</p>
|
||
<p>Alternatively, for btree and hash access methods, construct a single bulk buffer in the <span class="bold"><strong>key</strong></span> <a href="../api_reference/C/dbt.html" class="olink">DBT</a> using <a href="../api_reference/C/DB_MULTIPLE_WRITE_INIT.html" class="olink">DB_MULTIPLE_WRITE_INIT</a> and <a href="../api_reference/C/DB_MULTIPLE_KEY_WRITE_NEXT.html" class="olink">DB_MULTIPLE_KEY_WRITE_NEXT</a>. For recno and queue access methods, construct a bulk buffer in the <span class="bold"><strong>key</strong></span> <a href="../api_reference/C/dbt.html" class="olink">DBT</a> using <a href="../api_reference/C/DB_MULTIPLE_RECNO_WRITE_INIT.html" class="olink">DB_MULTIPLE_RECNO_WRITE_INIT</a> and <a href="../api_reference/C/DB_MULTIPLE_RECNO_WRITE_NEXT.html" class="olink">DB_MULTIPLE_RECNO_WRITE_NEXT</a>. In both cases, set the <a href="../api_reference/C/dbcget.html#dbcget_DB_MULTIPLE_KEY" class="olink">DB_MULTIPLE_KEY</a> flag to <a href="../api_reference/C/dbput.html" class="olink">DB->put()</a>.
|
||
|
||
</p>
|
||
<p>A successful bulk operation is logically equivalent to a loop through each key/data pair, performing a <a href="../api_reference/C/dbput.html" class="olink">DB->put()</a> for each one.</p>
|
||
<p>
|
||
</p>
|
||
</div>
|
||
<div class="sect2" lang="en" xml:lang="en">
|
||
<div class="titlepage">
|
||
<div>
|
||
<div>
|
||
<h3 class="title"><a id="am_misc_bulk_del"></a>Bulk deletes</h3>
|
||
</div>
|
||
</div>
|
||
</div>
|
||
<p>To delete all records with a specified set of keys with the btree or hash access methods, construct a bulk buffer in the <span class="bold"><strong>key</strong></span> <a href="../api_reference/C/dbt.html" class="olink">DBT</a> using <a href="../api_reference/C/DB_MULTIPLE_WRITE_INIT.html" class="olink">DB_MULTIPLE_WRITE_INIT</a> and <a href="../api_reference/C/DB_MULTIPLE_WRITE_NEXT.html" class="olink">DB_MULTIPLE_WRITE_NEXT</a>. To delete a set of records with the recno or queue access methods, construct the <span class="bold"><strong>key</strong></span> <a href="../api_reference/C/dbt.html" class="olink">DBT</a> using <a href="../api_reference/C/DB_MULTIPLE_RECNO_WRITE_INIT.html" class="olink">DB_MULTIPLE_RECNO_WRITE_INIT</a> and <a href="../api_reference/C/DB_MULTIPLE_RECNO_WRITE_NEXT.html" class="olink">DB_MULTIPLE_RECNO_WRITE_NEXT</a> with a data size of zero. In both cases, set the <a href="../api_reference/C/dbcget.html#dbcget_DB_MULTIPLE" class="olink">DB_MULTIPLE</a> flag to <a href="../api_reference/C/dbdel.html" class="olink">DB->del()</a>. This is equivalent to calling <a href="../api_reference/C/dbdel.html" class="olink">DB->del()</a> for each key in the bulk buffer. In particular, if the database supports duplicates, all records with the matching key are deleted.</p>
|
||
<p>Alternatively, to delete a specific set of key/data pairs, which may be items within a set of duplicates, there are also two cases depending on whether the access method uses record numbers for keys. For btree and hash access methods, construct a single bulk buffer in the <span class="bold"><strong>key</strong></span> <a href="../api_reference/C/dbt.html" class="olink">DBT</a> using <a href="../api_reference/C/DB_MULTIPLE_WRITE_INIT.html" class="olink">DB_MULTIPLE_WRITE_INIT</a> and <a href="../api_reference/C/DB_MULTIPLE_KEY_WRITE_NEXT.html" class="olink">DB_MULTIPLE_KEY_WRITE_NEXT</a>. For recno and queue access methods, construct a bulk buffer in the <span class="bold"><strong>key</strong></span> <a href="../api_reference/C/dbt.html" class="olink">DBT</a> using <a href="../api_reference/C/DB_MULTIPLE_RECNO_WRITE_INIT.html" class="olink">DB_MULTIPLE_RECNO_WRITE_INIT</a> and <a href="../api_reference/C/DB_MULTIPLE_RECNO_WRITE_NEXT.html" class="olink">DB_MULTIPLE_RECNO_WRITE_NEXT</a>. In both cases, set the <a href="../api_reference/C/dbcget.html#dbcget_DB_MULTIPLE_KEY" class="olink">DB_MULTIPLE_KEY</a> flag to <a href="../api_reference/C/dbdel.html" class="olink">DB->del()</a>.</p>
|
||
<p>A successful bulk operation is logically equivalent to a loop through each key/data pair, performing a <a href="../api_reference/C/dbdel.html" class="olink">DB->del()</a> for each one.</p>
|
||
</div>
|
||
</div>
|
||
<div class="navfooter">
|
||
<hr />
|
||
<table width="100%" summary="Navigation footer">
|
||
<tr>
|
||
<td width="40%" align="left"><a accesskey="p" href="am_misc.html">Prev</a> </td>
|
||
<td width="20%" align="center">
|
||
<a accesskey="u" href="am_misc.html">Up</a>
|
||
</td>
|
||
<td width="40%" align="right"> <a accesskey="n" href="am_misc_partial.html">Next</a></td>
|
||
</tr>
|
||
<tr>
|
||
<td width="40%" align="left" valign="top">Chapter 4.
|
||
Access Method Wrapup
|
||
</td>
|
||
<td width="20%" align="center">
|
||
<a accesskey="h" href="index.html">Home</a>
|
||
</td>
|
||
<td width="40%" align="right" valign="top"> Partial record storage and retrieval</td>
|
||
</tr>
|
||
</table>
|
||
</div>
|
||
</body>
|
||
</html>
|