libdb/docs/gsg_db_rep/C/introduction.html
2012-11-14 16:35:20 -05:00

409 lines
17 KiB
HTML
Raw Permalink Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<title>Chapter 1. Introduction</title>
<link rel="stylesheet" href="gettingStarted.css" type="text/css" />
<meta name="generator" content="DocBook XSL Stylesheets V1.73.2" />
<link rel="start" href="index.html" title="Getting Started with Replicated Berkeley DB Applications" />
<link rel="up" href="index.html" title="Getting Started with Replicated Berkeley DB Applications" />
<link rel="prev" href="moreinfo.html" title="For More Information" />
<link rel="next" href="repadvantage.html" title="Replication Benefits" />
</head>
<body>
<div xmlns="" class="navheader">
<div class="libver">
<p>Library Version 11.2.5.3</p>
</div>
<table width="100%" summary="Navigation header">
<tr>
<th colspan="3" align="center">Chapter 1. Introduction</th>
</tr>
<tr>
<td width="20%" align="left"><a accesskey="p" href="moreinfo.html">Prev</a> </td>
<th width="60%" align="center"> </th>
<td width="20%" align="right"> <a accesskey="n" href="repadvantage.html">Next</a></td>
</tr>
</table>
<hr />
</div>
<div class="chapter" lang="en" xml:lang="en">
<div class="titlepage">
<div>
<div>
<h2 class="title"><a id="introduction"></a>Chapter 1. Introduction</h2>
</div>
</div>
</div>
<div class="toc">
<p>
<b>Table of Contents</b>
</p>
<dl>
<dt>
<span class="sect1">
<a href="introduction.html#overview">Overview</a>
</span>
</dt>
<dd>
<dl>
<dt>
<span class="sect2">
<a href="introduction.html#repenvirons">Replication Environments</a>
</span>
</dt>
<dt>
<span class="sect2">
<a href="introduction.html#repdbs">Replication Databases</a>
</span>
</dt>
<dt>
<span class="sect2">
<a href="introduction.html#commlayer">Communications Layer</a>
</span>
</dt>
<dt>
<span class="sect2">
<a href="introduction.html#masterselect">Selecting a Master</a>
</span>
</dt>
</dl>
</dd>
<dt>
<span class="sect1">
<a href="repadvantage.html">Replication Benefits</a>
</span>
</dt>
<dt>
<span class="sect1">
<a href="apioverview.html">The Replication APIs</a>
</span>
</dt>
<dd>
<dl>
<dt>
<span class="sect2">
<a href="apioverview.html#repframeworkoverview">Replication Manager Overview</a>
</span>
</dt>
<dt>
<span class="sect2">
<a href="apioverview.html#repapioverview">Replication Base API Overview</a>
</span>
</dt>
</dl>
</dd>
<dt>
<span class="sect1">
<a href="elections.html">Holding Elections</a>
</span>
</dt>
<dd>
<dl>
<dt>
<span class="sect2">
<a href="elections.html#influencingelections">Influencing Elections</a>
</span>
</dt>
<dt>
<span class="sect2">
<a href="elections.html#winningelections">Winning Elections</a>
</span>
</dt>
<dt>
<span class="sect2">
<a href="elections.html#switchingmasters">Switching Masters</a>
</span>
</dt>
</dl>
</dd>
<dt>
<span class="sect1">
<a href="permmessages.html">Permanent Message Handling</a>
</span>
</dt>
<dd>
<dl>
<dt>
<span class="sect2">
<a href="permmessages.html#permmessagenot">When Not to Manage
Permanent Messages</a>
</span>
</dt>
<dt>
<span class="sect2">
<a href="permmessages.html#permmanage">Managing Permanent Messages</a>
</span>
</dt>
<dt>
<span class="sect2">
<a href="permmessages.html#permimplement">Implementing Permanent
Message Handling</a>
</span>
</dt>
</dl>
</dd>
</dl>
</div>
<p>
This book provides a thorough introduction and discussion on
replication as used with Berkeley DB (DB). It begins by offering a
general overview to replication and the benefits it provides. It also
describes the APIs that you use to implement replication, and it
describes architecturally the things that you need to do to your
application code in order to use the replication APIs. Finally, it
discusses the differences in backup and restore strategies that you
might pursue when using replication, especially where it comes to log
file removal.
</p>
<p>
You should understand the concepts from the
<span>
<em class="citetitle">Berkeley DB Getting Started with Transaction Processing</em>
</span>
guide before reading this book.
</p>
<div class="sect1" lang="en" xml:lang="en">
<div class="titlepage">
<div>
<div>
<h2 class="title" style="clear: both"><a id="overview"></a>Overview</h2>
</div>
</div>
</div>
<div class="toc">
<dl>
<dt>
<span class="sect2">
<a href="introduction.html#repenvirons">Replication Environments</a>
</span>
</dt>
<dt>
<span class="sect2">
<a href="introduction.html#repdbs">Replication Databases</a>
</span>
</dt>
<dt>
<span class="sect2">
<a href="introduction.html#commlayer">Communications Layer</a>
</span>
</dt>
<dt>
<span class="sect2">
<a href="introduction.html#masterselect">Selecting a Master</a>
</span>
</dt>
</dl>
</div>
<p>
The DB replication APIs allow you to distribute your database
write operations (performed on a read-write master) to one or
more read-only <span class="emphasis"><em>replicas</em></span>.
For this reason, DB's replication implementation is said to be a
<span class="emphasis"><em>single master, multiple replica</em></span> replication strategy.
</p>
<p>
Note that your database write operations can occur only on the
master; any attempt to write to a replica results in an error
being
<span>returned to</span>
the DB API used to perform the write.
</p>
<p>
A single replication master and all of its replicas are referred
to as a <span class="emphasis"><em>replication group</em></span>. While all
members of the replication group can reside on the same
machine, usually each replication participant is placed on a
separate physical machine somewhere on the network.
</p>
<p>
Note that all replication applications must first be
transactional applications. The data that the master transmits
to its replicas are log records that are generated as records are
updated. Upon transactional commit, the master transmits a
transaction record which tells the replicas to commit the
records they previously received from the master. In order for
all of this to work, your replicated application must also be a
transactional application. For this reason, it is
recommended that you write and debug your DB application as
a stand-alone transactional application before introducing the
replication layer to your code.
</p>
<div class="sect2" lang="en" xml:lang="en">
<div class="titlepage">
<div>
<div>
<h3 class="title"><a id="repenvirons"></a>Replication Environments</h3>
</div>
</div>
</div>
<p>
The most important requirement for a replication
participant is that it must use a unique Berkeley DB database
environment independent of all other replication
participants. So while multiple replication participants
can reside on the same physical machine, no two such participants
can share the same environment home directory.
</p>
<p>
For this reason, technically replication occurs between
unique <span class="emphasis"><em>database environments</em></span>. So in the strictest sense,
a replication group consists of a <span class="emphasis"><em>master
environment</em></span> and
one or more <span class="emphasis"><em>replica environments</em></span>. However, the reality
is that for production code, each such environment will
usually be located on its own unique machine. Consequently,
this manual sometimes talks about <span class="emphasis"><em>replication sites</em></span>, meaning the
unique combination of environment home directory, host and port that a specific
replication application is using.
</p>
<p>
There is no DB-specified limit to the number of
environments which can participate in a replication group.
The only limitation here is one of resources —
network bandwidth, for example.
</p>
<p>
(Note, however, that the Replication Manager does place a limit on the
number of environments you can use. See
<a class="xref" href="apioverview.html#repframeworkoverview" title="Replication Manager Overview">Replication Manager Overview</a>
for details.)
</p>
<p>
Also, DB's replication implementation requires all
participating environments to be assigned IDs that are
locally unique to the given environment. Depending on the
replication APIs that you choose to use, you may or may not
need to manage this particular detail.
</p>
<p>
For detailed information on database environments, see
the <em class="citetitle">Berkeley DB Getting Started with Transaction Processing</em>
guide. For more information on environment IDs, see
the <em class="citetitle">Berkeley DB Programmer's Reference Guide</em>.
</p>
</div>
<div class="sect2" lang="en" xml:lang="en">
<div class="titlepage">
<div>
<div>
<h3 class="title"><a id="repdbs"></a>Replication Databases</h3>
</div>
</div>
</div>
<p>
DB's databases are managed and used in exactly the same way
as if you were writing a non-replicated application, with
a couple of caveats. First, the databases maintained in a replicated environment
must reside either in the <code class="literal">ENV_HOME</code>
directory, or in the directory identified by the
<code class="methodname">DB_ENV-&gt;set_data_dir()</code>
method. Unlike non-replication applications, you cannot place your
databases in a subdirectory below these locations. You should
also not use full path names for your databases or
environments as these are likely to break when they are replicated
to other machines.
</p>
</div>
<div class="sect2" lang="en" xml:lang="en">
<div class="titlepage">
<div>
<div>
<h3 class="title"><a id="commlayer"></a>Communications Layer</h3>
</div>
</div>
</div>
<p>
In order to transmit database writes to the replication
replicas, DB requires a communications layer.
DB is agnostic as to what this layer should
look like. The only requirement is that it
be capable of passing two opaque data objects and an
environment ID from the master to its replicas without
corruption.
</p>
<p>
Because replicas are usually placed on different machines on
the network, the communications layer is usually some kind
of a network-aware implementation. Beyond that, its
implementation details are largely up to you. It could use
TCP/IP sockets, for example, or it could use
raw sockets if they perform better for your particular
application.
</p>
<p>
Note that you may not have to write your own communications
layer. DB provides a Replication Manager that
includes a fully-functional TCP/IP-based communications layer.
See <a class="xref" href="apioverview.html" title="The Replication APIs">The Replication APIs</a>
for more information.
</p>
<p>
See the <em class="citetitle">Berkeley DB Programmer's Reference Guide</em>
for a description of how to
write your own custom replication communications layer.
</p>
</div>
<div class="sect2" lang="en" xml:lang="en">
<div class="titlepage">
<div>
<div>
<h3 class="title"><a id="masterselect"></a>Selecting a Master</h3>
</div>
</div>
</div>
<p>
Every replication group is allowed one and only one
master environment. Usually masters are selected by
holding an <span class="emphasis"><em>election</em></span>, although it
is possible to turn elections off and manually select
masters (this is not recommended for most replicated
applications).
</p>
<p>
When elections are being used, they are performed by the
underlying Berkeley DB replication code so you have to
do very little to implement them.
</p>
<p>
When holding an election, replicas "vote" on who should
be the master. Among replicas participating in the
election, the one with the most up-to-date set of log
records will win the election. Note that it's possible
for there to be a tie. When this occurs, priorities are
used to select the master. See
<a class="xref" href="elections.html" title="Holding Elections">Holding Elections</a>
for details.
</p>
<p>
For more information on holding and managing elections,
see <a class="xref" href="elections.html" title="Holding Elections">Holding Elections</a>.
</p>
</div>
</div>
</div>
<div class="navfooter">
<hr />
<table width="100%" summary="Navigation footer">
<tr>
<td width="40%" align="left"><a accesskey="p" href="moreinfo.html">Prev</a> </td>
<td width="20%" align="center"> </td>
<td width="40%" align="right"> <a accesskey="n" href="repadvantage.html">Next</a></td>
</tr>
<tr>
<td width="40%" align="left" valign="top">For More Information </td>
<td width="20%" align="center">
<a accesskey="h" href="index.html">Home</a>
</td>
<td width="40%" align="right" valign="top"> Replication Benefits</td>
</tr>
</table>
</div>
</body>
</html>