mirror of
https://github.com/berkeleydb/je.git
synced 2024-11-20 11:16:25 +00:00
623 lines
30 KiB
HTML
623 lines
30 KiB
HTML
|
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
|
|||
|
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
|
|||
|
<html xmlns="http://www.w3.org/1999/xhtml">
|
|||
|
<head>
|
|||
|
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
|
|||
|
<title>Replication Group Life Cycle</title>
|
|||
|
<link rel="stylesheet" href="gettingStarted.css" type="text/css" />
|
|||
|
<meta name="generator" content="DocBook XSL Stylesheets V1.73.2" />
|
|||
|
<link rel="start" href="index.html" title="Getting Started with Berkeley DB, Java Edition High Availability Applications" />
|
|||
|
<link rel="up" href="introduction.html" title="Chapter 1. Introduction" />
|
|||
|
<link rel="prev" href="datamanagement.html" title="Managing Data Guarantees" />
|
|||
|
<link rel="next" href="progoverview.html" title="Chapter 2. Replication API First Steps" />
|
|||
|
</head>
|
|||
|
<body>
|
|||
|
<div xmlns="" class="navheader">
|
|||
|
<div class="libver">
|
|||
|
<p>Library Version 12.2.7.5</p>
|
|||
|
</div>
|
|||
|
<table width="100%" summary="Navigation header">
|
|||
|
<tr>
|
|||
|
<th colspan="3" align="center">Replication Group Life Cycle</th>
|
|||
|
</tr>
|
|||
|
<tr>
|
|||
|
<td width="20%" align="left"><a accesskey="p" href="datamanagement.html">Prev</a> </td>
|
|||
|
<th width="60%" align="center">Chapter 1. Introduction</th>
|
|||
|
<td width="20%" align="right"> <a accesskey="n" href="progoverview.html">Next</a></td>
|
|||
|
</tr>
|
|||
|
</table>
|
|||
|
<hr />
|
|||
|
</div>
|
|||
|
<div class="sect1" lang="en" xml:lang="en">
|
|||
|
<div class="titlepage">
|
|||
|
<div>
|
|||
|
<div>
|
|||
|
<h2 class="title" style="clear: both"><a id="lifecycle"></a>Replication Group Life Cycle</h2>
|
|||
|
</div>
|
|||
|
</div>
|
|||
|
</div>
|
|||
|
<div class="toc">
|
|||
|
<dl>
|
|||
|
<dt>
|
|||
|
<span class="sect2">
|
|||
|
<a href="lifecycle.html#lifecycle-terms">Terminology</a>
|
|||
|
</span>
|
|||
|
</dt>
|
|||
|
<dt>
|
|||
|
<span class="sect2">
|
|||
|
<a href="lifecycle.html#nodestates">Node States</a>
|
|||
|
</span>
|
|||
|
</dt>
|
|||
|
<dt>
|
|||
|
<span class="sect2">
|
|||
|
<a href="lifecycle.html#lifecycle-new">New Replication Group Startup</a>
|
|||
|
</span>
|
|||
|
</dt>
|
|||
|
<dt>
|
|||
|
<span class="sect2">
|
|||
|
<a href="lifecycle.html#lifecycle-established">Subsequent Startups</a>
|
|||
|
</span>
|
|||
|
</dt>
|
|||
|
<dt>
|
|||
|
<span class="sect2">
|
|||
|
<a href="lifecycle.html#lifecycle-nodestartup">Replica Startup</a>
|
|||
|
</span>
|
|||
|
</dt>
|
|||
|
<dt>
|
|||
|
<span class="sect2">
|
|||
|
<a href="lifecycle.html#lifecycle-masterfailover">Master Failover</a>
|
|||
|
</span>
|
|||
|
</dt>
|
|||
|
<dt>
|
|||
|
<span class="sect2">
|
|||
|
<a href="lifecycle.html#twonode">Two Node Groups</a>
|
|||
|
</span>
|
|||
|
</dt>
|
|||
|
</dl>
|
|||
|
</div>
|
|||
|
<p>
|
|||
|
This section describes how your replication group behaves
|
|||
|
over the course of the application's lifetime. Startup is
|
|||
|
described, both for new nodes as well as for existing nodes
|
|||
|
that are restarting. This section also describes Master
|
|||
|
failover.
|
|||
|
</p>
|
|||
|
<div class="sect2" lang="en" xml:lang="en">
|
|||
|
<div class="titlepage">
|
|||
|
<div>
|
|||
|
<div>
|
|||
|
<h3 class="title"><a id="lifecycle-terms"></a>Terminology</h3>
|
|||
|
</div>
|
|||
|
</div>
|
|||
|
</div>
|
|||
|
<p>
|
|||
|
Before continuing, it is necessary to define some terms
|
|||
|
used in this document as they relate to
|
|||
|
node membership in a replication group.
|
|||
|
</p>
|
|||
|
<div class="itemizedlist">
|
|||
|
<ul type="disc">
|
|||
|
<li>
|
|||
|
<p>
|
|||
|
Add/Remove
|
|||
|
</p>
|
|||
|
<p>
|
|||
|
When we say that a node has been persistently
|
|||
|
<span class="emphasis"><em>added</em></span> to a replication group,
|
|||
|
this means that it has become a persistent member of
|
|||
|
the group. Regardless of whether the node is running
|
|||
|
or otherwise reachable by the group, once it has been
|
|||
|
added to the group it remains a member of the group.
|
|||
|
If the added node is an electable node, the group size
|
|||
|
used during elections, or transaction commit
|
|||
|
acknowledgements, is increased by one. Note that
|
|||
|
secondary nodes are not persistent members of the
|
|||
|
replication group, so they are not considered to be
|
|||
|
persistently added or removed.
|
|||
|
</p>
|
|||
|
<p>
|
|||
|
A node that has been persistently added to a
|
|||
|
replication group remains a member of that group
|
|||
|
until it is explicitly <span class="emphasis"><em>removed</em></span>
|
|||
|
from the group. Once a node has been removed from
|
|||
|
the group, it is no longer a member of the group. If
|
|||
|
the node that was removed was an electable node, the
|
|||
|
group size used during elections, or transaction
|
|||
|
commit acknowledgements, is decreased by one.
|
|||
|
</p>
|
|||
|
</li>
|
|||
|
<li>
|
|||
|
<p>
|
|||
|
Join/Leave
|
|||
|
</p>
|
|||
|
<p>
|
|||
|
We say that a member has <span class="emphasis"><em>joined</em></span> the
|
|||
|
replication group when it starts up and begins
|
|||
|
operating in the group as an active node.
|
|||
|
Electable and secondary nodes join a replication
|
|||
|
group by successfully opening a
|
|||
|
<a class="ulink" href="../java/com/sleepycat/je/rep/ReplicatedEnvironment.html" target="_top">ReplicatedEnvironment</a> handle. Monitor nodes are
|
|||
|
not considered to join a replication group because
|
|||
|
they do not actively participate in replication or
|
|||
|
elections.
|
|||
|
</p>
|
|||
|
<p>
|
|||
|
A member, then, <span class="emphasis"><em>leaves</em></span> a
|
|||
|
replication group by shutting down, or losing the
|
|||
|
network contact that allows it to operate as an
|
|||
|
active member of the group. When operating
|
|||
|
normally, member nodes leave a replication group by
|
|||
|
closing their last <a class="ulink" href="../java/com/sleepycat/je/rep/ReplicatedEnvironment.html" target="_top">ReplicatedEnvironment</a> handle.
|
|||
|
</p>
|
|||
|
<p>
|
|||
|
Joining or leaving a group does not change the
|
|||
|
electable group size, and so the number of nodes
|
|||
|
required to hold an election, as well as the
|
|||
|
number of nodes required to acknowledge
|
|||
|
transaction commits, does not change.
|
|||
|
</p>
|
|||
|
</li>
|
|||
|
</ul>
|
|||
|
</div>
|
|||
|
</div>
|
|||
|
<div class="sect2" lang="en" xml:lang="en">
|
|||
|
<div class="titlepage">
|
|||
|
<div>
|
|||
|
<div>
|
|||
|
<h3 class="title"><a id="nodestates"></a>Node States</h3>
|
|||
|
</div>
|
|||
|
</div>
|
|||
|
</div>
|
|||
|
<p>
|
|||
|
Member nodes can be in the following states:
|
|||
|
</p>
|
|||
|
<div class="itemizedlist">
|
|||
|
<ul type="disc">
|
|||
|
<li>
|
|||
|
<p>
|
|||
|
Master
|
|||
|
</p>
|
|||
|
<p>
|
|||
|
When in the Master state, a member node can service read and
|
|||
|
write requests. At any given time, there can be only one node in the
|
|||
|
Master state in the replication group.
|
|||
|
</p>
|
|||
|
</li>
|
|||
|
<li>
|
|||
|
<p>
|
|||
|
Replica
|
|||
|
</p>
|
|||
|
<p>
|
|||
|
Member nodes in the Replica state can only service
|
|||
|
read requests. All of the electable nodes other
|
|||
|
than the Master, and all of the secondary nodes,
|
|||
|
should be in the Replica state.
|
|||
|
</p>
|
|||
|
</li>
|
|||
|
<li>
|
|||
|
<p>
|
|||
|
Unknown
|
|||
|
</p>
|
|||
|
<p>
|
|||
|
The member node is not aware of a Master and is actively
|
|||
|
trying to discover or elect a Master. A node in this
|
|||
|
state is constantly striving to transition to the
|
|||
|
more productive Master or Replica state.
|
|||
|
</p>
|
|||
|
<p>
|
|||
|
A node in the Unknown state can still process read
|
|||
|
transactions if the node can satisfy its transaction
|
|||
|
consistency requirements.
|
|||
|
</p>
|
|||
|
</li>
|
|||
|
<li>
|
|||
|
<p>
|
|||
|
Detached
|
|||
|
</p>
|
|||
|
<p>
|
|||
|
The member node has been shutdown (that is, it has
|
|||
|
left the group, but it has not been removed from the
|
|||
|
group — see the previous section). It is still
|
|||
|
a member of the replication group, but is not active
|
|||
|
in elections or replicating data. Note that
|
|||
|
secondary nodes do not remain members when they are
|
|||
|
in the detached state; when they lose contact with
|
|||
|
the Master, they are no longer considered members of
|
|||
|
the group.
|
|||
|
</p>
|
|||
|
</li>
|
|||
|
</ul>
|
|||
|
</div>
|
|||
|
<p>
|
|||
|
Note that from time to time this documentation uses the
|
|||
|
term <span class="emphasis"><em>active node</em></span>. An active node is a
|
|||
|
member node that is in the Master, Replica or Unknown
|
|||
|
state. More to the point, an active node is a node that is
|
|||
|
available to participate in elections — if it is an
|
|||
|
electable node — and in data replication. Monitor
|
|||
|
nodes are not considered active and do not report their
|
|||
|
state.
|
|||
|
</p>
|
|||
|
</div>
|
|||
|
<div class="sect2" lang="en" xml:lang="en">
|
|||
|
<div class="titlepage">
|
|||
|
<div>
|
|||
|
<div>
|
|||
|
<h3 class="title"><a id="lifecycle-new"></a>New Replication Group Startup</h3>
|
|||
|
</div>
|
|||
|
</div>
|
|||
|
</div>
|
|||
|
<p>
|
|||
|
The first time you start up a replication group using an
|
|||
|
electable node, the group
|
|||
|
exists (for at least a small time) as a group of size one. At
|
|||
|
this time, the single node belonging to the group becomes the
|
|||
|
Master. So long as there is only one electable node in the
|
|||
|
replication group, that one node behaves as if it is a
|
|||
|
non-replicated application. There are some differences in the
|
|||
|
format of the log file that the application maintains, but it
|
|||
|
otherwise behaves identically to a non-replicated transactional
|
|||
|
application.
|
|||
|
</p>
|
|||
|
<p>
|
|||
|
Subsequently, upon startup a new node must be given the contact
|
|||
|
information for at least one currently active node in the
|
|||
|
replication group in order for it to be added to the
|
|||
|
group. The new node contacts this active node
|
|||
|
who will identify the Master for the new node.
|
|||
|
</p>
|
|||
|
<div class="note" style="margin-left: 0.5in; margin-right: 0.5in;">
|
|||
|
<h3 class="title">Note</h3>
|
|||
|
<p>
|
|||
|
As is the case with elections, an electable node cannot
|
|||
|
be added to the replication group unless a simple
|
|||
|
majority of electable nodes are active at the time that
|
|||
|
it starts up. If too many nodes are down or otherwise
|
|||
|
unavailable, you cannot add a new electable node to the
|
|||
|
group.
|
|||
|
</p>
|
|||
|
</div>
|
|||
|
<p>
|
|||
|
The new node then contacts the Master, and provides all
|
|||
|
necessary identification information about itself to the
|
|||
|
Master. This includes host and port information, the
|
|||
|
node's unique name, and the replication group name. For
|
|||
|
electable nodes, the Master stores this identifying
|
|||
|
information about the node persistently, meaning the
|
|||
|
effective number of electable members of the replication
|
|||
|
group has just grown by one. For secondary nodes, the
|
|||
|
information about the node is only maintained while the
|
|||
|
secondary node is active; the number of electable
|
|||
|
members does not change.
|
|||
|
</p>
|
|||
|
<div class="note" style="margin-left: 0.5in; margin-right: 0.5in;">
|
|||
|
<h3 class="title">Note</h3>
|
|||
|
<p>
|
|||
|
Note that the new electable node is now a permanent member
|
|||
|
of the replication group until you manually remove
|
|||
|
it. This is true even if you shutdown the node for a long
|
|||
|
time. See <a class="xref" href="utilities.html#node-addremove" title="Adding and Removing Nodes from the Group">Adding and Removing Nodes from the Group</a> for details.
|
|||
|
</p>
|
|||
|
</div>
|
|||
|
<p>
|
|||
|
Once the new node is an established member of the group, the
|
|||
|
Master provides the Replica with the logical logs needed to
|
|||
|
replicate the environment. The sequence of logical log
|
|||
|
records sent from the Master to the Replica constitutes the
|
|||
|
<span class="emphasis"><em>Replication Stream</em></span>. At this time, the
|
|||
|
node is said to have <span class="emphasis"><em>joined</em></span> the group.
|
|||
|
Once a replication stream is established, it is maintained until either the
|
|||
|
Replica or the Master goes down.
|
|||
|
</p>
|
|||
|
</div>
|
|||
|
<div class="sect2" lang="en" xml:lang="en">
|
|||
|
<div class="titlepage">
|
|||
|
<div>
|
|||
|
<div>
|
|||
|
<h3 class="title"><a id="lifecycle-established"></a>Subsequent Startups</h3>
|
|||
|
</div>
|
|||
|
</div>
|
|||
|
</div>
|
|||
|
<p>
|
|||
|
Each node stores information about
|
|||
|
other persistent replication group members in its replicated
|
|||
|
environment so that this information is available to it
|
|||
|
upon restart.
|
|||
|
</p>
|
|||
|
<p>
|
|||
|
When a node that is already an established
|
|||
|
member of a replication group is restarted, the node uses its
|
|||
|
knowledge of other members of the replication group to
|
|||
|
locate the Master. It does this by by querying the
|
|||
|
members of the group to locate the current Master. If it
|
|||
|
finds a Master, the node joins the group and proceeds to operate in the
|
|||
|
group as a Replica.
|
|||
|
</p>
|
|||
|
<p>
|
|||
|
If a Master is not available and the restarting node is an
|
|||
|
electable node, the node initiates an election so as to
|
|||
|
establish a Master. If a simple majority of electable
|
|||
|
nodes are available for the election, a Master is
|
|||
|
elected. If the restarting node is elected Master, it then
|
|||
|
waits for Replicas to connect to it so that it can supply
|
|||
|
them a replication stream. If the restarting node is a
|
|||
|
secondary node, then it continues to try to find the
|
|||
|
Master, waiting for the electable nodes to elect a Master
|
|||
|
as needed.
|
|||
|
</p>
|
|||
|
<p>
|
|||
|
Under ordinary circumstances, if a Master cannot be
|
|||
|
determined for some reason, the restarting node will fail to
|
|||
|
open. However, you can permit the node to instead open
|
|||
|
in the UNKOWN state. While in this state, the node is
|
|||
|
persistently attempting to find a Master, but it is also
|
|||
|
available for read-only requests.
|
|||
|
</p>
|
|||
|
<p>
|
|||
|
To configure a node in this way, use the
|
|||
|
<a class="ulink" href="../java/com/sleepycat/je/rep/ReplicationConfig.html#setConfigParam(java.lang.String,java.lang.String)" target="_top">ReplicationConfig.setConfigParam()</a> method to set the
|
|||
|
<a class="ulink" href="../java/com/sleepycat/je/rep/ReplicationConfig.html#ENV_UNKNOWN_STATE_TIMEOUT" target="_top">ReplicationConfig.ENV_UNKNOWN_STATE_TIMEOUT</a> parameter.
|
|||
|
This parameter requires you to define a Master election
|
|||
|
timeout period. If this election timeout expires while
|
|||
|
the node is attempting to restart, then the node opens in
|
|||
|
the UNKNOWN state instead of failing its open operation
|
|||
|
entirely.
|
|||
|
</p>
|
|||
|
</div>
|
|||
|
<div class="sect2" lang="en" xml:lang="en">
|
|||
|
<div class="titlepage">
|
|||
|
<div>
|
|||
|
<div>
|
|||
|
<h3 class="title"><a id="lifecycle-nodestartup"></a>Replica Startup</h3>
|
|||
|
</div>
|
|||
|
</div>
|
|||
|
</div>
|
|||
|
<p>
|
|||
|
Regardless of how it happens, when a node joins
|
|||
|
a replication group, it contacts the
|
|||
|
Master and then goes through the following three steps:
|
|||
|
</p>
|
|||
|
<div class="orderedlist">
|
|||
|
<ol type="1">
|
|||
|
<li>
|
|||
|
<p>
|
|||
|
Handshake
|
|||
|
</p>
|
|||
|
<p>
|
|||
|
The Replica sends the Master its configuration
|
|||
|
information, along with the unique name
|
|||
|
associated with the Replica's environment. This
|
|||
|
name is a pseudo-randomly generated Universal
|
|||
|
Unique Identifier (UUID).
|
|||
|
</p>
|
|||
|
<p>
|
|||
|
This handshake establishes the node as a valid
|
|||
|
member of the group. It is used both by new nodes
|
|||
|
joining the group for the first time, and by
|
|||
|
existing nodes that are simply restarting.
|
|||
|
</p>
|
|||
|
<p>
|
|||
|
In addition, during this handshake process, the
|
|||
|
Master and Replica nodes will compare their
|
|||
|
clocks. If the clocks are too far off from one
|
|||
|
another, the handshake will fail and the Replica
|
|||
|
node will fail to start up. See
|
|||
|
<a class="xref" href="timesync.html" title="Time Synchronization">Time Synchronization</a>
|
|||
|
for more information.
|
|||
|
</p>
|
|||
|
</li>
|
|||
|
<li>
|
|||
|
<p>
|
|||
|
Replication Stream Sync-Up
|
|||
|
</p>
|
|||
|
<p>
|
|||
|
The Replica sends the Master its current position
|
|||
|
in the replication stream sequence. The Master
|
|||
|
and Replica then negotiate a point in the
|
|||
|
replication stream that the Master can use as a
|
|||
|
starting point to resume the flow of logical
|
|||
|
records to the Replica.
|
|||
|
</p>
|
|||
|
<p>
|
|||
|
Note that normally this sync-up process will be
|
|||
|
transparent to your application. However, in rare
|
|||
|
cases the sync-up may require that committed
|
|||
|
transactions be undone.
|
|||
|
</p>
|
|||
|
<p>
|
|||
|
Also, if the Replica has been offline for a long
|
|||
|
time, it is possible that the Master can no
|
|||
|
longer supply the Replica with the required contiguous
|
|||
|
interval of the replication stream. (This can
|
|||
|
happen due to log cleaning on the Master.) In
|
|||
|
this case, the log files must be copied to the
|
|||
|
restarting node from some other up-to-date node
|
|||
|
in the replication group. See
|
|||
|
<a class="xref" href="logfile-restore.html" title="Restoring Log Files">Restoring Log Files</a>
|
|||
|
for details.
|
|||
|
</p>
|
|||
|
</li>
|
|||
|
<li>
|
|||
|
<p>
|
|||
|
Steady state replication stream flow
|
|||
|
</p>
|
|||
|
<p>
|
|||
|
Once the Replica has successfully started up and
|
|||
|
joined the group, the
|
|||
|
Master maintains a flow of log records to the
|
|||
|
Replica. Beyond that, the Master will request
|
|||
|
acknowledgements from electable Replicas whenever the
|
|||
|
Master needs to meet transaction commit
|
|||
|
durability requirements.
|
|||
|
</p>
|
|||
|
</li>
|
|||
|
</ol>
|
|||
|
</div>
|
|||
|
</div>
|
|||
|
<div class="sect2" lang="en" xml:lang="en">
|
|||
|
<div class="titlepage">
|
|||
|
<div>
|
|||
|
<div>
|
|||
|
<h3 class="title"><a id="lifecycle-masterfailover"></a>Master Failover</h3>
|
|||
|
</div>
|
|||
|
</div>
|
|||
|
</div>
|
|||
|
<p>
|
|||
|
A Master failing or shutting down causes all of the replication streams
|
|||
|
between the Master and its various Replicas to terminate.
|
|||
|
In reaction, the Replicas transition to the Unknown state
|
|||
|
and the electable nodes initiate an election.
|
|||
|
</p>
|
|||
|
<p>
|
|||
|
An election can be held if at least a simple majority of
|
|||
|
the replication group's electable nodes are active. The
|
|||
|
electable node
|
|||
|
that wins the election transitions to the Master state,
|
|||
|
and all other active nodes transition to the Replica
|
|||
|
state.
|
|||
|
</p>
|
|||
|
<p>
|
|||
|
Upon transitioning to the Replica state, nodes connect to
|
|||
|
the new Master and proceed through the handshake,
|
|||
|
sync-up, replication replay process described in the
|
|||
|
previous section.
|
|||
|
</p>
|
|||
|
<p>
|
|||
|
If no Master can be elected (because a majority of electable nodes
|
|||
|
are not available to participate in the election), then
|
|||
|
the nodes remain in the Unknown state until such a time
|
|||
|
as a Master can be elected. In this state, the nodes
|
|||
|
might be able to service read-only requests, but the
|
|||
|
replication group is incapable of servicing write
|
|||
|
requests. Read requests can be serviced so long as the
|
|||
|
transaction's consistency requirements can be met (see
|
|||
|
<a class="xref" href="consistency.html" title="Managing Consistency">Managing Consistency</a>).
|
|||
|
</p>
|
|||
|
<p>
|
|||
|
Note that the JE Replication application needs to make
|
|||
|
provisions for the following state transitions after
|
|||
|
failover:
|
|||
|
</p>
|
|||
|
<div class="itemizedlist">
|
|||
|
<ul type="disc">
|
|||
|
<li>
|
|||
|
<p>
|
|||
|
A node that transitions from the Replica state to
|
|||
|
the Master state as a result of a failover needs
|
|||
|
to start accepting update requests. There are
|
|||
|
several ways to determine whether a node can
|
|||
|
handle update requests. See
|
|||
|
<a class="xref" href="replicawrites.html" title="Managing Write Requests at a Replica">Managing Write Requests at a Replica</a>
|
|||
|
for more information.
|
|||
|
</p>
|
|||
|
</li>
|
|||
|
<li>
|
|||
|
<p>
|
|||
|
If a node remains in the Replica state after a
|
|||
|
failover, the failover should be transparent to
|
|||
|
the application. However, an application may need
|
|||
|
to take corrective action in the rare situation
|
|||
|
where the sync-up process has to roll back
|
|||
|
committed transactions.
|
|||
|
</p>
|
|||
|
<p>
|
|||
|
See <a class="xref" href="txnrollback.html" title="Managing Transaction Rollbacks">Managing Transaction Rollbacks</a>
|
|||
|
for an example of how handle a transaction commit roll back.
|
|||
|
</p>
|
|||
|
</li>
|
|||
|
</ul>
|
|||
|
</div>
|
|||
|
</div>
|
|||
|
<div class="sect2" lang="en" xml:lang="en">
|
|||
|
<div class="titlepage">
|
|||
|
<div>
|
|||
|
<div>
|
|||
|
<h3 class="title"><a id="twonode"></a>Two Node Groups</h3>
|
|||
|
</div>
|
|||
|
</div>
|
|||
|
</div>
|
|||
|
<p>
|
|||
|
Replication groups comprised of just two electable nodes
|
|||
|
represents a unique corner case for JE replication. In
|
|||
|
order to elect a master, usually a simple majority of
|
|||
|
electable nodes must be available to participate in an
|
|||
|
election. However, for replication groups of size two, if
|
|||
|
even one electable node is unavailable for the election then
|
|||
|
by default it is impossible to hold an election.
|
|||
|
</p>
|
|||
|
<p>
|
|||
|
However, for some classes of application, it is desirable
|
|||
|
for the application to proceed with operations using just
|
|||
|
one electable node. That is, the application trades off the
|
|||
|
durability guarantees offered by using two electable nodes
|
|||
|
for the higher availability permissible by allowing the
|
|||
|
application to run with just one of the nodes.
|
|||
|
</p>
|
|||
|
<p>
|
|||
|
JE allows you to do this by designating one of the nodes
|
|||
|
in a two electable node group as a <span class="emphasis"><em>primary
|
|||
|
node</em></span>. When the non-primary node of the pair is
|
|||
|
not available, the number of nodes required for a simple
|
|||
|
majority is reduced from two to one by the primary
|
|||
|
node. Consequently, the primary node is able to elect itself
|
|||
|
as the Master. It can then commit transactions that require
|
|||
|
a simple majority to acknowledge commits. When the
|
|||
|
non-primary node becomes available again, the number of
|
|||
|
nodes required for a simple majority at the primary once
|
|||
|
again reverts to two.
|
|||
|
</p>
|
|||
|
<p>
|
|||
|
At any given time, there must be either zero or one
|
|||
|
electable nodes designated as the primary node, but it is up
|
|||
|
to your application to make sure both nodes are not
|
|||
|
erroneously designated as the primary. Your application must
|
|||
|
be very careful not to mistakenly designate two nodes as the
|
|||
|
primary. If this happened, and the two nodes could not
|
|||
|
communicate with one another (due to a network malfunction
|
|||
|
of some kind, for example), they could both then consider
|
|||
|
themselves to be Masters and start accepting write
|
|||
|
requests. This violates a fundamental requirement that at
|
|||
|
any given instant in time, there should be exactly one node
|
|||
|
that is permitted to perform writes on the replicated
|
|||
|
environment.
|
|||
|
</p>
|
|||
|
<p>
|
|||
|
Note that the non-primary electable node always needs two
|
|||
|
electable nodes for a simple majority, so it can never
|
|||
|
become the Master in the absence of the primary node. If the
|
|||
|
primary node fails, you can make provisions to swap the
|
|||
|
primary and non-primary designations so that the surviving
|
|||
|
node is now the primary. This swap must be performed
|
|||
|
carefully so as to ensure that both nodes are not
|
|||
|
concurrently designated the primary. The most important
|
|||
|
thing is that the failed node comes up as the non-primary
|
|||
|
after it has been repaired.
|
|||
|
</p>
|
|||
|
<p>
|
|||
|
For more information on using two-node groups, see
|
|||
|
<a class="xref" href="two-node.html" title="Configuring Two-Node Groups">Configuring Two-Node Groups</a>.
|
|||
|
</p>
|
|||
|
</div>
|
|||
|
</div>
|
|||
|
<div class="navfooter">
|
|||
|
<hr />
|
|||
|
<table width="100%" summary="Navigation footer">
|
|||
|
<tr>
|
|||
|
<td width="40%" align="left"><a accesskey="p" href="datamanagement.html">Prev</a> </td>
|
|||
|
<td width="20%" align="center">
|
|||
|
<a accesskey="u" href="introduction.html">Up</a>
|
|||
|
</td>
|
|||
|
<td width="40%" align="right"> <a accesskey="n" href="progoverview.html">Next</a></td>
|
|||
|
</tr>
|
|||
|
<tr>
|
|||
|
<td width="40%" align="left" valign="top">Managing Data Guarantees </td>
|
|||
|
<td width="20%" align="center">
|
|||
|
<a accesskey="h" href="index.html">Home</a>
|
|||
|
</td>
|
|||
|
<td width="40%" align="right" valign="top"> Chapter 2. Replication API First Steps</td>
|
|||
|
</tr>
|
|||
|
</table>
|
|||
|
</div>
|
|||
|
</body>
|
|||
|
</html>
|