Normally, when a master failure is detected it is desired that an
election finish quickly so the application can continue to service
updates. Also, participating sites are already up and can participate.
However, in the case of restarting a whole group after an
administrative shutdown, it is possible that a slower booting site had
later logs than any other site. To cover that case, an application
would like to give the election more time to ensure all sites have a
chance to participate. Since it is intractable for a starting site to
determine which case the whole group is in, the use of a long timeout
gives all sites a reasonable chance to participate. If an application
wanting full participation sets the <ahref="../api_reference/C/repelect.html"class="olink">DB_ENV->rep_elect()</a> method's
<spanclass="bold"><strong>nvotes</strong></span> argument to the number of sites
in the group and one site does not reboot, a master can never be elected
without manual intervention.
</p>
<p>
In those cases, the desired action at a group level is to hold
a full election if all sites crashed and a majority election if
a subset of sites crashed or rebooted. Since an individual site cannot know
which number of votes to require, a mechanism is available to
accomplish this using timeouts. By setting a long timeout (perhaps
on the order of minutes) using the <spanclass="bold"><strong>DB_REP_FULL_ELECTION_TIMEOUT</strong></span>
flag to the <ahref="../api_reference/C/repset_timeout.html"class="olink">DB_ENV->rep_set_timeout()</a> method, an application can
allow Berkeley DB to elect a master even without full participation.
Sites may also want to set a normal election timeout for majority
based elections using the <spanclass="bold"><strong>DB_REP_ELECTION_TIMEOUT</strong></span> flag
to the <ahref="../api_reference/C/repset_timeout.html"class="olink">DB_ENV->rep_set_timeout()</a> method.</p>
<p>
Consider 3 sites, A, B, and C where A is the master. In the
case where all three sites crash and all reboot, all sites
will set a timeout for a full election, say 10 minutes, but only
require a majority for <spanclass="bold"><strong>nvotes</strong></span> to the <ahref="../api_reference/C/repelect.html"class="olink">DB_ENV->rep_elect()</a> method.
Once all three sites are booted the election will complete
immediately if they reboot within 10 minutes of each other. Consider
if all three sites crash and only two reboot. The two sites will
enter the election, but after the 10 minute timeout they will
elect with the majority of two sites. Using the full election
timeout sets a threshold for allowing a site to reboot and rejoin
the group.</p>
<p>To add a database environment to the replication group with the intent
of it becoming the master, first add it as a client. Since it may be
out-of-date with respect to the current master, allow it to update
itself from the current master. Then, shut the current master down.
Presumably, the added client will win the subsequent election. If the
client does not win the election, it is likely that it was not given
sufficient time to update itself with respect to the current master.</p>
<p>If a client is unable to find a master or win an election, it means that
the network has been partitioned and there are not enough environments
participating in the election for one of the participants to win.
In this case, the application should repeatedly call <ahref="../api_reference/C/repstart.html"class="olink">DB_ENV->rep_start()</a>
and <ahref="../api_reference/C/repelect.html"class="olink">DB_ENV->rep_elect()</a>, alternating between attempting to discover an
existing master, and holding an election to declare a new one. In
desperate circumstances, an application could simply declare itself the
master by calling <ahref="../api_reference/C/repstart.html"class="olink">DB_ENV->rep_start()</a>, or by reducing the number of
participants required to win an election until the election is won.
Neither of these solutions is recommended: in the case of a network
partition, either of these choices can result in there being two masters
in one replication group, and the databases in the environment might
irretrievably diverge as they are modified in different ways by the
masters.</p>
<p>Note that this presents a special problem for a replication group
consisting of only two environments. If a master site fails, the
remaining client can never comprise a majority of sites in the group.
If the client application can reach a remote network site, or some other
external tie-breaker, it may be able to determine whether it is safe
to declare itself master. Otherwise it must choose between providing
availability of a writable master (at the risk of duplicate masters),
or strict protection against duplicate masters (but no master when a
failure occurs). Replication Manager offers this choice via the
<ahref="../api_reference/C/repconfig.html"class="olink">DB_ENV->rep_set_config()</a> method <ahref="../api_reference/C/repconfig.html#config_DB_REPMGR_CONF_2SITE_STRICT"class="olink">DB_REPMGR_CONF_2SITE_STRICT</a> flag. Base API
applications can accomplish this by judicious setting of the
<spanclass="bold"><strong>nvotes</strong></span> and
<spanclass="bold"><strong>nsites</strong></span> parameters to the