Normal operation of JE HA requires that at least a simple majority of electable nodes be available to form a quorum for election of a new Master, or when committing a transaction with default durability requirements. The number of electable nodes (the Electable Group Size) is obtained from persistent internal metadata that is stored in the environment and replicated across all members. See Replication Group Life Cycle for details.
Under exceptional circumstances, a simple majority of electable nodes may become unavailable for some period of time. With only a minority of electable nodes available, the overall availability of the group can be adversely affected. For example, the group may be unavailable for writes because a master cannot be elected. Also, the Master may be unable to satisfy the durability requirements for a transaction commit. The group may also be unavailable for reads, because the absence of a Master might cause a Replica to be unable to meet consistency requirements.
To deal with this exceptional circumstance — especially if the situation is likely to persist for an unacceptably long period of time — JE HA provides a mechanism by which you can modify the way in which the number of electable nodes, and consequently the quorum requirements for elections and commit acknowledgments, is calculated. The escape mechanism provides a way to override the normal computation of the Electable Group Size. The override is accomplished by specifying the size using the mutable replication configuration parameter ELECTABLE_GROUP_SIZE_OVERRIDE.
You should use this parameter sparingly, if at all. Overriding your Electable Group Size can have the consequence of allowing your replication group's election participants to elect two Masters simultaneously. This is especially likely to occur if a majority of the nodes are unavailable due to a network partition event, and so all nodes are running but are simply not communicating with one another.
Be very cautious when using this configuration option.
When you set ELECTABLE_GROUP_SIZE_OVERRIDE to a non-zero value, the number that you provide identifies the number of electable nodes that are required to meet quorum requirements. This means that the internally stored Electable Group Size value is ignored (but not changed) when this option is non-zero. By setting ELECTABLE_GROUP_SIZE_OVERRIDE to the number of electable nodes known to be available, the remaining replication group participants can make forward progress, both in terms of electing a new Master (if this is required) and in terms of meeting durability and consistency requirements.
When this option is zero (0), then the node will behave normally, and the internal Electable Group Size is honored by the node. This is the default value and behavior.
To override the internal Electable Group Size value:
Verify that the simple majority of electable nodes are in fact down and cannot elect their own independent Master.
Set ELECTABLE_GROUP_SIZE_OVERRIDE to the number of electable nodes known to be available. For best results, set this override on all available electable nodes.
It might be sufficient to set ELECTABLE_GROUP_SIZE_OVERRIDE on just one electable node in order to hold an election, because the proposer at that one node can conclude the election. However, if the election results in Master that is not configured with this override, it might result in InsufficientAcksExceptions at the Master. So, again, set the override on all available electable nodes.
Having set the override, the available electable members of the replication group can now meet quorum requirements.
Having restored the group to a functioning state by use of the ELECTABLE_GROUP_SIZE_OVERRIDE override, it is desirable to return the group to its normal state as soon as possible. The normal operating state is one where the Electable Group Size is maintained by JE HA, and the override is no longer used.
To restore the group to its normal operational state, do one of the following:
Remove from the group any electable nodes that you know will be down for an extended period of time. Remove the nodes using the ReplicationGroupAdmin.removeMember() API.
Bring up electable nodes as they once again come on line, so that they can join the functioning group. This must be done carefully one node at a time in order to avoid the small possibility that a majority of the downed nodes hold an election amongst themselves and elect a second Master.
Perform some combination of node removal and bringing up nodes which were previously down.
As soon as there is a sufficient number of electable nodes up and running that election quorum requirements can be met in the absence of the override, the override can be removed, and normal HA operations resumed.
Consider a group consisting of 5 electable nodes:
n1
-n5
. Suppose a
simple majority of the nodes
(n3
-n5
) have become
unavailable.
If one of the nodes in
n3
-n5
was the
Master, then nodes n1
and
n2
will try to hold an election, and
fail due to the lack of a quorum. We now carry out the steps described, above:
Verify that n3
-n5
are down.
Set ELECTABLE_GROUP_SIZE_OVERRIDE to 2. Do this
at both n1
and n2
.
You can do this dynamically using JConsole, or by
setting the property in the je.properties
file and
restarting the node.
n1
and n2
will choose a new Master, say, n1
.
n1
can now process write
operations, and n2
can
acknowledge transaction commits.
Suppose that n3
is now repaired.
You can bring it back online and it will
automatically locate the new Master and join the
group. As is normal, it will catch up to
n1
and n2
in
the replication stream, and then begin
acknowledging commits as requested by
n1
.
We now have three electable nodes that are operational. Because
we have a true simple majority of electable nodes available, we
can now reset ELECTABLE_GROUP_SIZE_OVERRIDE to 0
(do this on n1
and n2
),
which causes the replication group to resume normal
operations. Note that n1
remains
the Master.
If n2
was the Master at the time of the
failure, then the situation is similar, except that an
election is not held. In this case, n2
will continue to
remain the Master throughout the entire process described
above. However, n2
might not be able to meet quorum
requirements for transaction commits until step 2 (above) is
performed.