Configuring Two-Node Groups

Configuring Two-Node Groups
Prev	Chapter 2. Replication API First Steps	Next

A group needs at least a simple majority of active nodes in order to elect a Master. This means that for a replication group of size two, the failure of a single node means that the group as a whole is no longer available. In some cases, it may be desirable for the application to proceed anyway. If you are using a two-node group, and you decide you want your application to continue even if one of the nodes is unavailable, then you can trade off some of your durability guarantees, as well as potentially some of your performance, in exchange for a higher availability guarantee.

JE HA can explicitly relax the requirement for a simple majority of nodes. This is only possible when the replication group size is two. The application does this by designating one of the two electable nodes as a Primary node. The other node in the group is implicitly the Non-Primary node.

At any given instant in time, exactly one of the two nodes can be designated as the Primary. The application is responsible for ensuring that this is the case.

When the Non-Primary node is not available, the number of nodes required for a simple majority is reduced to one. As a consequence, the Primary is able to elect itself as the Master and then commit transactions that require a simple majority to commit. The Primary is said to be active when it is operating in this state. The transition from a designated Primary to an active Primary happens when the Primary needs to contact the Non-Primary node, but fails to do so for one of the following reasons:

An election is initiated by the Primary to determine a new Master. This might happen because the Primary is just starting up, or because the Primary has lost contact with the Non-Primary. In either case, if the election fails to establish a Master, the Primary is activated and it becomes the Master.

Note that the Primary will attempt to locate a Master until it has hit the retry limit as defined by the ELECTIONS_PRIMARY_RETRIES configuration property. But until the Primary has reached that limit, it will not transition to the active state.
An Environment.beginTransaction() operation is invoked on the Primary while it is in the Master state, and it cannot establish contact with the Non-Primary in the time period specified by the INSUFFICIENT_REPLICAS_TIMEOUT configuration property.
A Transaction.commit() needing a commit acknowledgement is invoked on the Primary while it is in the Master state, and the Primary does not receive the commit acknowledgement within the time period specified by the REPLICA_ACK_TIMEOUT configuration property.

Both the INSUFFICIENT_REPLICAS_TIMEOUT and REPLICA_ACK_TIMEOUT error cases are driven by the durability policy that you are using for your transactions. See Managing Durability for more information.

The three properties described above: ELECTIONS_PRIMARY_RETRIES, INSUFFICIENT_REPLICAS_TIMEOUT and REPLICA_ACK_TIMEOUT impact the time taken by the Primary to become active in the absence of the Non-Primary. Choosing smaller values for the timeouts and election retries will generally result in smaller service disruptions by activating the Primary more rapidly. The downside is that transient network glitches may result in unnecessary transitions to the active state where the Primary is operating with reduced Durability. It's up to the application to make these tradeoffs appropriately based on its operating environment.

When the Non-Primary becomes available again, the Primary becomes aware of it as part of the Master/Replica handshake (see Replica Startup). At that time, the number of nodes required for a simple majority reverts to two. That is, the Primary is no longer in the active state.

Your application must be very careful to not designate two nodes as Primaries. If both nodes are designated as Primaries, and the two nodes cannot communicate with one another for some reason, they could both consider themselves to be Masters and start accepting write transactions. This would violate a fundamental requirement of JE HA that at any given instant in time, there is only one node that is permitted to write to the replicated environment.

The Non-Primary always needs two nodes for a simple majority, and as a result can never become a Master in the absence of the Primary. If the Primary node fails, you can make provisions to swap the Primary and Non-Primary designations, so that the surviving node is now the Primary. The swap must be done carefully to ensure that both nodes are not concurrently designated Primaries. In particular, the failed node must come up as a Non-Primary after it has been repaired.

You designate a node as Primary using the mutable config property DESIGNATED_PRIMARY. You set this property using ReplicationMutableConfig.setDesignatedPrimary(). This property is ignored for groups of size greater than two.

As stated above, this configuration can only be set for one node at a time. This condition is checked during the Master/Replica startup handshake, and if both are designated as Primary then an EnvironmentFailureException is thrown. However, you should not rely on this handshake process to guard against dual Primaries. As stated above, if both nodes are designated Primary at some point after the handshake occurs, and your application experiences a network partition event such that the two nodes can no longer communicate, then both nodes will become Masters. This is error condition that will require you to lose data on at least one of the nodes if writes have occurred on both nodes while the network partition was in progress.

Prev	Up	Next
Time Synchronization	Home	Chapter 3. Transaction Management