Clean up section 11, remove 'Possible problems' section
This commit is contained in:
parent
424a64aeb6
commit
1f3d191d0e
1 changed files with 10 additions and 35 deletions
|
@ -1256,25 +1256,24 @@ and short:
|
||||||
A typical approach, as described by Coulouris et al.,[4] is to use a
|
A typical approach, as described by Coulouris et al.,[4] is to use a
|
||||||
quorum-consensus approach. This allows the sub-partition with a
|
quorum-consensus approach. This allows the sub-partition with a
|
||||||
majority of the votes to remain available, while the remaining
|
majority of the votes to remain available, while the remaining
|
||||||
sub-partitions should fall down to an auto-fencing mode.
|
sub-partitions should fall down to an auto-fencing mode.\footnote{Any
|
||||||
|
server on the minority side refuses to operate
|
||||||
|
because it is, so to speak, ``on the wrong side of the fence.''}
|
||||||
\end{quotation}
|
\end{quotation}
|
||||||
|
|
||||||
This is the same basic technique that
|
This is the same basic technique that
|
||||||
both Riak Ensemble and ZooKeeper use. Machi's
|
both Riak Ensemble and ZooKeeper use. Machi's
|
||||||
extensive use of write-registers are a big advantage when implementing
|
extensive use of write-once registers are a big advantage when implementing
|
||||||
this technique. Also very useful is the Machi ``wedge'' mechanism,
|
this technique. Also very useful is the Machi ``wedge'' mechanism,
|
||||||
which can automatically implement the ``auto-fencing'' that the
|
which can automatically implement the ``auto-fencing'' that the
|
||||||
technique requires. All Machi servers that can communicate with only
|
technique requires. All Machi servers that can communicate with only
|
||||||
a minority of other servers will automatically ``wedge'' themselves,
|
a minority of other servers will automatically ``wedge'' themselves,
|
||||||
refuse to author new projections, and
|
refuse to author new projections, and
|
||||||
and refuse all file API requests until communication with the
|
refuse all file API requests until communication with the
|
||||||
majority\footnote{I.e, communication with the majority's collection of
|
majority can be re-established.
|
||||||
projection stores.} can be re-established.
|
|
||||||
|
|
||||||
\subsection{The quorum: witness servers vs. real servers}
|
\subsection{The quorum: witness servers vs. real servers}
|
||||||
|
|
||||||
TODO Proofread for clarity: this is still a young draft.
|
|
||||||
|
|
||||||
In any quorum-consensus system, at least $2f+1$ participants are
|
In any quorum-consensus system, at least $2f+1$ participants are
|
||||||
required to survive $f$ participant failures. Machi can borrow an
|
required to survive $f$ participant failures. Machi can borrow an
|
||||||
old technique of ``witness servers'' to permit operation despite
|
old technique of ``witness servers'' to permit operation despite
|
||||||
|
@ -1292,7 +1291,7 @@ real Machi server.
|
||||||
|
|
||||||
A mixed cluster of witness and real servers must still contain at
|
A mixed cluster of witness and real servers must still contain at
|
||||||
least a quorum $f+1$ participants. However, as few as one of them
|
least a quorum $f+1$ participants. However, as few as one of them
|
||||||
must be a real server,
|
may be a real server,
|
||||||
and the remaining $f$ are witness servers. In
|
and the remaining $f$ are witness servers. In
|
||||||
such a cluster, any majority quorum must have at least one real server
|
such a cluster, any majority quorum must have at least one real server
|
||||||
participant.
|
participant.
|
||||||
|
@ -1303,10 +1302,8 @@ When in CP mode, any server that is on the minority side of a network
|
||||||
partition and thus cannot calculate a new projection that includes a
|
partition and thus cannot calculate a new projection that includes a
|
||||||
quorum of servers will
|
quorum of servers will
|
||||||
enter wedge state and remain wedged until the network partition
|
enter wedge state and remain wedged until the network partition
|
||||||
heals enough to communicate with a quorum of. This is a nice
|
heals enough to communicate with a quorum of FLUs. This is a nice
|
||||||
property: we automatically get ``fencing'' behavior.\footnote{Any
|
property: we automatically get ``fencing'' behavior.
|
||||||
server on the minority side is wedged and therefore refuses to serve
|
|
||||||
because it is, so to speak, ``on the wrong side of the fence.''}
|
|
||||||
|
|
||||||
\begin{figure}
|
\begin{figure}
|
||||||
\centering
|
\centering
|
||||||
|
@ -1387,28 +1384,6 @@ private projection store's epoch number from a quorum of servers
|
||||||
safely restart a chain. In the example above, we must endure the
|
safely restart a chain. In the example above, we must endure the
|
||||||
worst-case and wait until $S_a$ also returns to service.
|
worst-case and wait until $S_a$ also returns to service.
|
||||||
|
|
||||||
\section{Possible problems with Humming Consensus}
|
|
||||||
|
|
||||||
There are some unanswered questions about Machi's proposed chain
|
|
||||||
management technique. The problems that we guess are likely/possible
|
|
||||||
include:
|
|
||||||
|
|
||||||
\begin{itemize}
|
|
||||||
|
|
||||||
\item A counter-example is found which nullifies Humming Consensus's
|
|
||||||
safety properties.
|
|
||||||
|
|
||||||
\item Coping with rare flapping conditions.
|
|
||||||
It's hoped that the ``best projection'' ranking system
|
|
||||||
will be sufficient to prevent endless flapping of projections, but
|
|
||||||
it isn't yet clear that it will be.
|
|
||||||
|
|
||||||
\item CP Mode management via the method proposed in
|
|
||||||
Section~\ref{sec:split-brain-management} may not be sufficient in
|
|
||||||
all cases.
|
|
||||||
|
|
||||||
\end{itemize}
|
|
||||||
|
|
||||||
\section{File Repair/Synchronization}
|
\section{File Repair/Synchronization}
|
||||||
\label{sec:repair-entire-files}
|
\label{sec:repair-entire-files}
|
||||||
|
|
||||||
|
@ -1538,7 +1513,7 @@ projection of this type.
|
||||||
chain-of-chains.
|
chain-of-chains.
|
||||||
|
|
||||||
\item All write operations must flow successfully through the
|
\item All write operations must flow successfully through the
|
||||||
chain-of-chains in order, i.e., from Tail \#1
|
chain-of-chains in order, i.e., from ``head of heads''
|
||||||
to the ``tail of tails''. This rule also includes any
|
to the ``tail of tails''. This rule also includes any
|
||||||
repair operations.
|
repair operations.
|
||||||
|
|
||||||
|
|
Loading…
Reference in a new issue