Clean up section 11, remove 'Possible problems' section

This commit is contained in:
Scott Lystig Fritchie 2015-06-17 10:16:25 +09:00
parent 424a64aeb6
commit 1f3d191d0e

View file

@ -1256,25 +1256,24 @@ and short:
A typical approach, as described by Coulouris et al.,[4] is to use a A typical approach, as described by Coulouris et al.,[4] is to use a
quorum-consensus approach. This allows the sub-partition with a quorum-consensus approach. This allows the sub-partition with a
majority of the votes to remain available, while the remaining majority of the votes to remain available, while the remaining
sub-partitions should fall down to an auto-fencing mode. sub-partitions should fall down to an auto-fencing mode.\footnote{Any
server on the minority side refuses to operate
because it is, so to speak, ``on the wrong side of the fence.''}
\end{quotation} \end{quotation}
This is the same basic technique that This is the same basic technique that
both Riak Ensemble and ZooKeeper use. Machi's both Riak Ensemble and ZooKeeper use. Machi's
extensive use of write-registers are a big advantage when implementing extensive use of write-once registers are a big advantage when implementing
this technique. Also very useful is the Machi ``wedge'' mechanism, this technique. Also very useful is the Machi ``wedge'' mechanism,
which can automatically implement the ``auto-fencing'' that the which can automatically implement the ``auto-fencing'' that the
technique requires. All Machi servers that can communicate with only technique requires. All Machi servers that can communicate with only
a minority of other servers will automatically ``wedge'' themselves, a minority of other servers will automatically ``wedge'' themselves,
refuse to author new projections, and refuse to author new projections, and
and refuse all file API requests until communication with the refuse all file API requests until communication with the
majority\footnote{I.e, communication with the majority's collection of majority can be re-established.
projection stores.} can be re-established.
\subsection{The quorum: witness servers vs. real servers} \subsection{The quorum: witness servers vs. real servers}
TODO Proofread for clarity: this is still a young draft.
In any quorum-consensus system, at least $2f+1$ participants are In any quorum-consensus system, at least $2f+1$ participants are
required to survive $f$ participant failures. Machi can borrow an required to survive $f$ participant failures. Machi can borrow an
old technique of ``witness servers'' to permit operation despite old technique of ``witness servers'' to permit operation despite
@ -1292,7 +1291,7 @@ real Machi server.
A mixed cluster of witness and real servers must still contain at A mixed cluster of witness and real servers must still contain at
least a quorum $f+1$ participants. However, as few as one of them least a quorum $f+1$ participants. However, as few as one of them
must be a real server, may be a real server,
and the remaining $f$ are witness servers. In and the remaining $f$ are witness servers. In
such a cluster, any majority quorum must have at least one real server such a cluster, any majority quorum must have at least one real server
participant. participant.
@ -1303,10 +1302,8 @@ When in CP mode, any server that is on the minority side of a network
partition and thus cannot calculate a new projection that includes a partition and thus cannot calculate a new projection that includes a
quorum of servers will quorum of servers will
enter wedge state and remain wedged until the network partition enter wedge state and remain wedged until the network partition
heals enough to communicate with a quorum of. This is a nice heals enough to communicate with a quorum of FLUs. This is a nice
property: we automatically get ``fencing'' behavior.\footnote{Any property: we automatically get ``fencing'' behavior.
server on the minority side is wedged and therefore refuses to serve
because it is, so to speak, ``on the wrong side of the fence.''}
\begin{figure} \begin{figure}
\centering \centering
@ -1387,28 +1384,6 @@ private projection store's epoch number from a quorum of servers
safely restart a chain. In the example above, we must endure the safely restart a chain. In the example above, we must endure the
worst-case and wait until $S_a$ also returns to service. worst-case and wait until $S_a$ also returns to service.
\section{Possible problems with Humming Consensus}
There are some unanswered questions about Machi's proposed chain
management technique. The problems that we guess are likely/possible
include:
\begin{itemize}
\item A counter-example is found which nullifies Humming Consensus's
safety properties.
\item Coping with rare flapping conditions.
It's hoped that the ``best projection'' ranking system
will be sufficient to prevent endless flapping of projections, but
it isn't yet clear that it will be.
\item CP Mode management via the method proposed in
Section~\ref{sec:split-brain-management} may not be sufficient in
all cases.
\end{itemize}
\section{File Repair/Synchronization} \section{File Repair/Synchronization}
\label{sec:repair-entire-files} \label{sec:repair-entire-files}
@ -1538,7 +1513,7 @@ projection of this type.
chain-of-chains. chain-of-chains.
\item All write operations must flow successfully through the \item All write operations must flow successfully through the
chain-of-chains in order, i.e., from Tail \#1 chain-of-chains in order, i.e., from ``head of heads''
to the ``tail of tails''. This rule also includes any to the ``tail of tails''. This rule also includes any
repair operations. repair operations.