README and FAQ updates for mid-March 2016
This commit is contained in:
parent
6cddfcf988
commit
fa71a918b8
3 changed files with 74 additions and 51 deletions
13
FAQ.md
13
FAQ.md
|
@ -46,13 +46,13 @@
|
||||||
<a name="n1.1">
|
<a name="n1.1">
|
||||||
### 1.1. What is Machi?
|
### 1.1. What is Machi?
|
||||||
|
|
||||||
Very briefly, Machi is a very simple append-only file store.
|
Very briefly, Machi is a very simple append-only blob/file store.
|
||||||
|
|
||||||
Machi is
|
Machi is
|
||||||
"dumber" than many other file stores (i.e., lacking many features
|
"dumber" than many other file stores (i.e., lacking many features
|
||||||
found in other file stores) such as HadoopFS or a simple NFS or CIFS file
|
found in other file stores) such as HadoopFS or a simple NFS or CIFS file
|
||||||
server.
|
server.
|
||||||
However, Machi is a distributed file store, which makes it different
|
However, Machi is a distributed blob/file store, which makes it different
|
||||||
(and, in some ways, more complicated) than a simple NFS or CIFS file
|
(and, in some ways, more complicated) than a simple NFS or CIFS file
|
||||||
server.
|
server.
|
||||||
|
|
||||||
|
@ -142,7 +142,8 @@ consistency mode during and after network partitions are:
|
||||||
due to Machi's restrictions on file naming and file offset
|
due to Machi's restrictions on file naming and file offset
|
||||||
assignment. Both file names and file offsets are always chosen
|
assignment. Both file names and file offsets are always chosen
|
||||||
by Machi servers according to rules which guarantee safe
|
by Machi servers according to rules which guarantee safe
|
||||||
mergeability.
|
mergeability. Server-assigned names are a characteristic of a
|
||||||
|
"blob store".
|
||||||
|
|
||||||
<a name="n1.5">
|
<a name="n1.5">
|
||||||
### 1.5. What is Machi like when operating in "strongly consistent" mode?
|
### 1.5. What is Machi like when operating in "strongly consistent" mode?
|
||||||
|
@ -172,10 +173,10 @@ for more details.
|
||||||
### 1.6. What does Machi's API look like?
|
### 1.6. What does Machi's API look like?
|
||||||
|
|
||||||
The Machi API only contains a handful of API operations. The function
|
The Machi API only contains a handful of API operations. The function
|
||||||
arguments shown below use Erlang-style type annotations.
|
arguments shown below (in simplifed form) use Erlang-style type annotations.
|
||||||
|
|
||||||
append_chunk(Prefix:binary(), Chunk:binary()).
|
append_chunk(Prefix:binary(), Chunk:binary(), CheckSum:binary()).
|
||||||
append_chunk_extra(Prefix:binary(), Chunk:binary(), ExtraSpace:non_neg_integer()).
|
append_chunk_extra(Prefix:binary(), Chunk:binary(), CheckSum:binary(), ExtraSpace:non_neg_integer()).
|
||||||
read_chunk(File:binary(), Offset:non_neg_integer(), Size:non_neg_integer()).
|
read_chunk(File:binary(), Offset:non_neg_integer(), Size:non_neg_integer()).
|
||||||
|
|
||||||
checksum_list(File:binary()).
|
checksum_list(File:binary()).
|
||||||
|
|
108
README.md
108
README.md
|
@ -4,16 +4,16 @@
|
||||||
|
|
||||||
Outline
|
Outline
|
||||||
|
|
||||||
1. [Why another file store?](#sec1)
|
1. [Why another blob/file store?](#sec1)
|
||||||
2. [Where to learn more about Machi](#sec2)
|
2. [Where to learn more about Machi](#sec2)
|
||||||
3. [Development status summary](#sec3)
|
3. [Development status summary](#sec3)
|
||||||
4. [Contributing to Machi's development](#sec4)
|
4. [Contributing to Machi's development](#sec4)
|
||||||
|
|
||||||
<a name="sec1">
|
<a name="sec1">
|
||||||
## 1. Why another file store?
|
## 1. Why another blob/file store?
|
||||||
|
|
||||||
Our goal is a robust & reliable, distributed, highly available, large
|
Our goal is a robust & reliable, distributed, highly available, large
|
||||||
file store. Such stores already exist, both in the open source world
|
file and blob store. Such stores already exist, both in the open source world
|
||||||
and in the commercial world. Why reinvent the wheel? We believe
|
and in the commercial world. Why reinvent the wheel? We believe
|
||||||
there are three reasons, ordered by decreasing rarity.
|
there are three reasons, ordered by decreasing rarity.
|
||||||
|
|
||||||
|
@ -25,9 +25,8 @@ there are three reasons, ordered by decreasing rarity.
|
||||||
3. We want to manage file replicas in a way that's provably correct
|
3. We want to manage file replicas in a way that's provably correct
|
||||||
and also easy to test.
|
and also easy to test.
|
||||||
|
|
||||||
Of all the file stores in the open source & commercial worlds, only
|
Criteria #3 is difficult to find in the open source world but perhaps
|
||||||
criteria #3 is a viable option. Or so we hope. Or we just don't
|
not impossible.
|
||||||
care, and if data gets lost or corrupted, then ... so be it.
|
|
||||||
|
|
||||||
If we have app use cases where availability is more important than
|
If we have app use cases where availability is more important than
|
||||||
consistency, then systems that meet criteria #2 are also rare.
|
consistency, then systems that meet criteria #2 are also rare.
|
||||||
|
@ -39,12 +38,13 @@ file data and attempts best-effort file reads?
|
||||||
|
|
||||||
If we really do care about data loss and/or data corruption, then we
|
If we really do care about data loss and/or data corruption, then we
|
||||||
really want both #3 and #1. Unfortunately, systems that meet
|
really want both #3 and #1. Unfortunately, systems that meet
|
||||||
criteria #1 are _very rare_.
|
criteria #1 are _very rare_. (Nonexistant?)
|
||||||
Why? This is 2015. We have decades of research that shows
|
Why? This is 2015. We have decades of research that shows
|
||||||
that computer hardware can (and
|
that computer hardware can (and
|
||||||
indeed does) corrupt data at nearly every level of the modern
|
indeed does) corrupt data at nearly every level of the modern
|
||||||
client/server application stack. Systems with end-to-end data
|
client/server application stack. Systems with end-to-end data
|
||||||
corruption detection should be ubiquitous today. Alas, they are not.
|
corruption detection should be ubiquitous today. Alas, they are not.
|
||||||
|
|
||||||
Machi is an effort to change the deplorable state of the world, one
|
Machi is an effort to change the deplorable state of the world, one
|
||||||
Erlang function at a time.
|
Erlang function at a time.
|
||||||
|
|
||||||
|
@ -70,46 +70,62 @@ including the network partition simulator.
|
||||||
<a name="sec3">
|
<a name="sec3">
|
||||||
## 3. Development status summary
|
## 3. Development status summary
|
||||||
|
|
||||||
Mid-December 2015: work is underway.
|
Mid-March 2016: The Machi development team has been downsized in
|
||||||
|
recent months, and the pace of development has slowed. Here is a
|
||||||
|
summary of the status of Machi's major components.
|
||||||
|
|
||||||
* In progress:
|
* Humming Consensus and the chain manager
|
||||||
* Code refactoring: metadata management using
|
* No new safety bugs have been found by model-checking tests.
|
||||||
[ELevelDB](https://github.com/basho/eleveldb)
|
* A new document,
|
||||||
* File repair using file-centric, Merkle-style hash tree.
|
(Hand-on experiments with Machi and Humming Consensus)[doc/humming-consensus-demo.md]
|
||||||
* Server-side socket handling is now performed by
|
is now available. It is a tutorial for setting up a 3 virtual
|
||||||
[ranch](https://github.com/ninenines/ranch)
|
machine Machi cluster and how to demonstrate the chain manager's
|
||||||
* QuickCheck tests for file repair correctness
|
reactions to server stops & starts, crashes & restarts, and pauses
|
||||||
* 2015-12-15: The EUnit test `machi_ap_repair_eqc` is
|
(simulated by `SIGSTOP` and `SIGCONT`).
|
||||||
currently failing occasionally because it (correctly) detects
|
* The chain manager can still make suboptimal-but-safe choices for
|
||||||
double-write errors. Double-write errors will be eliminated
|
chain transitions when a server hangs/pauses temporarily.
|
||||||
when the ELevelDB integration work is complete.
|
* Recent chain manager changes have made the instability window
|
||||||
* The `make stage` and `make release` commands can be used to
|
much shorter when the slow/paused server resumes execution.
|
||||||
create a primitive "package". Use `./rel/machi/bin/machi console`
|
* Scott believes that a modest change to the chain manager's
|
||||||
to start the Machi app in interactive mode. Substitute the word
|
calculation of a new projection can reduce flapping in this (and
|
||||||
`start` instead of console to start Machi in background/daemon
|
many other cases) less likely. Currently, the new local
|
||||||
mode. The `./rel/machi/bin/machi` command without any arguments
|
projection is calculated using only local state (i.e., the chain
|
||||||
will give a short usage summary.
|
manager's internal state + the fitness server's state).
|
||||||
* Chain Replication management using the Humming Consensus
|
However, if the "latest" projection read from the public
|
||||||
algorithm to manage chain state is stable.
|
projection stores were also input to the new projection
|
||||||
* ... with the caveat that it runs very well in a very harsh
|
calculation function, then many obviously bad projections can be
|
||||||
and unforgiving network partition simulator but has not run
|
avoided without needing rounds of Humming Consensus to
|
||||||
much yet in the real world.
|
demonstrate that a bad projection is bad.
|
||||||
* All Machi client/server protocols are based on
|
|
||||||
[Protocol Buffers](https://developers.google.com/protocol-buffers/docs/overview).
|
|
||||||
* The current specification for Machi's protocols can be found at
|
|
||||||
[https://github.com/basho/machi/blob/master/src/machi.proto](https://github.com/basho/machi/blob/master/src/machi.proto).
|
|
||||||
* The Machi PB protocol is not yet stable. Expect change!
|
|
||||||
* The Erlang language client implementation of the high-level
|
|
||||||
protocol flavor is brittle (e.g., little error handling yet).
|
|
||||||
|
|
||||||
If you would like to run the Humming Consensus code (with or without
|
* FLU/data server process
|
||||||
|
* All known correctness bugs have been fixed.
|
||||||
|
* Performance has not yet been measured. Performance measurement
|
||||||
|
and enhancements are scheduled to start in the middle of March 2016.
|
||||||
|
(This will include a much-needed update to the `basho_bench` driver.)
|
||||||
|
|
||||||
|
* Access protocols and client libraries
|
||||||
|
* The protocol used by both external clients and internally (instead
|
||||||
|
of using Erlang's native message passing mechanisms) is based on
|
||||||
|
Protocol Buffers.
|
||||||
|
* (Machi PB protocol specification: ./src/machi.proto)[./src/machi.proto]
|
||||||
|
* At the moment, the PB specification contains two protocols.
|
||||||
|
Sometime in the near future, the spec will be split to separate
|
||||||
|
the external client API (the "high" protocol) from the internal
|
||||||
|
communication API (the "low" protocol).
|
||||||
|
|
||||||
|
* Recent conference talks about Machi
|
||||||
|
* Erlang Factory San Francisco 2016
|
||||||
|
(the slides and video recording)[http://www.erlang-factory.com/sfbay2016/scott-lystig-fritchie]
|
||||||
|
will be available a few weeks after the conference ends on March
|
||||||
|
11, 2016.
|
||||||
|
* Ricon 2015
|
||||||
|
* (The slides)[http://ricon.io/archive/2015/slides/Scott_Fritchie_Ricon_2015.pdf]
|
||||||
|
* and the (video recording)[https://www.youtube.com/watch?v=yR5kHL1bu1Q&index=13&list=PL9Jh2HsAWHxIc7Tt2M6xez_TOP21GBH6M]
|
||||||
|
are now available.
|
||||||
|
* If you would like to run the Humming Consensus code (with or without
|
||||||
the network partition simulator) as described in the RICON 2015
|
the network partition simulator) as described in the RICON 2015
|
||||||
presentation, please see the
|
presentation, please see the
|
||||||
[Humming Consensus demo doc.](./doc/humming_consensus_demo.md).
|
[Humming Consensus demo doc](./doc/humming_consensus_demo.md).
|
||||||
|
|
||||||
If you'd like to work on a protocol such as Thrift, UBF,
|
|
||||||
msgpack over UDP, or some other protocol, let us know by
|
|
||||||
[opening an issue to discuss it](./issues/new).
|
|
||||||
|
|
||||||
<a name="sec4">
|
<a name="sec4">
|
||||||
## 4. Contributing to Machi's development
|
## 4. Contributing to Machi's development
|
||||||
|
@ -150,3 +166,9 @@ Machi has a dependency on the
|
||||||
supports UNIX/Linux OSes and 64-bit versions of Erlang/OTP only; we
|
supports UNIX/Linux OSes and 64-bit versions of Erlang/OTP only; we
|
||||||
apologize to Windows-based and 32-bit-based Erlang developers for this
|
apologize to Windows-based and 32-bit-based Erlang developers for this
|
||||||
restriction.
|
restriction.
|
||||||
|
|
||||||
|
### 4.3 New protocols and features
|
||||||
|
|
||||||
|
If you'd like to work on a protocol such as Thrift, UBF,
|
||||||
|
msgpack over UDP, or some other protocol, let us know by
|
||||||
|
[opening an issue to discuss it](./issues/new).
|
||||||
|
|
|
@ -220,7 +220,7 @@ To help interpret the output of the test, please skip ahead to the
|
||||||
|
|
||||||
If you don't have `git` and/or the Erlang 17 runtime system available
|
If you don't have `git` and/or the Erlang 17 runtime system available
|
||||||
on your OS X, FreeBSD, Linux, or Solaris machine, please take a look
|
on your OS X, FreeBSD, Linux, or Solaris machine, please take a look
|
||||||
at the [Prerequistes section](#prerequisites) first. When you have
|
at the [Prerequisites section](#prerequisites) first. When you have
|
||||||
installed the prerequisite software, please return back here.
|
installed the prerequisite software, please return back here.
|
||||||
|
|
||||||
## Clone and compile the code
|
## Clone and compile the code
|
||||||
|
|
Loading…
Reference in a new issue