README and FAQ updates for mid-March 2016
This commit is contained in:
parent
6cddfcf988
commit
fa71a918b8
3 changed files with 74 additions and 51 deletions
13
FAQ.md
13
FAQ.md
|
@ -46,13 +46,13 @@
|
|||
<a name="n1.1">
|
||||
### 1.1. What is Machi?
|
||||
|
||||
Very briefly, Machi is a very simple append-only file store.
|
||||
Very briefly, Machi is a very simple append-only blob/file store.
|
||||
|
||||
Machi is
|
||||
"dumber" than many other file stores (i.e., lacking many features
|
||||
found in other file stores) such as HadoopFS or a simple NFS or CIFS file
|
||||
server.
|
||||
However, Machi is a distributed file store, which makes it different
|
||||
However, Machi is a distributed blob/file store, which makes it different
|
||||
(and, in some ways, more complicated) than a simple NFS or CIFS file
|
||||
server.
|
||||
|
||||
|
@ -142,7 +142,8 @@ consistency mode during and after network partitions are:
|
|||
due to Machi's restrictions on file naming and file offset
|
||||
assignment. Both file names and file offsets are always chosen
|
||||
by Machi servers according to rules which guarantee safe
|
||||
mergeability.
|
||||
mergeability. Server-assigned names are a characteristic of a
|
||||
"blob store".
|
||||
|
||||
<a name="n1.5">
|
||||
### 1.5. What is Machi like when operating in "strongly consistent" mode?
|
||||
|
@ -172,10 +173,10 @@ for more details.
|
|||
### 1.6. What does Machi's API look like?
|
||||
|
||||
The Machi API only contains a handful of API operations. The function
|
||||
arguments shown below use Erlang-style type annotations.
|
||||
arguments shown below (in simplifed form) use Erlang-style type annotations.
|
||||
|
||||
append_chunk(Prefix:binary(), Chunk:binary()).
|
||||
append_chunk_extra(Prefix:binary(), Chunk:binary(), ExtraSpace:non_neg_integer()).
|
||||
append_chunk(Prefix:binary(), Chunk:binary(), CheckSum:binary()).
|
||||
append_chunk_extra(Prefix:binary(), Chunk:binary(), CheckSum:binary(), ExtraSpace:non_neg_integer()).
|
||||
read_chunk(File:binary(), Offset:non_neg_integer(), Size:non_neg_integer()).
|
||||
|
||||
checksum_list(File:binary()).
|
||||
|
|
110
README.md
110
README.md
|
@ -4,16 +4,16 @@
|
|||
|
||||
Outline
|
||||
|
||||
1. [Why another file store?](#sec1)
|
||||
1. [Why another blob/file store?](#sec1)
|
||||
2. [Where to learn more about Machi](#sec2)
|
||||
3. [Development status summary](#sec3)
|
||||
4. [Contributing to Machi's development](#sec4)
|
||||
|
||||
<a name="sec1">
|
||||
## 1. Why another file store?
|
||||
## 1. Why another blob/file store?
|
||||
|
||||
Our goal is a robust & reliable, distributed, highly available, large
|
||||
file store. Such stores already exist, both in the open source world
|
||||
file and blob store. Such stores already exist, both in the open source world
|
||||
and in the commercial world. Why reinvent the wheel? We believe
|
||||
there are three reasons, ordered by decreasing rarity.
|
||||
|
||||
|
@ -25,9 +25,8 @@ there are three reasons, ordered by decreasing rarity.
|
|||
3. We want to manage file replicas in a way that's provably correct
|
||||
and also easy to test.
|
||||
|
||||
Of all the file stores in the open source & commercial worlds, only
|
||||
criteria #3 is a viable option. Or so we hope. Or we just don't
|
||||
care, and if data gets lost or corrupted, then ... so be it.
|
||||
Criteria #3 is difficult to find in the open source world but perhaps
|
||||
not impossible.
|
||||
|
||||
If we have app use cases where availability is more important than
|
||||
consistency, then systems that meet criteria #2 are also rare.
|
||||
|
@ -39,12 +38,13 @@ file data and attempts best-effort file reads?
|
|||
|
||||
If we really do care about data loss and/or data corruption, then we
|
||||
really want both #3 and #1. Unfortunately, systems that meet
|
||||
criteria #1 are _very rare_.
|
||||
criteria #1 are _very rare_. (Nonexistant?)
|
||||
Why? This is 2015. We have decades of research that shows
|
||||
that computer hardware can (and
|
||||
indeed does) corrupt data at nearly every level of the modern
|
||||
client/server application stack. Systems with end-to-end data
|
||||
corruption detection should be ubiquitous today. Alas, they are not.
|
||||
|
||||
Machi is an effort to change the deplorable state of the world, one
|
||||
Erlang function at a time.
|
||||
|
||||
|
@ -70,46 +70,62 @@ including the network partition simulator.
|
|||
<a name="sec3">
|
||||
## 3. Development status summary
|
||||
|
||||
Mid-December 2015: work is underway.
|
||||
Mid-March 2016: The Machi development team has been downsized in
|
||||
recent months, and the pace of development has slowed. Here is a
|
||||
summary of the status of Machi's major components.
|
||||
|
||||
* In progress:
|
||||
* Code refactoring: metadata management using
|
||||
[ELevelDB](https://github.com/basho/eleveldb)
|
||||
* File repair using file-centric, Merkle-style hash tree.
|
||||
* Server-side socket handling is now performed by
|
||||
[ranch](https://github.com/ninenines/ranch)
|
||||
* QuickCheck tests for file repair correctness
|
||||
* 2015-12-15: The EUnit test `machi_ap_repair_eqc` is
|
||||
currently failing occasionally because it (correctly) detects
|
||||
double-write errors. Double-write errors will be eliminated
|
||||
when the ELevelDB integration work is complete.
|
||||
* The `make stage` and `make release` commands can be used to
|
||||
create a primitive "package". Use `./rel/machi/bin/machi console`
|
||||
to start the Machi app in interactive mode. Substitute the word
|
||||
`start` instead of console to start Machi in background/daemon
|
||||
mode. The `./rel/machi/bin/machi` command without any arguments
|
||||
will give a short usage summary.
|
||||
* Chain Replication management using the Humming Consensus
|
||||
algorithm to manage chain state is stable.
|
||||
* ... with the caveat that it runs very well in a very harsh
|
||||
and unforgiving network partition simulator but has not run
|
||||
much yet in the real world.
|
||||
* All Machi client/server protocols are based on
|
||||
[Protocol Buffers](https://developers.google.com/protocol-buffers/docs/overview).
|
||||
* The current specification for Machi's protocols can be found at
|
||||
[https://github.com/basho/machi/blob/master/src/machi.proto](https://github.com/basho/machi/blob/master/src/machi.proto).
|
||||
* The Machi PB protocol is not yet stable. Expect change!
|
||||
* The Erlang language client implementation of the high-level
|
||||
protocol flavor is brittle (e.g., little error handling yet).
|
||||
* Humming Consensus and the chain manager
|
||||
* No new safety bugs have been found by model-checking tests.
|
||||
* A new document,
|
||||
(Hand-on experiments with Machi and Humming Consensus)[doc/humming-consensus-demo.md]
|
||||
is now available. It is a tutorial for setting up a 3 virtual
|
||||
machine Machi cluster and how to demonstrate the chain manager's
|
||||
reactions to server stops & starts, crashes & restarts, and pauses
|
||||
(simulated by `SIGSTOP` and `SIGCONT`).
|
||||
* The chain manager can still make suboptimal-but-safe choices for
|
||||
chain transitions when a server hangs/pauses temporarily.
|
||||
* Recent chain manager changes have made the instability window
|
||||
much shorter when the slow/paused server resumes execution.
|
||||
* Scott believes that a modest change to the chain manager's
|
||||
calculation of a new projection can reduce flapping in this (and
|
||||
many other cases) less likely. Currently, the new local
|
||||
projection is calculated using only local state (i.e., the chain
|
||||
manager's internal state + the fitness server's state).
|
||||
However, if the "latest" projection read from the public
|
||||
projection stores were also input to the new projection
|
||||
calculation function, then many obviously bad projections can be
|
||||
avoided without needing rounds of Humming Consensus to
|
||||
demonstrate that a bad projection is bad.
|
||||
|
||||
If you would like to run the Humming Consensus code (with or without
|
||||
the network partition simulator) as described in the RICON 2015
|
||||
presentation, please see the
|
||||
[Humming Consensus demo doc.](./doc/humming_consensus_demo.md).
|
||||
* FLU/data server process
|
||||
* All known correctness bugs have been fixed.
|
||||
* Performance has not yet been measured. Performance measurement
|
||||
and enhancements are scheduled to start in the middle of March 2016.
|
||||
(This will include a much-needed update to the `basho_bench` driver.)
|
||||
|
||||
If you'd like to work on a protocol such as Thrift, UBF,
|
||||
msgpack over UDP, or some other protocol, let us know by
|
||||
[opening an issue to discuss it](./issues/new).
|
||||
* Access protocols and client libraries
|
||||
* The protocol used by both external clients and internally (instead
|
||||
of using Erlang's native message passing mechanisms) is based on
|
||||
Protocol Buffers.
|
||||
* (Machi PB protocol specification: ./src/machi.proto)[./src/machi.proto]
|
||||
* At the moment, the PB specification contains two protocols.
|
||||
Sometime in the near future, the spec will be split to separate
|
||||
the external client API (the "high" protocol) from the internal
|
||||
communication API (the "low" protocol).
|
||||
|
||||
* Recent conference talks about Machi
|
||||
* Erlang Factory San Francisco 2016
|
||||
(the slides and video recording)[http://www.erlang-factory.com/sfbay2016/scott-lystig-fritchie]
|
||||
will be available a few weeks after the conference ends on March
|
||||
11, 2016.
|
||||
* Ricon 2015
|
||||
* (The slides)[http://ricon.io/archive/2015/slides/Scott_Fritchie_Ricon_2015.pdf]
|
||||
* and the (video recording)[https://www.youtube.com/watch?v=yR5kHL1bu1Q&index=13&list=PL9Jh2HsAWHxIc7Tt2M6xez_TOP21GBH6M]
|
||||
are now available.
|
||||
* If you would like to run the Humming Consensus code (with or without
|
||||
the network partition simulator) as described in the RICON 2015
|
||||
presentation, please see the
|
||||
[Humming Consensus demo doc](./doc/humming_consensus_demo.md).
|
||||
|
||||
<a name="sec4">
|
||||
## 4. Contributing to Machi's development
|
||||
|
@ -150,3 +166,9 @@ Machi has a dependency on the
|
|||
supports UNIX/Linux OSes and 64-bit versions of Erlang/OTP only; we
|
||||
apologize to Windows-based and 32-bit-based Erlang developers for this
|
||||
restriction.
|
||||
|
||||
### 4.3 New protocols and features
|
||||
|
||||
If you'd like to work on a protocol such as Thrift, UBF,
|
||||
msgpack over UDP, or some other protocol, let us know by
|
||||
[opening an issue to discuss it](./issues/new).
|
||||
|
|
|
@ -220,7 +220,7 @@ To help interpret the output of the test, please skip ahead to the
|
|||
|
||||
If you don't have `git` and/or the Erlang 17 runtime system available
|
||||
on your OS X, FreeBSD, Linux, or Solaris machine, please take a look
|
||||
at the [Prerequistes section](#prerequisites) first. When you have
|
||||
at the [Prerequisites section](#prerequisites) first. When you have
|
||||
installed the prerequisite software, please return back here.
|
||||
|
||||
## Clone and compile the code
|
||||
|
|
Loading…
Reference in a new issue