README and FAQ updates for mid-March 2016

This commit is contained in:
Scott Lystig Fritchie 2016-03-09 12:12:34 -08:00
parent 6cddfcf988
commit fa71a918b8
3 changed files with 74 additions and 51 deletions

13
FAQ.md
View file

@ -46,13 +46,13 @@
<a name="n1.1"> <a name="n1.1">
### 1.1. What is Machi? ### 1.1. What is Machi?
Very briefly, Machi is a very simple append-only file store. Very briefly, Machi is a very simple append-only blob/file store.
Machi is Machi is
"dumber" than many other file stores (i.e., lacking many features "dumber" than many other file stores (i.e., lacking many features
found in other file stores) such as HadoopFS or a simple NFS or CIFS file found in other file stores) such as HadoopFS or a simple NFS or CIFS file
server. server.
However, Machi is a distributed file store, which makes it different However, Machi is a distributed blob/file store, which makes it different
(and, in some ways, more complicated) than a simple NFS or CIFS file (and, in some ways, more complicated) than a simple NFS or CIFS file
server. server.
@ -142,7 +142,8 @@ consistency mode during and after network partitions are:
due to Machi's restrictions on file naming and file offset due to Machi's restrictions on file naming and file offset
assignment. Both file names and file offsets are always chosen assignment. Both file names and file offsets are always chosen
by Machi servers according to rules which guarantee safe by Machi servers according to rules which guarantee safe
mergeability. mergeability. Server-assigned names are a characteristic of a
"blob store".
<a name="n1.5"> <a name="n1.5">
### 1.5. What is Machi like when operating in "strongly consistent" mode? ### 1.5. What is Machi like when operating in "strongly consistent" mode?
@ -172,10 +173,10 @@ for more details.
### 1.6. What does Machi's API look like? ### 1.6. What does Machi's API look like?
The Machi API only contains a handful of API operations. The function The Machi API only contains a handful of API operations. The function
arguments shown below use Erlang-style type annotations. arguments shown below (in simplifed form) use Erlang-style type annotations.
append_chunk(Prefix:binary(), Chunk:binary()). append_chunk(Prefix:binary(), Chunk:binary(), CheckSum:binary()).
append_chunk_extra(Prefix:binary(), Chunk:binary(), ExtraSpace:non_neg_integer()). append_chunk_extra(Prefix:binary(), Chunk:binary(), CheckSum:binary(), ExtraSpace:non_neg_integer()).
read_chunk(File:binary(), Offset:non_neg_integer(), Size:non_neg_integer()). read_chunk(File:binary(), Offset:non_neg_integer(), Size:non_neg_integer()).
checksum_list(File:binary()). checksum_list(File:binary()).

110
README.md
View file

@ -4,16 +4,16 @@
Outline Outline
1. [Why another file store?](#sec1) 1. [Why another blob/file store?](#sec1)
2. [Where to learn more about Machi](#sec2) 2. [Where to learn more about Machi](#sec2)
3. [Development status summary](#sec3) 3. [Development status summary](#sec3)
4. [Contributing to Machi's development](#sec4) 4. [Contributing to Machi's development](#sec4)
<a name="sec1"> <a name="sec1">
## 1. Why another file store? ## 1. Why another blob/file store?
Our goal is a robust & reliable, distributed, highly available, large Our goal is a robust & reliable, distributed, highly available, large
file store. Such stores already exist, both in the open source world file and blob store. Such stores already exist, both in the open source world
and in the commercial world. Why reinvent the wheel? We believe and in the commercial world. Why reinvent the wheel? We believe
there are three reasons, ordered by decreasing rarity. there are three reasons, ordered by decreasing rarity.
@ -25,9 +25,8 @@ there are three reasons, ordered by decreasing rarity.
3. We want to manage file replicas in a way that's provably correct 3. We want to manage file replicas in a way that's provably correct
and also easy to test. and also easy to test.
Of all the file stores in the open source & commercial worlds, only Criteria #3 is difficult to find in the open source world but perhaps
criteria #3 is a viable option. Or so we hope. Or we just don't not impossible.
care, and if data gets lost or corrupted, then ... so be it.
If we have app use cases where availability is more important than If we have app use cases where availability is more important than
consistency, then systems that meet criteria #2 are also rare. consistency, then systems that meet criteria #2 are also rare.
@ -39,12 +38,13 @@ file data and attempts best-effort file reads?
If we really do care about data loss and/or data corruption, then we If we really do care about data loss and/or data corruption, then we
really want both #3 and #1. Unfortunately, systems that meet really want both #3 and #1. Unfortunately, systems that meet
criteria #1 are _very rare_. criteria #1 are _very rare_. (Nonexistant?)
Why? This is 2015. We have decades of research that shows Why? This is 2015. We have decades of research that shows
that computer hardware can (and that computer hardware can (and
indeed does) corrupt data at nearly every level of the modern indeed does) corrupt data at nearly every level of the modern
client/server application stack. Systems with end-to-end data client/server application stack. Systems with end-to-end data
corruption detection should be ubiquitous today. Alas, they are not. corruption detection should be ubiquitous today. Alas, they are not.
Machi is an effort to change the deplorable state of the world, one Machi is an effort to change the deplorable state of the world, one
Erlang function at a time. Erlang function at a time.
@ -70,46 +70,62 @@ including the network partition simulator.
<a name="sec3"> <a name="sec3">
## 3. Development status summary ## 3. Development status summary
Mid-December 2015: work is underway. Mid-March 2016: The Machi development team has been downsized in
recent months, and the pace of development has slowed. Here is a
summary of the status of Machi's major components.
* In progress: * Humming Consensus and the chain manager
* Code refactoring: metadata management using * No new safety bugs have been found by model-checking tests.
[ELevelDB](https://github.com/basho/eleveldb) * A new document,
* File repair using file-centric, Merkle-style hash tree. (Hand-on experiments with Machi and Humming Consensus)[doc/humming-consensus-demo.md]
* Server-side socket handling is now performed by is now available. It is a tutorial for setting up a 3 virtual
[ranch](https://github.com/ninenines/ranch) machine Machi cluster and how to demonstrate the chain manager's
* QuickCheck tests for file repair correctness reactions to server stops & starts, crashes & restarts, and pauses
* 2015-12-15: The EUnit test `machi_ap_repair_eqc` is (simulated by `SIGSTOP` and `SIGCONT`).
currently failing occasionally because it (correctly) detects * The chain manager can still make suboptimal-but-safe choices for
double-write errors. Double-write errors will be eliminated chain transitions when a server hangs/pauses temporarily.
when the ELevelDB integration work is complete. * Recent chain manager changes have made the instability window
* The `make stage` and `make release` commands can be used to much shorter when the slow/paused server resumes execution.
create a primitive "package". Use `./rel/machi/bin/machi console` * Scott believes that a modest change to the chain manager's
to start the Machi app in interactive mode. Substitute the word calculation of a new projection can reduce flapping in this (and
`start` instead of console to start Machi in background/daemon many other cases) less likely. Currently, the new local
mode. The `./rel/machi/bin/machi` command without any arguments projection is calculated using only local state (i.e., the chain
will give a short usage summary. manager's internal state + the fitness server's state).
* Chain Replication management using the Humming Consensus However, if the "latest" projection read from the public
algorithm to manage chain state is stable. projection stores were also input to the new projection
* ... with the caveat that it runs very well in a very harsh calculation function, then many obviously bad projections can be
and unforgiving network partition simulator but has not run avoided without needing rounds of Humming Consensus to
much yet in the real world. demonstrate that a bad projection is bad.
* All Machi client/server protocols are based on
[Protocol Buffers](https://developers.google.com/protocol-buffers/docs/overview).
* The current specification for Machi's protocols can be found at
[https://github.com/basho/machi/blob/master/src/machi.proto](https://github.com/basho/machi/blob/master/src/machi.proto).
* The Machi PB protocol is not yet stable. Expect change!
* The Erlang language client implementation of the high-level
protocol flavor is brittle (e.g., little error handling yet).
If you would like to run the Humming Consensus code (with or without * FLU/data server process
the network partition simulator) as described in the RICON 2015 * All known correctness bugs have been fixed.
presentation, please see the * Performance has not yet been measured. Performance measurement
[Humming Consensus demo doc.](./doc/humming_consensus_demo.md). and enhancements are scheduled to start in the middle of March 2016.
(This will include a much-needed update to the `basho_bench` driver.)
If you'd like to work on a protocol such as Thrift, UBF, * Access protocols and client libraries
msgpack over UDP, or some other protocol, let us know by * The protocol used by both external clients and internally (instead
[opening an issue to discuss it](./issues/new). of using Erlang's native message passing mechanisms) is based on
Protocol Buffers.
* (Machi PB protocol specification: ./src/machi.proto)[./src/machi.proto]
* At the moment, the PB specification contains two protocols.
Sometime in the near future, the spec will be split to separate
the external client API (the "high" protocol) from the internal
communication API (the "low" protocol).
* Recent conference talks about Machi
* Erlang Factory San Francisco 2016
(the slides and video recording)[http://www.erlang-factory.com/sfbay2016/scott-lystig-fritchie]
will be available a few weeks after the conference ends on March
11, 2016.
* Ricon 2015
* (The slides)[http://ricon.io/archive/2015/slides/Scott_Fritchie_Ricon_2015.pdf]
* and the (video recording)[https://www.youtube.com/watch?v=yR5kHL1bu1Q&index=13&list=PL9Jh2HsAWHxIc7Tt2M6xez_TOP21GBH6M]
are now available.
* If you would like to run the Humming Consensus code (with or without
the network partition simulator) as described in the RICON 2015
presentation, please see the
[Humming Consensus demo doc](./doc/humming_consensus_demo.md).
<a name="sec4"> <a name="sec4">
## 4. Contributing to Machi's development ## 4. Contributing to Machi's development
@ -150,3 +166,9 @@ Machi has a dependency on the
supports UNIX/Linux OSes and 64-bit versions of Erlang/OTP only; we supports UNIX/Linux OSes and 64-bit versions of Erlang/OTP only; we
apologize to Windows-based and 32-bit-based Erlang developers for this apologize to Windows-based and 32-bit-based Erlang developers for this
restriction. restriction.
### 4.3 New protocols and features
If you'd like to work on a protocol such as Thrift, UBF,
msgpack over UDP, or some other protocol, let us know by
[opening an issue to discuss it](./issues/new).

View file

@ -220,7 +220,7 @@ To help interpret the output of the test, please skip ahead to the
If you don't have `git` and/or the Erlang 17 runtime system available If you don't have `git` and/or the Erlang 17 runtime system available
on your OS X, FreeBSD, Linux, or Solaris machine, please take a look on your OS X, FreeBSD, Linux, or Solaris machine, please take a look
at the [Prerequistes section](#prerequisites) first. When you have at the [Prerequisites section](#prerequisites) first. When you have
installed the prerequisite software, please return back here. installed the prerequisite software, please return back here.
## Clone and compile the code ## Clone and compile the code