Doc update, including mid-December 2015 status #54

Merged
slfritchie merged 5 commits from slf/doc-201512-update into master 2015-12-16 02:15:09 +00:00
2 changed files with 72 additions and 84 deletions
Showing only changes of commit ec56164bd1 - Show all commits

119
README.md
View file

@ -2,6 +2,16 @@
[Travis-CI](http://travis-ci.org/basho/machi) :: ![Travis-CI](https://secure.travis-ci.org/basho/machi.png) [Travis-CI](http://travis-ci.org/basho/machi) :: ![Travis-CI](https://secure.travis-ci.org/basho/machi.png)
Outline
1. [Why another file store?](#sec1)
2. [Development status summary](#sec2)
3. [Where to learn more about Machi](#sec3)
4. [Contributing to Machi's development](#sec4)
<a name="sec1">
## 1. Why another file store?
Our goal is a robust & reliable, distributed, highly available, large Our goal is a robust & reliable, distributed, highly available, large
file store. Such stores already exist, both in the open source world file store. Such stores already exist, both in the open source world
and in the commercial world. Why reinvent the wheel? We believe and in the commercial world. Why reinvent the wheel? We believe
@ -38,40 +48,51 @@ corruption detection should be ubiquitous today. Alas, they are not.
Machi is an effort to change the deplorable state of the world, one Machi is an effort to change the deplorable state of the world, one
Erlang function at a time. Erlang function at a time.
## Status: mid-December 2015: work is underway <a name="sec2">
## 2. Development status summary
TODO: status update here. Mid-December 2015: work is underway.
* The chain manager is ready for both eventual consistency use ("available * In progress:
mode") and strong consistency use ("consistent mode"). Both modes use a new * Code refactoring: metadata management using
consensus technique, Humming Consensus. [ELevelDB](https://github.com/basho/eleveldb)
* Scott will be * File repair using file-centric, Merkle-style hash tree.
[speaking about Humming Consensus](http://ricon.io/agenda/#managing-chain-replication-metadata-with-humming-consensus) * QuickCheck tests for file repair correctness
at the [Ricon 2015 conference] (http://ricon.io) in San Francisco, * 2015-12-15: The EUnit test `machi_ap_repair_eqc` is
CA, USA on Thursday, November 5th, 2015. currently failing occasionally because it (correctly) detects
* If you would like to run the network partition simulator double-write errors. Double-write errors will be eliminated
mentioned in that Ricon presentation, please see the when the ELevelDB integration work is complete.
[partition simulator convergence test doc.](./doc/machi_chain_manager1_converge_demo.md) * Chain Replication management using the Humming Consensus
* Implementation of the file repair process for strong consistency algorithm to manage chain state is stable.
is still in progress. * ... with the caveat that it runs very well in a very harsh
and unforgiving network partition simulator but has not run
much yet in the real world.
* All Machi client/server protocols are based on
[Protocol Buffers](https://developers.google.com/protocol-buffers/docs/overview).
* The current specification for Machi's protocols can be found at
[https://github.com/basho/machi/blob/master/src/machi.proto](https://github.com/basho/machi/blob/master/src/machi.proto).
* The Machi PB protocol is not yet stable. Expect change!
* The Erlang language client implementation of the high-level
protocol flavor is brittle (e.g., little error handling yet).
* All Machi client/server protocols are based on If you would like to run the network partition simulator
[Protocol Buffers](https://developers.google.com/protocol-buffers/docs/overview). mentioned in the Ricon 2015 presentation about Humming Consensus,
* The current specification for Machi's protocols can be found at please see the
[https://github.com/basho/machi/blob/master/src/machi.proto](https://github.com/basho/machi/blob/master/src/machi.proto). [partition simulator convergence test doc.](./doc/machi_chain_manager1_converge_demo.md)
* The Machi PB protocol is not yet stable. Expect change!
* The Erlang language client implementation of the high-level
protocol flavor is brittle (e.g., little error handling yet).
If you'd like to work on a protocol such as Thrift, UBF, If you'd like to work on a protocol such as Thrift, UBF,
msgpack over UDP, or some other protocol, let us know by msgpack over UDP, or some other protocol, let us know by
[opening an issue to discuss it](./issues/new). [opening an issue to discuss it](./issues/new).
## Where to learn more about Machi <a name="sec3">
## 3. Where to learn more about Machi
The two major design documents for Machi are now mostly stable. The two major design documents for Machi are now mostly stable.
Please see the [doc](./doc) directory's [README](./doc) for details. Please see the [doc](./doc) directory's [README](./doc) for details.
We also have a
[Frequently Asked Questions (FAQ) list](./FAQ.md).
Scott recently (November 2015) gave a presentation at the Scott recently (November 2015) gave a presentation at the
[RICON 2015 conference](http://ricon.io) about one of the techniques [RICON 2015 conference](http://ricon.io) about one of the techniques
used by Machi; "Managing Chain Replication Metadata with used by Machi; "Managing Chain Replication Metadata with
@ -79,7 +100,10 @@ Humming Consensus" is available online now.
* [slides (PDF format)](http://ricon.io/speakers/slides/Scott_Fritchie_Ricon_2015.pdf) * [slides (PDF format)](http://ricon.io/speakers/slides/Scott_Fritchie_Ricon_2015.pdf)
* [video](https://www.youtube.com/watch?v=yR5kHL1bu1Q) * [video](https://www.youtube.com/watch?v=yR5kHL1bu1Q)
## Contributing to Machi: source code, documentation, etc. <a name="sec4">
## 4. Contributing to Machi's development
### 4.1 License
Basho Technologies, Inc. as committed to licensing all work for Machi Basho Technologies, Inc. as committed to licensing all work for Machi
under the under the
@ -95,26 +119,7 @@ We invite all contributors to review the
[CONTRIBUTING.md](./CONTRIBUTING.md) document for guidelines for [CONTRIBUTING.md](./CONTRIBUTING.md) document for guidelines for
working with the Basho development team. working with the Basho development team.
## A brief survey of this directories in this repository ### 4.2 Development environment requirements
* A list of Frequently Asked Questions, a.k.a.
[the Machi FAQ](./FAQ.md).
* The [doc](./doc/) directory: home for major documents about Machi:
high level design documents as well as exploration of features still
under design & review within Basho.
* The `ebin` directory: used for compiled application code
* The `include`, `src`, and `test` directories: contain the header
files, source files, and test code for Machi, respectively.
* The [prototype](./prototype/) directory: contains proof of concept
code, scaffolding libraries, and other exploratory code. Curious
readers should see the [prototype/README.md](./prototype/README.md)
file for more explanation of the small sub-projects found here.
## Development environment requirements
All development to date has been done with Erlang/OTP version 17 on OS All development to date has been done with Erlang/OTP version 17 on OS
X. The only known limitations for using R16 are minor type X. The only known limitations for using R16 are minor type
@ -126,26 +131,8 @@ tool chain for C and C++ applications. Specifically, we assume `make`
is available. The utility used to compile the Machi source code, is available. The utility used to compile the Machi source code,
`rebar`, is pre-compiled and included in the repo. `rebar`, is pre-compiled and included in the repo.
There are no known OS limits at this time: any platform that supports Machi has a dependency on the
Erlang/OTP should be sufficient for Machi. This may change over time [ELevelDB](https://github.com/basho/eleveldb) library. ELevelDB only
(e.g., adding NIFs which can make full portability to Windows OTP supports UNIX/Linux OSes and 64-bit versions of Erlang/OTP only; we
environments difficult), but it hasn't happened yet. apologize to Windows-based and 32-bit-based Erlang developers for this
restriction.
## Contributions
Basho encourages contributions to Riak from the community. Heres how
to get started.
* Fork the appropriate sub-projects that are affected by your change.
* Create a topic branch for your change and checkout that branch.
git checkout -b some-topic-branch
* Make your changes and run the test suite if one is provided. (see below)
* Commit your changes and push them to your fork.
* Open pull-requests for the appropriate projects.
* Contributors will review your pull request, suggest changes, and merge it when its ready and/or offer feedback.
* To report a bug or issue, please open a new issue against this repository.
-The Machi team at Basho,
[Scott Lystig Fritchie](mailto:scott@basho.com), technical lead, and
[Matt Brender](mailto:mbrender@basho.com), your developer advocate.

View file

@ -6,20 +6,6 @@ Erlang documentation, please use this link:
## Documents in this directory ## Documents in this directory
### chain-self-management-sketch.org
[chain-self-management-sketch.org](chain-self-management-sketch.org)
is a mostly-deprecated draft of
an introduction to the
self-management algorithm proposed for Machi. Most material has been
moved to the [high-level-chain-mgr.pdf](high-level-chain-mgr.pdf) document.
### cluster-of-clusters (directory)
This directory contains the sketch of the "cluster of clusters" design
strawman for partitioning/distributing/sharding files across a large
number of independent Machi clusters.
### high-level-machi.pdf ### high-level-machi.pdf
[high-level-machi.pdf](high-level-machi.pdf) [high-level-machi.pdf](high-level-machi.pdf)
@ -50,9 +36,9 @@ introduction to the Humming Consensus algorithm. Its abstract:
> of file updates to all replica servers in a Machi cluster. Chain > of file updates to all replica servers in a Machi cluster. Chain
> Replication is a variation of primary/backup replication where the > Replication is a variation of primary/backup replication where the
> order of updates between the primary server and each of the backup > order of updates between the primary server and each of the backup
> servers is strictly ordered into a single ``chain''. Management of > servers is strictly ordered into a single "chain". Management of
> Chain Replication's metadata, e.g., ``What is the current order of > Chain Replication's metadata, e.g., "What is the current order of
> servers in the chain?'', remains an open research problem. The > servers in the chain?", remains an open research problem. The
> current state of the art for Chain Replication metadata management > current state of the art for Chain Replication metadata management
> relies on an external oracle (e.g., ZooKeeper) or the Elastic > relies on an external oracle (e.g., ZooKeeper) or the Elastic
> Replication algorithm. > Replication algorithm.
@ -60,7 +46,7 @@ introduction to the Humming Consensus algorithm. Its abstract:
> This document describes the Machi chain manager, the component > This document describes the Machi chain manager, the component
> responsible for managing Chain Replication metadata state. The chain > responsible for managing Chain Replication metadata state. The chain
> manager uses a new technique, based on a variation of CORFU, called > manager uses a new technique, based on a variation of CORFU, called
> ``humming consensus''. > "humming consensus".
> Humming consensus does not require active participation by all or even > Humming consensus does not require active participation by all or even
> a majority of participants to make decisions. Machi's chain manager > a majority of participants to make decisions. Machi's chain manager
> bases its logic on humming consensus to make decisions about how to > bases its logic on humming consensus to make decisions about how to
@ -71,3 +57,18 @@ introduction to the Humming Consensus algorithm. Its abstract:
> decision during that epoch. When a differing decision is discovered, > decision during that epoch. When a differing decision is discovered,
> new time epochs are proposed in which a new consensus is reached and > new time epochs are proposed in which a new consensus is reached and
> disseminated to all available participants. > disseminated to all available participants.
### chain-self-management-sketch.org
[chain-self-management-sketch.org](chain-self-management-sketch.org)
is a mostly-deprecated draft of
an introduction to the
self-management algorithm proposed for Machi. Most material has been
moved to the [high-level-chain-mgr.pdf](high-level-chain-mgr.pdf) document.
### cluster-of-clusters (directory)
This directory contains the sketch of the "cluster of clusters" design
strawman for partitioning/distributing/sharding files across a large
number of independent Machi clusters.