Doc update, including mid-December 2015 status #54

Merged
slfritchie merged 5 commits from slf/doc-201512-update into master 2015-12-16 02:15:09 +00:00
Showing only changes of commit e4a784d3dd - Show all commits

View file

@ -1,32 +1,46 @@
# Machi # Machi: a robust & reliable, distributed, highly available, large file store
[Travis-CI](http://travis-ci.org/basho/machi) :: ![Travis-CI](https://secure.travis-ci.org/basho/machi.png) [Travis-CI](http://travis-ci.org/basho/machi) :: ![Travis-CI](https://secure.travis-ci.org/basho/machi.png)
Our goal is a robust & reliable, distributed, highly available(*), Our goal is a robust & reliable, distributed, highly available, large
large file store based upon write-once registers, append-only files, file store. Such stores already exist, both in the open source world
Chain Replication, and client-server style architecture. All members and in the commercial world. Why reinvent the wheel? We believe
of the cluster store all of the files. Distributed load there are three reasons, ordered by decreasing rarity.
balancing/sharding of files is __outside__ of the scope of this
system. However, it is a high priority that this system be able to
integrate easily into systems that do provide distributed load
balancing, e.g., Riak Core. Although strong consistency is a major
feature of Chain Replication, first use cases will focus mainly on
eventual consistency features --- strong consistency design will be
discussed in a separate design document (read more below).
The ability for Machi to maintain strong consistency will make it 1. We want end-to-end checksums for all file data, from the initial
attractive as a toolkit for building things like CORFU and Tango as file writer to every file reader, anywhere, all the time.
well as better-known open source software such as Kafka's file 2. We need flexibility to trade consistency for availability:
replication. (See the bibliography of the [Machi high level design e.g. weak consistency in exchange for being available in cases
doc](./doc/high-level-machi.pdf) for further references.) of partial system failure.
3. We want to manage file replicas in a way that's provably correct
and also easy to test.
(*) When operating in strong consistency mode (supporting Of all the file stores in the open source & commercial worlds, only
sequential or linearizable semantics), the availability of the criteria #3 is a viable option. Or so we hope. Or we just don't
system is restricted to quorum majority availability. When in care, and if data gets lost or corrupted, then ... so be it.
eventual consistency mode, service can be provided by any
available server.
## Status: mid-October 2015: work is underway If we have app use cases where availability is more important than
consistency, then systems that meet criteria #2 are also rare.
Most file stores provide only strong consistency and therefore
have unavoidable, unavailable behavior when parts of the system
fail.
What if we want a file store that is always available to write new
file data and attempts best-effort file reads?
If we really do care about data loss and/or data corruption, then we
really want both #3 and #1. Unfortunately, systems that meet criteria
#1 are *very*
rare. Why? This is 2015. We have decades of research that shows
that computer hardware can (and
indeed does) corrupt data at nearly every level of the modern
client/server application stack. Systems with end-to-end data
corruption detection should be ubiquitous today. Alas, they are not.
Machi is an effort to change the deplorable state of the world, one
Erlang function at a time.
## Status: mid-December 2015: work is underway
TODO: status update here.
* The chain manager is ready for both eventual consistency use ("available * The chain manager is ready for both eventual consistency use ("available
mode") and strong consistency use ("consistent mode"). Both modes use a new mode") and strong consistency use ("consistent mode"). Both modes use a new
@ -53,9 +67,18 @@ If you'd like to work on a protocol such as Thrift, UBF,
msgpack over UDP, or some other protocol, let us know by msgpack over UDP, or some other protocol, let us know by
[opening an issue to discuss it](./issues/new). [opening an issue to discuss it](./issues/new).
## Where to learn more about Machi
The two major design documents for Machi are now mostly stable. The two major design documents for Machi are now mostly stable.
Please see the [doc](./doc) directory's [README](./doc) for details. Please see the [doc](./doc) directory's [README](./doc) for details.
Scott recently (November 2015) gave a presentation at the
[RICON 2015 conference](http://ricon.io) about one of the techniques
used by Machi; "Managing Chain Replication Metadata with
Humming Consensus" is available online now.
* [slides (PDF format)](http://ricon.io/speakers/slides/Scott_Fritchie_Ricon_2015.pdf)
* [video](https://www.youtube.com/watch?v=yR5kHL1bu1Q)
## Contributing to Machi: source code, documentation, etc. ## Contributing to Machi: source code, documentation, etc.
Basho Technologies, Inc. as committed to licensing all work for Machi Basho Technologies, Inc. as committed to licensing all work for Machi