Machi: a distributed, decentralized blob/large file store using chain replication and "Humming Consensus".
Find a file
Scott Lystig Fritchie 715d1b5ee4 If I have a single test file @ 400MB:
dd if=/dev/random of=test/foo-data-1 bs=1m count=400

And if I also remove the "config" dir that stores all of the .csum
files.

Then when I run `machi_merkle_tree_test:test(5)`, I see this output:

    machi_csum_table:open(./config/csum_1.csum) -> 2375 usec
    machi_csum_table:open(./config/csum_1.csum) -> 789 usec
    build_tree leaves = 1261 by pid <0.39.0>
    build_tree(size = 1261) -> 602 usec
    Osize 457 Osize2 1261
    machi_csum_table:open(./config/csum_2.csum) -> 662 usec
    machi_csum_table:open(./config/csum_2.csum) -> 801 usec
    build_tree leaves = 1259 by pid <0.39.0>
    build_tree(size = 1259) -> 566 usec
    Osize 479 Osize2 1259
    machi_csum_table:open(./config/csum_3.csum) -> 685 usec
    machi_csum_table:open(./config/csum_3.csum) -> 811 usec
    build_tree leaves = 1217 by pid <0.39.0>
    build_tree(size = 1217) -> 557 usec
    Osize 472 Osize2 1217
    machi_csum_table:open(./config/csum_4.csum) -> 941 usec
    machi_csum_table:open(./config/csum_4.csum) -> 740 usec
    build_tree leaves = 1134 by pid <0.39.0>
    build_tree(size = 1134) -> 559 usec
    Osize 395 Osize2 1134
    machi_csum_table:open(./config/csum_5.csum) -> 2541 usec
    machi_csum_table:open(./config/csum_5.csum) -> 725 usec
    build_tree leaves = 1284 by pid <0.39.0>
    build_tree(size = 1284) -> 572 usec
    Osize 425 Osize2 1284
    ok

... and I see this in "results.txt":

    473	32517	48778	1322	27788
    424	29114	43283	1155	25188
    452	27120	46868	1346	26828
    462	28497	47198	1398	27322
    424	29915	44685	1326	25596
2015-10-29 17:25:36 +09:00
doc Add section '9. Other considerations for FLU/sequencer implementations' 2015-10-19 14:47:13 +09:00
ebin Single server client & server code (squashed) 2015-04-01 16:14:24 +09:00
include Merge branch 'ku/config-system' into tmp 2015-10-27 14:14:40 +09:00
priv Experimental: add 'make dialyzer' to priv/test-for-gh-pr.sh 2015-10-27 14:12:37 +09:00
prototype Update on the status of prototype/chain-manager code: now moved to TOP/src on 2015-04-18 01:42:47 +09:00
rel Fix cut-and-paste-o in machi-admin script 2015-10-16 17:50:34 +09:00
src If I have a single test file @ 400MB: 2015-10-29 17:25:36 +09:00
test If I have a single test file @ 400MB: 2015-10-29 17:25:36 +09:00
.gitignore Ignore vim cruft too 2015-10-27 16:33:07 -05:00
.travis.yml Add priv/test-for-gh-pr.sh, change TravisCI 'script' to use it 2015-10-16 18:23:24 +09:00
CONTRIBUTING.md The FAQ grows 2015-06-22 00:09:35 +09:00
dialyzer.ignore-warnings Silence remaining warnings 2015-10-22 15:37:09 +09:00
FAQ.md Docuemntation updates 2015-06-30 19:04:23 +09:00
INSTALLATION.md Add a bit more to INSTALLATION.md 2015-05-21 15:58:00 +09:00
LICENSE Add APL v2 LICENSE file 2015-03-02 17:12:39 +09:00
Makefile Replace some make targets with tools.mk 2015-10-22 10:53:09 +09:00
NOTICE Add NOTICE 2015-03-02 21:06:31 +09:00
README.md markdown fix 2015-10-21 10:44:52 +09:00
rebar Add test/machi_pb_test.erl, finish PB refactoring 2015-06-19 13:00:28 +09:00
rebar.config Add merklet as a dependency 2015-10-27 11:57:38 -05:00
rebar.config.script Hrm, fewer deadlocks, but sometimes unreliable shutdown 2015-07-16 17:59:02 +09:00
TODO-shortterm.org Update TODO-shortterm.org for completion of fitness work 2015-09-22 16:44:49 +09:00
tools.mk Reduce all dialyzer runtimes by approx 13 seconds (on my MacBook Pro) 2015-10-22 15:36:50 +09:00

Machi

Travis-CI :: Travis-CI

Our goal is a robust & reliable, distributed, highly available(*), large file store based upon write-once registers, append-only files, Chain Replication, and client-server style architecture. All members of the cluster store all of the files. Distributed load balancing/sharding of files is outside of the scope of this system. However, it is a high priority that this system be able to integrate easily into systems that do provide distributed load balancing, e.g., Riak Core. Although strong consistency is a major feature of Chain Replication, first use cases will focus mainly on eventual consistency features --- strong consistency design will be discussed in a separate design document (read more below).

The ability for Machi to maintain strong consistency will make it attractive as a toolkit for building things like CORFU and Tango as well as better-known open source software such as Kafka's file replication. (See the bibliography of the Machi high level design doc for further references.)

(*) When operating in strong consistency mode (supporting
sequential or linearizable semantics), the availability of the
system is restricted to quorum majority availability.  When in
eventual consistency mode, service can be provided by any
available server.

Status: mid-October 2015: work is underway

  • The chain manager is ready for both eventual consistency use ("AP mode") and strong constency use ("CP mode"). Both modes use a new consensus technique, Humming Consensus.

    • Scott will be speaking about Humming Consensus at the [Ricon 2015 conference] (http://ricon.io) in San Francisco, CA, USA on Thursday, November 5th, 2015.
    • Implementation of the file repair process for strong consistency is still in progress.
  • All Machi client/server protocols are based on Protocol Buffers.

    • The current specification for Machi's protocols can be found at https://github.com/basho/machi/blob/master/src/machi.proto.
    • The Machi PB protocol is not yet stable. Expect change!
    • The Erlang language client implementation of the high-level protocol flavor is brittle (e.g., little error handling yet).

If you'd like to work on a protocol such as Thrift, UBF, msgpack over UDP, or some other protocol, let us know by opening an issue to discuss it.

The two major design documents for Machi are now mostly stable. Please see the doc directory's README for details.

Contributing to Machi: source code, documentation, etc.

Basho Technologies, Inc. as committed to licensing all work for Machi under the Apache Public License version 2. All authors of source code and documentation who agree with these licensing terms are welcome to contribute their ideas in any form: suggested design or features, documentation, and source code.

Machi is still a very young project within Basho, with a small team of developers; please bear with us as we grow out of "toddler" stage into a more mature open source software project. We invite all contributors to review the CONTRIBUTING.md document for guidelines for working with the Basho development team.

A brief survey of this directories in this repository

  • A list of Frequently Asked Questions, a.k.a. the Machi FAQ.

  • The doc directory: home for major documents about Machi: high level design documents as well as exploration of features still under design & review within Basho.

  • The ebin directory: used for compiled application code

  • The include, src, and test directories: contain the header files, source files, and test code for Machi, respectively.

  • The prototype directory: contains proof of concept code, scaffolding libraries, and other exploratory code. Curious readers should see the prototype/README.md file for more explanation of the small sub-projects found here.

Development environment requirements

All development to date has been done with Erlang/OTP version 17 on OS X. The only known limitations for using R16 are minor type specification difference between R16 and 17, but we strongly suggest continuing development using version 17.

We also assume that you have the standard UNIX/Linux developers tool chain for C and C++ applications. Specifically, we assume make is available. The utility used to compile the Machi source code, rebar, is pre-compiled and included in the repo.

There are no known OS limits at this time: any platform that supports Erlang/OTP should be sufficient for Machi. This may change over time (e.g., adding NIFs which can make full portability to Windows OTP environments difficult), but it hasn't happened yet.

Contributions

Basho encourages contributions to Riak from the community. Heres how to get started.

  • Fork the appropriate sub-projects that are affected by your change.
  • Create a topic branch for your change and checkout that branch. git checkout -b some-topic-branch
  • Make your changes and run the test suite if one is provided. (see below)
  • Commit your changes and push them to your fork.
  • Open pull-requests for the appropriate projects.
  • Contributors will review your pull request, suggest changes, and merge it when its ready and/or offer feedback.
  • To report a bug or issue, please open a new issue against this repository.

-The Machi team at Basho, Scott Lystig Fritchie, technical lead, and Matt Brender, your developer advocate.