7.9 KiB
Frequently Asked Questions (FAQ)
Outline
## 1. Questions about Machi in general ### 1.1. What is Machi?TODO: expand this topic.
Very briefly, Machi is a very simple append-only file store; it is "dumber" than many other file stores (i.e., lacking many features found in other file stores) such as HadoopFS or simple NFS or CIFS file server. However, Machi is a distributed file store, which makes it different (and, in some ways, more complicated) than a simple NFS or CIFS file server.
As a distributed system, Machi can be configured to operate with either eventually consistent mode or strongly consistent mode. (See the high level design document for definitions and details.)
For a much longer answer, please see the Machi high level design doc.
### 1.2. What does Machi's API look like?The Machi API only contains a handful of API operations. The function arguments shown below use Erlang-style type annotations.
append_chunk(Prefix:binary(), Chunk:binary()).
append_chunk_extra(Prefix:binary(), Chunk:binary(), ExtraSpace:non_neg_integer()).
read_chunk(File:binary(), Offset:non_neg_integer(), Size:non_neg_integer()).
checksum_list(File:binary()).
list_files().
Machi allows the client to choose the prefix of the file name to
append data to, but the Machi server will always choose the final file
name and byte offset for each append_chunk()
operation. This
restriction on file naming makes it easy to operate in "eventually
consistent" mode: files may be written to any server during network
partitions and can be easily merged together after the partition is
healed.
Internally, there is a more complex protocol used by individual cluster members to manage file contents and to repair damaged/missing files. See Figure 3 in Machi high level design doc for more details.
## 2. Questions about Machi relative to something else ### 2.1. How is Machi better than Hadoop?This question is frequently asked by trolls. If this is a troll question, the answer is either, "Nothing is better than Hadoop," or else "Everything is better than Hadoop."
The real answer is that Machi is not a distributed data processing framework like Hadoop is. See Hadoop's entry in Wikipedia and focus on the description of Hadoop's MapReduce and YARN; Machi contains neither.
### 2.2. How does Machi differ from HadoopFS?This is a much better question than the How is Machi better than Hadoop? question.
One way to look at Machi is to consider Machi as a distributed file store. HadoopFS is also a distributed file store. Let's compare and contrast.
Machi | Hadoop |
Not POSIX compliant | Not POSIX compliant |
Immutable file store with append-only semantics (simplifying things a little bit). | Immutable file store with append-only semantics |
File data may be read concurrently while file is being actively appended to. | File must be closed before a client can read it. |
No concept (yet) of users, directories, or ACLs | Has concepts of users, directories, and ACLs. |
Machi oes not allow clients to name their own files or to specify data placement/offset within a file. | While not POSIX compliant, HDFS allows a fairly flexible API for managing file names and file writing position within a file (during a file's writable phase). |
Does not have any file distribution/partitioning/sharding across Machi clusters: in a single Machi cluster, all files are replicated by all servers in the cluster. The "cluster of clusters" concept is used to distribute/partition/shard files across multiple Machi clusters. | File distribution/partitioning/sharding is performed automatically by the HDFS "name node". |
Machi requires no central "name node" for single cluster use. Machi requires no central "name node" for "cluster of clusters" use | Requires a single "namenode" server to maintain file system contents and file content mapping. (May be deployed with a "secondary namenode" to reduce unavailability when the primary namenode fails.) |
Machi uses Chain Replication to manage all file replicas. | The HDFS name node uses an ad hoc mechanism for replicating file contents. The HDFS file system metadata (file names, file block(s) locations, ACLs, etc.) is stored by the name node in the local file system and is replicated to any secondary namenode using snapshots. |
Machi replicates files *N* ways where *N* is the length of the Chain Replication chain. Typically, *N=2*, but this is configurable. | HDFS typical replicates file contents *N=3* ways, but this is configurable. |
Machi is rather close to Kafka in spirit, though its implementation is quite different.
Machi | Kafka |
Append-only, strongly consistent file store only | Append-only, strongly consistent log file store + additional services: for example, producer topics & sharding, consumer groups & failover, etc. |
Not yet code complete nor "battle tested" in large production environments. | "Battle tested" in large production environments. |
In theory, it should be "quite straightforward" to remove these parts of Kafka's code base:
- local file system I/O for all topic/partition/log files
- leader/follower file replication, ISR ("In Sync Replica") state management, and related log file replication logic
... and replace those parts with Machi client API calls. Those parts of Kafka are what Machi has been designed to do from the very beginning.
See also: How does Machi differ from CORFU and Tango?
### 2.4. How does Machi differ from Bookkeeper?Sorry, we haven't studied Bookkeeper very deeply or used Bookkeeper for any non-trivial project.
One notable limitation of the Bookkeeper API is that a ledger cannot be read by other clients until it has been closed. Any byte in a Machi file that has been written successfully may be read immedately by any other Machi client.
The name "Machi" does not have three consecutive pairs of repeating letters. The name "Bookkeeper" does.
### 2.5. How does Machi differ from CORFU and Tango?Machi's design borrows very heavily from CORFU. We acknowledge a deep debt to the original Microsoft Research papers that describe CORFU's original design and implementation.
See also: the "Recommended reading & related work" and "References" sections of the Machi high level design doc for pointers to the MSR papers related to CORFU.
Machi does not implement Tango directly. (Not yet, at least.) However, there is a prototype implementation included in the Machi source tree. See the prototype/tango source code directory for details.