.. | ||
include | ||
src | ||
test | ||
.gitignore | ||
LICENSE | ||
Makefile | ||
README.md | ||
rebar.config | ||
rebar.config.script |
Tango prototype
This is a quick hack, just to see how quick & easy it might be to build Tango on top of corfurl. It turned out to be pretty quick and easy.
This prototype does not include any datatype-specific APIs, such as an
HTTP REST interface for manipulating a queue. The current API is
native Erlang only. However, because the Tango client communicates to
the underlying CORFU log via the corfurl
interface, this
implementation is powerful enough to run concurrently on multiple
Erlang nodes.
This implementation does not follow the same structure as described in the Tango paper. I made some changes, based on some guesses/partial understanding of the paper. If I were to start over again, I'd try to use the exact same naming scheme & structure suggested by the paper.
Testing environment
Tested using Erlang/OTP R16B and Erlang/OTP 17, both on OS X.
It ought to "just work" on other versions of Erlang and on other OS platforms, but sorry, I haven't tested it.
Use make
and make test
to compile and run unit tests.
Note that the Makefile assumes that the rebar utility is available
somewhere in your path.
Data types implemented
- OID mapper
- Simple single-value register
- Map (i.e., multi-value register or basic key-value store)
- Queue
- Used the Erlang/OTP
queue.erl
library for rough inspiration - Operations: is_empty, length, peek, to_list, member, in, out, reverse, filter.
- Queue mutation operations are not idempotent with respect to multiple writes in the underlying CORFU log, e.g., due to CORFU log reconfiguration or partial write error/timeouts.
- Used the Erlang/OTP
Experimental idea: built-in OID checkpointing
I was toying with the idea of adding a Tango "history splicing" operation that could make the implementation per-OID checkpoint & garbage collection (and CORFU-level trimming) operations much easier. I think that this might be a very good idea and that it deserves more research & work.
The implementation of the checkpointing & splicing as it is today is flawed. See the TODO list below for more details.
Information about the Tango paper
"Tango: Distributed Data Structures over a Shared Log"
Balakrishnan, Malkhi, Wobber, Wu, Brabhakaran, Wei, Davis, Rao, Zou, Zuck
Describes a framework for developing data structures that reside persistently within a CORFU log: the log is the database/data structure store.
http://www.snookles.com/scottmp/corfu/Tango.pdf
See also, ../corfu/docs/corfurl.md
for more information on CORFU
research papers.
TODO list
__ The src/corfu* files in this sub-repo differ from the original prototype source files in the ../corfu sub-repo, sorry!
__ The current checkpoint implementation is fundamentally broken and needs a rewrite, or else. This issue is not mentioned at all in the Tango paper.
option 1: fix checkpoint to be 100% correct option 2: checkpointing is for the weak and the memory-constrained, so don't bother. Instead, rip out the current checkpoint code, period. option 3: other
xx Checkpoint fix option #1: history splicing within the same OID?
xx Checkpoint fix option #2: checkpoint to a new OID, history writes to both OIDs during the CP, then a marker in the old OID to switch over to the new OID?
History splicing has a flaw that I belive just won't work. The switch to a new OID has problems with updates written to the old OID before and before the new checkpoint has finished.
I believe that a checkpoint where:
- all Tango writes, checkpoint and non-checkpoint alike, are noted with a checkpoint number.
- that checkpoint number is strictly increasing
- a new checkpoint has a new checkpoint number
- scans ignore blocks with checkpoint numbers larger than the current active checkpoint #, until the checkpoint is complete.
... ought to work correctly.