Update todo items.

This commit is contained in:
Gregory Burd 2012-04-21 17:24:37 -04:00
parent 638f2d56ee
commit 85b09530ee

58
TODO
View file

@ -1,21 +1,24 @@
* hanoi
* [cleanup] add @doc strings and and -spec's
* [cleanup] check to make sure every error returns with a reason {error, Reason}
* [feature] statistics
* [feature] use lager for error messages
* [enhancement] add crc or something to the files
* [feature] add config parameters on open
* lsm_btree
* [2i] secondary index support
* atomic multi-commit/recovery
* add checkpoint/1 and sync/1 - flush pending writes to stable storage
(nursery:finish() and finish/flush any merges)
* [config] add config parameters on open
* {sync, boolean()} fdsync or not on write
* {cache, bytes(), name} share max(bytes) cache named 'name' via etc
* [enhancement] use etc/emmap to access/cache files
* [enhancement] adaptive nursery sizing
* [feature] support for time based expiry, merge should eliminate expired data
* [feature] add truncate/1 - quickly truncates a database to 0 items
* [feature] add sync/1 - flush pending writes to disk (aka checkpoint)
(nursery:finish() and finish/flush any merges)
* [feature] count/1 - return number of items currently in tree
* [feature] "group" commit - ability to make many k/v add/update/deletes atomic (for 2i)
* [enhancement] backpressure on fold operations
* [stats] statistics
* For each level {#merges, {merge-time-min, max, average}}
* [expiry] support for time based expiry, merge should eliminate expired data
* add @doc strings and and -spec's
* check to make sure every error returns with a reason {error, Reason}
* lager; check for uses of lager:error/2
* add version 1, crc to the files
* add compression via snappy (https://github.com/fdmanana/snappy-erlang-nif)
* add encryption
* adaptive nursery sizing
* add truncate/1 - quickly truncates a database to 0 items
* count/1 - return number of items currently in tree
* backpressure on fold operations
- The "sync_fold" creates a snapshot (hard link to btree files), which
provides consistent behavior but may use a lot of disk space if there is
a lot of insertion going on.
@ -23,31 +26,14 @@
serviced, then picks up from there again. So you could see intermittent
puts in a subsequent batch of results.
* riak_kv_hanoie_backend
* add support for time-based expiry
* finish support for 2i
* add stats collection
- For each level {#merges, {merge-time-min, max, average}}
PHASE 2:
* hanoi
* lsm_btree
* Define a standard struct which is the metadata added at the end of the
file, e.g. [btree-nodes] [meta-data] [offset of meta-data]. This is written
in hanoi_writer:flush_nodes, and read in hanoi_reader:open2.
in lsm_btree_writer:flush_nodes, and read in lsm_btree_reader:open2.
* [feature] compression, encryption on disk
PHASE 3:
* lsm_ixdb
* hanoi{btree, trie, ...} support for sub-databases and associations with
different index types
* [major change] add more CAPABILITIES such as
test-and-set(Fun, Key, Value) - to compare a vclock quickly, to speed up
the get/put patch for every update
* [enhancement] change encoding/layout of data on disk using sub-databases
and secondary indexes
bucket/key{meta[], data} -> ??
REVIEW LITERATURE AND OTHER SIMILAR IMPLEMENTATAIONS:
@ -56,7 +42,7 @@ REVIEW LITERATURE AND OTHER SIMILAR IMPLEMENTATAIONS:
* http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.44.2782&rep=rep1&type=pdf
1: make the "first level" have more thatn 2^5 entries (controlled by the constant TOP_LEVEL in hanoi.hrl); this means a new set of files is opened/closed/merged for every 32 insert/updates/deletes. Setting this higher will just make the nursery correspondingly larger, which should be absolutely fine.
1: make the "first level" have more thatn 2^5 entries (controlled by the constant TOP_LEVEL in lsm_btree.hrl); this means a new set of files is opened/closed/merged for every 32 insert/updates/deletes. Setting this higher will just make the nursery correspondingly larger, which should be absolutely fine.
2: Right now, the streaming btree writer emits a btree page based on number of elements. This could be changed to be based on the size of the node (say, some block-size boudary) and then add padding at the end so that each node read becomes a clean block transfer. Right now, we're probably taking way to many reads.