* To Do list

** DONE remove the escript* stuff from machi_util.erl
** DONE Add functions to manipulate 1-chain projections

- Add epoch ID = epoch number + checksum of projection!
  Done via compare() func.

** DONE Change all protocol ops to add epoch ID
** DONE Add projection store to each FLU.

*** DONE What should the API look like? (borrow from chain mgr PoC?)

Yeah, I think that's pretty complete.  Steal it now, worry later.

*** DONE Choose protocol & TCP port. Share with get/put? Separate?

Hrm, I like the idea of having a single TCP port to talk to any single
FLU.

To make the protocol "easy" to hack, how about using the same basic
method as append/write where there's a variable size blob.  But we'll
format that blob as a term_to_binary().  Then dispatch to a single
func, and pattern match Erlang style in that func.

*** DONE Do it.

** DONE Finish OTP'izing the Chain Manager with FLU & proj store processes
** DONE Eliminate the timeout exception for the client: just {error,timeout} ret
** DONE Move prototype/chain-manager code to "top" of source tree
*** DONE Preserve current test code (leave as-is? tiny changes?)
*** DONE Make chain manager code flexible enough to run "real world" or "sim"
** DONE Add projection wedging logic to each FLU.
** DONE Implement real data repair, orchestrated by the chain manager
** DONE Change all protocol ops to enforce the epoch ID

- Add no-wedging state to make testing easier?
    

** DONE Adapt the projection-aware, CR-implementing client from demo-day
** DONE Add major comment sections to the CR-impl client
** DONE Simple basho_bench driver, put some unscientific chalk on the benchtop
** TODO Create parallel PULSE test for basic API plus chain manager repair
** DONE Add client-side vs. server-side checksum type, expand client API?
** TODO Add gproc and get rid of registered name rendezvous
*** TODO Fixes the atom table leak
*** TODO Fixes the problem of having active sequencer for the same prefix
         on two FLUS in the same VM

** TODO Fix all known bugs/cruft with Chain Manager (list below)
*** DONE Fix known bugs
*** DONE Clean up crufty TODO comments and other obvious cruft
*** TODO Re-add verification step of stable epochs, including inner projections!
*** TODO Attempt to remove cruft items in flapping_i?

** TODO Move the FLU server to gen_server behavior?


* DONE Chain manager CP mode, Plan B
** SKIP Maybe? Change ch_mgr to use middleworker
**** DONE Is it worthwhile?  Is the parallelism so important?  No, probably.
**** SKIP Move middleworker func to utility module?
** DONE Add new proc to psup group
*** DONE Name: machi_fitness
** DONE ch_mgr keeps its current proc struct: i.e. same 1 proc as today
** NO chmgr asks hosed mgr for hosed list @ start of react_to_env
** DONE For all hosed, do *async*: try to read latest proj.
*** NO If OK, inform hosed mgr: status change will be used by next HC iter.
*** NO If fail, no change, because that server is already known to be hosed
*** DONE For all non-hosed, continue as the chain manager code does today
*** DONE Any new errors are added to UpNodes/DownNodes tracking as used today
*** DONE At end of react loop, if UpNodes list differs, inform hosed mgr.

* TODO fitness_mon, the fitness monitor
** DONE Map key & val sketch

Logical sketch:

Map key: ObservingServerName::atom()

Map val: { ObservingServerLastModTime::now(),
           UnfitList::list(ServerName::atom()),
           Props::proplist() }

Implementation sketch:

1. Use CRDT map.
2. If map key is not atom, then atom->string or atom->binary is fine.
3. For map value, is it possible CRDT LWW type?

** DONE Investigate riak_dt data structure definition, manipulating, etc.
** DONE Add dependency on riak_dt
** WORKING Update is an entire dict from Observer O
*** TODO Merge my pending map + update map + my last mod time + my unfit list
*** TODO if merged /= pending:
**** TODO Schedule async tick (more)

Tick message contains list of servers with differing state as of this
instant in time... we want to avoid triggering decisions about
fitness/unfitness for other servers where we might have received less
than a full time period's worth of waiting.

**** TODO Spam merged map to All_list -- [Me, O]
**** TODO Set pending <- merged

*** TODO When we receive an async tick
**** TODO set active map <- pending map for all servers in ticks list
**** TODO Send ch_mgr a react_to_env tick trigger
*** TODO react_to_env tick trigger actions
**** TODO Filter active map to remove stale entries (i.e. no update in 1 hour)
**** TODO If time since last map spam is too long, spam our *pending* map
**** TODO Proceed with normal react processing, using *active* map for AllHosed!