Commit graph

855 commits

Author SHA1 Message Date
Scott Lystig Fritchie
6b8a3cf2a4 WIP: epoch ID added to checksum protocol command 2015-04-02 20:49:45 +09:00
Scott Lystig Fritchie
9479baac46 WIP: epoch ID added to read protocol command 2015-04-02 20:31:10 +09:00
Scott Lystig Fritchie
44bb5e1dae WIP: epoch ID added to append protocol command 2015-04-02 18:08:42 +09:00
Scott Lystig Fritchie
030d2ecd10 Update TODO-shortterm.org + minor stuff 2015-04-02 17:42:26 +09:00
Scott Lystig Fritchie
5580098d49 Refactor to use record for FLU state, add dbg mode 2015-04-02 17:16:15 +09:00
Scott Lystig Fritchie
e5dba66eb0 Status update for the master branch 2015-04-02 16:41:12 +09:00
Scott Lystig Fritchie
4c3bd81689 Add machi_projection.erl and basic new() test 2015-04-02 16:24:32 +09:00
Scott Lystig Fritchie
58fa35a674 Remove escript-related proof-of-concept stuff from machi_util.erl
I'd first thought that having that code there would be a kind of
useful reminder: please move me somewhere else.  However, there's
quite a bit there that's "cluster of clusters" stuff and not
appropriate for the current short-term work.
2015-04-02 14:36:22 +09:00
Scott Lystig Fritchie
f8263c15cc Move FLU client 'verify checksums + local path' code from prototype/demo-day-hack 2015-04-02 12:38:12 +09:00
Scott Lystig Fritchie
76fcd4d931 Move FLU client 'verify checksums' code from prototype/demo-day-hack 2015-04-01 18:35:10 +09:00
Scott Lystig Fritchie
5c20ee6337 Fix client API for file list & checksum list 2015-04-01 18:02:16 +09:00
Scott Lystig Fritchie
d243ffca23 Single server client & server code (squashed) 2015-04-01 16:14:24 +09:00
Scott Lystig Fritchie
78f2ff4bbf Number section headings, clarify flapping behavior, add prototype notes
Fix #+END_QUOTE typo
2015-03-14 12:06:50 +09:00
Scott Lystig Fritchie
c2f8b3a478 Add a bit of interpretation advice for the chain manager simulator 2015-03-04 13:01:38 +09:00
Scott Lystig Fritchie
7c0092b0e4 Fix typo in chain-self-management-sketch.org 2015-03-04 12:26:43 +09:00
Scott Lystig Fritchie
e3307587d1 Update prototype/README.md 2015-03-03 20:15:00 +09:00
Scott Lystig Fritchie
e0066660ef Merge branch 'slf/manager-cleanup1' 2015-03-03 20:10:26 +09:00
Scott Lystig Fritchie
54266c4196 More docs 2 2015-03-03 20:09:32 +09:00
Scott Lystig Fritchie
a69db1da64 More docs, minor code cleanup 2015-03-03 18:45:52 +09:00
Scott Lystig Fritchie
fdddac99ab Separate the PULSE and non-PULSE test code 2015-03-03 18:31:54 +09:00
Scott Lystig Fritchie
7c0e174a3d Round 1 of doc updates 2015-03-03 17:59:04 +09:00
Scott Lystig Fritchie
26f08e62ec Remove obsolete & duplicate documentation, etc 2015-03-03 17:10:30 +09:00
Scott Lystig Fritchie
8487d5759d Initial cleanup 2015-03-03 16:49:32 +09:00
Scott Lystig Fritchie
a4c3b16357 make clean tweak 2015-03-03 16:43:56 +09:00
Scott Lystig Fritchie
2de061900c Update re-porting status in top README 2015-03-03 16:39:04 +09:00
Scott Lystig Fritchie
e1fcbd8bb0 Merge branch 'slf/tango-cleanup1' 2015-03-03 16:31:13 +09:00
Scott Lystig Fritchie
f973473d47 Remove test/pulse_util dir 2015-03-03 16:30:29 +09:00
Scott Lystig Fritchie
3cd5088b39 Fix up READMEs 2015-03-03 16:28:50 +09:00
Scott Lystig Fritchie
ff7c02d2dd Fix up 'make clean', TODO list 2015-03-03 16:22:05 +09:00
Scott Lystig Fritchie
9eda779f6e Clean up test code and corfurl-specific docs 2015-03-03 16:01:41 +09:00
Scott Lystig Fritchie
1ea0c302ec Now working on Tango prototype re-porting 2015-03-03 15:16:47 +09:00
Scott Lystig Fritchie
54f95481b5 Merge branch 'slf/corfurl-cleanup1' 2015-03-03 15:10:23 +09:00
Scott Lystig Fritchie
769ac0bd03 Reformat C2 example in prototype/corfurl/docs/corfurl/notes/README.md 2015-03-03 15:07:32 +09:00
Scott Lystig Fritchie
8ddb62d88f Aw, heck, add the PNG versions of the MSC diagrams 2015-03-03 15:03:08 +09:00
Scott Lystig Fritchie
37044a9ef4 Update top-level README.md 2015-03-03 14:58:46 +09:00
Scott Lystig Fritchie
c148ed8d66 Fix up PULSE code & documentation 2015-03-03 14:56:26 +09:00
Scott Lystig Fritchie
fbd2b6c31d Fix up README & using-pulse docs, other fixups 2015-03-03 14:09:39 +09:00
Scott Lystig Fritchie
12d2411dfc Targets all, compile, clean, and test seem to work 2015-03-03 11:57:08 +09:00
Scott Lystig Fritchie
2371c40815 Add NOTICE 2015-03-02 21:06:31 +09:00
Scott Lystig Fritchie
c5f9419048 Remove cruft from README.md regarding old repo 2015-03-02 21:04:18 +09:00
Scott Lystig Fritchie
9fbf13f91e Add sad & sorry first draft of README.md 2015-03-02 21:02:15 +09:00
Scott Lystig Fritchie
8e004cf93d Merge branch 'merge/demo-day-hack' 2015-03-02 20:58:32 +09:00
Scott Lystig Fritchie
29868678a4 Add file0_test.escript (and big squash)
Small cleanups

Small cleanups

Refactoring argnames & order for more consistency

Add server-side-calculated MD5 checksum + logging

file:consult() style checksum management, too slow! 513K csums = 105 seconds, ouch

Much faster checksum recording

Add checksum_list. Alas, line-by-line I/O is slow, neh?

Much faster checksum listing

Add file0_verify_checksums.escript and supporting code

Adjust escript +A and -smp flags

Add file0_compare_filelists.escript

First draft of file0_repair_server.escript

First draft of file0_repair_server.escript, part 2

WIP of file0_repair_server.escript, part 3

WIP of file0_repair_server.escript, part 4

Basic repair works, it seems, hooray!

When checksum file ordering is different, try a cheap(?) 'cmp' on sorted results instead

Add README.md

Initial import of szone_chash.erl

Add file0_cc_make_projection.escript and supporting code

Add file0_cc_map_prefix.escript and supporting code

Change think-o: hash output is a chain, silly boy

Add file0_cc_1file_write_redundant.escript and support

Add file0_cc_read_client.escript and supporting code

Add examples/servers.map & file0_start_servers.escript

WIP: working on file0_cc_migrate_files.escript

File migration finished, works, yay!

Add basic 'what am I' docs to each script

Add file0_server_daemon.escript

Minor fixes

Fix broken unit test

Add basho_bench run() commands for append & read ops with projection

Add to examples dir

WIP: erasure coding hack, part 1

Fix broken unit test

WIP: erasure coding hack, part 2

WIP: erasure coding hack, part 3, EC data write is finished!

WIP: erasure coding hack, part 4, EC data read still in progress

WIP: erasure coding hack, part 5, EC data read still in progress

WIP: erasure coding hack, part 5b, EC data read still in progress

WIP: erasure coding hack, EC data read finished!

README update, part 1

README update, part 2

Oops, put back the printed ouput for file-write-client and 1file-write-redundant-client

README update, part 3

Fix 'user' output bug in list-client

Ugly hacks to get output/no-output from write clients

Clean up minor output bugs

Clean up minor output bugs, part 2

README update, part 4

Clean up minor output bugs, part 3

Clean up minor output bugs, part 5

Clean up minor output bugs, part 6

README update, part 6

README update, part 7

README update, part 7

README update, part 8

Final edits/fixes for demo day

Fix another oops in the README/demo day script
2015-03-02 20:57:17 +09:00
Scott Lystig Fritchie
ed762b71b3 Cleanup for unit tests 2015-03-02 20:55:58 +09:00
Scott Lystig Fritchie
cb08983697 Initial import 2015-03-02 20:55:58 +09:00
Scott Lystig Fritchie
fc74861d99 Merge branch 'merge/chain-manager' 2015-03-02 20:27:13 +09:00
Scott Lystig Fritchie
240b7abe2a Rename prototype/poc-machi -> prototype/chain-manager 2015-03-02 20:25:40 +09:00
Scott Lystig Fritchie
f772fb334a Added 4th FLU, yes, the partially connected/flapping counting & dampening works!
So, it definitely works, in that it stops a low(er) ranking flapping
process from continuing to make new proposals, so then the cycle of
flapping stops.  Whenever an up/down state changes and a new/different
proposal is made, then things immediately resume, yay.

However, there's still a problem of the chain state at this time,
I believe.  Here's a session that's damped by the flap counter:

    SET always_last_partitions ON ... we should see convergence to correct chains.
    21:23:03.170 d uses: [{epoch,457},{author,a},{upi,[a]},{repair,[b,d,c]},{down,[]},{d,[{author_proc,react},{ps,[]},{nodes_up,[a,b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,3}}}]}]
    21:23:03.270 c uses: [{epoch,457},{author,a},{upi,[a]},{repair,[b,d,c]},{down,[]},{d,[{author_proc,react},{ps,[]},{nodes_up,[a,b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,3}}}]}]
    21:23:03.471 a uses: [{epoch,459},{author,a},{upi,[a,d]},{repair,[c]},{down,[b]},{d,[{repair_airquote_done,{we_agree,457}},{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,3}}}]}]
    21:23:03.611 b uses: [{epoch,460},{author,b},{upi,[b]},{repair,[c,d]},{down,[a]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,3}}}]}]
    21:23:03.635 d uses: [{epoch,461},{author,d},{upi,[a]},{repair,[b,d,c]},{down,[]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,3}}}]}]
    21:23:03.672 c uses: [{epoch,461},{author,d},{upi,[a]},{repair,[b,d,c]},{down,[]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,3}}}]}]
    21:23:03.873 a uses: [{epoch,462},{author,a},{upi,[a,d]},{repair,[c]},{down,[b]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,3}}}]}]
    21:23:04.155 d uses: [{epoch,463},{author,d},{upi,[a]},{repair,[b,d,c]},{down,[]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,4}}}]}]
    21:23:04.198 c uses: [{epoch,463},{author,d},{upi,[a]},{repair,[b,d,c]},{down,[]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,4}}}]}]
    21:23:04.270 b uses: [{epoch,464},{author,b},{upi,[b]},{repair,[c,d]},{down,[a]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,4}}}]}]
    21:23:04.276 a uses: [{epoch,465},{author,a},{upi,[a,d]},{repair,[c]},{down,[b]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,4}}}]}]
    21:23:04.652 d uses: [{epoch,466},{author,d},{upi,[a]},{repair,[b,d,c]},{down,[]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,4}}}]}]
    21:23:04.660 c uses: [{epoch,466},{author,d},{upi,[a]},{repair,[b,d,c]},{down,[]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,4}}}]}]
    21:23:04.679 a uses: [{epoch,467},{author,a},{upi,[a,d]},{repair,[c]},{down,[b]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,4}}}]}]
    21:23:04.914 b uses: [{epoch,468},{author,b},{upi,[b]},{repair,[c,d]},{down,[a]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,4}}}]}]
    21:23:05.058 d uses: [{epoch,469},{author,d},{upi,[a]},{repair,[b,d,c]},{down,[]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,5}}}]}]
    21:23:05.062 c uses: [{epoch,469},{author,d},{upi,[a]},{repair,[b,d,c]},{down,[]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,5}}}]}]
    21:23:05.081 a uses: [{epoch,470},{author,a},{upi,[a,d]},{repair,[c]},{down,[b]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,5}}}]}]
    21:23:05.579 b uses: [{epoch,471},{author,b},{upi,[b]},{repair,[c,d]},{down,[a]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,5}}}]}]
    21:23:05.581 d uses: [{epoch,472},{author,d},{upi,[a]},{repair,[b,d,c]},{down,[]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,5}}}]}]
    21:23:05.590 c uses: [{epoch,472},{author,d},{upi,[a]},{repair,[b,d,c]},{down,[]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,5}}}]}]
    21:23:05.885 a uses: [{epoch,473},{author,a},{upi,[a,d]},{repair,[c]},{down,[b]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,5}}}]}]
    21:23:06.102 d uses: [{epoch,474},{author,d},{upi,[a]},{repair,[b,d,c]},{down,[]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,6}}}]}]
    21:23:06.159 c uses: [{epoch,474},{author,d},{upi,[a]},{repair,[b,d,c]},{down,[]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,6}}}]}]
    21:23:06.250 b uses: [{epoch,475},{author,b},{upi,[b]},{repair,[c,d]},{down,[a]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,6}}}]}]
    21:23:06.288 a uses: [{epoch,476},{author,a},{upi,[a,d]},{repair,[c]},{down,[b]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,6}}}]}]
    21:23:06.612 d uses: [{epoch,477},{author,d},{upi,[a]},{repair,[b,d,c]},{down,[]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,6}}}]}]
    21:23:06.620 c uses: [{epoch,477},{author,d},{upi,[a]},{repair,[b,d,c]},{down,[]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,6}}}]}]
    21:23:06.691 a uses: [{epoch,478},{author,a},{upi,[a,d]},{repair,[c]},{down,[b]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,6}}}]}]
    21:23:06.893 b uses: [{epoch,479},{author,b},{upi,[b]},{repair,[c,d]},{down,[a]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,6}}}]}]
    21:23:07.015 d uses: [{epoch,480},{author,d},{upi,[a]},{repair,[b,d,c]},{down,[]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,7}}}]}]
    21:23:07.022 c uses: [{epoch,480},{author,d},{upi,[a]},{repair,[b,d,c]},{down,[]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,7}}}]}]
    21:23:07.094 a uses: [{epoch,481},{author,a},{upi,[a,d]},{repair,[c]},{down,[b]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,7}}}]}]
    21:23:07.516 d uses: [{epoch,482},{author,d},{upi,[a]},{repair,[b,d,c]},{down,[]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,7}}}]}]
    21:23:07.550 b uses: [{epoch,483},{author,b},{upi,[b]},{repair,[c,d]},{down,[a]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,7}}}]}]
    {FLAP: c flaps 4}!
    {FLAP: c flaps 5}!
    21:23:07.898 a uses: [{epoch,484},{author,a},{upi,[a,d]},{repair,[c]},{down,[b]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,7}}}]}]
    21:23:08.010 d uses: [{epoch,485},{author,d},{upi,[a]},{repair,[b,d,c]},{down,[]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,8}}}]}]
    21:23:08.013 c uses: [{epoch,485},{author,d},{upi,[a]},{repair,[b,d,c]},{down,[]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,8}}}]}]
    21:23:08.221 b uses: [{epoch,486},{author,b},{upi,[b]},{repair,[c,d]},{down,[a]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,8}}}]}]
    {FLAP: a flaps 5}!
    {FLAP: a flaps 6}!

    SET always_last_partitions OFF ... let loose the dogs of war!
    21:23:17.349 b uses: [{epoch,495},{author,b},{upi,[b]},{repair,[c,d,a]},{down,[]},{d,[{author_proc,react},{ps,[]},{nodes_up,[a,b,c,d]}]},{d2,[{network_islands,[islands_not_supported]},{hooray,{v2,{2014,11,8},{21,23,17}}}]}]

So, the state of the chains at 21:23:11.221, three seconds after
the flapping detector finished, is:

    epoch=484, UPI=[a,d], repair=[c],     nodes_up=[a,c,d]
    epoch=485, UPI=[a],   repair=[b,d,c], nodes_up=[a,b,c,d]
    epoch=486, UPI=[b],   repair=[c,d],   nodes_up=[b,c,d]

The UPIs are overlapping, derp, that won't work, thanks to the magic
of epoch version # enforcement, However, the clients need to concern
themselves with the repairing members, also.  As soon as a client
in the epoch=486 sends an op to FLU c or FLU d, those nodes will
wedge themselves because they're in a different epoch.  Everyone
will get stuck, and then life sucks.

Future work TBD!
2015-03-02 20:20:20 +09:00
Scott Lystig Fritchie
69d08c4328 Add partially-disconnected/asymmetric network partition support, a first attempt 2015-03-02 20:20:20 +09:00
Scott Lystig Fritchie
a9229df8e2 WIP: flap detection, broken right now 2015-03-02 20:20:20 +09:00