machi

greg/machi

Author	SHA1	Message	Date
Scott Lystig Fritchie	91496c656b	Oops, fix PB stuff to add witnesses	2015-08-05 12:53:20 +09:00
Scott Lystig Fritchie	3f51357577	WIP: pre-travel code, not sure if good, check in for history	2015-07-30 13:12:08 -07:00
Scott Lystig Fritchie	aa1a31982a	Add 'witnesses' to machi_projection:make_summary()	2015-07-30 13:11:43 -07:00
Scott Lystig Fritchie	6e521700bd	WIP: Adding witness_smoke_test_ but it's broken (more) So, the problem is that the chain manager isn't finishing repair because UPI=[a], and a is a witness, and a can't do the list files etc etc repair stuff that repairer FLUs need to do. The best (?) way forward is to add some advance smarts to the chain manager so that it doesn't propose a UPI of 100% witnesses?	2015-07-21 19:05:04 +09:00
Scott Lystig Fritchie	432190435e	Add witness_mode to FLU	2015-07-21 17:29:33 +09:00
Scott Lystig Fritchie	88d3228a4c	Fix various problems with repair not being aware of inner projections	2015-07-20 16:25:42 +09:00
Scott Lystig Fritchie	9ae4afa58e	Reduce chmgr verbosity a bit	2015-07-20 14:58:21 +09:00
Scott Lystig Fritchie	e14493373b	Bugfix: add missing reset of not_sanes dictionary, fix comments	2015-07-20 14:04:25 +09:00
Scott Lystig Fritchie	f7ef8c54f5	Reduce # of assumptions made by ch_mgr + simulator for 'repair_airquote_done'	2015-07-19 13:32:55 +09:00
Scott Lystig Fritchie	b8c642aaa7	WIP: bugfix for rare flapping infinite loop (done^2 fix I hope) How can even computer? So, there's a flavor of the flapping infinite loop problem that can happen without flapping being detected (by the existing flapping detector, that is). That detector relies on a series of accepted projections to converge to a single projection repeated X times. However, it's possible to have a race with a simulated repair "finishing" that causes a problem so that no more projections are ever accepted. Oops. See also: new comments in do_react_to_env().	2015-07-19 00:43:10 +09:00
Scott Lystig Fritchie	57b7122035	Fix bug found by PULSE that's not directly chain manager-related (more) PULSE managed to create a situation where machi_proxy_flu_client1 would appear to fail a remote attempt to write_projection. The client would retry, but the 1st attempt really did get through to the server. So, if we hit this case, we try to read the projection, and if it's exactly equal to what we tried to write, we consider the op a success. Ditto for write_chunk. Fix up eunit test to accomodate the change of semantics.	2015-07-18 23:22:14 +09:00
Scott Lystig Fritchie	87867f8f2e	WIP: bugfix for rare flapping infinite loop (done fix I hope) {sigh} This is a correction to a think-o error in the "WIP: bugfix for rare flapping infinite loop (better fix I hope)" bugfix that I thought I had finished in the slf/chain-manager/cp-mode branch. Silly me, the test for myself as the author of the not_sane transition was wrong: we don't do that kind of insanity, other nodes might, though. ^_^	2015-07-18 17:53:17 +09:00
Scott Lystig Fritchie	19ce841471	Merge slf/chain-manager/cp-mode (fix conflicts)	2015-07-17 16:39:37 +09:00
Scott Lystig Fritchie	b295c7f374	Log more info on private projection write failure	2015-07-17 16:20:54 +09:00
Scott Lystig Fritchie	f4d16881c0	WIP: bugfix for rare flapping infinite loop (better fix I hope) %% So, I'd tried this kind of "if everyone is doing it, then we %% 'agree' and we can do something different" strategy before, %% and it didn't work then. Silly me. Distributed systems %% lesson #823: do not forget the past. In a situation created %% by PULSE, of all=[a,b,c,d,e], b & d & e were scheduled %% completely unfairly. So a & c were the only authors ever to %% suceessfully write a suggested projection to a public store. %% Oops. %% %% So, we're going to keep track in #ch_mgr state for the number %% of times that this insane judgement has happened.	2015-07-17 14:51:39 +09:00
Scott Lystig Fritchie	0a8821a1c6	WIP: bugfix for rare flapping infinite loop (fixed I hope) I'll run a set of PULSE tests (Cmd_e of the 'regression' style) to try to confirm a fix for this pernicious little thing. Final (?) part of the fix: add myself to SeenFlappers in react_to_env_A30().	2015-07-16 23:23:30 +09:00
Scott Lystig Fritchie	b4d9ac5fe0	Hooray, PULSE things look stable; remove debugging verbose cruft	2015-07-16 21:57:34 +09:00
Scott Lystig Fritchie	c10200138c	Hooray??! Fix the damn PULSE hangs by using infinity supervisor shutdown times	2015-07-16 21:17:46 +09:00
Scott Lystig Fritchie	3a4624ab06	Hrm, fewer deadlocks, but lots of !@#$! mystery hangs @ startup & teardown	2015-07-16 20:13:48 +09:00
Scott Lystig Fritchie	d331e09923	Hrm, fewer deadlocks, but sometimes unreliable shutdown	2015-07-16 17:59:02 +09:00
Scott Lystig Fritchie	f2fc5b91c2	Add more PULSE instrumentation -> more deadlocks	2015-07-16 16:25:38 +09:00
Scott Lystig Fritchie	73ac220d75	Add machi_verbose.hrl	2015-07-16 16:01:53 +09:00
Scott Lystig Fritchie	0ead97093b	WIP: bugfix for rare flapping infinite loop (unfinished) part ...	2015-07-16 00:18:42 +09:00
Scott Lystig Fritchie	18c92c98f8	WIP: bugfix for rare flapping infinite loop (unfinished) part IV	2015-07-15 18:42:59 +09:00
Scott Lystig Fritchie	402720d301	WIP: bugfix for rare flapping infinite loop (unfinished) part II	2015-07-15 17:23:17 +09:00
Scott Lystig Fritchie	6f9a603e99	WIP: bugfix for rare flapping infinite loop (unfinished)	2015-07-15 12:44:56 +09:00
Scott Lystig Fritchie	0f667c4356	WIP: add more debugging/react info	2015-07-15 11:25:06 +09:00
Scott Lystig Fritchie	7c970d90a6	Bugfix: use correct updated #state in react_to_env_A30() {sigh}	2015-07-15 00:44:07 +09:00
Scott Lystig Fritchie	5eb6ebc874	Bugfix: add missing remember_partition_hack() calls in perhaps_call path	2015-07-14 17:17:14 +09:00
Scott Lystig Fritchie	fd66fe46b5	Move react logging in react_to_env_A30()	2015-07-14 17:16:23 +09:00
Scott Lystig Fritchie	0089af0a86	Bugfix: moving inner -> outer projection, use calc_projection() for sanity	2015-07-10 21:11:34 +09:00
Scott Lystig Fritchie	f746b75254	Bugfix: A30: if Kicker_p only true if we actually have an inner proj!	2015-07-10 20:25:44 +09:00
Scott Lystig Fritchie	e9e4c54b25	Bugfix: undo the jump directly from A30 -> C100.	2015-07-10 20:24:44 +09:00
Scott Lystig Fritchie	ed7dcd14db	Avoid putting inner_summary in dbg proplist	2015-07-10 17:47:33 +09:00
Scott Lystig Fritchie	4d41c59e19	Bugfix: machi_projection:new/6 derp: argument order mistake	2015-07-10 16:41:28 +09:00
Scott Lystig Fritchie	cf9ae5b555	WIP: correct calc of All_UPI_Repairing_were_unanimous, but now infinite loop in long chains??	2015-07-10 15:30:31 +09:00
Scott Lystig Fritchie	2060b80830	Keep good refactorings from commit a8390ee2 Also, add more misc details to the 'react' breadcrumb trail. Also, save get(react) results into dbg2 whenever we write a private projection, very valuable for debugging. Also: cleanup PULSE code, add regression commands as option and controls with some new environment variables. These regression sequences were responsbile for several fruitful debugging sessions, so we keep them for posterity and for their ability (with new seeds and PULSE) to find new interleavings.	2015-07-10 15:04:50 +09:00
Scott Lystig Fritchie	badcfa3064	Remove comment cruft	2015-07-07 14:32:02 +09:00
Scott Lystig Fritchie	0f3d11e1bf	Bugfix (part II) rare race between just-finished repair and flapping ending The prior commit wasn't sufficient: the range of transitions is wider than assumed by that commit. So, we take one of two options, with a TODO task of researching the other option.	2015-07-07 14:30:21 +09:00
Scott Lystig Fritchie	96ca7b7082	Bugfix for rare race between just-finished repair and flapping ending Fix for today: We are going to game the system. We know that C100 is going to be checking authorship relative to P_current's UPI's tail. Therefore, we're just going to set it here. Why??? Because we have been using this projection safely for the entire flapping period! ... The only other way I see is to allow C100 to carve out an exception if the repair finished PLUS author_server check fails PLUS if we came from here, but that feels a bit fragile to me: if some code factoring happens in projection_transition_is_saneprojection_transition_is_sane() or elsewhere that causes the author_server check to be something-other-than-the-final-thing-checked, then such a refactoring would likely cause an even harder bug to find & fix. Conditions tested: 5 FLUs plus alternating partitions of: [ [{a,b}], [], [{a,b}], [], [{a,b}], [], [{a,b}], [], [{a,b}], [], [{b,a},{d,e}], [{a,b}], [], [{a,b}], [], [{a,b}], [], [{a,b}], [], [{a,b}], [] ].	2015-07-07 01:29:37 +09:00
Scott Lystig Fritchie	54b5014446	WIP: bugfix in transition, just-in-case commit	2015-07-06 23:56:29 +09:00
Scott Lystig Fritchie	9d4b4b1df6	Bugfix: update inner projection based on previous inner projection	2015-07-06 17:38:15 +09:00
Scott Lystig Fritchie	3f8982cbe1	MAJOR WIP: set author's rank to constant 0? Worthwhile??	2015-07-06 16:12:15 +09:00
Scott Lystig Fritchie	471cde1f2c	WIP: debugging fmt shuffle	2015-07-06 16:11:14 +09:00
Scott Lystig Fritchie	8ee3377fa7	Fix a state transition bug (chain manager infinite loop, oops) %% We have a small problem for state transition sanity checking in the %% case where we are flapping and a repair has finished. One of the %% sanity checks in simple_chain_state_transition_is_sane(() is that %% the author of P2 in this case must be the tail of P1's UPI: i.e., %% it's the tail's responsibility to perform repair, therefore the tail %% must damn well be the author of any transition that says a repair %% finished successfully. %% %% The problem is that author_server of the inner projection does not %% reflect the actual author! See the comment with the text %% "The inner projection will have a fake author" in %react_to_env_A30(). %% %% So, there's a special return value that tells us to try to check for %% the correct authorship here.	2015-07-05 14:52:50 +09:00
Scott Lystig Fritchie	920c0fc610	WIP: much better structure for inner projection sanity checking	2015-07-04 16:46:02 +09:00
Scott Lystig Fritchie	8241d1f600	WIP: cruft, needs refactoring	2015-07-04 14:57:38 +09:00
Scott Lystig Fritchie	65ee0c23ec	Adjust author of inner projections to yield same checksum	2015-07-04 01:58:00 +09:00
Scott Lystig Fritchie	cd026303a0	Unused var cleanup	2015-07-04 00:35:05 +09:00
Scott Lystig Fritchie	9b0a5a1dc3	WIP: 1st part of moving old chain state transtion code to new Ha, famous last words, amirite? %% The chain sequence/order checks at the bottom of this function aren't %% as easy-to-read as they ought to be. However, I'm moderately confident %% that it isn't buggy. TODO: refactor them for clarity. So, now machi_chain_manager1:projection_transition_is_sane() is using newer, far less buggy code to make sanity decisions. TODO: Add support for Retrospective mode. TODO is it really needed? Examples of how the old code sucks and the new code sucks less. 138> eqc:quickcheck(eqc:testing_time(10, machi_chain_manager1_test:prop_compare_legacy_with_v2_chain_transition_check(whole))). xxxxxxxxxxxx..x.xxxxxx..x.x....x..xx........................................................Failed! After 69 tests. [a,b,c] {c,[a,b,c],[c,b],b,[b,a],[b,a,c]} Old_res ([335,192,166,160,153,139]): true New_res: false (why line [1936]) Shrinking xxxxxxxxxxxx.xxxxxxx.xxx.xxxxxxxxxxxxxxxxx(3 times) [a,b,c] %% {Author1,UPI1, Repair1,Author2,UPI2, Repair2} %% {c, [a,b,c],[], a, [b,a],[]} Old_res ([338,185,160,153,147]): true New_res: false (why line [1936]) false Old code is wrong: we've swapped order of a & b, which is bad. 139> eqc:quickcheck(eqc:testing_time(10, machi_chain_manager1_test:prop_compare_legacy_with_v2_chain_transition_check(whole))). xxxxxxxxxx..x...xx..........xxx..x..............x......x............................................(x10)...(x1)........Failed! After 120 tests. [b,c,a] {c,[c,a],[c],a,[a,b],[b,a]} Old_res ([335,192,185,160,153,123]): true New_res: false (why line [1936]) Shrinking xx.xxxxxx.x.xxxxxxxx.xxxxxxxxxxx(4 times) [b,a,c] %% {Author1,UPI1,Repair1,Author2,UPI2, Repair2} %% {a, [c], [], c, [c,b],[]} Old_res ([338,185,160,153,147]): true New_res: false (why line [1936]) false Old code is wrong: b wasn't repairing in the previous state. 150> eqc:quickcheck(eqc:testing_time(10, machi_chain_manager1_test:prop_compare_legacy_with_v2_chain_transition_check(whole))). xxxxxxxxxxx....x...xxxxx..xx.....x.......xxx..x.......xxx...................x................x......(x10).....(x1)........xFailed! After 130 tests. [c,a,b] {b,[c],[b,a,c],c,[c,a,b],[b]} Old_res ([335,214,185,160,153,147]): true New_res: false (why line [1936]) Shrinking xxxx.x.xxx.xxxxxxx.xxxxxxxxx(4 times) [c,b,a] %% {Author1,UPI1,Repair1,Author2,UPI2, Repair2} %% {c, [c], [a,b], c, [c,b,a],[]} Old_res ([335,328,185,160,153,111]): true New_res: false (why line [1981,1679]) false Old code is wrong: a & b were repairing but UPI2 has a & b in the wrong order.	2015-07-04 00:32:28 +09:00
Scott Lystig Fritchie	42fb6dd002	WIP: it's clear that the legacy state transition check is broken, II	2015-07-03 23:37:36 +09:00
Scott Lystig Fritchie	caeb322725	WIP: it's clear that the legacy state transition check is broken	2015-07-03 23:17:34 +09:00
Scott Lystig Fritchie	83015c319d	WIP: yeah, now we're going places	2015-07-03 22:05:35 +09:00
Scott Lystig Fritchie	6a706cbfeb	WIP: Refactoring and prototyping goop, broken test	2015-07-03 19:21:41 +09:00
Scott Lystig Fritchie	9b3cd9056a	Un-TEST'ify testr_react_to_env() everywhere	2015-07-03 16:18:40 +09:00
Scott Lystig Fritchie	2b64028bbd	Add kick_projection_reaction, implement yo:tell_author_yo()	2015-07-03 04:30:05 +09:00
Scott Lystig Fritchie	c6870a1c86	If FLU is wedged by a newer client epoch ID, kick the chain manager to react	2015-07-03 02:17:01 +09:00
Scott Lystig Fritchie	ff66638eb3	Sequencer changes file sequence number when epoch_id change is detected	2015-07-03 02:04:04 +09:00
Scott Lystig Fritchie	da3a56dd74	Fix epoch checking in eunit tests and enforcement by FLU (always permit list_files())	2015-07-01 18:12:22 +09:00
Scott Lystig Fritchie	2c869ed598	TODO fix: wedge self	2015-07-01 17:19:11 +09:00
Scott Lystig Fritchie	1e14fe878f	Ha, oops! Add bad_epoch code, derp 1	2015-07-01 15:51:25 +09:00
Scott Lystig Fritchie	a658a64482	Cosmetic formatting change	2015-07-01 15:37:53 +09:00
Scott Lystig Fritchie	a0061d6ffa	make decode_csum_file_entry() very slightly less brittle	2015-07-01 15:18:57 +09:00
Scott Lystig Fritchie	d710d90ea7	Fix usage of checksum_list by machi_chain_repair.erl	2015-07-01 15:04:22 +09:00
Scott Lystig Fritchie	0321e05b46	Fix usage of checksum_list by machi_basho_bench_driver.erl	2015-07-01 15:03:56 +09:00
Scott Lystig Fritchie	e3b80c6ac2	Docuemntation updates	2015-06-30 19:04:23 +09:00
Scott Lystig Fritchie	00c8cf0ef7	Rename temporary HTTP server hack functions	2015-06-30 16:19:44 +09:00
Scott Lystig Fritchie	7542fe8225	WIP: all eunit tests are passing again, yay	2015-06-30 16:12:23 +09:00
Scott Lystig Fritchie	e9d50a2128	WIP: Reinstate one eunit test, fix type bugs	2015-06-30 15:51:03 +09:00
Scott Lystig Fritchie	3d2b49b7e5	WIP: refactoring & edoc'ing	2015-06-30 15:20:35 +09:00
Scott Lystig Fritchie	310fdb1f6a	Add crude file size check to do_server_checksum_listing()	2015-06-30 14:13:26 +09:00
Scott Lystig Fritchie	2d070bf1e3	Minor refactoring + add demo/exploratory time measurement code %% Demo/exploratory hackery to check relative speeds of dealing with %% checksum data in different ways. %% %% Summary: %% %% * Use compact binary encoding, with 1 byte header for entry length. %% * Because the hex-style code is far slower just for enc & dec ops. %% * For 1M entries of enc+dec: 0.215 sec vs. 15.5 sec. %% * File sorter when sorting binaries as-is is only 30-40% slower %% than an in-memory split (of huge binary emulated by file:read_file() %% "big slurp") and sort of the same as-is sortable binaries. %% * File sorter slows by a factor of about 2.5 if {order, fun compare/2} %% function must be used, i.e. because the checksum entry lengths differ. %% * File sorter + {order, fun compare/2} is still far faster than external %% sort by OS X's sort(1) of sortable ASCII hex-style: %% 4.5 sec vs. 21 sec. %% * File sorter {order, fun compare/2} is faster than in-memory sort %% of order-friendly 3-tuple-style: 4.5 sec vs. 15 sec.	2015-06-30 14:08:46 +09:00
Scott Lystig Fritchie	34b046acbd	Remove machi_pb_wrap.erl	2015-06-29 17:31:07 +09:00
Scott Lystig Fritchie	dba7041929	Change names to indicate we're no longer in PB land	2015-06-29 17:20:17 +09:00
Scott Lystig Fritchie	151e696324	WIP: yank out more unused cruft	2015-06-29 17:14:33 +09:00
Scott Lystig Fritchie	87ec988353	WIP: yank out more unused cruft	2015-06-29 17:06:28 +09:00
Scott Lystig Fritchie	6cd3b8d0ec	WIP: yank out lots of unused cruft	2015-06-29 17:02:58 +09:00
Scott Lystig Fritchie	d54c74f58a	WIP: yank out io:format	2015-06-29 16:53:41 +09:00
Scott Lystig Fritchie	7aff9fca70	WIP: giant hairball 12	2015-06-29 16:42:05 +09:00
Scott Lystig Fritchie	b25ab3b7ac	WIP: giant hairball 11	2015-06-29 16:24:57 +09:00
Scott Lystig Fritchie	64817dd7e8	WIP: giant hairball 01	2015-06-29 16:10:43 +09:00
Scott Lystig Fritchie	f45dc7829e	WIP: hairball, but: Failed: 6. Skipped: 0. Passed: 13	2015-06-27 00:43:27 +09:00
Scott Lystig Fritchie	b5c824c5c0	WIP: hairball, but bad_checksum_test() works!	2015-06-27 00:06:21 +09:00
Scott Lystig Fritchie	2fd27fdae6	WIP: hairball, but flu_projection_smoke_test() works!	2015-06-26 23:58:34 +09:00
Scott Lystig Fritchie	93f64a20c0	WIP: hairball, but flu_smoke_test() works!	2015-06-26 23:03:28 +09:00
Scott Lystig Fritchie	920a5c33d7	WIP: giant hairball 6	2015-06-26 22:32:53 +09:00
Scott Lystig Fritchie	77b4da16c3	WIP: giant hairball 5	2015-06-26 21:36:07 +09:00
Scott Lystig Fritchie	9a212fb19f	WIP: giant hairball 4	2015-06-26 20:47:55 +09:00
Scott Lystig Fritchie	0e32fd25c9	WIP: giant hairball 3	2015-06-26 18:59:07 +09:00
Scott Lystig Fritchie	8437d76c1c	WIP: giant hairball 2	2015-06-26 18:22:15 +09:00
Scott Lystig Fritchie	fb975eea46	WIP: giant hairball	2015-06-26 16:58:24 +09:00
Scott Lystig Fritchie	6d95d8669c	WIP: giant hairball, bleh, low-level checksum_list() barely working	2015-06-26 16:25:12 +09:00
Scott Lystig Fritchie	90efc41167	machi.proto definition for low-level protocol ops	2015-06-25 17:09:33 +09:00
Scott Lystig Fritchie	cf0d9a25b4	EDoc cleanup	2015-06-25 16:39:19 +09:00
Scott Lystig Fritchie	0b2b79cd0b	Merge branch 'slf/pb-api-experiment1'	2015-06-25 16:36:50 +09:00
Scott Lystig Fritchie	0f4d5ed775	Silence dialyzer unused function clause	2015-06-25 16:36:29 +09:00
Scott Lystig Fritchie	c2faf9f499	yolo, un-do experimental type hack	2015-06-25 16:36:14 +09:00
Scott Lystig Fritchie	d9694a992a	Alright, use term_to_binary() for opaque/sexp-style encoding, only 15x slower. machi_flu1_test: timing_pb_encoding_test_... speed factor=15.12 [2.678 s] ok	2015-06-25 16:11:46 +09:00
Scott Lystig Fritchie	2763b16ca2	timing_pb_encoding_test_... speed factor=35.95 [2.730 s] ok So, the PB style encoding of the Mpb_LL_WriteProjectionReq message is about 35-36 times slower than using Erlang's term_to_binary() and binary_to_term(). {sigh}	2015-06-25 16:11:44 +09:00
Scott Lystig Fritchie	5d8b648a24	All projection store protocol operations are now using Protocol Buffers! So, there's some cheating going on, because some of the parts of the #projection_v1{} and #p_srvr{} records aren't fully specified. Those parts are being specified as "opaque" in the field names, e.g. optional bytes opaque_flap = 10; optional bytes opaque_inner = 11; required bytes opaque_dbg = 12; required bytes opaque_dbg2 = 13; The serialization that's being used is erlang term sexprs. That isn't portable. So if/when we really need to deal with a non-Erlang language, we'll have to straighten this out further.	2015-06-25 15:26:35 +09:00
Scott Lystig Fritchie	841235b3b5	WIP: bugfixes, add {error, written}	2015-06-25 15:10:24 +09:00
Scott Lystig Fritchie	4fc0578a9d	WIP: bugfixes, machi_flu1_test still broken	2015-06-25 15:08:40 +09:00
Scott Lystig Fritchie	d9407b76b7	WIP: dinnertime, machi_flu1_test still broken	2015-06-24 18:00:25 +09:00
Scott Lystig Fritchie	31c5bcc0c7	WIP: 1/2 of low-level projection proto finished, machi_flu1_test fails	2015-06-24 17:20:18 +09:00
Scott Lystig Fritchie	725b10ba90	Complete PB round-trip for #projection_v1{}, bleh	2015-06-24 16:13:11 +09:00
Scott Lystig Fritchie	2068f70700	WIP: encoding #p_srvr and #projection_v1, just starting. Damn tedious.	2015-06-24 12:50:37 +09:00
Scott Lystig Fritchie	817efb2b15	machi_pb_high_client: always be checksumming	2015-06-23 17:37:47 +09:00
Scott Lystig Fritchie	d3b0b7fdc5	Clean up some dialyzer complaints	2015-06-23 17:26:15 +09:00
Scott Lystig Fritchie	727b2a987d	ROTFL forgot to add src/machi_pb_server.erl	2015-06-23 17:22:45 +09:00
Scott Lystig Fritchie	ceebe3d491	WIP: list_files #2	2015-06-23 17:17:14 +09:00
Scott Lystig Fritchie	73f71c406e	WIP: list_files end-to-end!	2015-06-23 17:08:15 +09:00
Scott Lystig Fritchie	6722b3c0f1	WIP: checksum_list incomplete implementation....	2015-06-23 16:53:06 +09:00
Scott Lystig Fritchie	44c22bf752	WIP: read_chunk #1	2015-06-23 15:34:48 +09:00
Scott Lystig Fritchie	a8782eed5a	WIP: write_chunk #1	2015-06-23 15:13:13 +09:00
Scott Lystig Fritchie	cb06c53dc0	WIP: PB append_chunk end-to-end works!	2015-06-23 14:45:24 +09:00
Scott Lystig Fritchie	5ef499ec73	WIP: append_chunk #1	2015-06-23 14:08:10 +09:00
Scott Lystig Fritchie	bb8e725c26	WIP: 'auth' request placeholders	2015-06-22 18:16:15 +09:00
Scott Lystig Fritchie	db7f1476b9	WIP: 'echo' request works end-to-end, yay!	2015-06-22 18:04:17 +09:00
Scott Lystig Fritchie	3d05f543df	WIP: new test case is failing, quick fix soon	2015-06-22 17:49:07 +09:00
Scott Lystig Fritchie	dc9f272c44	Nearly dumbest-possible Protocol Buffers client request & response round trip	2015-06-19 17:21:04 +09:00
Scott Lystig Fritchie	c4bdeee4da	Oops, add missing src/machi_dt.erl	2015-06-19 17:20:49 +09:00
Scott Lystig Fritchie	0cdaee32f8	Egadz, edoc doesn't use preprocessor {sigh}	2015-06-19 16:24:57 +09:00
Scott Lystig Fritchie	984b4f7a86	Dialyzer tightening and subsequent cleanup	2015-06-19 16:04:34 +09:00
Scott Lystig Fritchie	1372bd9594	{sigh} add filter-dialyzer-dep-warnings	2015-06-19 15:22:07 +09:00
Scott Lystig Fritchie	3c300bb9f1	Add write_chunk() to machi_cr_client.erl	2015-06-19 14:49:09 +09:00
Scott Lystig Fritchie	40c0a72b48	Add test/machi_pb_test.erl, finish PB refactoring	2015-06-19 13:00:28 +09:00
Scott Lystig Fritchie	a82bd68f3c	Overhaul the 0.1 PB definition. Again. Many thanks to @seancribbs for a suggestion to avoid the PB design mistake/feature of the original Riak KV PB API.	2015-06-19 12:28:31 +09:00
Scott Lystig Fritchie	87b636a349	WIP: PB wrestling	2015-06-18 17:31:48 +09:00
Scott Lystig Fritchie	37a8c1c124	WIP: PB wrestling	2015-06-18 16:16:23 +09:00
Scott Lystig Fritchie	e5673b5e20	First attempt at Protocol Buffers .proto + infrastructure	2015-06-17 16:12:20 +09:00
Scott Lystig Fritchie	22337e1819	Remove short circuit (bad idea!) from react_to_env_C100()	2015-06-15 17:22:02 +09:00
Scott Lystig Fritchie	b244a3b8e4	Reduce verbosity, try fix up convergence demo for chain len=4	2015-06-15 12:41:16 +09:00
Scott Lystig Fritchie	9bf76e0bfb	Fix for correctness bug, thanks PULSE	2015-06-05 01:06:39 +09:00
Scott Lystig Fritchie	be62300b3b	Bug fixes: model and real bugs, thanks PULSE and converge_demo both!	2015-06-04 17:39:29 +09:00
Scott Lystig Fritchie	0cf9627f26	Bugfix, found by inspection, yay!	2015-06-04 15:05:37 +09:00
Scott Lystig Fritchie	89b8b6a012	Bugfix, found by PULSE, yay!	2015-06-04 14:31:58 +09:00
Scott Lystig Fritchie	d3df2bd31d	WIP: remove repair_always_done option, it was flawed	2015-06-03 15:26:22 +09:00
Scott Lystig Fritchie	87417d2872	WIP: get the old jalopy into runnable shape	2015-06-03 11:48:55 +09:00
Scott Lystig Fritchie	2207151eba	Fix projection_transition_is_sane() bug	2015-06-02 21:20:50 +09:00
Scott Lystig Fritchie	deabe14d29	Un-proplist-ify the inner projection	2015-06-02 20:55:18 +09:00
Scott Lystig Fritchie	207be8729b	Un-proplist-ify the flapping_i info	2015-06-02 20:32:52 +09:00
Scott Lystig Fritchie	0f10b45161	Dialyzer fixes, derp!	2015-06-02 19:07:13 +09:00
Scott Lystig Fritchie	000d687588	Fix creation_time bug in inner projection	2015-06-02 16:26:49 +09:00
Scott Lystig Fritchie	cffbd3c50c	Add checksum handling strawman to strawman HTTP interface	2015-06-02 13:23:36 +09:00
Scott Lystig Fritchie	dd4160b963	Add basic {error, bad_checksum} tests to proxy & CR clients	2015-06-02 12:36:51 +09:00
Scott Lystig Fritchie	e3162fdcda	Rudimentary client-side checksum and server-side checksum type tags	2015-06-01 14:25:55 +09:00
Scott Lystig Fritchie	6cebf39723	Damn ugly HTTP interface "equivalent" for machi_cr_client.erl basic API This goes to show that mixing implementation and protocol and API and lots of other stuff ... is cool for the quick hack to do one thing but really sucks when trying to do more than one thing. * Proof-of-concept only: add HTTP/1.0'ish 'PUT' interface to be the rough equivalent of machi_cr_client:append_chunk/3 * Proof-of-concept only: add HTTP/1.0'ish 'GET' interface to be the rough equivalent of machi_cr_client:read_chunk/4 Example use: `append_chunk` % curl http://127.0.0.1:4444/foo -0 -T /etc/hosts -v * Hostname was NOT found in DNS cache * Trying 127.0.0.1... * Connected to 127.0.0.1 (127.0.0.1) port 4444 (#0) > PUT /foo HTTP/1.0 > User-Agent: curl/7.37.1 > Host: 127.0.0.1:4444 > Accept: / > Content-Length: 338 > * We are completely uploaded and fine * HTTP 1.0, assume close after body < HTTP/1.0 201 Created < Location: foo.50EI18AX.21 < X-Offset: 3052 < X-Size: 338 < * Closing connection 0 Example_use: `read_chunk` curl 'http://127.0.0.1:4444/foo.50EI18AX.21?offset=3052&size=338' -0 -v * Hostname was NOT found in DNS cache * Trying 127.0.0.1... * Connected to 127.0.0.1 (127.0.0.1) port 4444 (#0) > GET /foo.50EI18AX.21?offset=3052&size=338 HTTP/1.0 > User-Agent: curl/7.37.1 > Host: 127.0.0.1:4444 > Accept: / > * HTTP 1.0, assume close after body < HTTP/1.0 200 OK < Content-Length: 338 < ## # Host Database # # localhost is used to configure the loopback interface # when the system is booting. Do not change this entry. ## 127.0.0.1 localhost 127.0.0.1 test.localhost 255.255.255.255 broadcasthost ::1 localhost fe80::1%lo0 localhost # Xxxxxxx Yyyyy 192.168.99.222 zzzzz 127.0.0.1 aaaaaaaa.bb.ccccccccc.com * Closing connection 0	2015-05-22 17:51:06 +09:00
Scott Lystig Fritchie	7ba962e9fb	Fix handling of {error, partial_read}	2015-05-21 15:12:46 +09:00
Scott Lystig Fritchie	9e41162e65	Minor machi_basho_bench_driver changes	2015-05-20 18:44:04 +09:00
Scott Lystig Fritchie	1c13273efd	Add simple basho_bench driver, no schmancy bells or whistles	2015-05-20 17:30:37 +09:00
Scott Lystig Fritchie	69244691f4	Such wonder when one reads the docs...	2015-05-20 14:12:48 +09:00
Scott Lystig Fritchie	b5ddfaf019	Finish basic API for machi_cr_client.erl	2015-05-19 20:04:36 +09:00
Scott Lystig Fritchie	a4266e8aa4	Fix known chain repair bugs, add basic smoke test	2015-05-19 19:32:48 +09:00
Scott Lystig Fritchie	152e487060	WIP: read-repair, new test is failing, yay	2015-05-19 15:15:05 +09:00
Scott Lystig Fritchie	079d15dd5c	Derp, remove debugging goop + fix eunit @ write_chunk() response change	2015-05-19 14:05:18 +09:00
Scott Lystig Fritchie	eaf007ec08	Fix read repair FLU tracking	2015-05-19 13:56:12 +09:00
Scott Lystig Fritchie	f7274e7106	WIP: brute-force read-repair	2015-05-18 23:26:21 +09:00
Scott Lystig Fritchie	185c670b2f	WIP: refactoring machi_cr_client:append_chunk*	2015-05-18 19:06:06 +09:00
Scott Lystig Fritchie	a347722a15	Fix {error,not_written} type bugs in chmgr	2015-05-18 17:32:22 +09:00
Scott Lystig Fritchie	966d2edae8	WIP: refactoring machi_cr_client:append_chunk*	2015-05-18 15:49:05 +09:00
Scott Lystig Fritchie	a7f53cf21a	WIP: starting machi_cr_client:append_chunk*	2015-05-18 00:59:24 +09:00
Scott Lystig Fritchie	b0607ae815	WIP: starting machi_cr_client:append_chunk*	2015-05-18 00:33:15 +09:00
Scott Lystig Fritchie	d293170e92	WIP: starting machi_cr_client.erl	2015-05-17 23:48:05 +09:00
Scott Lystig Fritchie	10364834de	Add a dummy client-side implementation module:machi_yessir_client.erl	2015-05-17 19:00:51 +09:00
Scott Lystig Fritchie	d4080b78d8	WIP: rearrange client code to approach some semblance of modularity someday	2015-05-17 16:46:50 +09:00
Scott Lystig Fritchie	a8c5879d21	WIP: rearrange client code to approach some semblance of modularity someday	2015-05-17 16:25:58 +09:00
Scott Lystig Fritchie	a06055ac23	WIP: rearrange client code to approach some semblance of modularity someday	2015-05-17 16:18:30 +09:00
Scott Lystig Fritchie	c7d4131a44	Remove unused verb()	2015-05-17 14:19:37 +09:00
Scott Lystig Fritchie	6c07522359	Add new API func, append_chunk_extra()	2015-05-17 14:10:42 +09:00
Scott Lystig Fritchie	5c2635346f	Basic multi-party chain repair for ap_mode finished	2015-05-16 17:39:58 +09:00
Scott Lystig Fritchie	d2f1549aa3	WIP: more generic all-way file chunk merge func, part 3	2015-05-16 17:11:54 +09:00
Scott Lystig Fritchie	04bc28b9da	WIP: more generic all-way file chunk merge func, part 2	2015-05-16 16:55:48 +09:00
Scott Lystig Fritchie	a9c753ad64	WIP: more generic all-way file chunk merge func	2015-05-15 17:15:02 +09:00
Scott Lystig Fritchie	358764d403	WIP: repair common file, part 0	2015-05-14 14:04:31 +09:00
Scott Lystig Fritchie	19d3c95325	WIP: aside, damn, add missing hex encoding for epochid, derp	2015-05-13 18:57:38 +09:00
Scott Lystig Fritchie	eec029b08f	WIP: aside, fix FLU wedge status @ init()	2015-05-13 17:59:32 +09:00
Scott Lystig Fritchie	4ae0f94649	WIP: move to stats via ETS, success/failure propagates, yay!	2015-05-12 23:45:35 +09:00
Scott Lystig Fritchie	cad84442bb	WIP: stats record, hrm	2015-05-12 22:42:03 +09:00
Scott Lystig Fritchie	8807f954ff	WIP: Whole file repair is 95% complete, yay!	2015-05-12 21:45:40 +09:00
Scott Lystig Fritchie	f48720e4dc	WIP: set up proxies for repair	2015-05-12 12:56:41 +09:00
Scott Lystig Fritchie	1c70a46b09	Add basic process & bookkeeping structure for repair proc =INFO REPORT==== 11-May-2015::19:50:09 === Chain tail a of [a] starting repair of [c] =INFO REPORT==== 11-May-2015::19:50:12 === Chain tail a of [a]: repair finished in 2.438 seconds: todo_yo	2015-05-11 19:50:21 +09:00
Scott Lystig Fritchie	c82000dc30	Reduce spamminess slightly	2015-05-11 19:00:21 +09:00
Scott Lystig Fritchie	33bfbe109e	Chain manager bug fixes & enhancment (more...) * Set max length of a chain at -define(MAX_CHAIN_LENGTH, 64). * Perturb tick sleep time of each manager * If a chain manager L has zero members in its chain, and then its local public projection store (authored by some remote author R) has a projection that contains L, then adopt R's projection and start humming consensus. * Handle "cross-talk" across projection stores, when chain membership is changed administratively, e.g. chain was [a,b,c] then changed to merely [a], but that change only happens on a. Servers b & c continue to use stale projections and scribble their projection suggestions to a, causing it to flap. What's really cool about the flapping handling is that it works. I wasn't thinking about this scenario when designing the flapping logic, but it's really nifty that this extra scenario causes a to flap and then a's inner projection remains stable, yay! * Add complaints when "cross-talk" is observed. * Fix flapping sleep time throttle. * Fix bug in the machi_projection_store.erl's bookkeeping of the max epoch number when flapping.	2015-05-11 18:41:45 +09:00
Scott Lystig Fritchie	dc43a32945	WIP: tests for wedge state all working	2015-05-08 21:37:19 +09:00
Scott Lystig Fritchie	0dd9282789	WIP: fix other broken eunit tests, surrounding wedge state	2015-05-08 21:24:07 +09:00
Scott Lystig Fritchie	6f7818fca7	WIP: additional tests for wedge state	2015-05-08 19:50:47 +09:00
Scott Lystig Fritchie	316126fa59	WIP: additional tests for wedge state	2015-05-08 19:07:57 +09:00
Scott Lystig Fritchie	7906e6c235	WIP: basic wedge notifications now working	2015-05-08 18:17:41 +09:00
Scott Lystig Fritchie	d6d003618d	WIP: add wedge_status() query to proxy client	2015-05-08 16:58:06 +09:00
Scott Lystig Fritchie	1dc759b908	WIP: add wedge_status() query to client	2015-05-08 16:53:10 +09:00
Scott Lystig Fritchie	762aef557f	WIP: Set the stage for FLU wedging API	2015-05-08 15:36:53 +09:00
Scott Lystig Fritchie	ae1d038abe	Change default value of chmgr's use_partition_simulator to false	2015-05-08 13:40:44 +09:00
Scott Lystig Fritchie	238c8472cd	WIP: timeout comments	2015-05-07 18:52:01 +09:00
Scott Lystig Fritchie	14fc37bd0d	Add ability to start FLUs at application startup	2015-05-07 18:39:39 +09:00
Scott Lystig Fritchie	517941aaaa	Finish chain manager restart & membership changing	2015-05-07 17:52:16 +09:00
Scott Lystig Fritchie	aeb2e4ef9e	WIP: partial refactoring of chmgr 2nd start code, one test broken	2015-05-06 11:41:04 +09:00
Scott Lystig Fritchie	a7bd8e43d3	Clean up machi_flu_psup_test.erl	2015-05-02 17:10:23 +09:00
Scott Lystig Fritchie	1675020150	WIP, tests pass again, includign the newest one	2015-05-02 00:33:49 +09:00
Scott Lystig Fritchie	53f6a753f4	WIP: tests pass, but not finished yet	2015-05-01 14:51:42 +09:00
Scott Lystig Fritchie	7bafc1c28a	WIP: stop for the night, we are broken	2015-04-30 23:16:08 +09:00

... 2 3 4 5 6 ...

392 commits