machi

greg/machi

Author	SHA1	Message	Date
Scott Lystig Fritchie	666a05b50b	Comment out debug diag output	2015-03-05 16:53:30 +09:00
Scott Lystig Fritchie	8cdf69c8b9	Remove expensive debugging diag when not needed	2015-03-05 16:53:30 +09:00
Scott Lystig Fritchie	f129801f29	Good news: all_hosed list agreement in all simulated cases & is correct (mostly!) Occasionally, all_hosed will include a node from the previous simulation round that isn't included in the current round. I believe that this is expected, since machi_chain_manager1_test is not trying to "reset" state to no-partitions before switching to the next partition scenario. Please note that the #ch_mgr.flap_limit constant is a hack and needs a better implementation!	2015-03-05 16:53:30 +09:00
Scott Lystig Fritchie	2fe2a23c79	Undo proposed improvement from commit `cebb205b52`, it was harmful.	2015-03-05 16:53:27 +09:00
Scott Lystig Fritchie	6ed7293b02	WIP: debugging, with some good insight now (read on...) I'm getting closer, I think. The data that's being written to the 'flapping_i' proplist is more accurate, thanks to previous commits. Fixing inaccuracies there can only be helpful. We still have this basic problem: % cat /tmp/foo16 \| awk '$1 == "SET" { for (key in a) { print a[key]; a[key] = ""; } print; } { if ($3 == "uses:") { a[$2] = $0 }}' \| sed -e 's/uses:.all_hosed/uses: [...] all_hosed/' -e 's/all_flap_counts.//' [...] SET partitions = [{c,a}]. 01:31:55.934 d uses: [...] all_hosed,[a,c]},{ 01:31:53.674 a uses: [...] all_hosed,[]},{ 01:31:55.961 b uses: [...] all_hosed,[a,c]},{ 01:31:52.799 c uses: [...] all_hosed,[a]},{ SET partitions = [] The 'all_hosed' values should be converged, but they aren't. It's very interesting that a's all_hosed list is completely empty, oops. But it's also becoming clear that the public epochs "surrounding" each of a's public proposals do indeed have decent all_hosed info. For example, a's report above @ "01:31:53.674" corresponds to: 01:31:53.674 a uses: [{epoch,514},{author,a},{upi,[a]},{repair,[b,d]},{down,[c]},{d,[{flapping_i,[{flap_count,0},{all_hosed,[]},{all_flap_counts,[]},{all_flap_counts_settled,false},{bad,[c]},{da_downu,[c]},{da_hosedtu,[a,c]},{da_downreports,[{513,b,[]},{512,a,[c]},{511,d,[]}]}]},{ps,[{c,a}]},{nodes_up,[a,b,d]}]},{d2,[[]]}] ... and to: {514,a, {projection,514, <<173,82,84,188,162,136,57,149,29,171,29,97,139,30,241,15,128,76,236, 126>>, [a,b,c,d], [c], {1425,486713,673377}, a, [a], [b,d], [{flapping_i, [{flap_count,0}, {all_hosed,[]}, {all_flap_counts,[]}, {all_flap_counts_settled,false}, {bad,[c]}, {da_downu,[c]}, {da_hosedtu,[a,c]}, {da_downreports,[{513,b,[]},{512,a,[c]},{511,d,[]}]}]}, {ps,[{c,a}]}, {nodes_up,[a,b,d]}], []}}, So FLU a knows that FLU c is down, but it just isn't flapping yet. And it hasn't incorporated any flapping_i from other FLUs. If I look at the epochs mentioned in the 'da_downreports', I see 513, 512, and 511. 513 is authored by b, flapping_i says: {flap_count,68}, {all_hosed,[a,c]}, 512 is authored by a, flapping_i says: {flap_count,0}, {all_hosed,[]}, 511 is authored by d, flapping_i says: {flap_count,12}, {all_hosed,[a,c]} Looks promising, yeah?	2015-03-05 01:33:12 +09:00
Scott Lystig Fritchie	b8063c9575	Refactoring, plus some intermediate bug hunt results, see below See the TODO comment for details. The scheme that I've been using at test time is: make test \|& tee /tmp/foo7 and elsewhere: cat /tmp/foo7 \| awk '$1 == "SET" { for (key in a) { print a[key]; a[key] = ""; } print; } { if ($3 == "uses:") { a[$2] = $0 }}' \| sed -e 's/uses:.all_hosed/uses: [...] all_hosed/' -e 's/all_flap_counts.//' Which prints our sub-optimal results like: SET partitions = [{c,a}]. 21:09:06.603 d uses: [...] all_hosed,[a,c]},{ 21:09:04.203 a uses: [...] all_hosed,[]},{ 21:09:06.640 b uses: [...] all_hosed,[a,c]},{ 21:09:03.409 c uses: [...] all_hosed,[a]},{ FLU a should definitely be seeing enough information to avoid all_hosed=[]. Gadz. To be resumed tomorrow.	2015-03-04 21:11:06 +09:00
Scott Lystig Fritchie	b558de2244	Refactor: long-overdue for some proplists code sharing	2015-03-04 19:58:01 +09:00
Scott Lystig Fritchie	4e01f5bac0	Fix checksumming bug: dbg2 is not checksummed	2015-03-04 19:26:07 +09:00
Scott Lystig Fritchie	cf56e2d388	Fix ?REACT() tags from edit long ago {sigh}	2015-03-04 18:20:33 +09:00
Scott Lystig Fritchie	cebb205b52	Improvement: if I am in all_hosed, goto A50 directly, BUT! ... ... but it also appears to have a negative impact on the gossip-hack that I've added to try to detect when everyone stabilized on a value for all_hosed. {sigh} So, it isn't clear if this patch is worthwhile!	2015-03-04 18:19:13 +09:00
Scott Lystig Fritchie	6e4e2ac6bf	Trivial cleanup: comments & unused vars	2015-03-04 16:48:48 +09:00
Scott Lystig Fritchie	fbdf606128	TODO future research: does this flapping constant make a significant difference?	2015-03-04 16:48:17 +09:00
Scott Lystig Fritchie	938f3c16f2	Add all_flap_counts_settled, minor cleanups	2015-03-04 14:37:26 +09:00
Scott Lystig Fritchie	24fc2a773c	Move calculate_flaps() call to A30	2015-03-04 14:37:24 +09:00
Scott Lystig Fritchie	77f16523f4	Swap A20 and A30's code	2015-03-04 14:37:23 +09:00
Scott Lystig Fritchie	07e477598a	OK, asymmetric network partition handling is stable. Time for next stage. Sketch: 1. Exchange the roles of A30 and A20 2. Query each participants' public & private stores, use them for A20 and A30 3. Determine if there's flapping before calculating a new latest. 3b. In flapping_i props, keep non-flapping UPI+Repairing and Down list. When calculating unique UPI+Repairing history and the unique Down lists history, use the non-flapping version if it exists. grrr/alt idea/bogus??.... Change the calculation of the unique UPIs and unique down lists. If the UPI & Down list looks the same as last time, and last time we were flapping, then we're still flapping. 4a. Calculate a proposal using the standard Down info. 4b. If there is flapping, then use Down <- All_Hosed instead of calculated down. 5. If 4b's proposal is used ('cause flapping exists), preserve the non-flapping histories from 4a in each new proposal	2015-03-04 14:37:22 +09:00
Scott Lystig Fritchie	ecd56a194e	Yay, back to stability, I believe	2015-03-04 14:37:19 +09:00
Scott Lystig Fritchie	187ed6ddce	Yay, back to stability, I believe	2015-03-04 14:37:18 +09:00
Scott Lystig Fritchie	b03f88de43	Yay, back to stability, I believe	2015-03-04 14:37:16 +09:00
Scott Lystig Fritchie	2ae8a42d07	WIP: broken, alas, still wip	2015-03-04 14:37:14 +09:00
Scott Lystig Fritchie	50dd83919a	WIP: getting closer to transitive flap_count counting, but not there yet	2015-03-04 14:36:56 +09:00
Scott Lystig Fritchie	35c9f41cf4	WIP: back to PULSE model for experimenting	2015-03-04 14:36:54 +09:00
Scott Lystig Fritchie	4f28191552	Clean up some of the convergence_demo_test() code, add lots of documentation	2015-03-04 14:36:51 +09:00
Scott Lystig Fritchie	1488587dd9	WIP: by golly, it seems to work 100% well with asymmetric network partitions?!	2015-03-04 14:36:48 +09:00
Scott Lystig Fritchie	0b4e106635	WIP: getting closer. Add i_flapping info to all proposed projs @ B10. The placement of this flapping stuff is suboptimal. I'd prefer it to be at A20 where all of the real projection calculation is done. But this is still a prototype ... I'm still trying to figure out where & how is the best place to react. Moving the calculation from B-something to A-something can come later.	2015-03-04 14:36:45 +09:00
Scott Lystig Fritchie	27f17e88ad	WIP: try to store flapping/asymmetric network partition information, then act upon it	2015-03-04 14:36:39 +09:00
Scott Lystig Fritchie	f4a2a453bd	WIP: experiments with flap detection and mitigation	2015-03-04 14:36:37 +09:00
Scott Lystig Fritchie	240b7abe2a	Rename prototype/poc-machi -> prototype/chain-manager	2015-03-02 20:25:40 +09:00
Scott Lystig Fritchie	f772fb334a	Added 4th FLU, yes, the partially connected/flapping counting & dampening works! So, it definitely works, in that it stops a low(er) ranking flapping process from continuing to make new proposals, so then the cycle of flapping stops. Whenever an up/down state changes and a new/different proposal is made, then things immediately resume, yay. However, there's still a problem of the chain state at this time, I believe. Here's a session that's damped by the flap counter: SET always_last_partitions ON ... we should see convergence to correct chains. 21:23:03.170 d uses: [{epoch,457},{author,a},{upi,[a]},{repair,[b,d,c]},{down,[]},{d,[{author_proc,react},{ps,[]},{nodes_up,[a,b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,3}}}]}] 21:23:03.270 c uses: [{epoch,457},{author,a},{upi,[a]},{repair,[b,d,c]},{down,[]},{d,[{author_proc,react},{ps,[]},{nodes_up,[a,b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,3}}}]}] 21:23:03.471 a uses: [{epoch,459},{author,a},{upi,[a,d]},{repair,[c]},{down,[b]},{d,[{repair_airquote_done,{we_agree,457}},{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,3}}}]}] 21:23:03.611 b uses: [{epoch,460},{author,b},{upi,[b]},{repair,[c,d]},{down,[a]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,3}}}]}] 21:23:03.635 d uses: [{epoch,461},{author,d},{upi,[a]},{repair,[b,d,c]},{down,[]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,3}}}]}] 21:23:03.672 c uses: [{epoch,461},{author,d},{upi,[a]},{repair,[b,d,c]},{down,[]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,3}}}]}] 21:23:03.873 a uses: [{epoch,462},{author,a},{upi,[a,d]},{repair,[c]},{down,[b]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,3}}}]}] 21:23:04.155 d uses: [{epoch,463},{author,d},{upi,[a]},{repair,[b,d,c]},{down,[]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,4}}}]}] 21:23:04.198 c uses: [{epoch,463},{author,d},{upi,[a]},{repair,[b,d,c]},{down,[]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,4}}}]}] 21:23:04.270 b uses: [{epoch,464},{author,b},{upi,[b]},{repair,[c,d]},{down,[a]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,4}}}]}] 21:23:04.276 a uses: [{epoch,465},{author,a},{upi,[a,d]},{repair,[c]},{down,[b]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,4}}}]}] 21:23:04.652 d uses: [{epoch,466},{author,d},{upi,[a]},{repair,[b,d,c]},{down,[]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,4}}}]}] 21:23:04.660 c uses: [{epoch,466},{author,d},{upi,[a]},{repair,[b,d,c]},{down,[]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,4}}}]}] 21:23:04.679 a uses: [{epoch,467},{author,a},{upi,[a,d]},{repair,[c]},{down,[b]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,4}}}]}] 21:23:04.914 b uses: [{epoch,468},{author,b},{upi,[b]},{repair,[c,d]},{down,[a]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,4}}}]}] 21:23:05.058 d uses: [{epoch,469},{author,d},{upi,[a]},{repair,[b,d,c]},{down,[]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,5}}}]}] 21:23:05.062 c uses: [{epoch,469},{author,d},{upi,[a]},{repair,[b,d,c]},{down,[]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,5}}}]}] 21:23:05.081 a uses: [{epoch,470},{author,a},{upi,[a,d]},{repair,[c]},{down,[b]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,5}}}]}] 21:23:05.579 b uses: [{epoch,471},{author,b},{upi,[b]},{repair,[c,d]},{down,[a]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,5}}}]}] 21:23:05.581 d uses: [{epoch,472},{author,d},{upi,[a]},{repair,[b,d,c]},{down,[]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,5}}}]}] 21:23:05.590 c uses: [{epoch,472},{author,d},{upi,[a]},{repair,[b,d,c]},{down,[]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,5}}}]}] 21:23:05.885 a uses: [{epoch,473},{author,a},{upi,[a,d]},{repair,[c]},{down,[b]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,5}}}]}] 21:23:06.102 d uses: [{epoch,474},{author,d},{upi,[a]},{repair,[b,d,c]},{down,[]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,6}}}]}] 21:23:06.159 c uses: [{epoch,474},{author,d},{upi,[a]},{repair,[b,d,c]},{down,[]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,6}}}]}] 21:23:06.250 b uses: [{epoch,475},{author,b},{upi,[b]},{repair,[c,d]},{down,[a]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,6}}}]}] 21:23:06.288 a uses: [{epoch,476},{author,a},{upi,[a,d]},{repair,[c]},{down,[b]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,6}}}]}] 21:23:06.612 d uses: [{epoch,477},{author,d},{upi,[a]},{repair,[b,d,c]},{down,[]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,6}}}]}] 21:23:06.620 c uses: [{epoch,477},{author,d},{upi,[a]},{repair,[b,d,c]},{down,[]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,6}}}]}] 21:23:06.691 a uses: [{epoch,478},{author,a},{upi,[a,d]},{repair,[c]},{down,[b]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,6}}}]}] 21:23:06.893 b uses: [{epoch,479},{author,b},{upi,[b]},{repair,[c,d]},{down,[a]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,6}}}]}] 21:23:07.015 d uses: [{epoch,480},{author,d},{upi,[a]},{repair,[b,d,c]},{down,[]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,7}}}]}] 21:23:07.022 c uses: [{epoch,480},{author,d},{upi,[a]},{repair,[b,d,c]},{down,[]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,7}}}]}] 21:23:07.094 a uses: [{epoch,481},{author,a},{upi,[a,d]},{repair,[c]},{down,[b]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,7}}}]}] 21:23:07.516 d uses: [{epoch,482},{author,d},{upi,[a]},{repair,[b,d,c]},{down,[]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,7}}}]}] 21:23:07.550 b uses: [{epoch,483},{author,b},{upi,[b]},{repair,[c,d]},{down,[a]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,7}}}]}] {FLAP: c flaps 4}! {FLAP: c flaps 5}! 21:23:07.898 a uses: [{epoch,484},{author,a},{upi,[a,d]},{repair,[c]},{down,[b]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,7}}}]}] 21:23:08.010 d uses: [{epoch,485},{author,d},{upi,[a]},{repair,[b,d,c]},{down,[]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,8}}}]}] 21:23:08.013 c uses: [{epoch,485},{author,d},{upi,[a]},{repair,[b,d,c]},{down,[]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[a,b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,8}}}]}] 21:23:08.221 b uses: [{epoch,486},{author,b},{upi,[b]},{repair,[c,d]},{down,[a]},{d,[{author_proc,react},{ps,[{a,b}]},{nodes_up,[b,c,d]}]},{d2,[{network_islands,[na_reset_by_always]},{hooray,{v2,{2014,11,8},{21,23,8}}}]}] {FLAP: a flaps 5}! {FLAP: a flaps 6}! SET always_last_partitions OFF ... let loose the dogs of war! 21:23:17.349 b uses: [{epoch,495},{author,b},{upi,[b]},{repair,[c,d,a]},{down,[]},{d,[{author_proc,react},{ps,[]},{nodes_up,[a,b,c,d]}]},{d2,[{network_islands,[islands_not_supported]},{hooray,{v2,{2014,11,8},{21,23,17}}}]}] So, the state of the chains at 21:23:11.221, three seconds after the flapping detector finished, is: epoch=484, UPI=[a,d], repair=[c], nodes_up=[a,c,d] epoch=485, UPI=[a], repair=[b,d,c], nodes_up=[a,b,c,d] epoch=486, UPI=[b], repair=[c,d], nodes_up=[b,c,d] The UPIs are overlapping, derp, that won't work, thanks to the magic of epoch version # enforcement, However, the clients need to concern themselves with the repairing members, also. As soon as a client in the epoch=486 sends an op to FLU c or FLU d, those nodes will wedge themselves because they're in a different epoch. Everyone will get stuck, and then life sucks. Future work TBD!	2015-03-02 20:20:20 +09:00
Scott Lystig Fritchie	69d08c4328	Add partially-disconnected/asymmetric network partition support, a first attempt	2015-03-02 20:20:20 +09:00
Scott Lystig Fritchie	a9229df8e2	WIP: flap detection, broken right now	2015-03-02 20:20:20 +09:00
Scott Lystig Fritchie	9b828f87e5	WIP: flap detection, broken right now	2015-03-02 20:20:20 +09:00
Scott Lystig Fritchie	7433f6cf99	WIP: deprecate proj_proposed	2015-03-02 20:20:20 +09:00
Scott Lystig Fritchie	6fb3f55fee	WIP: minor refactoring	2015-03-02 20:20:20 +09:00
Scott Lystig Fritchie	89f81e45b1	Add machi_partition_simulator:always_these_partitions(Parts), try a hard scenario So, this is an interesting case where an asymmetric network parittion can cause the current algorithm to cycle for several seconds, then one participant X becomes less active (I'm not sure why), the other two participants slowly come to an agreement, then X seems to wake up and return everyone to the cycle/flapping loop. SET always_last_partitions ON ... we should see convergence to correct chains. 16:35:03.986 c uses: [{epoch,321},{author,b},{upi,[b]},{repair,[a,c]},{down,[]},{d,[{author_proc,react},{ps,[]},{nodes_up,[a,b,c]}]},{d2,[]}] 16:35:04.118 b uses: [{epoch,323},{author,b},{upi,[b,c]},{repair,[]},{down,[a]},{d,[{repair_airquote_done,{we_agree,321}},{author_proc,react},{ps,[{b,a}]},{nodes_up,[b,c]}]},{d2,[]}] 16:35:04.492 c uses: [{epoch,324},{author,c},{upi,[b]},{repair,[a,c]},{down,[]},{d,[{author_proc,react},{ps,[{b,a}]},{nodes_up,[a,b,c]}]},{d2,[]}] 16:35:04.520 b uses: [{epoch,325},{author,b},{upi,[b,c]},{repair,[]},{down,[a]},{d,[{author_proc,react},{ps,[{b,a}]},{nodes_up,[b,c]}]},{d2,[]}] 16:35:04.583 a uses: [{epoch,326},{author,a},{upi,[a]},{repair,[c]},{down,[b]},{d,[{author_proc,react},{ps,[{b,a}]},{nodes_up,[a,c]}]},{d2,[]}] 16:35:04.894 c uses: [{epoch,327},{author,c},{upi,[b]},{repair,[a,c]},{down,[]},{d,[{author_proc,react},{ps,[{b,a}]},{nodes_up,[a,b,c]}]},{d2,[]}] 16:35:04.922 b uses: [{epoch,328},{author,b},{upi,[b,c]},{repair,[]},{down,[a]},{d,[{author_proc,react},{ps,[{b,a}]},{nodes_up,[b,c]}]},{d2,[]}] 16:35:05.291 a uses: [{epoch,329},{author,a},{upi,[a]},{repair,[c]},{down,[b]},{d,[{author_proc,react},{ps,[{b,a}]},{nodes_up,[a,c]}]},{d2,[]}] 16:35:05.296 c uses: [{epoch,330},{author,c},{upi,[b]},{repair,[a,c]},{down,[]},{d,[{author_proc,react},{ps,[{b,a}]},{nodes_up,[a,b,c]}]},{d2,[]}] 16:35:05.324 b uses: [{epoch,331},{author,b},{upi,[b,c]},{repair,[]},{down,[a]},{d,[{author_proc,react},{ps,[{b,a}]},{nodes_up,[b,c]}]},{d2,[]}] 16:35:05.830 c uses: [{epoch,332},{author,c},{upi,[b]},{repair,[a,c]},{down,[]},{d,[{author_proc,react},{ps,[{b,a}]},{nodes_up,[a,b,c]}]},{d2,[]}] 16:35:06.023 a uses: [{epoch,333},{author,a},{upi,[a]},{repair,[c]},{down,[b]},{d,[{author_proc,react},{ps,[{b,a}]},{nodes_up,[a,c]}]},{d2,[]}] 16:35:06.128 b uses: [{epoch,334},{author,b},{upi,[b,c]},{repair,[]},{down,[a]},{d,[{author_proc,react},{ps,[{b,a}]},{nodes_up,[b,c]}]},{d2,[]}] 16:35:06.342 c uses: [{epoch,335},{author,c},{upi,[b]},{repair,[a,c]},{down,[]},{d,[{author_proc,react},{ps,[{b,a}]},{nodes_up,[a,b,c]}]},{d2,[]}] 16:35:06.530 b uses: [{epoch,336},{author,b},{upi,[b,c]},{repair,[]},{down,[a]},{d,[{author_proc,react},{ps,[{b,a}]},{nodes_up,[b,c]}]},{d2,[]}] 16:35:06.734 a uses: [{epoch,337},{author,a},{upi,[a]},{repair,[c]},{down,[b]},{d,[{author_proc,react},{ps,[{b,a}]},{nodes_up,[a,c]}]},{d2,[]}] 16:35:06.746 c uses: [{epoch,338},{author,c},{upi,[b]},{repair,[a,c]},{down,[]},{d,[{author_proc,react},{ps,[{b,a}]},{nodes_up,[a,b,c]}]},{d2,[]}] 16:35:06.932 b uses: [{epoch,339},{author,b},{upi,[b,c]},{repair,[]},{down,[a]},{d,[{author_proc,react},{ps,[{b,a}]},{nodes_up,[b,c]}]},{d2,[]}] 16:35:07.267 c uses: [{epoch,340},{author,c},{upi,[b]},{repair,[a,c]},{down,[]},{d,[{author_proc,react},{ps,[{b,a}]},{nodes_up,[a,b,c]}]},{d2,[]}] 16:35:07.334 b uses: [{epoch,341},{author,b},{upi,[b,c]},{repair,[]},{down,[a]},{d,[{author_proc,react},{ps,[{b,a}]},{nodes_up,[b,c]}]},{d2,[]}] 16:35:07.460 a uses: [{epoch,342},{author,a},{upi,[a]},{repair,[c]},{down,[b]},{d,[{author_proc,react},{ps,[{b,a}]},{nodes_up,[a,c]}]},{d2,[]}] 16:35:07.669 c uses: [{epoch,343},{author,c},{upi,[b]},{repair,[a,c]},{down,[]},{d,[{author_proc,react},{ps,[{b,a}]},{nodes_up,[a,b,c]}]},{d2,[]}] 16:35:07.736 b uses: [{epoch,344},{author,b},{upi,[b,c]},{repair,[]},{down,[a]},{d,[{author_proc,react},{ps,[{b,a}]},{nodes_up,[b,c]}]},{d2,[]}] 16:35:08.165 a uses: [{epoch,345},{author,a},{upi,[a]},{repair,[c]},{down,[b]},{d,[{author_proc,react},{ps,[{b,a}]},{nodes_up,[a,c]}]},{d2,[]}] 16:35:08.194 c uses: [{epoch,346},{author,c},{upi,[b]},{repair,[a,c]},{down,[]},{d,[{author_proc,react},{ps,[{b,a}]},{nodes_up,[a,b,c]}]},{d2,[]}] 16:35:08.541 b uses: [{epoch,347},{author,b},{upi,[b,c]},{repair,[]},{down,[a]},{d,[{author_proc,react},{ps,[{b,a}]},{nodes_up,[b,c]}]},{d2,[]}] 16:35:08.702 c uses: [{epoch,348},{author,c},{upi,[b]},{repair,[a,c]},{down,[]},{d,[{author_proc,react},{ps,[{b,a}]},{nodes_up,[a,b,c]}]},{d2,[]}] 16:35:08.894 a uses: [{epoch,349},{author,a},{upi,[a]},{repair,[c]},{down,[b]},{d,[{author_proc,react},{ps,[{b,a}]},{nodes_up,[a,c]}]},{d2,[]}] 16:35:08.944 b uses: [{epoch,350},{author,b},{upi,[b,c]},{repair,[]},{down,[a]},{d,[{author_proc,react},{ps,[{b,a}]},{nodes_up,[b,c]}]},{d2,[]}] 16:35:09.212 c uses: [{epoch,351},{author,c},{upi,[b]},{repair,[a,c]},{down,[]},{d,[{author_proc,react},{ps,[{b,a}]},{nodes_up,[a,b,c]}]},{d2,[]}] 16:35:09.346 b uses: [{epoch,352},{author,b},{upi,[b,c]},{repair,[]},{down,[a]},{d,[{author_proc,react},{ps,[{b,a}]},{nodes_up,[b,c]}]},{d2,[]}] 16:35:09.598 a uses: [{epoch,353},{author,a},{upi,[a]},{repair,[c]},{down,[b]},{d,[{author_proc,react},{ps,[{b,a}]},{nodes_up,[a,c]}]},{d2,[]}] 16:35:09.614 c uses: [{epoch,354},{author,c},{upi,[b]},{repair,[a,c]},{down,[]},{d,[{author_proc,react},{ps,[{b,a}]},{nodes_up,[a,b,c]}]},{d2,[]}] 16:35:09.748 b uses: [{epoch,355},{author,b},{upi,[b,c]},{repair,[]},{down,[a]},{d,[{author_proc,react},{ps,[{b,a}]},{nodes_up,[b,c]}]},{d2,[]}] 16:35:10.135 c uses: [{epoch,356},{author,c},{upi,[b]},{repair,[a,c]},{down,[]},{d,[{author_proc,react},{ps,[{b,a}]},{nodes_up,[a,b,c]}]},{d2,[]}] 16:35:10.150 b uses: [{epoch,357},{author,b},{upi,[b,c]},{repair,[]},{down,[a]},{d,[{author_proc,react},{ps,[{b,a}]},{nodes_up,[b,c]}]},{d2,[]}] 16:35:10.329 a uses: [{epoch,358},{author,a},{upi,[a]},{repair,[c]},{down,[b]},{d,[{author_proc,react},{ps,[{b,a}]},{nodes_up,[a,c]}]},{d2,[]}] 16:35:10.537 c uses: [{epoch,359},{author,c},{upi,[b]},{repair,[a,c]},{down,[]},{d,[{author_proc,react},{ps,[{b,a}]},{nodes_up,[a,b,c]}]},{d2,[]}] 16:35:10.552 b uses: [{epoch,360},{author,b},{upi,[b,c]},{repair,[]},{down,[a]},{d,[{author_proc,react},{ps,[{b,a}]},{nodes_up,[b,c]}]},{d2,[]}]	2015-03-02 20:20:20 +09:00
Scott Lystig Fritchie	6893d8af52	Re-introduce the 1-way partition generation scheme of olde, default=oneway_partitions This is a return to the old, possibly asymmetric/unidirectional network partition simulation scheme. PULSE testing so far for the symmetric/bidirectional partitioning scheme (via the "islands" approach) appears to be very stable, yay. So, let's go back to the harder environment and see what happens!	2015-03-02 20:20:20 +09:00
Scott Lystig Fritchie	098524ea2d	Change verbosity of react_to_env_C120()	2015-03-02 20:20:20 +09:00
Scott Lystig Fritchie	61d3aafccb	Fix non-TEST compilation errors	2015-03-02 20:20:20 +09:00
Scott Lystig Fritchie	e0d4dce8af	Fix PULSE model to work around after-the-fact/retrospective sanity check limitation	2015-03-02 20:20:20 +09:00
Scott Lystig Fritchie	2471e61cc7	Fix PULSE model problem, yay!	2015-03-02 20:20:20 +09:00
Scott Lystig Fritchie	72adece716	WIP: EUnit and PULSE test fixing	2015-03-02 20:20:20 +09:00
Scott Lystig Fritchie	5d86b23851	PULSE: try to avoid false positives, add verbosity, tighten shrinking	2015-03-02 20:20:19 +09:00
Scott Lystig Fritchie	6ded565f26	Add extra history to exception tuple in projection_transition_is_sane()	2015-03-02 20:20:19 +09:00
Scott Lystig Fritchie	502afc7c19	Fix error in projection_transition_is_sane()	2015-03-02 20:20:19 +09:00
Scott Lystig Fritchie	96f5b329c9	Tweaks for PULSE	2015-03-02 20:20:19 +09:00
Scott Lystig Fritchie	39bee01936	Initial PULSE test for chain manager is done and ready for punishment	2015-03-02 20:20:19 +09:00
Scott Lystig Fritchie	ed900c2a5f	Fix up pid vs. atom name usage in chain manager	2015-03-02 20:20:19 +09:00
Scott Lystig Fritchie	16a45660cc	WIP: Initial PULSE test for chain manager	2015-03-02 20:20:19 +09:00
Scott Lystig Fritchie	ac2af6d1ae	Minor changes for initial PULSE testing	2015-03-02 20:20:19 +09:00
Scott Lystig Fritchie	e72f239905	Remove spammy message but keep the TODO string in place	2015-03-02 20:20:19 +09:00

1 2 3

125 commits