Also, add more misc details to the 'react' breadcrumb trail. Also,
save get(react) results into dbg2 whenever we write a private projection,
very valuable for debugging.
Also: cleanup PULSE code, add regression commands as option and
controls with some new environment variables. These regression
sequences were responsbile for several fruitful debugging sessions,
so we keep them for posterity and for their ability (with new seeds
and PULSE) to find new interleavings.
So, the PULSE test is failing, which is good. However, I believe
that the failures are all due to the model now being *too strict*.
The model is now catching failures which are now benign, I think.
{bummer_NOT_DISJOINT,{[a,b,b,c,d],
[{a,not_in_this_epoch},
{b,not_in_this_epoch},
{c,"[{epoch,1546},{author,c},{upi,[c]},{repair,[b]},{down,[a,d]},{d,[{ps,[{a,c},{c,a},{a,d},{b,d},{c,d}]},{nodes_up,[b,c]}]},{d2,[]}]"},
{d,"[{epoch,1546},{author,d},{upi,[d]},{repair,[a,b]},{down,[c]},{d,[{ps,[{c,b},{d,c}]},{nodes_up,[a,b,d]}]},{d2,[]}]"}]}}},
In this and all other examples, the UPIs are disjoint but the
repairs are not disjoint. I believe the model ought to be
ignoring the repair list.
{bummer_NOT_DISJOINT,{[a,a,b],
[{a,"[{epoch,1174},{author,a},{upi,[a]},{repair,[]},{down,[b]},{d,[{ps,[{a,b},{b,a}]},{nodes_up,[a]}]},{d2,[]}]"},
{b,"[{epoch,1174},{author,b},{upi,[b]},{repair,[a]},{down,[]},{d,[{ps,[]},{nodes_up,[a,b]}]},{d2,[]}]"}]}}},
or
{bummer_NOT_DISJOINT,{[c,c,e],
[{a,not_in_this_epoch},
{b,not_in_this_epoch},
{c,"[{epoch,1388},{author,c},{upi,[c]},{repair,[]},{down,[a,b,d,e]},{d,[{ps,[{a,b},{a,c},{c,a},{a,d},{d,a},{e,a},{c,b},{b,e},{e,b},{c,d},{e,c},{e,d}]},{nodes_up,[c]}]},{d2,[]}]"},
{d,not_in_this_epoch},
{e,"[{epoch,1388},{author,e},{upi,[e]},{repair,[c]},{down,[a,b,d]},{d,[{ps,[{a,b},{b,a},{a,c},{c,a},{a,d},{d,a},{a,e},{e,a},{b,c},{c,b},{b,d},{b,e},{e,b},{c,d},{d,c},{d,e},{e,d}]},{nodes_up,[c,e]}]},{d2,[]}]"}]}}},
Ha, famous last words, amirite?
%% The chain sequence/order checks at the bottom of this function aren't
%% as easy-to-read as they ought to be. However, I'm moderately confident
%% that it isn't buggy. TODO: refactor them for clarity.
So, now machi_chain_manager1:projection_transition_is_sane() is using
newer, far less buggy code to make sanity decisions.
TODO: Add support for Retrospective mode. TODO is it really needed?
Examples of how the old code sucks and the new code sucks less.
138> eqc:quickcheck(eqc:testing_time(10, machi_chain_manager1_test:prop_compare_legacy_with_v2_chain_transition_check(whole))).
xxxxxxxxxxxx..x.xxxxxx..x.x....x..xx........................................................Failed! After 69 tests.
[a,b,c]
{c,[a,b,c],[c,b],b,[b,a],[b,a,c]}
Old_res ([335,192,166,160,153,139]): true
New_res: false (why line [1936])
Shrinking xxxxxxxxxxxx.xxxxxxx.xxx.xxxxxxxxxxxxxxxxx(3 times)
[a,b,c]
%% {Author1,UPI1, Repair1,Author2,UPI2, Repair2} %%
{c, [a,b,c],[], a, [b,a],[]}
Old_res ([338,185,160,153,147]): true
New_res: false (why line [1936])
false
Old code is wrong: we've swapped order of a & b, which is bad.
139> eqc:quickcheck(eqc:testing_time(10, machi_chain_manager1_test:prop_compare_legacy_with_v2_chain_transition_check(whole))).
xxxxxxxxxx..x...xx..........xxx..x..............x......x............................................(x10)...(x1)........Failed! After 120 tests.
[b,c,a]
{c,[c,a],[c],a,[a,b],[b,a]}
Old_res ([335,192,185,160,153,123]): true
New_res: false (why line [1936])
Shrinking xx.xxxxxx.x.xxxxxxxx.xxxxxxxxxxx(4 times)
[b,a,c]
%% {Author1,UPI1,Repair1,Author2,UPI2, Repair2} %%
{a, [c], [], c, [c,b],[]}
Old_res ([338,185,160,153,147]): true
New_res: false (why line [1936])
false
Old code is wrong: b wasn't repairing in the previous state.
150> eqc:quickcheck(eqc:testing_time(10, machi_chain_manager1_test:prop_compare_legacy_with_v2_chain_transition_check(whole))).
xxxxxxxxxxx....x...xxxxx..xx.....x.......xxx..x.......xxx...................x................x......(x10).....(x1)........xFailed! After 130 tests.
[c,a,b]
{b,[c],[b,a,c],c,[c,a,b],[b]}
Old_res ([335,214,185,160,153,147]): true
New_res: false (why line [1936])
Shrinking xxxx.x.xxx.xxxxxxx.xxxxxxxxx(4 times)
[c,b,a]
%% {Author1,UPI1,Repair1,Author2,UPI2, Repair2} %%
{c, [c], [a,b], c, [c,b,a],[]}
Old_res ([335,328,185,160,153,111]): true
New_res: false (why line [1981,1679])
false
Old code is wrong: a & b were repairing but UPI2 has a & b in the wrong order.
So, the PB style encoding of the Mpb_LL_WriteProjectionReq message
is about 35-36 times slower than using Erlang's term_to_binary()
and binary_to_term(). {sigh}
So, there's some cheating going on, because some of the parts of
the #projection_v1{} and #p_srvr{} records aren't fully specified.
Those parts are being specified as "opaque" in the field names, e.g.
optional bytes opaque_flap = 10;
optional bytes opaque_inner = 11;
required bytes opaque_dbg = 12;
required bytes opaque_dbg2 = 13;
The serialization that's being used is erlang term sexprs. That isn't
portable. So if/when we really need to deal with a non-Erlang
language, we'll have to straighten this out further.