Run via:
env PULSE_NOSHRINK=yes PULSE_SKIP_NEW=yes PULSE_TIME=900 make pulse
So, this one hangs here:
tick-<0.991.0>,dump_state(){prop,machi_chain_manager1_pulse,358,<0.891.0>}
At machi_chain_manager1_pulse.erl line 358, that's after the return
of run_commands(). The next verbose message should come from line
362, after the return of pulse:run(), but that message never appears.
My laptop CPU is really busy (fans running, case is hot), but both
console & disterl aren't available now, so no idea why, alas.
Ah, when I run with a console available and then run Redbug, there is
zero activity calling both machi_chain_manager1_pulse:'_' and
machi_chain_manager1:'_'
This may be related to a bad/ugly shutdown? In both hang cases,
I see at least one SASL error message such as the one below ...
BUT! There should be erlang:display() messages from the shutdown_hard()
function, which does some exit(Pid, kill) calls, but there is no output
from them! So, the killing is coming from some kind of PULSE-initiated
process shutdown/cleanup/??
=SUPERVISOR REPORT==== 16-Jul-2015::20:24:31 ===
Supervisor: {local,machi_sup}
Context: shutdown_error
Reason: killed
Offender: [{pid,<0.200.0>},
{name,machi_flu_sup},
{mfargs,{machi_flu_sup,start_link,[]}},
{restart_type,permanent},
{shutdown,5000},
{child_type,supervisor}]
Also, add more misc details to the 'react' breadcrumb trail. Also,
save get(react) results into dbg2 whenever we write a private projection,
very valuable for debugging.
Also: cleanup PULSE code, add regression commands as option and
controls with some new environment variables. These regression
sequences were responsbile for several fruitful debugging sessions,
so we keep them for posterity and for their ability (with new seeds
and PULSE) to find new interleavings.
So, the PULSE test is failing, which is good. However, I believe
that the failures are all due to the model now being *too strict*.
The model is now catching failures which are now benign, I think.
{bummer_NOT_DISJOINT,{[a,b,b,c,d],
[{a,not_in_this_epoch},
{b,not_in_this_epoch},
{c,"[{epoch,1546},{author,c},{upi,[c]},{repair,[b]},{down,[a,d]},{d,[{ps,[{a,c},{c,a},{a,d},{b,d},{c,d}]},{nodes_up,[b,c]}]},{d2,[]}]"},
{d,"[{epoch,1546},{author,d},{upi,[d]},{repair,[a,b]},{down,[c]},{d,[{ps,[{c,b},{d,c}]},{nodes_up,[a,b,d]}]},{d2,[]}]"}]}}},
In this and all other examples, the UPIs are disjoint but the
repairs are not disjoint. I believe the model ought to be
ignoring the repair list.
{bummer_NOT_DISJOINT,{[a,a,b],
[{a,"[{epoch,1174},{author,a},{upi,[a]},{repair,[]},{down,[b]},{d,[{ps,[{a,b},{b,a}]},{nodes_up,[a]}]},{d2,[]}]"},
{b,"[{epoch,1174},{author,b},{upi,[b]},{repair,[a]},{down,[]},{d,[{ps,[]},{nodes_up,[a,b]}]},{d2,[]}]"}]}}},
or
{bummer_NOT_DISJOINT,{[c,c,e],
[{a,not_in_this_epoch},
{b,not_in_this_epoch},
{c,"[{epoch,1388},{author,c},{upi,[c]},{repair,[]},{down,[a,b,d,e]},{d,[{ps,[{a,b},{a,c},{c,a},{a,d},{d,a},{e,a},{c,b},{b,e},{e,b},{c,d},{e,c},{e,d}]},{nodes_up,[c]}]},{d2,[]}]"},
{d,not_in_this_epoch},
{e,"[{epoch,1388},{author,e},{upi,[e]},{repair,[c]},{down,[a,b,d]},{d,[{ps,[{a,b},{b,a},{a,c},{c,a},{a,d},{d,a},{a,e},{e,a},{b,c},{c,b},{b,d},{b,e},{e,b},{c,d},{d,c},{d,e},{e,d}]},{nodes_up,[c,e]}]},{d2,[]}]"}]}}},