* When repairing multiple chunks at once and any of its repair
failed, the whole read request and repair work will fail
* Rename read_repair3 and read_repair4 to do_repair_chunks and
do_repair chunk in machi_file_proxy
* This pull request changes return semantics of read_chunk(), that
returns any chunk included in requested range
* First and last chunk may be cut to fit the requested range
* In machi_file_proxy, unwritten_bytes are removed and replaced by
machi_csum_table
This is simply a change of read_chunk() protocol, where a response of
read_chunk() becomes list of written bytes along with checksum. All
related code including repair is changed as such. This is to pass all
tests and not actually supporting partial chunks.
For debuging from shell, some functions in machi_cinfo are exported:
- public_projection/1
- private_projection/1
- fitness/1
- chain_manager/1
- flu1/1
The FLU psup starts the chain manager in active mode by default
(as it should for normal run-time operation.) By adding the
{active_mode, false} tuple to the options list, we can
tell the chain manager that it should be explicitly manipulated
during tests.
I am treating the original write-once branch as a prototype
which I am now throwing away. I had too much work interleved
in there, so I felt like the best thing to do would be to cut
a new clean branch and pull the files over and start over
against a recent-ish master.
We will have to refactor the other things in FLU in a more
piecemeal fashion.
Last night we hit a rare case of failed convergence.
f was out of sync with the rest of the world.
f: upi=[b,g,f] repairing=[a,c]
The "rest of the world" used a larger chain at:
*: upi=[c,b,g,a], repairing=[f]
And f refused to join the larger chain because of the way that
IsRelevantToMe_p was being calculated before this commit.
Hrrrm, though, I'm not convinced that this particular problem
is fixed 100% by this patch. What if the chain lengths were
the same but also UPI incompatible? e.g. if I remove 'a' from
the "real world (in the partition simulator)" example above:
f: upi=[b,g,f] repairing=[c]
*: upi=[c,b,g], repairing=[f]
Hrmmmmm, I may need to reintroduce the my-recent-adopted-projection-
flapping-like-counter thingie to try to break this kind of
incompatible deadlock.
See comments added in this commit at A40.
So far, I've been doing CP mode testing with a handful of (very useful)
network partition combinations using:
machi_chain_manager1_converge_demo:t(3, [{private_write_verbose,true}, {consistency_mode, cp_mode}, {witnesses, [a]}]).
Next steps:
* Expand number & types of partitions
* Expand to chain lengths of 5 and beyond
So, I'm 50% sure this is a good idea for CP mode: if there's
a later public projection than P_current, then who knows what
we might have missed. So, call make_zerf() to find out the
absolute latest. Problem: flapping state appears to be lost,
booo.
If we use verbose output from:
machi_chain_manager1_converge_demo:t(3, [{private_write_verbose,true}, {consistency_mode, cp_mode}, {witnesses, [a]}]).
And use:
tail -f typescript_file | egrep --line-buffered 'SET|attempted|CONFIRM'
... then we can clearly see a chain safety violation when moving from
epoch 81 -> 83. I need to add more smarts to the safety checking,
both at the individual transition sanity check and at the converge_demo
overall rolling sanity check.
Key to output: CONFIRM by epoch {num} {csum} at {UPI} {Repairing}
SET # of FLUs = 3 members [a,b,c]).
CONFIRM by epoch 1 <<96,161,96,...>> at [a,b] [c]
CONFIRM by epoch 5 <<134,243,175,...>> at [b,c] []
CONFIRM by epoch 7 <<207,93,225,...>> at [b,c] []
CONFIRM by epoch 47 <<60,142,248,...>> at [b,c] []
SET partitions = [{c,b},{c,a}] (1 of 2) at {22,3,34}
CONFIRM by epoch 81 <<223,58,184,...>> at [a,b] []
SET partitions = [{b,c},{b,a}] (2 of 2) at {22,3,38}
CONFIRM by epoch 83 <<33,208,224,...>> at [a,c] []
SET partitions = []
CONFIRM by epoch 85 <<173,179,149,...>> at [a,c] [b]
So, the problem is that the chain manager isn't finishing repair
because UPI=[a], and a is a witness, and a can't do the list files etc etc
repair stuff that repairer FLUs need to do.
The best (?) way forward is to add some advance smarts to the
chain manager so that it doesn't propose a UPI of 100% witnesses?
PULSE managed to create a situation where machi_proxy_flu_client1
would appear to fail a remote attempt to write_projection. The
client would retry, but the 1st attempt really did get through to
the server. So, if we hit this case, we try to read the projection,
and if it's exactly equal to what we tried to write, we consider the
op a success.
Ditto for write_chunk.
Fix up eunit test to accomodate the change of semantics.
Due to changes by slf/chain-manager/cp-mode branch, there are
no longer extraneous epoch changes by "larger" authors that
re-suggest the same UPI+Repairing just because their author rank
is very slightly higher than the current epoch. Thus the
partial_stop_restart2() test only needs to deal with one epoch
change instead of the original two.
Run via:
env PULSE_NOSHRINK=yes PULSE_SKIP_NEW=yes PULSE_TIME=900 make pulse
So, this one hangs here:
tick-<0.991.0>,dump_state(){prop,machi_chain_manager1_pulse,358,<0.891.0>}
At machi_chain_manager1_pulse.erl line 358, that's after the return
of run_commands(). The next verbose message should come from line
362, after the return of pulse:run(), but that message never appears.
My laptop CPU is really busy (fans running, case is hot), but both
console & disterl aren't available now, so no idea why, alas.
Ah, when I run with a console available and then run Redbug, there is
zero activity calling both machi_chain_manager1_pulse:'_' and
machi_chain_manager1:'_'
This may be related to a bad/ugly shutdown? In both hang cases,
I see at least one SASL error message such as the one below ...
BUT! There should be erlang:display() messages from the shutdown_hard()
function, which does some exit(Pid, kill) calls, but there is no output
from them! So, the killing is coming from some kind of PULSE-initiated
process shutdown/cleanup/??
=SUPERVISOR REPORT==== 16-Jul-2015::20:24:31 ===
Supervisor: {local,machi_sup}
Context: shutdown_error
Reason: killed
Offender: [{pid,<0.200.0>},
{name,machi_flu_sup},
{mfargs,{machi_flu_sup,start_link,[]}},
{restart_type,permanent},
{shutdown,5000},
{child_type,supervisor}]