Commit graph

1024 commits

Author SHA1 Message Date
Scott Lystig Fritchie
4dd3ccf10c Merge branch 'merge/corfurl-master' 2015-03-02 18:12:46 +09:00
Scott Lystig Fritchie
2bf28122c1 Fix typos in docs/corfurl.md 2015-03-02 18:10:46 +09:00
Scott Lystig Fritchie
22f46c329d Add annoying & verbose TODO reminder for FILL implementation fixing! 2015-03-02 18:10:46 +09:00
Scott Lystig Fritchie
1c5e8d3726 Change env var BITCASK_PULSE -> USE_PULSE 2015-03-02 18:10:46 +09:00
Scott Lystig Fritchie
edd5b62563 del prototype/corfurl/README.old.md 2015-03-02 18:10:46 +09:00
Scott Lystig Fritchie
305cf34a2d Move old README.md -> README.old.md, create new README.md 2015-03-02 18:08:29 +09:00
Scott Lystig Fritchie
c9764bf5f6 Add new docs/corfurl/notes/README.md stuff
and also:

Add CORFU papers section
Merge corfurl.md and CONCEPTS.md
Add one more CORFU-related paper
Delete prototype/corfurl/docs/CONCEPTS.md
2015-03-02 18:08:29 +09:00
Scott Lystig Fritchie
8b105672b1 Bugfix for read-repair (thanks PULSE), model change to handle handle aborted writes 2015-03-02 18:08:29 +09:00
Scott Lystig Fritchie
b7b9255f5f Partial fix for bug in last commit, but not good enough 2015-03-02 18:08:29 +09:00
Scott Lystig Fritchie
6858041c7d See comments added by this commit for append_page() bug found, racing with epoch change 2015-03-02 18:08:29 +09:00
Scott Lystig Fritchie
40c28b79bb PULSE test now uses corfurl_client (retry logic) for all ops 2015-03-02 18:08:29 +09:00
Scott Lystig Fritchie
7ac1e7f178 Add retry loop for read_page/2, fill_page/2, trim_page/2 2015-03-02 18:08:29 +09:00
Scott Lystig Fritchie
1f0e43d33f Fix dumb think-o in corfurl_client:append_page() retry counter 2015-03-02 18:08:29 +09:00
Scott Lystig Fritchie
04f2105df0 Var renaming in corfurl_client:append_page() 2015-03-02 18:08:29 +09:00
Scott Lystig Fritchie
8df5326b0c Try to restart the sequencer only if it looks like nobody else has 2015-03-02 18:08:29 +09:00
Scott Lystig Fritchie
0b031bcf0a Change polling constants for to deal with PULSE's evil 2015-03-02 18:08:28 +09:00
Scott Lystig Fritchie
fb1216649c Finish very basic PULSE testing of stopping & restarting the sequencer 2015-03-02 18:08:28 +09:00
Scott Lystig Fritchie
63d1c93fc9 Fix silly-dumb errors in seal epoch comparisons 2015-03-02 18:08:28 +09:00
Scott Lystig Fritchie
96b561cde9 Fix broken EUnit tests 2015-03-02 18:08:28 +09:00
Scott Lystig Fritchie
d93572c391 Refactoring to implement stop_sequencer command 2015-03-02 18:08:24 +09:00
Scott Lystig Fritchie
d5091358ff Put the sequencer pid inside the projection 2015-03-02 18:06:52 +09:00
Scott Lystig Fritchie
a64a09338d Fix broken EUnit tests (been in PULSE land too long) 2015-03-02 18:06:48 +09:00
Scott Lystig Fritchie
20a2a51649 Partial fix (#2 of 2) for model problem in honest write-vs-trim race 2015-03-02 18:05:03 +09:00
Scott Lystig Fritchie
638a45e8cb Partial fix for model problem in honest write-vs-trim race 2015-03-02 18:05:03 +09:00
Scott Lystig Fritchie
eabebac6f2 Fix PULSE model difficulty of how to handle races between write & trim.
This trim race is (as far as I can tell) fine -- I see no correctness
problem with CORFU, on the client side or the server side.  However,
this race with a trim causes a model problem that I believe can be
solved this way:

1. We must keep track of the fact that the page write is happening:
someone can notice the write via read-repair or even a regular read by
the tail.  We do this in basically the way that all other writes
are handled in the ValuesR relation.

2. Add new code to client-side writer: if there's a trim race, *and*
if we're using PULSE, then return a special error code that says that
the write was ok *and* that we raced with trim.

2b. If we aren't using pulse, just return {ok, LPN}.

3. For the transition check property, treat the new return code as if
it is a w_tt.  Actually, we use a special marker atom, w_special_trimmed
for that purpose, but it is later treated the same way that w_tt is by the
filter_transition_trimfill_suffixes() filter.
2015-03-02 18:05:02 +09:00
Scott Lystig Fritchie
13e15e0ecf Add MSC charts to help explain BAD-looking trim race 2015-03-02 18:05:02 +09:00
Scott Lystig Fritchie
d077148b47 Attempt to fix unimplemented corner case, thanks PULSE! 2015-03-02 18:05:02 +09:00
Scott Lystig Fritchie
b7e3f91931 Add ?EVENT_LOG() to add extra trace info to corfurl and corfurl_flu 2015-03-02 18:05:02 +09:00
Scott Lystig Fritchie
479efce0b1 Make PULSE model aware of read-repair for 'error_trimmed' races
The read operation isn't a read-only operation: it can trigger
read-repair in the case where a hole is discovered.  The PULSE
model needs to be aware of this kind of thing.

Imagine that we have a 3-way race, between an append to LPN 1,
a read of LPN 1, and a trim of LPN 1.  There is a single chain
of length 3.  The FLUs in the chain are annotated below with
"F1", "F2", and "F3".  Note also the indentation levels, with
F1's indented is smaller than F2's << F3's.

 2,{call,<0.8748.3>,{append,<<0>>,will_be,1}}},
 4,{call,<0.8746.3>,{read,1}}},
 6,{call,<0.8747.3>,{trim,1,will_fail,error_unwritten}}},

 6, Read has contacted tail of chain, it is unwritten.  Time for repair.
 6,{read_repair,1,[<0.8741.3>,<0.8742.3>,<0.8743.3>]}},

 6,  F1:{flu,write,<0.8741.3>,1,ok}},
 7,  F1:{flu,trim,<0.8741.3>,1,ok}},  % by repair

 9,{read_repair,1,fill,<0.8742.3>}},

 9,          F2:{flu,trim,<0.8742.3>,1,error_unwritten}},

 9,{read_repair,1,<0.8741.3>,trimmed}},

10,{result,<0.8747.3>,error_unwritten}},
   Trim operation from time=6 stops here

10,          F2:{flu,write,<0.8742.3>,1,ok}},
11,          F2:{flu,fill,<0.8742.3>,1,error_overwritten}},

12,                  F3:{flu,write,<0.8743.3>,1,ok}},

12,{read_repair,1,fill,<0.8742.3>,overwritten,try_trim}},

13,{result,<0.8748.3>,{ok,1}}}, % append/write to LPN 1

13,          F2:{flu,trim,<0.8742.3>,1,ok}},

14,{read_repair,1,fill,<0.8743.3>}},
15,                  F3:{flu,fill,<0.8743.3>,1,error_overwritten}},

16,{read_repair,1,fill,<0.8743.3>,overwritten,try_to_trim}},
17,                  F3:{flu,trim,<0.8743.3>,1,ok}},

18,{result,<0.8746.3>,error_trimmed}}]
2015-03-02 18:05:02 +09:00
Scott Lystig Fritchie
a7dd78d8f1 Switch to Lamport clocks for PULSE verifying 2015-03-02 18:04:59 +09:00
Scott Lystig Fritchie
5420e9ca1f Bugfix for read repair: if trimmed, try fill first then trim 2015-03-02 18:03:10 +09:00
Scott Lystig Fritchie
88d44722be Fix PULSE model bug of adding multiple same values to orddict 2015-03-02 18:03:10 +09:00
Scott Lystig Fritchie
8ec5f04903 Bug: PULSE found a way to reach a 'left_off_here' corner case, sweet 2015-03-02 18:03:10 +09:00
Scott Lystig Fritchie
e40394a3a7 Bugfix: yet another race in read_repair, sweet 2015-03-02 18:03:10 +09:00
Scott Lystig Fritchie
370c57b78a Bug: corfurl:read_repair_chain() should use trim when it encounters error_trimmed 2015-03-02 18:03:09 +09:00
Scott Lystig Fritchie
fd32bcb308 Fix PULSE model to accomodate API change from previous commit.
Now 1+ trim & fill transitions are collapsed to a single 'w_t+' atom.
The atom name is a bit odd; think about regexps and it hopefully
makes sense.
2015-03-02 18:03:09 +09:00
Scott Lystig Fritchie
431827f65e Allow racing trim/fill and read-repair to simply "win".
This exposes a bug in the PULSE model, now that we can have multiple
successful fill/trim for the same LPN.
2015-03-02 18:03:09 +09:00
Scott Lystig Fritchie
5edee3a2cf Don't bother adding 2 when picking an LPN for fill & trim 2015-03-02 18:03:09 +09:00
Scott Lystig Fritchie
d2562588f2 Move the lists:reverse() in make_chains() to preserve input's order in the output 2015-03-02 18:03:09 +09:00
Scott Lystig Fritchie
e791876212 Fix silly model error when calculating values 2015-03-02 18:03:09 +09:00
Scott Lystig Fritchie
f5c4474669 Derp, turn off TRIP_no_append_duplicates 2015-03-02 18:03:09 +09:00
Scott Lystig Fritchie
b3ed9ef51c Add fill checking to PULSE model, minimal API coverage is complete 2015-03-02 18:03:09 +09:00
Scott Lystig Fritchie
7a46709c13 Change transition type names to make better invalid transition detection 2015-03-02 18:03:09 +09:00
Scott Lystig Fritchie
8a56771182 Add better condition for perhaps_trip_fill_page() 2015-03-02 18:03:09 +09:00
Scott Lystig Fritchie
db6fa3d895 Fix two bugs found by PULSE in corfurl_flu.erl, yay! 2015-03-02 18:03:09 +09:00
Scott Lystig Fritchie
86d4583aef Add fill support to the PULSE model 2015-03-02 18:03:09 +09:00
Scott Lystig Fritchie
7dba8beae9 Refactor PULSE test for easier checking, prior to adding fill & trim. 2015-03-02 18:03:09 +09:00
Scott Lystig Fritchie
78019b402f Refactor the PULSE model testing error 'trip' code 2015-03-02 18:03:09 +09:00
Scott Lystig Fritchie
c80921de25 Add scan_forward() command, no result checking yet 2015-03-02 18:03:09 +09:00
Scott Lystig Fritchie
fb6b1cdc3c Fix read_page() model problem: no more false positives! 2015-03-02 18:03:09 +09:00