* maybe_gc/2 is triggered at machi_file_proxy, when chunk is deleted
and the file is larger than `max_file_size`
* A file is deleted if all chunks except 1024 bytes header are trimmed
* If a file is going to be deleted, file_proxy notifies metadata_mgr
to remember the filename persistently, whose filename is
`known_files_<FluName>`
* Such trimmed filenames are stored in a machi_plist file per flu
* machi_file_proxy could not be started if the filename is in the
manager's list. Consequently, any write, read and trim operations
cannot happen against deleted file.
* After the file was trimmed, any read request to the file returns
`{error, trimmed}`
* Disclaimer: no tests written yet and machi_plist does not support
any recovery from partial writes.
* Add some thoughts as comments for repairing trims.
* State diagram of every byte is as follows:
```
state\action| write/append | read_chunk | trim_chunk
------------+----------------+------------------+---------------
unwritten | -> written | fail (+repair) | -> trimmed
written | noop or repair | return content | -> trimmed
trimmed | fail | fail | noop
```
This is simply a change of read_chunk() protocol, where a response of
read_chunk() becomes list of written bytes along with checksum. All
related code including repair is changed as such. This is to pass all
tests and not actually supporting partial chunks.
For debuging from shell, some functions in machi_cinfo are exported:
- public_projection/1
- private_projection/1
- fitness/1
- chain_manager/1
- flu1/1
Also, add more misc details to the 'react' breadcrumb trail. Also,
save get(react) results into dbg2 whenever we write a private projection,
very valuable for debugging.
Also: cleanup PULSE code, add regression commands as option and
controls with some new environment variables. These regression
sequences were responsbile for several fruitful debugging sessions,
so we keep them for posterity and for their ability (with new seeds
and PULSE) to find new interleavings.
%% Demo/exploratory hackery to check relative speeds of dealing with
%% checksum data in different ways.
%%
%% Summary:
%%
%% * Use compact binary encoding, with 1 byte header for entry length.
%% * Because the hex-style code is *far* slower just for enc & dec ops.
%% * For 1M entries of enc+dec: 0.215 sec vs. 15.5 sec.
%% * File sorter when sorting binaries as-is is only 30-40% slower
%% than an in-memory split (of huge binary emulated by file:read_file()
%% "big slurp") and sort of the same as-is sortable binaries.
%% * File sorter slows by a factor of about 2.5 if {order, fun compare/2}
%% function must be used, i.e. because the checksum entry lengths differ.
%% * File sorter + {order, fun compare/2} is still *far* faster than external
%% sort by OS X's sort(1) of sortable ASCII hex-style:
%% 4.5 sec vs. 21 sec.
%% * File sorter {order, fun compare/2} is faster than in-memory sort
%% of order-friendly 3-tuple-style: 4.5 sec vs. 15 sec.