Representing transformation events and timelines #200
Labels
No labels
A-build
A-cli
A-core
A-design
A-edn
A-ffi
A-query
A-sdk
A-sdk-android
A-sdk-ios
A-sync
A-transact
A-views
A-vocab
P-Android
P-desktop
P-iOS
bug
correctness
dependencies
dev-ergonomics
discussion
documentation
duplicate
enhancement
enquiry
good first bug
good first issue
help wanted
hygiene
in progress
invalid
question
ready
size
speed
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: greg/mentat#200
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
A synchronized set of Mentat instances collaborate to build a linear timeline of states. They do so by rebasing and/or merging their changes into the remote primary timeline.
Each instance is able to perform a set of operations — assertion, retraction, excision, and schema addition/alteration — that introduce coordination points in this timeline.
Before we discuss those, it's worth mentioning that there are other operations that only affect whether clients can proceed:
Most of a client's operations are point-in-time:
The presence of one of these operations in a branch of history is important when considering merges and rebases. We must make sure that retractions and assertions are sequenced, and ensure that schema changes occur before data changes that depend on them.
One operation — excision (#21) — affects an existing region of timeline. An excision marker is placed in the local timeline and implies the removal of data from the device's local log.
Given that we want all clients to have the same materialized state after reaching the same point in the transaction log, it's crucial that excision processing is reproducible.
For a single client system, this is relatively simple: there is only one timeline, and all writes are fast-forward.
For a multi-client system, where a timeline might diverge and merge back together:
we must define which of the three timeline segments — shared history, new-left, new-right — are impacted by an excision on left or right. This definition must produce the same outcome when applied by either client.
Let's look at an example.
Imagine that two clients,
A
andB
, both know about a URL in a browser's history,http://example.com/
. This has ID123
. At pointT1
it's linked to three visits:200
,201
,202
. Visits are component entities.Now
A
andB
diverge.A
adds a visit:300
.B
excises all visits for123
.This is a classic syncing problem: what happens after a merge?
Usually one of the following occurs:
A
reaches the server first, it adds the visit.B
detects a conflict and undoes its local deletion. This surprises the user.B
deletes all four visits. Sometimes it'll do this even ifA
's visit was later thanB
's excision, perhaps even months later, depending on whenA
andB
sync. This surprises the user.B
reaches the server first, it drops the first three visits. Depending on format,A
will then reupload all four visits (which surprises the user), or just the new one (300
).Various approaches are used to try to make some of these operations durable, such as recording tombstones. That's really tricky.
One of the reasons it's tricky is that it's not clear what the deletion means, because the excision operation (in Firefox Sync, at least) isn't pinned to a point on a timeline — it floats at an instant in time or an order of interaction with the server, and that is very woolly indeed.
The most obvious meaning for excision is that it applies along the current parent 'route' back to the origin.
B
's excision doesn't apply toA
's new data. IfB
merges first, its excision is already recorded whenA
comes to merge or rebase. IfA
merges first,B
knows how to rewrite its excision to apply only to earlier data.(Indeed,
B
might well automatically record the excision only for the merged data —{:db/excise 123, :db.excise/beforeT <last parent>}
— and just directly drop non-merged excised datoms as if they had never been written. This is reminiscent of Mercurial's phases.)This alone isn't enough. If
A
transacted like this:the actual datoms recorded would be:
B
excised everything about123
, and here we wouldn't have reasserted the URL, and so the URL would be excised, leaving300
as a visit to an unknown page. A process of reintroduction (or storing redundant data) might be necessary, or perhaps that behavior is actually desirable; it depends on how the data is modeled and how specific the retraction is.