3 Transacting
Richard Newman edited this page 2017-01-25 15:09:48 -08:00

The highest of high-level overviews

Transacting, or committing a transaction, follows this conceptual sequence:

  1. EDN parsing.
  2. Entity parsing: turning EDN into an abstract representation of a transaction as a series of entities to be applied to the database.
  3. Entity expansion and rewriting: replacing keyword idents with numeric entids; expanding entity syntactic sugar (like map notation or embedded vector notation) into simpler entity forms.
  4. Type checking: ensuring that all attribute/value pairs are well-typed.
  5. SQL transaction opened
  6. Resolving lookup-refs: ensuring all attribute/value lookup-refs map to a unique entid.
  7. Resolving temp IDs: processing upserts; allocating new entids as necessary.
  8. SQL evaluation: executing bulk INSERT and DELETE SQL statements to update the transactions and datoms tables in the SQL store.
  9. Transaction report generation: extracting the transacted datoms from the SQL store; finalizing the temp ID map.
  10. Schema evolution: interpreting any :db.install/* and :db.alter/* entities encountered.
  11. Transaction finalization: updating the idents, schema, and parts materialized views in the SQL store.
  12. SQL transaction committed

Parsing

A transaction usually arrives as a string. That string is parsed to EDN, and from there parsed to a vector of Entity elements. Each Entity element represents a transaction operation, one of:

  • :db/add
  • :db/retract
  • :db/retractEntity
  • :db/retractAttribute

This doesn't depend on the schema but almost all transactions will include variable data so it's usually not sensible to preprocess or cache them.

Expansion, rewriting, and type checking

This is the point in the process at which the contents of the database — in particular, its schema and ident mappings — are first used.

The entities of the parsed transaction are walked to map keyword idents to numeric entids using the ident mappings. Syntactic sugar like the map notation and the embedded vector notation are expanded into multiple entities. Attribute/value pairs are checked to ensure that they are well-typed.

SQL transaction opened

This is the point in the process at which the contents of the SQL store are used. Therefore, we open a SQL write transaction to isolate our multi-step process from concurrent readers. (We assume that writes are serialized. See the Wiki notes on modeling the DB and connection in Rust.)

Resolution

Next, the set of attribute/value pairs requiring resolution is collected. There are two ways that an attribute/value pair can require resolution: it can be used in a lookup ref, like

[[:db/add [:db/ident :user/unique-attribute] ATTRIBUTE VALUE]

or it could be used in an upsert, like

[[:db/add "tempid" :user/unique-attribute EXISTING-VALUE]
 [:db/add "tempid" :user/other-attribute NEW-VALUE]]

Every lookup-ref must resolve to a unique entid already in the database; if not, the transaction fails immediately. Upserted temp IDs, however, may or may not resolve. If any resolve to multiple entids, the transaction fails immediately. Those that do not resolve will have new entids in the appropriate partition allocated for them.

The upsert resolution process is a multi-step algorithm that iteratively refines sets. See the Wiki notes on resolving upserts.

SQL evaluation

At this point, the transaction entities are in a streamlined form, like

[[:db/add numeric-entid numeric-attr-entid well-typed-value]]

Using the properties of the relevant attribute, we generate bulk INSERT and DELETE SQL statements to update the transactions and datoms tables in the SQL store. The goal is to have SQLite do the work to look up and replace :db.cardinality/one datoms, to produce a minimal number of DELETE statements, to efficiently INSERT fulltext values and the corresponding datoms, etc.

The translation to SQL is not complicated, but there are several fiddly cases. Eventually we'll write Wiki notes on translating entities to bulk SQL.

Transaction report generation

At this point, the datoms and transactions tables are updated but the transactor itself does not know the details of what has happened! For example, a :db.cardinality/many datom may have already existed and not been transacted; or a new :db/ident may have been transacted. The transactor therefore queries the transactions table to find out what the actual SQL changes have been, for presentation to the transact consumer in the transaction report.

Schema evolution and transaction finalization

Using the report generated in the previous step, the transactor interprets any :db.install/*, :db.alter/*, and changes to the schema definitions encountered.

Finally, the idents (if new :db/ident datoms were transacted), schema (if :db.install/* datoms were transacted), and parts (always, since we allocate a new :db/tx entid each transaction) materialized views in the SQL store are updated.

SQL transaction closed

Finally, we've updated the data stores and the materialized metadata views. The encompassing SQL transaction is committed. See the Wiki notes on modeling the DB and connection in Rust.