Welcome to the Project Mentat Wiki!
Context
There are two blog posts that discuss some of the motivations for the project and some of the technical context you might need in order to work on or with Project Mentat.
How it works
At its core, Mentat maintains a set of assertions of the form entity-attribute-value (EAV). The assertions conform to a schema whereby the given attribute constrains the associated value/set of associated values. We call these assertions datoms.
Entities and entids
A Mentat entity is represented by a positive integer. (This agrees with Datomic.) We call such a positive integer an entid.
Partitions
Datomic partitions the entid space in order to separate core knowledge base entities required for the healthy function of the system from user-defined entities. Datomic also partitions in order to ensure that certain index walks of related entities are efficient. Mentat follows suit, partitioning entids into the following partitions:
:db.part/db
, for core knowledge base entities;:db.part/user
, for user-defined entities;:db.part/tx
, for transaction entities.
You almost certainly want to add new entities in the :db.part/user
partition.
The entid sequence in a given partition is monotonically increasing, although not necessarily contiguous. That is, it is possible for a specific entid to have never been present in the system, even though its predecessor and successor are present.
Representation of assertions
Mentat assertions are represented as rows in the datoms
SQLite table, and each Mentat row representing an assertion is tagged with a numeric representation of :db/valueType
.
The tag is used to limit queries, and therefore is placed carefully in the relevant indices to allow searching numeric longs and doubles quickly. The tag is also used to convert SQLite values to the correct Mentat value type on query egress.
The value type tag mapping is currently:
:db/valueType |
value type tag | SQLite storage class | examples |
---|---|---|---|
:db.type/ref | 0 | INTEGER | 1234 |
:db.type/boolean | 1 | INTEGER | 0 (false), 1 (true) |
:db.type/long | 5 | INTEGER | -4321 |
:db.type/double | 5 | REAL | -0.369 |
:db.type/string | 10 | TEXT | arbitrary textual data |
:db.type/keyword | 13 | TEXT | :namespaced/keyword |
Observe that some Mentat value types share a value type tag: they are differentiated using SQLite's storage class.
Representation as SQL tables
The authoritative table in the SQL store is the transactions
table; from its contents, all other tables can be derived. Each assertion in a transaction (see Transacting) is represented as a row in the transactions
table, which has columns roughly
e a v value_type_tag added tx
The added
column is a boolean flag that is non-0 if the datom was added and 0 if the datom was retracted from the datom store. We index on tx
so that we can quickly extract the datoms added or retracted as part of a particular transaction.
The most important table in the SQL store is the datoms
table. Queries extract data from the datoms
table (see Querying). It is the materialized view of the transactions table, taking into account all transacted additions and retractions. Each asserted assertion is represented as a row in the datoms
table, which has columns roughly
e a v value_type_tag tx FLAGS
We (really, SQLite) maintains several indexes and partial indexes to make particular types of queries efficient (at the cost of increasing transaction time and increasing database fragmentation). In particular, we maintain the same set of indexes that Datomic does:
Index | Contains |
---|---|
EAVT | all datoms |
AEVT | all datoms |
AVET | datoms with attributes that have :db/index or :db/unique |
VAET | datoms with attributes that are :db/type :db.type/ref |
The EAVT and AEVT indexes allow to efficiently enumerate entities and attributes, respectively. The AVET index allows to efficiently map attribute-value pairs to matching entities. The VAET index allows to efficiently reverse index.
Representation of metadata as SQL tables
The transactor maintains three metadata tables: idents
, schema
, and parts
. These are materialized views capturing the current state of (the schema part of) the transactions
table.
The idents
table maintains the set of ident mappings from keyword ident (like :db/ident
) to numeric entid (like 1
). It looks like
ident | entid |
---|---|
:db/ident | 1 |
:db.part/db | 2 |
The schema
table maintains the flags and types of the Mentat schema. It looks like
ident | attribute | v | value_type_tag |
---|---|---|---|
:db/txInstant | :db/cardinality | 31 | 0 |
:db/txInstant | :db/index | 1 | 1 |
:db/txInstant | :db/valueType | 25 | 0 |
Observe that the value type is represented with a value type tag of 0 (since it's a reference to the entid with ident :db.type/long
(in future, :db.type/instant
)), but the index is represented with a value type tag of 1 (since it's a boolean). This all may change in future as we make the representation more compact, or make certain operations more efficient.
The parts
table maintains the partition ranges and especially the next ID to be allocated in each partition. It looks like
part | start | idx | |
---|---|---|---|
:db.part/db | 0 | 38 | |
:db.part/user | 65536 | 65536 | |
:db.part/tx | 268435456 | 268435457 |
Each transaction that allocates temporary IDs will increment the idx
of the appropriate partition. Every transaction allocates a single transaction ID, so the :db.part/tx
index should tick up regularly.