Updated Home (markdown)

Nick Alexander 2017-01-25 16:10:53 -08:00
parent e756cd1ea7
commit 6b489e9d10


@ -1 +1,78 @@
Welcome to the Mentat Wiki!
At its core, Mentat maintains a set of assertions of the form entity-attribute-value (EAV). The assertions conform to a schema whereby the given attribute constrains the associated value/set of associated values.
## Assertions
Mentat assertions are represented as rows in the `datoms` SQLite table, and each Mentat row representing an assertion is tagged with a numeric representation of `:db/valueType`.
The tag is used to limit queries, and therefore is placed carefully in the relevant indices to allow searching numeric longs and doubles quickly. The tag is also used to convert SQLite values to the correct Mentat value type on query egress.
## Entities and entids
A Mentat entity is represented by a *positive* integer. (This agrees with Datomic.) We call such a positive integer an *entid*.
## Partitions
Datomic partitions the entid space in order to separate core knowledge base entities required for the healthy function of the system from user-defined entities. Datomic also partitions in order to ensure that certain index walks of related entities are efficient. Mentat follows suit, partitioning entids into the following partitions:
* `:db.part/db`, for core knowledge base entities;
* `:db.part/user`, for user-defined entities;
* `:db.part/tx`, for transaction entities.
You almost certainly want to add new entities in the `:db.part/user` partition.
The entid sequence in a given partition is monotonically increasing, although not necessarily contiguous. That is, it is possible for a specific entid to have never been present in the system, even though its predecessor and successor are present.
## Representation as SQL tables
The authoritative table in the SQL store is the `transactions` table; from its contents, all other tables can be derived. Each assertion in a transaction (see [[Transacting|Transacting]]) is represented as a row in the `transactions` table, which has columns roughly
e a v value_type_tag added tx
The `added` column is a boolean flag that is non-0 if the datom was added and 0 if the datom was retracted from the datom store. We index on `tx` so that we can quickly extract the datoms added or retracted as part of a particular transaction.
The most important table in the SQL store is the `datoms` table. Queries extract data from the `datoms` table (see [[Querying|Querying]]). It is the materialized view of the transactions table, taking into account all transacted additions and retractions. Each asserted assertion is represented as a row in the `datoms` table, which has columns roughly
e a v value_type_tag tx FLAGS
We (really, SQLite) maintains several indexes and partial indexes to make particular types of queries efficient (at the cost of increasing transaction time and increasing database fragmentation). In particular, we maintain the same set of [indexes that Datomic does](http://docs.datomic.com/indexes.html):
| Index | Contains |
| --- | --- |
| EAVT | all datoms |
| AEVT | all datoms |
| AVET | datoms with attributes that have `:db/index` or `:db/unique` |
| VAET | datoms with attributes that are `:db/type :db.type/ref` |
The EAVT and AEVT indexes allow to efficiently enumerate entities and attributes, respectively. The AVET index allows to efficiently map attribute-value pairs to matching entities. The VAET index allows to efficiently reverse index.
### Representation of metadata as SQL tables
The transactor maintains three metadata tables: `idents`, `schema`, and `parts`. These are materialized views capturing the current state of (the schema part of) the `transactions` table.
The `idents` table maintains the set of _ident mappings_ from keyword ident (like `:db/ident`) to numeric entid (like `1`). It looks like
| ident | entid |
| --- | --- |
| :db/ident | 1 |
| :db.part/db | 2 |
The `schema` table maintains the flags and types of the Mentat schema. It looks like
| ident | attribute | v | value_type_tag |
| --- | --- | --- | --- |
Observe that the value type is represented with a value type tag of 0 (since it's a reference to the entid with ident `:db.type/long` (in future, `:db.type/instant`)), but the index is represented with a value type tag of 1 (since it's a boolean). This all may change in future as we make the representation more compact, or make certain operations more efficient.
The `parts` table maintains the partition ranges and especially the next ID to be allocated in each partition. It looks like
| part | start | idx |
| --- | --- | --- | --- |
Each transaction that allocates temporary IDs will increment the `idx` of the appropriate partition. Every transaction allocates a single transaction ID, so the `:db.part/tx` index should tick up regularly.