Commit graph

305 commits

Author SHA1 Message Date
Richard Newman d7018aedaa Remove dependency on test code from places importer. 2016-08-30 18:22:10 -07:00
Nick Alexander 3775c7c773 Start importing places. This is just about profiling for now. 2016-08-30 18:22:10 -07:00
Nick Alexander 71446a3af5 Completely rewrite main transaction logic to be faster.
This is almost complete; it passes the test suite save for retracting
fulltext datoms correctly.

There's a lot to say about this approach, but I don't have time to give
too many details.  The broad outline is as follows.  We collect datoms
to add and retract in a tx_lookup table.  Depending on flags ("search
value" sv and "search value type tag" svalue_type_tag) we "complete" the
tx_lookup table by joining matching datoms.  This allows us to find
datoms that are present (and should not be added as part of the
transaction, or should be retracted as part of the transaction, or
should be replaced as part of the transaction.  We complete the
tx_lookup (in place!) in two separate INSERTs to avoid a quadratic
two-table walk (explain the queries to observe that both INSERTs walk
the lookup table once and then use the datoms indexes to complete the
matching values).

We could simplify the code by using multiple lookup tables, both for the
two cases of search parameters (eav vs. ea) and for the incomplete and
completed rows.  Right now we differentiate the former with NULL checks,
and the latter by incrementing the added0 column.  It performs well
enough, so I haven't tried to understand the performance of separating
these things.

After the tx_lookup table is completed, we build the transaction from
it; and update the datoms materialized view table as well.  Observe the
careful handling of the "search value" sv parameters to handle replacing
:db.cardinality/one datoms.

Finally, we read the processed transaction back to produce to the API.
This is strictly to match the Datomic API; we might make allow to skip
this, since many consumers will not want to stream this over the wire.

Rough timings show the transactor processing a single >50k datom
transaction in about 3.5s, of which less than 0.5s is spent in the
expensive joins.  Further, repeating the processing of the same
transaction is only about 3.5s again!  That's the worst possible for the
joins, since every single inserted datom will already be present in the
database, making the most expensive join match every row.
2016-08-30 18:22:10 -07:00
Nick Alexander b59c71cb00 Extract datomish.db.debug from test code, in order to use it during development. 2016-08-30 18:22:10 -07:00
Nick Alexander a1cc372d43 Fix :db.unique/value, which should be per (a, v) pair, not per v-value. 2016-08-30 18:22:10 -07:00
Nick Alexander 1b1cc6f18e Add Datomic, for testing. 2016-08-30 18:22:09 -07:00
Nick Alexander b29e5caec0 Implement parts: Make the DB allocate and persist entity IDs.
This implementation is inefficient because each allocated temporary ID
touches the database, but it's enough to allow to re-open DBs.
2016-08-30 18:22:09 -07:00
Richard Newman 22fc2ce201 Define <-tagged-SQLite and tagged-SQLite-to-JS to do tag-aware value transforms. 2016-08-30 18:22:09 -07:00
Nick Alexander ff9a75ae09 Tag values with value type tags in SQLite. 2016-08-30 18:22:09 -07:00
Nick Alexander 4d34c820b8 Add d/q; make query minimally schema aware. 2016-08-30 18:22:09 -07:00
Nick Alexander 0f7c1cad79 Extract IEncodeSQLite protocol and type-aware (but not schema-aware) <-SQLite factory. 2016-08-30 18:22:09 -07:00
Richard Newman 655a6a1461 Raise a specific error when trying to transact invalid sequences.
This specifically checks for things like :db/add foo bar (nil),
which will otherwise fail elsewhere after being exploded.
2016-08-30 18:22:09 -07:00
Nick Alexander e7e84e0a90 Add d/{ident,entid} for mapping between keyword idents and integer entids. 2016-08-30 18:22:09 -07:00
Richard Newman dc87d7d557 Correctly handle SQL errors in <?all-rows. Fixes #40. 2016-08-30 18:22:09 -07:00
Richard Newman 14a6a9bebb Configure our JDBC interface to not downcase column names in result sets. 2016-08-30 18:22:09 -07:00
Richard Newman 9f97cc8f2d Clarify d/<connect. 2016-08-30 18:22:09 -07:00
Richard Newman cf96bf1888 Always use our sql-quoting-style. 2016-08-30 18:22:09 -07:00
Richard Newman cff49b6df5 Initialize the sqlite connection with WAL and foreign keys.
This somewhat improves performance, which is nice.
2016-08-30 18:22:09 -07:00
Richard Newman 3ab0428ee0 Add some Leiningen plugins to make testing prettier. 2016-08-30 18:11:25 -07:00
Richard Newman f4344fa28a Bump JVM stack size to 4MB and eliminate tiered compilation.
This allows CLJS compilation to complete for complex go-pair forms.

See <https://github.com/emezeske/lein-cljsbuild/issues/303> for more details.
2016-08-19 12:11:44 -07:00
Richard Newman 1a54cec1ba Use correct view for fulltext+non-fulltext. 2016-08-05 16:29:21 -07:00
Richard Newman 3f882ebdaa Bump honeysql to 0.8.0, which includes mainline ClojureScript support. 2016-08-05 13:06:42 -07:00
Richard Newman b1e6ffc91c Example query hacking. 2016-08-05 13:04:09 -07:00
Richard Newman fc845a9950 Implement basic fulltext binding. r=nalexander 2016-08-05 13:04:06 -07:00
Richard Newman 57d8796d07 Split out datomish.query.cc so we can implement clause resolution in more than one namespace. r=nalexander 2016-08-05 13:04:01 -07:00
Richard Newman fc65f268fa Process Function instances in clause handling. r=nalexander 2016-08-05 13:03:52 -07:00
Richard Newman 4c2230929e Turn Source into a protocol. Allow source->from to switch on attribute. r=nalexander 2016-08-05 13:03:48 -07:00
Nick Alexander c948902c52 Follow-up: Extract datomish.transact.explode.
This required pushing the ID literal out of transact.  I elected to put
them in DB, since literal allocation will end up in IDB eventually.
2016-08-04 17:54:50 -07:00
Nick Alexander d8c976c3ad Follow-up: split the monolith!
This was a little more tricky than might be expected because the
initialization process uses the transactor to bootstrap the database.
Since Clojure doesn't accept mutually recursive modules, this
necessitated a third module, namely "db-factory", which uses both "db"
and "transact".  While I was here, I started an "api" module, to paper
over the potentially complicated internal module structure for external
consumers.  In time, this "api" module may also grow CLJS-specific JS
transformations.
2016-08-04 17:54:32 -07:00
Nick Alexander 1853d57cba Follow-up: extract datomish.transact.bootstrap. 2016-08-04 16:37:14 -07:00
Nick Alexander 73b155cfdc Follow-up: don't accept schema when creating DB.
This schema was already ignored.
2016-08-04 16:34:48 -07:00
Nick Alexander 52af06ce28 Follow-up: lift in-transaction! to DB; make transactions exclusive. 2016-08-04 16:27:36 -07:00
Nick Alexander f408cfd314 Follow-up: lift helpers to IDB. 2016-08-04 16:20:48 -07:00
Nick Alexander 65d71a2598 Review comment: use :quoting :ansi parameter to honeysql/format. 2016-08-04 15:42:25 -07:00
Nick Alexander 9c2d81b9c4 Review comments: update SQLite schema definitions. 2016-08-04 15:42:25 -07:00
Nick Alexander 6a01885dff Review comment: simplify equality. 2016-08-04 14:51:17 -07:00
Nick Alexander c344de665e Review comments: lift (schema db) in a few places. 2016-08-04 14:50:05 -07:00
Nick Alexander 5ece60e3e9 Review comment: be more strict in lookup-ref?. 2016-08-04 14:45:41 -07:00
Nick Alexander 44db8116bf Handle nested maps, sequences in maps, and reverse references. 2016-08-04 14:26:20 -07:00
Nick Alexander d9a8cb0d6a Don't accept user-provided tx values.
This agrees with Datomic.  DataScript allows tx values, possibly to
allow reconstructing DBs from Datom streams, but appears to handle
user-provided tx values in the transactor inconsistently.
2016-08-04 14:26:20 -07:00
Nick Alexander 38545f6efc Add :tx and :txInstant to TxReport; accept :db/tx in transactor; allow to set :db/txInstant.
The implementation of :db/tx is special and may need to change over
time.  We add it as a special ident, with value the current transaction
entity ID, specified per-transaction.  This works well right now but
introduces some (internal) ordering requirements that may need to be
loosened.
2016-08-04 14:26:20 -07:00
Nick Alexander 417ae1ed92 Allow false; fail if too many components in vector given. 2016-08-04 14:26:20 -07:00
Nick Alexander f25838a1eb Support :db/fulltext true.
Internally, we use SQLite's FTS4 to maintain a fulltext_values table of
unique "text" values.  Fulltext indexed datoms have value v that is the
rowid into fulltext_values.  We manually maintain the map between rowid
and value in the transactor.

For convenience, we expose two views interpolating the real text values
into the datoms structure.
2016-08-04 14:26:20 -07:00
Nick Alexander 13f33a4915 Unify test pattern around "(... -after tx0)". 2016-08-04 14:26:20 -07:00
Nick Alexander 296c9cb436 Bootstrap DB schema; persist and restore schema from materialized views. 2016-08-04 14:26:20 -07:00
Nick Alexander 5d271454ac Don't replace keyword idents if the underlying field is itself :db.type/keyword.
This is necessary to assert new :db/ident datoms.
2016-08-04 14:26:20 -07:00
Nick Alexander a1e3158eac Add some structure and value checking. 2016-08-04 14:26:20 -07:00
Nick Alexander 6a8739bd2f Accept new schema fragments via :db.part/db :db.install/attribute. 2016-08-04 14:26:20 -07:00
Nick Alexander 9497d69b44 Respect :db/unique constraints; test upserts.
This version includes SQLite-level unique indexes; these should never be
needed.  I've included them as a fail-safe while testing; they'll help
us catch errors in the transaction layer above.
2016-08-04 14:26:20 -07:00
Nick Alexander 80742242e2 Fix cljs 2016-08-04 14:26:20 -07:00