Commit graph

542 commits

Author SHA1 Message Date
Nick Alexander
c165972684 Post: Reject at parse-time reversed attributes in direct notation with bad values.
This is an optimization that trades rejecting inputs earlier at the
cost of expressive error messages.  It should be possible to recover
the error messages, however.

This will reject input like `[:db/{add,retract} v :attribute/_reversed NOT-AN-ENTITY]`.
2017-06-08 10:30:31 -07:00
Nick Alexander
59a710f80f Review comments: another test, add unreversed(). 2017-06-08 10:30:31 -07:00
Nick Alexander
eb220528bf Post: Indent. 2017-06-08 10:30:31 -07:00
Nick Alexander
d88823e7c4 Handle :attribute/_reverse in transactor. Fixes #187
There are two broad approaches:

1) Handle reverse attribute notation dynamically, in the style that
   Datomic does.  This is the most flexible, but it's not a good fit
   given that we produce strongly typed output from the parser.
   Strongly typed input to the transactor has had many benefits, so I
   don't want to roll it back for a relatively unimportant feature
   like reverse notation -- especially not since Mentat does not
   require :db.install/_attribute to modify schema attributes.

2) Handle reverse attribute in the parser itself, so that we can
   produce strongly typed parser output while restricting the input.
   I implemented this first and discovered that it's very difficult to
   give sensible error messages in common cases.

In any case, the bulk of the code is the same between the two
approaches, and I wrote the tests for the dynamic version (with error
output), so that's what I'm rolling with.

This patch preserves the existing indentation, to highlight the
differences.  The next patch will indent.
2017-06-08 10:30:31 -07:00
Nick Alexander
0be78cf956 Pre: Extract entity_*_into_term_* helpers. 2017-06-08 10:30:31 -07:00
Nick Alexander
4b0881a957 Pre: Push bookkeeping into an InProcess struct. 2017-06-08 10:30:31 -07:00
Nick Alexander
05129cefbb Pre: Use ValueType rather than Attribute to convert edn::Value to TypedValue.
This is expedient now, but might require work in the future to achieve
better error messages.
2017-06-08 10:30:31 -07:00
Nick Alexander
2650fe163d Pre: Intern lookup_ref by reference. 2017-06-08 10:30:31 -07:00
Nick Alexander
a4fc04ea86 Pre: Crib map_{left,right} for Either. 2017-06-08 10:30:31 -07:00
Richard Newman
634b7a816b Dedupe SQL arguments. (#471) r=nalexander 2017-06-07 11:55:42 -07:00
Richard Newman
2c52346999 Review comment: generalize from Uuid SQL arguments to byte arrays. 2017-06-07 11:55:05 -07:00
Richard Newman
ed04083ceb Dedupe SQL arguments.
This isn't perfect -- we still need to clone in a couple of cases -- but it avoids us
passing duplicate strings down into SQLite whenever the same value is mentioned more
than once in a query.
2017-06-07 11:46:18 -07:00
Richard Newman
7fc0848cb0 Fix typo in README. 2017-06-06 19:01:27 +00:00
Richard Newman
a88375fc15 Update README for master switchover. 2017-06-06 11:10:49 -07:00
Nick Alexander
8a8fcedd1c Parse without copying streams. Fixes #436. #444. r=rnewman
We were accidentally quadratic, copying the tails of owned Vec
instances around.  This brings us down to the expected linear runtime.
2017-05-18 10:20:06 -07:00
Nick Alexander
409a2ea78f Post: Use choice instead of or. 2017-05-18 10:17:13 -07:00
Nick Alexander
d1ac752de6 Parse without copying; parse keyword maps using macros.
This is a big commit, but it breaks into two conceptual pieces.  The
first is to "parse without copying".  We replace a stream of an owned
collection of edn::ValueAndSpan and instead have a stream of a
borrowed collection of &edn::ValueAndSpan references.  (Generally,
this is represented as an iterator over a slice, but it can be over
other things too.)  Cloning such iterators is constant time, which
improves on cloning an owned collection of edn::ValueAndSpan, which is
linear time in the length of the collection and additional time
depending on the complexity of the EDN values.

The second conceptual piece is to parse keyword maps using a special
parser and a macro to build the parser implementations.  Before, we
created a new edn::ValueAndSpan::Map to represent a keyword map in
vector form; since we're working with &edn::ValueAndSpan references
now, we can't create an &edn::ValueAndSpan reference with an
appropriate lifetime.  Therefore we generalize the concept of
iteration slightly and turn keyword maps in map form into linear
iterators by flattening the value maps.  This is a potentially
obscuring transformation, so we have to take care to protect against
some failure cases.  (See the comments and the tests in the code.)

After these changes, parsing using `combine` is linear time (and
reasonably fast).
2017-05-18 10:17:13 -07:00
Nick Alexander
4fa57942d3 Pre: Move macros out of lib.rs.
It seems very subtle to use macros in tests: I needed to separate the
modules in order to control load order to get everything to work.
2017-05-18 10:17:13 -07:00
Richard Newman
953f9f7734 Remove server instructions from README. 2017-05-15 13:05:24 +00:00
Richard Newman
c95ec13ffe Begin moving web server to a separate crate. (#448) r=bgrins
This doesn't yet introduce a working Cargo.toml for 'mentatweb', but it
does allow RLS to build correctly without errors, and it reduces the
core library's dependency space, which is more important in the short
term.
2017-05-10 02:25:59 -07:00
Richard Newman
3d4615fb8c Allow opening a DB. (#462) r=fluffyemily 2017-05-09 09:42:35 -07:00
Richard Newman
1dc8a3eaa0 Add a test for exported symbols. 2017-05-03 15:57:09 -07:00
Richard Newman
059b9d1182 Expose mentat_core::{TypedValue,ValueType} and conn::{Conn,Metadata}. (#443) 2017-05-03 15:49:29 -07:00
Richard Newman
523d5ea5f1 Bump dependency versions. r=bgrins. (#441) 2017-05-03 12:53:16 -07:00
Richard Newman
daca8def57 UUIDs and instants. Fixes #44, #45, #426, #427. (#438) r=nalexander
* Pre: unused import in translate.rs.

* Part 2: take a dependency on rusqlite for query arguments.

* Part 1: flatten V2 schema into V1. Add UUID and URI.

Bump expected ident and bootstrap datom count in tests.

* Part 5: parse edn::Value::Uuid.

* Part 3: extend ValueType and TypedValue to include Uuid.

* Part 4: add Uuid to query arguments.

* Part 6: extend db to support Uuid.

* Part 8: add a tx-parser test for #f NaN and #uuid.

* Part 7: parse and algebrize UUIDs in queries.

* Part 1: parse #inst in EDN and throughout query engine.

* Part 3: handle instants in db.

* Part 2: instants never matches integers in queries.

* Part 4: use DateTime for tx_instants.

* Add a test for adding and querying UUIDs and instants.

* Review comments.
2017-04-28 20:11:55 -07:00
Emily Toop
bd389d2f0d Parse and Algebrize not & not-join. (#302) (Closes #303, #389, #422 ) r=rnewman
* Part 1 - Parse `not` and `not-join`

* Part 2 - Validate `not` and `not-join` pre-algebrization

* Address review comments rnewman.
* Remove `WhereNotClause` and populate `NotJoin` with `WhereClause`.
* Fix validation for `not` and `not-join`, removing tests that were invalid.
* Address rustification comments.

* Rebase against `rust` branch.

* Part 3 - Add required types for NotJoin.
* Implement `PartialEq` for
`ConjoiningClauses` so `ComputedTable` can be included inside `ColumnConstraint::NotExists`

* Part 4 - Implement `apply_not_join`

* Part 5 - Call `apply_not_join` from inside `apply_clause`

* Part 6 - Translate `not-join` into `NOT EXISTS` SQL

* Address review comments.

* Rename `projected` to `unified` to better describe the fact that we are not projecting any variables.
* Check for presence of each unified var in either `column_bindings` or `input_bindings` and bail if not there.
* Copy over `input_bindings` for each var in `unified`.
* Only copy over the first `column_binding` for each variable in `unified` rather than the whole list.
* Update tests.

* Address review comments.

* Make output from Debug for NotExists more useful

* Clear up misunderstanding. Any single failing clause in the not will cause the entire not to be considered empty

* Address review comments.

* Remove Limit requirement from cc_to_exists.
* Use Entry.or_insert instead of matching on the entry to add to column_bindings.
* Move addition of value_bindings to before apply_clauses on template.
* Tidy up tests with some variable reuse.
* Addressed nits,

* Address review comments.

* Move addition of column_bindings to above apply_clause.
* Update tests.

* Add test to ensure that unbound vars fail

* Improve test for unbound variable to check for correct variable and error

* address nits
2017-04-28 10:44:11 +01:00
Richard Newman
e64ee5864e Force newline at end of file in VSCode config. 2017-04-26 10:09:50 -07:00
Richard Newman
19fc7cddf1 [query] Widen known_types correctly in complex or. (#424) r=nalexander
* Part 1: define ValueTypeSet.

We're going to use this instead of `HashSet<ValueType>` so that we can clearly express
the empty set and the set of all types, and also to encapsulate a switch to `EnumSet`."

* Part 2: use ValueTypeSet.

* Part 3: fix type expansion.

* Part 4: add a test for type extraction from nested `or`.

* Review comments.

* Review comments: simplify ValueTypeSet.
2017-04-24 14:15:26 -07:00
Richard Newman
bc63744aba Add :limit to queries (#420) r=nalexander
* Pre: put query parts in alphabetical order.
* Pre: rename 'input' to 'query' in translate tests.
* Part 1: parse :limit.
* Part 2: validate and escape variable parameters in SQL.
* Part 3: algebrize and translate limits.
2017-04-19 16:16:19 -07:00
Brian Grinstead
cd860ae68d Add an initial benchmark for the tx-parser crate. (#406) (#413) r=nalexander 2017-04-19 13:54:24 -07:00
Richard Newman
bffefe7e6b Review comments for #418. 2017-04-18 13:50:58 -07:00
Richard Newman
aa14a71019 Parse :in, pass inputs through to querying. (#418) r=nalexander
This commit downgrades error_chain to 0.8.1 in order to fix trait bounds
on errors.
2017-04-18 13:20:00 -07:00
Nick Alexander
ff0147e89c Review comments: downgrade to error-chain 0.8.1 for Send + Sync bound; use combine::primitive::Error. 2017-04-18 13:19:50 -07:00
Richard Newman
60c082b61e Part 4: pass inputs through algebrizing and execution. (#418)
This also adds a test that an `UnboundVariables` error is raised if a
variable mentioned in the `:in` clause isn't bound.
2017-04-18 13:19:50 -07:00
Richard Newman
dfc846e483 Part 3: define keep_intersected_keys.
We'll use this to drop unneeded values from input maps, if lazy callers
reuse a general-purpose map for multiple queries.
2017-04-18 13:19:50 -07:00
Richard Newman
651308f721 Part 2: define a type to encapsulate query inputs.
This is for two reasons.

Firstly, we need to track the types of inputs, their values, and also
the input variables; adding a struct gives us a little more clarity.

Secondly, when we come to implement prepared statements, we'll be
algebrizing queries without having the values available. We'll be able
to do a better job of algebrizing, and also do more validating, if we
allow callers to specify the types of variables in advance, even if the
values aren't known.
2017-04-18 13:19:50 -07:00
Richard Newman
a9a82ea1a7 Part 1: parse :in.
We also at this point switch from using `Vec<Variable>` to
`BTreeSet<Variable>`. This allows us to guarantee no duplicates later;
we'll reject duplicates at parse time.
2017-04-18 13:19:50 -07:00
Richard Newman
01af45ab3f Pre: define Display for ValueType. 2017-04-18 13:19:50 -07:00
Richard Newman
8ddbc834ae Pre: take Variables instead of Strings in public API, for now. 2017-04-18 13:19:50 -07:00
Richard Newman
5718ce0155 Pre: add two checks to translate tests to fix unused var warning. 2017-04-18 13:19:50 -07:00
Richard Newman
5cd53aff44 Pre: unused imports. 2017-04-18 13:19:50 -07:00
Brian Grinstead
99b7e89116 Make struct Conn public. (#419) r=rnewman 2017-04-18 11:04:44 -07:00
Richard Newman
35d73d5541 Implement :order. (#415) (#416) r=nalexander
This adds an `:order` keyword to `:find`.

If present, the results of the query will be an ordered set, rather than
an unordered set; rows will appear in an ordered defined by each
`:order` entry.

Each can be one of three things:

- A var, `?x`, meaning "order by ?x ascending".
- A pair, `(asc ?x)`, meaning "order by ?x ascending".
- A pair, `(desc ?x)`, meaning "order by ?x descending".

Values will be ordered in this sequence for asc, and in reverse for desc:

1. Entity IDs, in ascending numerical order.
2. Booleans, false then true.
3. Timestamps, in ascending numerical order.
4. Longs and doubles, intermixed, in ascending numerical order.
5. Strings, in ascending lexicographic order.
6. Keywords, in ascending lexicographic order, considering the entire
   ns/name pair as a single string separated by '/'.

Subcommits:

Pre: make bound_value public.
Pre: generalize ErrorKind::UnboundVariable for use in order.
Part 1: parse (direction, var) pairs.
Part 2: parse :order clause into FindQuery.
Part 3: include order variables in algebrized query.

We add order variables to :with, so we can reuse its type tag projection
logic, and so that we can phrase ordering in terms of variables rather
than datoms columns.

Part 4: produce SQL for order clauses.
2017-04-17 11:30:31 -07:00
Richard Newman
64acc6a7ee Support :with (#311) (#414) r=nalexander
* Pre: refactor projector code.
* Part 1: maintain 'with' variables in AlgebrizedQuery.
* Part 2: include necessary 'with' variables in SQL projection list.

The test produces projection elements for `:with`, even though there are
no aggregates in the query. This test will need to be adjusted when we
optimize this away!
2017-04-17 09:23:55 -07:00
Richard Newman
d8f761993d Implement complex or joins. (#410) r=nalexander 2017-04-12 19:23:40 -07:00
Richard Newman
758ab8b476 Part 5: add more tests for complex or. 2017-04-12 19:21:56 -07:00
Richard Newman
bca8b7e322 Part 4: correct projection of type tags in the outermost projector. 2017-04-12 19:21:56 -07:00
Richard Newman
d8075aa07d Part 3: finish expansion and translation of complex or.
This commit turns complex `or` -- `or`s in which not all variables are
unified, or in which not all arms are the same shape -- into a
computed table.

We do this by building a template CC that shares some state with the
destination CC, applying each arm of the `or` to a copy of the template
as if it were a standalone query, then building a projection list and
creating a `ComputedTable::Union`. This is pushed into the destination
CC's `computed_tables` list.

Finally, the variables projected from the UNION are bound in the
destination CC, so that unification occurs, and projection of the
outermost query can use bindings established by the `or-join`.

This commit includes projection of type codes from heterogeneous `UNION`
arms: we compute a list of variables for which a definite type is
unknown in at least one arm, and force all arms to project either a type
tag column or a fixed type. It's important that each branch of a UNION
project the same columns in the same order, hence the projection of
fixed values.

The translator is similarly extended to project the type tag column name
or the known value_type_tag to support this.

Review comment: clarify union type extraction.
2017-04-12 19:21:45 -07:00
Richard Newman
08d2c613a4 Part 2: expand the definition of a table to include computed tables.
This commit:

- Defines a new kind of column, distinct from the eavt columns in
  `DatomsColumn`, to model the rows projected from subqueries. These
  always name one of two things: a variable, or a variable's type tag.
  Naturally the two cases are thus `Variable` and `VariableTypeTag`.
  These are cheap to clone, given that `Variable` is an `Rc<String>`.
- Defines `Column` as a wrapper around `DatomsColumn` and
  `VariableColumn`. Everywhere we used to use `DatomsColumn` we now
  allow `Column`: particularly in constraints and projections.
- Broadens the definition of a table list in the intermediate
  "query-sql" representation to include a SQL UNION. A UNION is
  represented as a list of queries and an alias.
- Implements translation from a `ComputedTable` to the query-sql
  representation. In this commit we only project vars, not type tags.

Review comment: discuss bind_column_to_var for ValueTypeTag.
Review comment: implement From<Vec<T>> for ConsumableVec<T>.
2017-04-12 19:21:33 -07:00
Richard Newman
7948788936 Part 1: define ComputedTable.
Complex `or`s are translated to SQL as a subquery -- in particular, a
subquery that's a UNION. Conceptually, that subquery is a computed
table: `all_datoms` and `datoms` yield rows of e/a/v/tx, and each
computed table yields rows of variable bindings.

The table itself is a type, `ComputedTable`. Its `Union` case contains
everything a subquery needs: a `ConjoiningClauses` and a projection
list, which together allow us to build a SQL subquery, and a list of
variables that need type code extraction. (This is discussed further in
a later commit.)

Naturally we also need a way to refer to columns in a computed table.
We model this by a new enum case in `DatomsTable`, `Computed`, which
maintains an integer value that uniquely identifies a computed table.
2017-04-12 11:13:58 -07:00