Commit graph

97 commits

Author SHA1 Message Date
Greg Burd
85f0497cf5 WIP 2020-01-24 11:45:54 -05:00
Greg Burd
b838259a8b rustfmt 2020-01-23 13:16:19 -05:00
Greg Burd
ef1c196516 Update pretty_print dependency and fix issues. 2020-01-23 11:16:14 -05:00
Greg Burd
9421a5c3bb Fixes some mistakes when updating the grammar. 2020-01-23 11:16:14 -05:00
Greg Burd
60c65033b2 Specify dependency versions without patch component unless necessary. 2020-01-23 11:15:49 -05:00
Greg Burd
b2f92b8461 Update to 2018 edition of Rust (1.42). Fix and format code. Update dependencies. Fix tests. 2020-01-16 10:58:21 -05:00
sc13-bioinf
fba9568d44 Allow plus symbol "+" in symbol names. (#821) r=nalexander 2018-09-05 09:28:32 -07:00
Nick Alexander
e362ca6213 Part 1: Allow to clone useful query structures. 2018-07-05 16:45:42 -07:00
Nick Alexander
1c0602fa00 Part 1: Add {From,To}Millis.
I think this is just oversight.  Generally, we should anticipate what
our consumers need to do to interact with Mentat, and producing milli-
and micro-second timestamps is part of that need.
2018-07-05 16:45:42 -07:00
Nick Alexander
1cb1847aa6 Part 5: Make existing TermBuilder actually build Entity instances.
There are a few tricky details to call out here.  The first is the
`TransactableValueMarker` trait.  This is strictly a marker (like
`Sized`, for example) to give some control over what types can be used
as value types in `Entity` instances.  This expression is needed due
to the network of `Into` and `From` relations between the parts of
valid `Entity` instances.  This allows to drop the `IntoThing`
work-around trait and use the established patterns.  (Observe that
`KnownEntid` makes this a little harder, due to the cross-crate
consistency restrictions.)

The second is that we can get rid `{add,retract}_kw`, since the
network of relations expresses the coercions directly.

The third is that this commit doesn't change the name `TermBuilder`,
even though it is now building `Entity` instances.  This is because
there's _already_ an `EntityBuilder` which fixes the `EntityPlace`.
It's not clear whether the existing entity building interface should
be removed or whether both should be renamed.  That can be follow-up.
2018-07-05 16:33:51 -07:00
Nick Alexander
76507623ac Part 4: Prepare EDN Entity type for interning tempids during parsing.
This is all part of moving the entity builder away from building term
instances and toward building entity instances.  One of the nice
things that the existing term interface does is allow consumers to use
lightweight reference counted tempid handles; I don't want to lose
that, so we'll build it into the entity data structures directly.
2018-07-05 11:17:20 -07:00
Nick Alexander
106d6fae11 Part 3: Implement Deref and DerefMut for InternSet.
This pattern is generally how newtype wrappers (like `struct
Foo(Bar)`) are implemented in Rust.
2018-07-05 11:16:55 -07:00
Nick Alexander
02a163a10f Part 2: Use ValueRc in InternSet.
We haven't observed performance issues using `Arc` instead of `Rc`,
and we want to be able to include things that are interned (including,
soon, `TempId` instances) in errors coming out of the
transactor.  (And `Rc` isn't `Sync`, so it can't be included in errors
directly.)
2018-07-05 11:16:53 -07:00
Nick Alexander
87f850a44e Part 1: Move intern_set into edn crate.
It's not great to keep lifting functionality higher and higher up the
crate hierarchy, but we really do want to intern while we parse.
Eventually, I expect that we will split the `edn` crate into `types`
and `parsing`, and the `types` crate can depend on a more efficient
interning dependency.
2018-07-05 11:16:48 -07:00
Thom
72a9b302f9
Rename or delete things so that there is only one type named Entid (#768)
* Delete the (apparently unused) EntId

* Rename edn's Entid to EntidOrIdent to avoid confusion with the Entid that's actually an i64

* Fix travis beta bustage (This is actually unrelated to entids, but is a trivial fix nonetheless)
2018-06-26 16:34:18 -07:00
Nick Alexander
cfed968514 Review comments. 2018-06-04 15:21:27 -07:00
Nick Alexander
47441f56dc Part 5: Push FindQuery into query-algebrizer; structure errors.
This is a big deck-chair re-arrangement.  This puts FindQuery into
query-algebrizer and puts the validation from ParsedFindQuery ->
FindQuery their as well.

Some tests were re-homed for this.

In addition, the little-used maplit crate dependency was replaced with
inline expressions.
2018-06-04 15:04:39 -07:00
Nick Alexander
09f1d633b5 Part 4: Parse queries with rust-peg.
There's an unfortunate conflation here between implementing the query
parser in `rust-peg` and moving some validation that now happens at
parse time to happen later.  The result is that we introduce
`ParsedFindQuery` as a less-processed `FindQuery`, and that we only
use string errors (which is all `rust-peg` supports) instead of the
structured errors in query-parser's errors module.  The next commit
will address this, on the road to removing the `query-parser` module
entirely.
2018-06-04 15:04:39 -07:00
Nick Alexander
a8073056f2 Part 3: Move query into edn.
It's unfortunate to squash two crates together like this, but it's the
best option.
2018-06-04 15:04:37 -07:00
Nick Alexander
a4a8892309 Part 3a: Move file to preserve blame. 2018-06-04 14:56:56 -07:00
Nick Alexander
ad9a1394a3 Part 1: Push ValueRc and friends into edn crate.
This is a pre-requisite for moving the existing `combine`-based parser
to use `rust-peg` -- part of the push to use `rust-peg` for all parsing
started in https://github.com/mozilla/mentat/pull/681.  We need the
types for the parsed structure "very early", and the `edn` crate is
the earliest such crate.

This is an unfortunate destruction of boundaries between parts of the
system, but it's the best way we have to achieve this right now.
2018-06-04 14:52:51 -07:00
Nick Alexander
729fe59578 [edn] Pre: Rename keyword to namespaced_keyword.
The `Keyword` type evolved to become more general: we now use the one
type for both :regular and :name/spaced keywords.  This changes
reflects the new generality.
2018-06-04 14:52:51 -07:00
Richard Newman
01db9232b4
Include namespace-separating solidus in NamespaceableName; improve type handling around ground (#713) r=nalexander
* Include the namespace-separating solidus in NamespaceableName.
* Use type annotations when deciding how to process ambiguous ground input.
* Include simple patterns in the type extraction phase of pattern application. (#705)
* Review comment.
* Add a test.
2018-05-29 16:45:53 +02:00
Richard Newman
b2e98f44f6
Generalize Entity by value type. (#701) (#691) r=rnewman
* Part 3: Parameterize Entity by value type.

This isn't quite right, because after parsing, we shouldn't care
about` `edn::ValueAndSpan`, we should care only about edn::Value.
However, I think we can drop `ValueAndSpan` entirely if we just use
`rust-peg` (and its simpler error messages) rather than a mix of
`rust-peg` and `combine`.

In any case, this paves the way to transacting `Entity<TypedValue>`,
which is a nice step towards building general entities.

* Part 1: Add AttributePlace.

* Part 2: Name other places EntityPlace and ValuePlace.

Now we're consistent and closer to self-documenting.  Both matter more
as we expose `Entity` as the thing to build for programmatic usage.

* Part 4: Allow Ident and TempId in ValuePlace.

The parser will never produce these, since determining whether an
integer/keyword or string is an ident or a tempid, respectively, in
the value place requires the schema.

But a builder that produces `Entity` instances directly will want to
produce these.
2018-05-15 00:43:07 -07:00
Richard Newman
3cba87c74b Allow pull aliases to be non-namespaced. (#694) r=nalexander 2018-05-14 10:45:48 -07:00
Nick Alexander
4fde4fe0a6 Bustage fixes: compile on stable; avoid unused variable warning. 2018-05-11 10:22:57 -07:00
Richard Newman
3dc68bcd38 Combine NamespacedKeyword and Keyword. (#689) r=nalexander
* Make properties on NamespacedKeyword/NamespacedSymbol private

* Use only a single String for NamespacedKeyword/NamespacedSymbol

* Review comments.

* Remove unsafe code in namespaced_name.

Benchmarking shows approximately zero change.

* Allow the types of ns and name to differ when constructing a NamespacedName.

* Make symbol namespaces optional.

* Normalize names of keyword/symbol constructors.

This will make the subsequent refactor much less painful.

* Use expect not unwrap.

* Merge Keyword and NamespacedKeyword.
2018-05-11 09:52:17 -07:00
Nick Alexander
c8f74fa41b [edn] Round-trip instants. (#686) (#687) r=rnewman
First, the parser had a small grouping bug where-by it wouldn't parse
Z as timezone correctly.  Second, we weren't printing instants in the format
that we parse.
2018-05-11 02:11:04 -07:00
Nick Alexander
cbffe5e545 Use rust-peg for tx parsing.
There are few reasons to do this:

- it's difficult to add symbol interning to combine-based parsers like
  tx-parser -- literally every type changes to reflect the interner,
  and that means every convenience macro we've built needs to chagne.
  It's trivial to add interning to rust-peg-based parsers.

- combine has rolled forward to 3.2, and I spent a similar amount of
  time investigating how to upgrade tx-parser (to take advantage of
  the new parser! macros in combine that I think are necessary for
  adapting to changing types) as I did just converting to rust-peg.

- it's easy to improve the error messages in rust-peg, where-as I have
  tried twice to improve the nested error messages in combine and am
  stumped.

- it's roughly 4x faster to parse strings directly as opposed to
  edn::ValueAndSpan, and it'll be even better when we intern directly.
2018-05-10 10:24:05 -07:00
Nick Alexander
36eca0bfb0 (chore) Pre: Use the same features of uuid throughout the project. 2018-04-19 09:58:41 -07:00
Richard Newman
df58de52f4
Correctly parse and unescape quotes etc. inside EDN strings. (#434) (#589) 2018-03-15 07:13:27 -07:00
Richard Newman
1817ce7c0b Performance and cleanup. r=emily
* Use fixed-size arrays for bootstrap datoms, not vecs.
* Wide-ranging cleanup.

    This commit:
    - Deletes some dead code.
    - Marks some functions only used by tests as cfg(test).
    - Adds pub(crate) to a bunch of functions.
    - Cleans up a few other nits.
2018-03-06 09:03:00 -08:00
Richard Newman
9b23cf3945
Speed up EDN parser (fixes #445) (#581) r=nalexander
Fixes from @kevinmehall.

* Prefer character sets over backtracking in the EDN parser.
* Avoid duplicate effort when parsing floats in the EDN parser.
* Clean up duplicate position tracking code.

This turns out to have little performance impact, but makes the grammar
much cleaner.

* Fix EDN work to pass tests with correct numeric precedence.
2018-03-05 20:33:51 -08:00
Grisha Kruglov
84f29676e8
"Unchanged server" uploader flow (#543) r=rnewman
* Remove unused struct from tx_processor

* Derive serialize & deserialize for TypedValue

* First pass of uploader flow + feedback
2018-02-09 09:55:19 -08:00
Richard Newman
2614f498be Ergonomics improvements, including a kw macro. (#537) r=emily
* Add TypedValue::instant(micros).
* Add From<f64> for TypedValue.
* Add lookup_values_for_attribute to Conn.
* Add q_explain to Queryable.
* Expose an iterator over FindSpec's columns.
* Export edn from mentat crate. Export QueryExecutionResult.
* Implement Display for Variable and Element.
* Introduce a `kw` macro.

    This allows you to write:

    ```rust
    kw!(:foo/bar)
    ```

    instead of

    ```rust
    NamespacedKeyword::new("foo", "bar")
    ```

    … and it's more efficient, too.

Add `mentat::open`, eliminate use of `mentat_db` in some places.
2018-02-01 09:27:23 -08:00
Thom
9740cafdbd Automatically remove trailing whitespace from text files. (#527) r=rnewman
This was done using the following shell script:

```
find . -type f -not -path "*target*" \
       '(' -name '*.rs' -o -name '*.md' -o -name '*.toml' ')' -print0 | \
    xargs -0 sed -i '' -E 's/[[:space:]]*$//'
```

Which is admittedly imperfect, but manages to hit everything that was a problem in this repo.
2018-01-19 21:21:04 -06:00
Grisha Kruglov
c61bc79b99 Sync metadata schema and SyncMetadataClient. (#502) r=rnewman 2017-12-13 14:19:05 -06:00
Richard Newman
c600152d78
Update some dependencies. (#492) r=etoop
* Update some dependencies.

* Update rusqlite to 0.12.

* Update error-chain to a forked version that implements Sync.

* Fix some compiler warnings.

* Remove unused imports in tests.

* Parse errors no longer naturally print with the expected symbol.
2017-11-21 16:24:08 +00:00
Richard Newman
aa5f569df5 There are one million microseconds in a second, not one hundred thousand. (#480) r=fluffyemily 2017-06-16 08:00:14 -07:00
Nick Alexander
59a710f80f Review comments: another test, add unreversed(). 2017-06-08 10:30:31 -07:00
Nick Alexander
d1ac752de6 Parse without copying; parse keyword maps using macros.
This is a big commit, but it breaks into two conceptual pieces.  The
first is to "parse without copying".  We replace a stream of an owned
collection of edn::ValueAndSpan and instead have a stream of a
borrowed collection of &edn::ValueAndSpan references.  (Generally,
this is represented as an iterator over a slice, but it can be over
other things too.)  Cloning such iterators is constant time, which
improves on cloning an owned collection of edn::ValueAndSpan, which is
linear time in the length of the collection and additional time
depending on the complexity of the EDN values.

The second conceptual piece is to parse keyword maps using a special
parser and a macro to build the parser implementations.  Before, we
created a new edn::ValueAndSpan::Map to represent a keyword map in
vector form; since we're working with &edn::ValueAndSpan references
now, we can't create an &edn::ValueAndSpan reference with an
appropriate lifetime.  Therefore we generalize the concept of
iteration slightly and turn keyword maps in map form into linear
iterators by flattening the value maps.  This is a potentially
obscuring transformation, so we have to take care to protect against
some failure cases.  (See the comments and the tests in the code.)

After these changes, parsing using `combine` is linear time (and
reasonably fast).
2017-05-18 10:17:13 -07:00
Richard Newman
523d5ea5f1 Bump dependency versions. r=bgrins. (#441) 2017-05-03 12:53:16 -07:00
Richard Newman
daca8def57 UUIDs and instants. Fixes #44, #45, #426, #427. (#438) r=nalexander
* Pre: unused import in translate.rs.

* Part 2: take a dependency on rusqlite for query arguments.

* Part 1: flatten V2 schema into V1. Add UUID and URI.

Bump expected ident and bootstrap datom count in tests.

* Part 5: parse edn::Value::Uuid.

* Part 3: extend ValueType and TypedValue to include Uuid.

* Part 4: add Uuid to query arguments.

* Part 6: extend db to support Uuid.

* Part 8: add a tx-parser test for #f NaN and #uuid.

* Part 7: parse and algebrize UUIDs in queries.

* Part 1: parse #inst in EDN and throughout query engine.

* Part 3: handle instants in db.

* Part 2: instants never matches integers in queries.

* Part 4: use DateTime for tx_instants.

* Add a test for adding and querying UUIDs and instants.

* Review comments.
2017-04-28 20:11:55 -07:00
Nick Alexander
5369f03464 Improve parsing of nested edn::ValueAndSpan streams. r=rnewman (#393)
* Pre: Expose more in edn.

* Pre: Make it easier to work with ValueAndSpan.

with_spans() is a temporary hack, needed only because I don't care to
parse the bootstrap assertions from text right now.

* Part 1a: Add `value_and_span` for parsing nested `edn::ValueAndSpan` instances.

I wasn't able to abstract over `edn::Value` and `edn::ValueAndSpan`;
there are multiple obstacles.  I chose to roll with
`edn::ValueAndSpan` since it exposes the additional span information
that we will want to form good error messages in the future.

* Part 1b: Add keyword_map() parsing an `edn::Value::Vector` into an `edn::Value::map`.

* Part 1c: Add `Log`/`.log(...)` for logging parser progress.

This is a terrible hack, but it sure helps to debug complicated nested
parsers.  I don't even know what a principled approach would look
like; since our parser combinators are so frequently expressed in
code, it's hard to imagine a data-driven interpreter that can help
debug things.

* Part 2: Use `value_and_span` apparatus in tx-parser/.

I break an abstraction boundary by returning a value column
`edn::ValueAndSpan` rather than just an `edn::Value`.  That is, the
transaction processor shouldn't care where the `edn::Value` it is
processing arose -- even we care to track that information we should
bake it into the `Entity` type.  We do this because we need to
dynamically parse the value column to support nested maps, and parsing
requires a full `edn::ValueAndSpan`.  Alternately, we could cheat and
fake the spans when parsing nested maps, but that's potentially
expensive.

* Part 3: Use `value_and_span` apparatus in query-parser/.

* Part 4: Use `value_and_span` apparatus in root crate.

* Review comment: Make Span and SpanPosition Copy.

* Review comment: nits.

* Review comment: Make `or` be `or_exactly`.

I baked the eof checking directly into the parser, rather than using
the skip and eof parsers.  I also took the time to restore some tests
that were mistakenly commented out.

* Review comment: Extract and use def_matches_* macros.

* Review comment: .map() as late as possible.
2017-04-06 10:06:28 -07:00
Nick Alexander
4b874deae1 Lookup refs, nested vector values, map notation. Fixes #180, fixes #183, fixes #284. (#382) r=rnewman
* Pre: Fix error in parser macros.

* Pre: Make test unwrapping more verbose.

* Pre: Make lookup refs be (lookup-ref a v) in the entity position.

This has the advantage of being explicit in all situations and
unambiguous at parse-time.  This choice agrees with the Clojure
implementation but not with Datomic.  Datomic treats [a v] as a lookup
ref, is ambiguous at parse-time, and is disambiguated in ways I do not
understand at transaction time.  We mooted making lookup refs [[a v]]
and outlawing nested value vectors in transactions, but after
implementing that approach I decided it was better to handle lookup
refs at parse time and therefore outlawing nested value vectors is not
necessary.

* Handle lookup refs in the entity and value columns. Fixes #183.

* Pre 0a: Use a stack instead of into_iter.

* Pre 0b: Dedent.

* Pre 0c: Handle `e` after `v`.

This allows to use the original `e` while handling `v`.

* Explode value lists for :db.cardinality/many attributes. Fixes #284.

* Parse and accept map notation. Fixes #180.

* Pre: Modernize add() and retract() into one add_or_retract().

* Pre: Add is_collection and is_atom to edn::Value.

* Pre: Differentiate atoms from lookup-refs in value position.

Initially, I expected to accept arbitrary edn::Value instances in the
value position, and to differentiate in the transactor.  However, the
implementation quickly became a two-stage parser, since we always
wanted to parse the resulting value position into some other known
thing using the tx-parser.  To save calls into the parser and to allow
the parser to move forward with a smaller API surface, I push as much
of this parsing as possible into the initial parse.

* Pre: Modernize entities().

* Pre: Quote edn::Value::Text in Display.

* Review comment: Add and use edn::Value::into_atom.

* Review comment: Use skip(eof()) throughout.

* Review comment: VecDeque instead of Vec.

* Review comment: Part 0: Rename TempId to TempIdHandle.

* Review comment: Part 1: Differentiate internal and external tempids.

This breaks an abstraction boundary by pushing the Internal/External
split up to the Entity level in tx/ and tx-parser/.  This just makes
it easier to explode Entity map notation instances into Entity
instances, taking an existing External tempid :db/id or generating a
new Internal tempid as appropriate.  To do this without breaking the
abstraction boundary would require adding flexibility to the
transaction processor: we'd need to be able to turn Entity instances
into some internal enum and handle the two cases independently.  It
wouldn't be too hard, but this reduces the combinatorial type
explosion.
2017-03-27 16:30:04 -07:00
Richard Newman
85f3b79f75 Support a limited set of '.'-prefixed non-keyword symbols. (#352) r=nalexander
This commit allows `.` and `...` to parse correctly as `PlainSymbol`.

Tests in edn, query-translator, and the top level have been added.
2017-03-06 15:01:19 -08:00
Nick Alexander
dcd9bcb1ce Extract partial storage abstraction; use error-chain throughout. Fixes #328. r=rnewman (#341)
* Pre: Drop unneeded tx0 from search results.

* Pre: Don't require a schema in some of the DB code.

The idea is to separate the transaction applying code, which is
schema-aware, from the concrete storage code, which is just concerned
with getting bits onto disk.

* Pre: Only reference Schema, not DB, in debug module.

This is part of a larger separation of the volatile PartitionMap,
which is modified every transaction, from the stable Schema, which is
infrequently modified.

* Pre: Fix indentation.

* Extract part of DB to new SchemaTypeChecking trait.

* Extract part of DB to new PartitionMapping trait.

* Pre: Don't expect :db.part/tx partition to advance when tx fails.

This fails right now, because we allocate tx IDs even when we shouldn't.

* Sketch a db interface without DB.

* Add ValueParseError; use error-chain in tx-parser.

This can be simplified when
https://github.com/Marwes/combine/issues/86 makes it to a published
release, but this unblocks us for now.  This converts the `combine`
error type `ParseError<&'a [edn::Value]>` to a type with owned
`Vec<edn::Value>` collections, re-using `edn::Value::Vector` for
making them `Display`.

* Pre: Accept Borrow<Schema> instead of just &Schema in debug module.

This makes it easy to use Rc<Schema> or Arc<Schema> without inserting
&* sigils throughout the code.

* Use error-chain in query-parser.

There are a few things to point out here:

- the fine grained error types have been flattened into one crate-wide
  error type; it's pretty easy to regain the granularity as needed.

- edn::ParseError is automatically lifted to
  mentat_query_parser::errors::Error;

- we use mentat_parser_utils::ValueParser to maintain parsing error
  information from `combine`.

* Patch up top-level.

* Review comment: Only `borrow()` once.
2017-02-24 15:33:48 -08:00
Victor Porof
7fc2a22d68 Implement a basic edn matcher, r=ncalexan (#271) (#338)
Signed-off-by: Victor Porof <victor.porof@gmail.com>
2017-02-23 09:11:34 +01:00
Victor Porof
bf707acbc3 Lint for the clippy gods in the edn crate (#340)
Signed-off-by: Victor Porof <victor.porof@gmail.com>
2017-02-22 18:11:05 +01:00
Victor Porof
0d3b8e4b29 Avoid code duplication for common Value trait implementations, r=ncalexan
Signed-off-by: Victor Porof <victor.porof@gmail.com>
2017-02-22 08:36:45 +01:00