Commit graph

27 commits

Author SHA1 Message Date
Nick Alexander
76507623ac Part 4: Prepare EDN Entity type for interning tempids during parsing.
This is all part of moving the entity builder away from building term
instances and toward building entity instances.  One of the nice
things that the existing term interface does is allow consumers to use
lightweight reference counted tempid handles; I don't want to lose
that, so we'll build it into the entity data structures directly.
2018-07-05 11:17:20 -07:00
Thom
72a9b302f9
Rename or delete things so that there is only one type named Entid (#768)
* Delete the (apparently unused) EntId

* Rename edn's Entid to EntidOrIdent to avoid confusion with the Entid that's actually an i64

* Fix travis beta bustage (This is actually unrelated to entids, but is a trivial fix nonetheless)
2018-06-26 16:34:18 -07:00
Nick Alexander
cfed968514 Review comments. 2018-06-04 15:21:27 -07:00
Nick Alexander
09f1d633b5 Part 4: Parse queries with rust-peg.
There's an unfortunate conflation here between implementing the query
parser in `rust-peg` and moving some validation that now happens at
parse time to happen later.  The result is that we introduce
`ParsedFindQuery` as a less-processed `FindQuery`, and that we only
use string errors (which is all `rust-peg` supports) instead of the
structured errors in query-parser's errors module.  The next commit
will address this, on the road to removing the `query-parser` module
entirely.
2018-06-04 15:04:39 -07:00
Nick Alexander
729fe59578 [edn] Pre: Rename keyword to namespaced_keyword.
The `Keyword` type evolved to become more general: we now use the one
type for both :regular and :name/spaced keywords.  This changes
reflects the new generality.
2018-06-04 14:52:51 -07:00
Richard Newman
b2e98f44f6
Generalize Entity by value type. (#701) (#691) r=rnewman
* Part 3: Parameterize Entity by value type.

This isn't quite right, because after parsing, we shouldn't care
about` `edn::ValueAndSpan`, we should care only about edn::Value.
However, I think we can drop `ValueAndSpan` entirely if we just use
`rust-peg` (and its simpler error messages) rather than a mix of
`rust-peg` and `combine`.

In any case, this paves the way to transacting `Entity<TypedValue>`,
which is a nice step towards building general entities.

* Part 1: Add AttributePlace.

* Part 2: Name other places EntityPlace and ValuePlace.

Now we're consistent and closer to self-documenting.  Both matter more
as we expose `Entity` as the thing to build for programmatic usage.

* Part 4: Allow Ident and TempId in ValuePlace.

The parser will never produce these, since determining whether an
integer/keyword or string is an ident or a tempid, respectively, in
the value place requires the schema.

But a builder that produces `Entity` instances directly will want to
produce these.
2018-05-15 00:43:07 -07:00
Richard Newman
3dc68bcd38 Combine NamespacedKeyword and Keyword. (#689) r=nalexander
* Make properties on NamespacedKeyword/NamespacedSymbol private

* Use only a single String for NamespacedKeyword/NamespacedSymbol

* Review comments.

* Remove unsafe code in namespaced_name.

Benchmarking shows approximately zero change.

* Allow the types of ns and name to differ when constructing a NamespacedName.

* Make symbol namespaces optional.

* Normalize names of keyword/symbol constructors.

This will make the subsequent refactor much less painful.

* Use expect not unwrap.

* Merge Keyword and NamespacedKeyword.
2018-05-11 09:52:17 -07:00
Nick Alexander
c8f74fa41b [edn] Round-trip instants. (#686) (#687) r=rnewman
First, the parser had a small grouping bug where-by it wouldn't parse
Z as timezone correctly.  Second, we weren't printing instants in the format
that we parse.
2018-05-11 02:11:04 -07:00
Nick Alexander
cbffe5e545 Use rust-peg for tx parsing.
There are few reasons to do this:

- it's difficult to add symbol interning to combine-based parsers like
  tx-parser -- literally every type changes to reflect the interner,
  and that means every convenience macro we've built needs to chagne.
  It's trivial to add interning to rust-peg-based parsers.

- combine has rolled forward to 3.2, and I spent a similar amount of
  time investigating how to upgrade tx-parser (to take advantage of
  the new parser! macros in combine that I think are necessary for
  adapting to changing types) as I did just converting to rust-peg.

- it's easy to improve the error messages in rust-peg, where-as I have
  tried twice to improve the nested error messages in combine and am
  stumped.

- it's roughly 4x faster to parse strings directly as opposed to
  edn::ValueAndSpan, and it'll be even better when we intern directly.
2018-05-10 10:24:05 -07:00
Richard Newman
df58de52f4
Correctly parse and unescape quotes etc. inside EDN strings. (#434) (#589) 2018-03-15 07:13:27 -07:00
Richard Newman
9b23cf3945
Speed up EDN parser (fixes #445) (#581) r=nalexander
Fixes from @kevinmehall.

* Prefer character sets over backtracking in the EDN parser.
* Avoid duplicate effort when parsing floats in the EDN parser.
* Clean up duplicate position tracking code.

This turns out to have little performance impact, but makes the grammar
much cleaner.

* Fix EDN work to pass tests with correct numeric precedence.
2018-03-05 20:33:51 -08:00
Richard Newman
c600152d78
Update some dependencies. (#492) r=etoop
* Update some dependencies.

* Update rusqlite to 0.12.

* Update error-chain to a forked version that implements Sync.

* Fix some compiler warnings.

* Remove unused imports in tests.

* Parse errors no longer naturally print with the expected symbol.
2017-11-21 16:24:08 +00:00
Richard Newman
daca8def57 UUIDs and instants. Fixes #44, #45, #426, #427. (#438) r=nalexander
* Pre: unused import in translate.rs.

* Part 2: take a dependency on rusqlite for query arguments.

* Part 1: flatten V2 schema into V1. Add UUID and URI.

Bump expected ident and bootstrap datom count in tests.

* Part 5: parse edn::Value::Uuid.

* Part 3: extend ValueType and TypedValue to include Uuid.

* Part 4: add Uuid to query arguments.

* Part 6: extend db to support Uuid.

* Part 8: add a tx-parser test for #f NaN and #uuid.

* Part 7: parse and algebrize UUIDs in queries.

* Part 1: parse #inst in EDN and throughout query engine.

* Part 3: handle instants in db.

* Part 2: instants never matches integers in queries.

* Part 4: use DateTime for tx_instants.

* Add a test for adding and querying UUIDs and instants.

* Review comments.
2017-04-28 20:11:55 -07:00
Nick Alexander
5369f03464 Improve parsing of nested edn::ValueAndSpan streams. r=rnewman (#393)
* Pre: Expose more in edn.

* Pre: Make it easier to work with ValueAndSpan.

with_spans() is a temporary hack, needed only because I don't care to
parse the bootstrap assertions from text right now.

* Part 1a: Add `value_and_span` for parsing nested `edn::ValueAndSpan` instances.

I wasn't able to abstract over `edn::Value` and `edn::ValueAndSpan`;
there are multiple obstacles.  I chose to roll with
`edn::ValueAndSpan` since it exposes the additional span information
that we will want to form good error messages in the future.

* Part 1b: Add keyword_map() parsing an `edn::Value::Vector` into an `edn::Value::map`.

* Part 1c: Add `Log`/`.log(...)` for logging parser progress.

This is a terrible hack, but it sure helps to debug complicated nested
parsers.  I don't even know what a principled approach would look
like; since our parser combinators are so frequently expressed in
code, it's hard to imagine a data-driven interpreter that can help
debug things.

* Part 2: Use `value_and_span` apparatus in tx-parser/.

I break an abstraction boundary by returning a value column
`edn::ValueAndSpan` rather than just an `edn::Value`.  That is, the
transaction processor shouldn't care where the `edn::Value` it is
processing arose -- even we care to track that information we should
bake it into the `Entity` type.  We do this because we need to
dynamically parse the value column to support nested maps, and parsing
requires a full `edn::ValueAndSpan`.  Alternately, we could cheat and
fake the spans when parsing nested maps, but that's potentially
expensive.

* Part 3: Use `value_and_span` apparatus in query-parser/.

* Part 4: Use `value_and_span` apparatus in root crate.

* Review comment: Make Span and SpanPosition Copy.

* Review comment: nits.

* Review comment: Make `or` be `or_exactly`.

I baked the eof checking directly into the parser, rather than using
the skip and eof parsers.  I also took the time to restore some tests
that were mistakenly commented out.

* Review comment: Extract and use def_matches_* macros.

* Review comment: .map() as late as possible.
2017-04-06 10:06:28 -07:00
Richard Newman
85f3b79f75 Support a limited set of '.'-prefixed non-keyword symbols. (#352) r=nalexander
This commit allows `.` and `...` to parse correctly as `PlainSymbol`.

Tests in edn, query-translator, and the top level have been added.
2017-03-06 15:01:19 -08:00
Victor Porof
896d7f8f88 Add a span component to edn::Value, r=ncalexan
Signed-off-by: Victor Porof <victor.porof@gmail.com>
2017-02-17 18:31:26 +01:00
Jordan Santell
4f5c94891a Add octal, hexadecimal, and arbitrary base integers to the EDN parser. Fixes #277. r=rnewman (#286) 2017-02-10 16:03:35 -08:00
Victor Porof
42580539b8 Properly handle whitespace for Infinity and NaN, r=rnewman (#246)
Signed-off-by: Victor Porof <vporof@mozilla.com>
2017-02-09 18:13:44 +01:00
Victor Porof
a627f532f0 Relax whitespace rules for edn vectors, lists, sets and maps
Signed-off-by: Victor Porof <vporof@mozilla.com>
2017-02-04 08:45:31 +01:00
Victor Porof
419db388da Relax whitespace rules for Infinity and NaN
Signed-off-by: Victor Porof <vporof@mozilla.com>
2017-02-04 08:45:02 +01:00
Jordan Santell
0b20d7691b Parse and display EDN values for NaN, +Infinity and -Infinity. Fixes #232 (#238) r=victorporof 2017-02-03 10:14:23 -08:00
Victor Porof
9ee0ac8e00 Unify and generalize keywords and symbols parsing
Signed-off-by: Victor Porof <vporof@mozilla.com>
2017-02-03 09:06:42 +01:00
Victor Porof
72da5722ae Update rustpeg to latest version and follow new syntax and formatting rules
Signed-off-by: Victor Porof <vporof@mozilla.com>
2017-02-03 09:06:42 +01:00
Nick Alexander
ab041291fb edn: Bound values by optional whitespace; treat comma as whitespace. 2017-01-18 08:34:27 -08:00
Nick Alexander
247035cc9b edn: Allow comments.
EDN supports only one type of comment: initiated by ; and lasting
until the end of the current line or the end of the input stream.
2017-01-18 08:34:27 -08:00
Richard Newman
a152e60040 Read EDN keywords and symbols as rich types. Fixes #154. r=nalexander 2017-01-12 09:09:48 -08:00
Joe Walker
c4735119c4 Implement a basic EDN parser. (#149) r=rnewman,bgrins,nalexander
The parser mostly works and has a decent test suite. It parses all the
queries issued by the Tofino UAS, with some caveats. Known flaws:

* No support for tagged elements, comments, discarded elements or "'".
* Incomplete support for escaped characters in strings and the range of
  characters that are allowed in keywords and symbols.
* Possible whitespace handling problems.
2017-01-11 13:03:04 -08:00