mentat/query/src/lib.rs

1186 lines
38 KiB
Rust
Raw Normal View History

// Copyright 2016 Mozilla
//
// Licensed under the Apache License, Version 2.0 (the "License"); you may not use
// this file except in compliance with the License. You may obtain a copy of the
// License at http://www.apache.org/licenses/LICENSE-2.0
// Unless required by applicable law or agreed to in writing, software distributed
// under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR
// CONDITIONS OF ANY KIND, either express or implied. See the License for the
// specific language governing permissions and limitations under the License.
///! This module defines some core types that support find expressions: sources,
///! variables, expressions, etc.
///! These are produced as 'fuel' by the query parser, consumed by the query
///! translator and executor.
///!
///! Many of these types are defined as simple structs that are little more than
///! a richer type alias: a variable, for example, is really just a fancy kind
///! of string.
///!
///! At some point in the future, we might consider reducing copying and memory
///! usage by recasting all of these string-holding structs and enums in terms
///! of string references, with those references being slices of some parsed
///! input query string, and valid for the lifetime of that string.
///!
///! For now, for the sake of simplicity, all of these strings are heap-allocated.
///!
///! Furthermore, we might cut out some of the chaff here: each time a 'tagged'
///! type is used within an enum, we have an opportunity to simplify and use the
///! inner type directly in conjunction with matching on the enum. Before diving
///! deeply into this it's worth recognizing that this loss of 'sovereignty' is
///! a tradeoff against well-typed function signatures and other such boundaries.
extern crate edn;
extern crate mentat_core;
use std::collections::{
BTreeSet,
HashSet,
};
use std::fmt;
use std::rc::Rc;
use edn::{
BigInt,
DateTime,
OrderedFloat,
Uuid,
Utc,
};
pub use edn::{
Keyword,
PlainSymbol,
};
use mentat_core::{
FromRc,
TypedValue,
ValueRc,
ValueType,
};
pub type SrcVarName = String; // Do not include the required syntactic '$'.
#[derive(Clone, PartialEq, Eq, Hash, PartialOrd, Ord)]
pub struct Variable(pub Rc<PlainSymbol>);
impl Variable {
pub fn as_str(&self) -> &str {
self.0.as_ref().0.as_str()
}
pub fn to_string(&self) -> String {
self.0.as_ref().0.clone()
}
pub fn name(&self) -> PlainSymbol {
self.0.as_ref().clone()
}
/// Return a new `Variable`, assuming that the provided string is a valid name.
pub fn from_valid_name(name: &str) -> Variable {
let s = PlainSymbol::plain(name);
assert!(s.is_var_symbol());
Variable(Rc::new(s))
}
}
pub trait FromValue<T> {
fn from_value(v: &edn::ValueAndSpan) -> Option<T>;
}
/// If the provided EDN value is a PlainSymbol beginning with '?', return
/// it wrapped in a Variable. If not, return None.
/// TODO: intern strings. #398.
impl FromValue<Variable> for Variable {
fn from_value(v: &edn::ValueAndSpan) -> Option<Variable> {
Improve parsing of nested `edn::ValueAndSpan` streams. r=rnewman (#393) * Pre: Expose more in edn. * Pre: Make it easier to work with ValueAndSpan. with_spans() is a temporary hack, needed only because I don't care to parse the bootstrap assertions from text right now. * Part 1a: Add `value_and_span` for parsing nested `edn::ValueAndSpan` instances. I wasn't able to abstract over `edn::Value` and `edn::ValueAndSpan`; there are multiple obstacles. I chose to roll with `edn::ValueAndSpan` since it exposes the additional span information that we will want to form good error messages in the future. * Part 1b: Add keyword_map() parsing an `edn::Value::Vector` into an `edn::Value::map`. * Part 1c: Add `Log`/`.log(...)` for logging parser progress. This is a terrible hack, but it sure helps to debug complicated nested parsers. I don't even know what a principled approach would look like; since our parser combinators are so frequently expressed in code, it's hard to imagine a data-driven interpreter that can help debug things. * Part 2: Use `value_and_span` apparatus in tx-parser/. I break an abstraction boundary by returning a value column `edn::ValueAndSpan` rather than just an `edn::Value`. That is, the transaction processor shouldn't care where the `edn::Value` it is processing arose -- even we care to track that information we should bake it into the `Entity` type. We do this because we need to dynamically parse the value column to support nested maps, and parsing requires a full `edn::ValueAndSpan`. Alternately, we could cheat and fake the spans when parsing nested maps, but that's potentially expensive. * Part 3: Use `value_and_span` apparatus in query-parser/. * Part 4: Use `value_and_span` apparatus in root crate. * Review comment: Make Span and SpanPosition Copy. * Review comment: nits. * Review comment: Make `or` be `or_exactly`. I baked the eof checking directly into the parser, rather than using the skip and eof parsers. I also took the time to restore some tests that were mistakenly commented out. * Review comment: Extract and use def_matches_* macros. * Review comment: .map() as late as possible.
2017-04-06 17:06:28 +00:00
if let edn::SpannedValue::PlainSymbol(ref s) = v.inner {
Variable::from_symbol(s)
} else {
None
}
}
}
impl Variable {
pub fn from_rc(sym: Rc<PlainSymbol>) -> Option<Variable> {
if sym.is_var_symbol() {
Some(Variable(sym.clone()))
} else {
None
}
}
/// TODO: intern strings. #398.
pub fn from_symbol(sym: &PlainSymbol) -> Option<Variable> {
if sym.is_var_symbol() {
Some(Variable(Rc::new(sym.clone())))
} else {
None
}
}
}
impl fmt::Debug for Variable {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
write!(f, "var({})", self.0)
}
}
impl std::fmt::Display for Variable {
fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
write!(f, "{}", self.0)
}
}
#[derive(Clone, Debug, PartialEq, Eq, PartialOrd, Ord)]
pub struct QueryFunction(pub PlainSymbol);
impl FromValue<QueryFunction> for QueryFunction {
fn from_value(v: &edn::ValueAndSpan) -> Option<QueryFunction> {
Improve parsing of nested `edn::ValueAndSpan` streams. r=rnewman (#393) * Pre: Expose more in edn. * Pre: Make it easier to work with ValueAndSpan. with_spans() is a temporary hack, needed only because I don't care to parse the bootstrap assertions from text right now. * Part 1a: Add `value_and_span` for parsing nested `edn::ValueAndSpan` instances. I wasn't able to abstract over `edn::Value` and `edn::ValueAndSpan`; there are multiple obstacles. I chose to roll with `edn::ValueAndSpan` since it exposes the additional span information that we will want to form good error messages in the future. * Part 1b: Add keyword_map() parsing an `edn::Value::Vector` into an `edn::Value::map`. * Part 1c: Add `Log`/`.log(...)` for logging parser progress. This is a terrible hack, but it sure helps to debug complicated nested parsers. I don't even know what a principled approach would look like; since our parser combinators are so frequently expressed in code, it's hard to imagine a data-driven interpreter that can help debug things. * Part 2: Use `value_and_span` apparatus in tx-parser/. I break an abstraction boundary by returning a value column `edn::ValueAndSpan` rather than just an `edn::Value`. That is, the transaction processor shouldn't care where the `edn::Value` it is processing arose -- even we care to track that information we should bake it into the `Entity` type. We do this because we need to dynamically parse the value column to support nested maps, and parsing requires a full `edn::ValueAndSpan`. Alternately, we could cheat and fake the spans when parsing nested maps, but that's potentially expensive. * Part 3: Use `value_and_span` apparatus in query-parser/. * Part 4: Use `value_and_span` apparatus in root crate. * Review comment: Make Span and SpanPosition Copy. * Review comment: nits. * Review comment: Make `or` be `or_exactly`. I baked the eof checking directly into the parser, rather than using the skip and eof parsers. I also took the time to restore some tests that were mistakenly commented out. * Review comment: Extract and use def_matches_* macros. * Review comment: .map() as late as possible.
2017-04-06 17:06:28 +00:00
if let edn::SpannedValue::PlainSymbol(ref s) = v.inner {
QueryFunction::from_symbol(s)
} else {
None
}
}
}
impl QueryFunction {
pub fn from_symbol(sym: &PlainSymbol) -> Option<QueryFunction> {
// TODO: validate the acceptable set of function names.
Some(QueryFunction(sym.clone()))
}
}
impl std::fmt::Display for QueryFunction {
fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
write!(f, "{}", self.0)
}
}
#[derive(Clone, Debug, Eq, PartialEq)]
pub enum Direction {
Ascending,
Descending,
}
/// An abstract declaration of ordering: direction and variable.
#[derive(Clone, Debug, Eq, PartialEq)]
pub struct Order(pub Direction, pub Variable); // Future: Element instead of Variable?
#[derive(Clone, Debug, PartialEq, Eq, PartialOrd, Ord)]
pub enum SrcVar {
DefaultSrc,
NamedSrc(SrcVarName),
}
impl FromValue<SrcVar> for SrcVar {
fn from_value(v: &edn::ValueAndSpan) -> Option<SrcVar> {
Improve parsing of nested `edn::ValueAndSpan` streams. r=rnewman (#393) * Pre: Expose more in edn. * Pre: Make it easier to work with ValueAndSpan. with_spans() is a temporary hack, needed only because I don't care to parse the bootstrap assertions from text right now. * Part 1a: Add `value_and_span` for parsing nested `edn::ValueAndSpan` instances. I wasn't able to abstract over `edn::Value` and `edn::ValueAndSpan`; there are multiple obstacles. I chose to roll with `edn::ValueAndSpan` since it exposes the additional span information that we will want to form good error messages in the future. * Part 1b: Add keyword_map() parsing an `edn::Value::Vector` into an `edn::Value::map`. * Part 1c: Add `Log`/`.log(...)` for logging parser progress. This is a terrible hack, but it sure helps to debug complicated nested parsers. I don't even know what a principled approach would look like; since our parser combinators are so frequently expressed in code, it's hard to imagine a data-driven interpreter that can help debug things. * Part 2: Use `value_and_span` apparatus in tx-parser/. I break an abstraction boundary by returning a value column `edn::ValueAndSpan` rather than just an `edn::Value`. That is, the transaction processor shouldn't care where the `edn::Value` it is processing arose -- even we care to track that information we should bake it into the `Entity` type. We do this because we need to dynamically parse the value column to support nested maps, and parsing requires a full `edn::ValueAndSpan`. Alternately, we could cheat and fake the spans when parsing nested maps, but that's potentially expensive. * Part 3: Use `value_and_span` apparatus in query-parser/. * Part 4: Use `value_and_span` apparatus in root crate. * Review comment: Make Span and SpanPosition Copy. * Review comment: nits. * Review comment: Make `or` be `or_exactly`. I baked the eof checking directly into the parser, rather than using the skip and eof parsers. I also took the time to restore some tests that were mistakenly commented out. * Review comment: Extract and use def_matches_* macros. * Review comment: .map() as late as possible.
2017-04-06 17:06:28 +00:00
if let edn::SpannedValue::PlainSymbol(ref s) = v.inner {
SrcVar::from_symbol(s)
} else {
None
}
}
}
impl SrcVar {
pub fn from_symbol(sym: &PlainSymbol) -> Option<SrcVar> {
if sym.is_src_symbol() {
2017-04-05 22:30:22 +00:00
if sym.0 == "$" {
Some(SrcVar::DefaultSrc)
} else {
Some(SrcVar::NamedSrc(sym.name().to_string()))
2017-04-05 22:30:22 +00:00
}
} else {
None
}
}
}
/// These are the scalar values representable in EDN.
#[derive(Clone, Debug, Eq, PartialEq)]
pub enum NonIntegerConstant {
Boolean(bool),
BigInteger(BigInt),
Float(OrderedFloat<f64>),
Text(ValueRc<String>),
Instant(DateTime<Utc>),
Uuid(Uuid),
}
impl NonIntegerConstant {
pub fn into_typed_value(self) -> TypedValue {
match self {
NonIntegerConstant::BigInteger(_) => unimplemented!(), // TODO: #280.
NonIntegerConstant::Boolean(v) => TypedValue::Boolean(v),
NonIntegerConstant::Float(v) => TypedValue::Double(v),
NonIntegerConstant::Text(v) => v.into(),
NonIntegerConstant::Instant(v) => TypedValue::Instant(v),
NonIntegerConstant::Uuid(v) => TypedValue::Uuid(v),
}
}
}
impl<'a> From<&'a str> for NonIntegerConstant {
fn from(val: &'a str) -> NonIntegerConstant {
NonIntegerConstant::Text(ValueRc::new(val.to_string()))
}
}
impl From<String> for NonIntegerConstant {
fn from(val: String) -> NonIntegerConstant {
NonIntegerConstant::Text(ValueRc::new(val))
}
}
#[derive(Clone, Debug, Eq, PartialEq)]
pub enum FnArg {
Variable(Variable),
SrcVar(SrcVar),
EntidOrInteger(i64),
IdentOrKeyword(Keyword),
Constant(NonIntegerConstant),
// The collection values representable in EDN. There's no advantage to destructuring up front,
// since consumers will need to handle arbitrarily nested EDN themselves anyway.
Vector(Vec<FnArg>),
}
impl FromValue<FnArg> for FnArg {
fn from_value(v: &edn::ValueAndSpan) -> Option<FnArg> {
2017-04-05 22:30:22 +00:00
use edn::SpannedValue::*;
match v.inner {
Integer(x) =>
Some(FnArg::EntidOrInteger(x)),
PlainSymbol(ref x) if x.is_src_symbol() =>
SrcVar::from_symbol(x).map(FnArg::SrcVar),
PlainSymbol(ref x) if x.is_var_symbol() =>
Variable::from_symbol(x).map(FnArg::Variable),
PlainSymbol(_) => None,
Keyword(ref x) =>
2017-04-05 22:30:22 +00:00
Some(FnArg::IdentOrKeyword(x.clone())),
Instant(x) =>
Some(FnArg::Constant(NonIntegerConstant::Instant(x))),
Uuid(x) =>
Some(FnArg::Constant(NonIntegerConstant::Uuid(x))),
Boolean(x) =>
Some(FnArg::Constant(NonIntegerConstant::Boolean(x))),
Float(x) =>
Some(FnArg::Constant(NonIntegerConstant::Float(x))),
BigInteger(ref x) =>
Some(FnArg::Constant(NonIntegerConstant::BigInteger(x.clone()))),
Text(ref x) =>
// TODO: intern strings. #398.
Some(FnArg::Constant(x.clone().into())),
2017-04-05 22:30:22 +00:00
Nil |
NamespacedSymbol(_) |
Vector(_) |
List(_) |
Set(_) |
Map(_) => None,
}
}
}
// For display in column headings in the repl.
impl std::fmt::Display for FnArg {
fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
match self {
&FnArg::Variable(ref var) => write!(f, "{}", var),
&FnArg::SrcVar(ref var) => {
if var == &SrcVar::DefaultSrc {
write!(f, "$")
} else {
write!(f, "{:?}", var)
}
},
&FnArg::EntidOrInteger(entid) => write!(f, "{}", entid),
&FnArg::IdentOrKeyword(ref kw) => write!(f, "{}", kw),
&FnArg::Constant(ref constant) => write!(f, "{:?}", constant),
&FnArg::Vector(ref vec) => write!(f, "{:?}", vec),
}
}
}
impl FnArg {
pub fn as_variable(&self) -> Option<&Variable> {
match self {
&FnArg::Variable(ref v) => Some(v),
_ => None,
}
}
}
/// e, a, tx can't be values -- no strings, no floats -- and so
/// they can only be variables, entity IDs, ident keywords, or
/// placeholders.
/// This encoding allows us to represent integers that aren't
/// entity IDs. That'll get filtered out in the context of the
/// database.
#[derive(Clone, Debug, Eq, PartialEq)]
pub enum PatternNonValuePlace {
Placeholder,
Variable(Variable),
Entid(i64), // Will always be +ve. See #190.
Ident(ValueRc<Keyword>),
}
impl From<Rc<Keyword>> for PatternNonValuePlace {
fn from(value: Rc<Keyword>) -> Self {
PatternNonValuePlace::Ident(ValueRc::from_rc(value))
}
}
impl From<Keyword> for PatternNonValuePlace {
fn from(value: Keyword) -> Self {
PatternNonValuePlace::Ident(ValueRc::new(value))
}
}
impl PatternNonValuePlace {
// I think we'll want move variants, so let's leave these here for now.
#[allow(dead_code)]
fn into_pattern_value_place(self) -> PatternValuePlace {
match self {
PatternNonValuePlace::Placeholder => PatternValuePlace::Placeholder,
PatternNonValuePlace::Variable(x) => PatternValuePlace::Variable(x),
PatternNonValuePlace::Entid(x) => PatternValuePlace::EntidOrInteger(x),
PatternNonValuePlace::Ident(x) => PatternValuePlace::IdentOrKeyword(x),
}
}
fn to_pattern_value_place(&self) -> PatternValuePlace {
match *self {
PatternNonValuePlace::Placeholder => PatternValuePlace::Placeholder,
PatternNonValuePlace::Variable(ref x) => PatternValuePlace::Variable(x.clone()),
PatternNonValuePlace::Entid(x) => PatternValuePlace::EntidOrInteger(x),
PatternNonValuePlace::Ident(ref x) => PatternValuePlace::IdentOrKeyword(x.clone()),
}
}
}
impl FromValue<PatternNonValuePlace> for PatternNonValuePlace {
fn from_value(v: &edn::ValueAndSpan) -> Option<PatternNonValuePlace> {
Improve parsing of nested `edn::ValueAndSpan` streams. r=rnewman (#393) * Pre: Expose more in edn. * Pre: Make it easier to work with ValueAndSpan. with_spans() is a temporary hack, needed only because I don't care to parse the bootstrap assertions from text right now. * Part 1a: Add `value_and_span` for parsing nested `edn::ValueAndSpan` instances. I wasn't able to abstract over `edn::Value` and `edn::ValueAndSpan`; there are multiple obstacles. I chose to roll with `edn::ValueAndSpan` since it exposes the additional span information that we will want to form good error messages in the future. * Part 1b: Add keyword_map() parsing an `edn::Value::Vector` into an `edn::Value::map`. * Part 1c: Add `Log`/`.log(...)` for logging parser progress. This is a terrible hack, but it sure helps to debug complicated nested parsers. I don't even know what a principled approach would look like; since our parser combinators are so frequently expressed in code, it's hard to imagine a data-driven interpreter that can help debug things. * Part 2: Use `value_and_span` apparatus in tx-parser/. I break an abstraction boundary by returning a value column `edn::ValueAndSpan` rather than just an `edn::Value`. That is, the transaction processor shouldn't care where the `edn::Value` it is processing arose -- even we care to track that information we should bake it into the `Entity` type. We do this because we need to dynamically parse the value column to support nested maps, and parsing requires a full `edn::ValueAndSpan`. Alternately, we could cheat and fake the spans when parsing nested maps, but that's potentially expensive. * Part 3: Use `value_and_span` apparatus in query-parser/. * Part 4: Use `value_and_span` apparatus in root crate. * Review comment: Make Span and SpanPosition Copy. * Review comment: nits. * Review comment: Make `or` be `or_exactly`. I baked the eof checking directly into the parser, rather than using the skip and eof parsers. I also took the time to restore some tests that were mistakenly commented out. * Review comment: Extract and use def_matches_* macros. * Review comment: .map() as late as possible.
2017-04-06 17:06:28 +00:00
match v.inner {
edn::SpannedValue::Integer(x) => if x >= 0 {
Some(PatternNonValuePlace::Entid(x))
} else {
None
},
Improve parsing of nested `edn::ValueAndSpan` streams. r=rnewman (#393) * Pre: Expose more in edn. * Pre: Make it easier to work with ValueAndSpan. with_spans() is a temporary hack, needed only because I don't care to parse the bootstrap assertions from text right now. * Part 1a: Add `value_and_span` for parsing nested `edn::ValueAndSpan` instances. I wasn't able to abstract over `edn::Value` and `edn::ValueAndSpan`; there are multiple obstacles. I chose to roll with `edn::ValueAndSpan` since it exposes the additional span information that we will want to form good error messages in the future. * Part 1b: Add keyword_map() parsing an `edn::Value::Vector` into an `edn::Value::map`. * Part 1c: Add `Log`/`.log(...)` for logging parser progress. This is a terrible hack, but it sure helps to debug complicated nested parsers. I don't even know what a principled approach would look like; since our parser combinators are so frequently expressed in code, it's hard to imagine a data-driven interpreter that can help debug things. * Part 2: Use `value_and_span` apparatus in tx-parser/. I break an abstraction boundary by returning a value column `edn::ValueAndSpan` rather than just an `edn::Value`. That is, the transaction processor shouldn't care where the `edn::Value` it is processing arose -- even we care to track that information we should bake it into the `Entity` type. We do this because we need to dynamically parse the value column to support nested maps, and parsing requires a full `edn::ValueAndSpan`. Alternately, we could cheat and fake the spans when parsing nested maps, but that's potentially expensive. * Part 3: Use `value_and_span` apparatus in query-parser/. * Part 4: Use `value_and_span` apparatus in root crate. * Review comment: Make Span and SpanPosition Copy. * Review comment: nits. * Review comment: Make `or` be `or_exactly`. I baked the eof checking directly into the parser, rather than using the skip and eof parsers. I also took the time to restore some tests that were mistakenly commented out. * Review comment: Extract and use def_matches_* macros. * Review comment: .map() as late as possible.
2017-04-06 17:06:28 +00:00
edn::SpannedValue::PlainSymbol(ref x) => if x.0.as_str() == "_" {
Some(PatternNonValuePlace::Placeholder)
} else {
if let Some(v) = Variable::from_symbol(x) {
Some(PatternNonValuePlace::Variable(v))
} else {
None
}
},
edn::SpannedValue::Keyword(ref x) =>
Some(x.clone().into()),
_ => None,
}
}
}
#[derive(Clone, Debug, Eq, PartialEq)]
pub enum IdentOrEntid {
Ident(Keyword),
Entid(i64),
}
/// The `v` part of a pattern can be much broader: it can represent
/// integers that aren't entity IDs (particularly negative integers),
/// strings, and all the rest. We group those under `Constant`.
#[derive(Clone, Debug, Eq, PartialEq)]
pub enum PatternValuePlace {
Placeholder,
Variable(Variable),
EntidOrInteger(i64),
IdentOrKeyword(ValueRc<Keyword>),
Constant(NonIntegerConstant),
}
impl From<Rc<Keyword>> for PatternValuePlace {
fn from(value: Rc<Keyword>) -> Self {
PatternValuePlace::IdentOrKeyword(ValueRc::from_rc(value))
}
}
impl From<Keyword> for PatternValuePlace {
fn from(value: Keyword) -> Self {
PatternValuePlace::IdentOrKeyword(ValueRc::new(value))
}
}
impl FromValue<PatternValuePlace> for PatternValuePlace {
fn from_value(v: &edn::ValueAndSpan) -> Option<PatternValuePlace> {
Improve parsing of nested `edn::ValueAndSpan` streams. r=rnewman (#393) * Pre: Expose more in edn. * Pre: Make it easier to work with ValueAndSpan. with_spans() is a temporary hack, needed only because I don't care to parse the bootstrap assertions from text right now. * Part 1a: Add `value_and_span` for parsing nested `edn::ValueAndSpan` instances. I wasn't able to abstract over `edn::Value` and `edn::ValueAndSpan`; there are multiple obstacles. I chose to roll with `edn::ValueAndSpan` since it exposes the additional span information that we will want to form good error messages in the future. * Part 1b: Add keyword_map() parsing an `edn::Value::Vector` into an `edn::Value::map`. * Part 1c: Add `Log`/`.log(...)` for logging parser progress. This is a terrible hack, but it sure helps to debug complicated nested parsers. I don't even know what a principled approach would look like; since our parser combinators are so frequently expressed in code, it's hard to imagine a data-driven interpreter that can help debug things. * Part 2: Use `value_and_span` apparatus in tx-parser/. I break an abstraction boundary by returning a value column `edn::ValueAndSpan` rather than just an `edn::Value`. That is, the transaction processor shouldn't care where the `edn::Value` it is processing arose -- even we care to track that information we should bake it into the `Entity` type. We do this because we need to dynamically parse the value column to support nested maps, and parsing requires a full `edn::ValueAndSpan`. Alternately, we could cheat and fake the spans when parsing nested maps, but that's potentially expensive. * Part 3: Use `value_and_span` apparatus in query-parser/. * Part 4: Use `value_and_span` apparatus in root crate. * Review comment: Make Span and SpanPosition Copy. * Review comment: nits. * Review comment: Make `or` be `or_exactly`. I baked the eof checking directly into the parser, rather than using the skip and eof parsers. I also took the time to restore some tests that were mistakenly commented out. * Review comment: Extract and use def_matches_* macros. * Review comment: .map() as late as possible.
2017-04-06 17:06:28 +00:00
match v.inner {
edn::SpannedValue::Integer(x) =>
Some(PatternValuePlace::EntidOrInteger(x)),
Improve parsing of nested `edn::ValueAndSpan` streams. r=rnewman (#393) * Pre: Expose more in edn. * Pre: Make it easier to work with ValueAndSpan. with_spans() is a temporary hack, needed only because I don't care to parse the bootstrap assertions from text right now. * Part 1a: Add `value_and_span` for parsing nested `edn::ValueAndSpan` instances. I wasn't able to abstract over `edn::Value` and `edn::ValueAndSpan`; there are multiple obstacles. I chose to roll with `edn::ValueAndSpan` since it exposes the additional span information that we will want to form good error messages in the future. * Part 1b: Add keyword_map() parsing an `edn::Value::Vector` into an `edn::Value::map`. * Part 1c: Add `Log`/`.log(...)` for logging parser progress. This is a terrible hack, but it sure helps to debug complicated nested parsers. I don't even know what a principled approach would look like; since our parser combinators are so frequently expressed in code, it's hard to imagine a data-driven interpreter that can help debug things. * Part 2: Use `value_and_span` apparatus in tx-parser/. I break an abstraction boundary by returning a value column `edn::ValueAndSpan` rather than just an `edn::Value`. That is, the transaction processor shouldn't care where the `edn::Value` it is processing arose -- even we care to track that information we should bake it into the `Entity` type. We do this because we need to dynamically parse the value column to support nested maps, and parsing requires a full `edn::ValueAndSpan`. Alternately, we could cheat and fake the spans when parsing nested maps, but that's potentially expensive. * Part 3: Use `value_and_span` apparatus in query-parser/. * Part 4: Use `value_and_span` apparatus in root crate. * Review comment: Make Span and SpanPosition Copy. * Review comment: nits. * Review comment: Make `or` be `or_exactly`. I baked the eof checking directly into the parser, rather than using the skip and eof parsers. I also took the time to restore some tests that were mistakenly commented out. * Review comment: Extract and use def_matches_* macros. * Review comment: .map() as late as possible.
2017-04-06 17:06:28 +00:00
edn::SpannedValue::PlainSymbol(ref x) if x.0.as_str() == "_" =>
Some(PatternValuePlace::Placeholder),
Improve parsing of nested `edn::ValueAndSpan` streams. r=rnewman (#393) * Pre: Expose more in edn. * Pre: Make it easier to work with ValueAndSpan. with_spans() is a temporary hack, needed only because I don't care to parse the bootstrap assertions from text right now. * Part 1a: Add `value_and_span` for parsing nested `edn::ValueAndSpan` instances. I wasn't able to abstract over `edn::Value` and `edn::ValueAndSpan`; there are multiple obstacles. I chose to roll with `edn::ValueAndSpan` since it exposes the additional span information that we will want to form good error messages in the future. * Part 1b: Add keyword_map() parsing an `edn::Value::Vector` into an `edn::Value::map`. * Part 1c: Add `Log`/`.log(...)` for logging parser progress. This is a terrible hack, but it sure helps to debug complicated nested parsers. I don't even know what a principled approach would look like; since our parser combinators are so frequently expressed in code, it's hard to imagine a data-driven interpreter that can help debug things. * Part 2: Use `value_and_span` apparatus in tx-parser/. I break an abstraction boundary by returning a value column `edn::ValueAndSpan` rather than just an `edn::Value`. That is, the transaction processor shouldn't care where the `edn::Value` it is processing arose -- even we care to track that information we should bake it into the `Entity` type. We do this because we need to dynamically parse the value column to support nested maps, and parsing requires a full `edn::ValueAndSpan`. Alternately, we could cheat and fake the spans when parsing nested maps, but that's potentially expensive. * Part 3: Use `value_and_span` apparatus in query-parser/. * Part 4: Use `value_and_span` apparatus in root crate. * Review comment: Make Span and SpanPosition Copy. * Review comment: nits. * Review comment: Make `or` be `or_exactly`. I baked the eof checking directly into the parser, rather than using the skip and eof parsers. I also took the time to restore some tests that were mistakenly commented out. * Review comment: Extract and use def_matches_* macros. * Review comment: .map() as late as possible.
2017-04-06 17:06:28 +00:00
edn::SpannedValue::PlainSymbol(ref x) =>
Variable::from_symbol(x).map(PatternValuePlace::Variable),
edn::SpannedValue::Keyword(ref x) if x.is_namespaced() =>
Some(x.clone().into()),
Improve parsing of nested `edn::ValueAndSpan` streams. r=rnewman (#393) * Pre: Expose more in edn. * Pre: Make it easier to work with ValueAndSpan. with_spans() is a temporary hack, needed only because I don't care to parse the bootstrap assertions from text right now. * Part 1a: Add `value_and_span` for parsing nested `edn::ValueAndSpan` instances. I wasn't able to abstract over `edn::Value` and `edn::ValueAndSpan`; there are multiple obstacles. I chose to roll with `edn::ValueAndSpan` since it exposes the additional span information that we will want to form good error messages in the future. * Part 1b: Add keyword_map() parsing an `edn::Value::Vector` into an `edn::Value::map`. * Part 1c: Add `Log`/`.log(...)` for logging parser progress. This is a terrible hack, but it sure helps to debug complicated nested parsers. I don't even know what a principled approach would look like; since our parser combinators are so frequently expressed in code, it's hard to imagine a data-driven interpreter that can help debug things. * Part 2: Use `value_and_span` apparatus in tx-parser/. I break an abstraction boundary by returning a value column `edn::ValueAndSpan` rather than just an `edn::Value`. That is, the transaction processor shouldn't care where the `edn::Value` it is processing arose -- even we care to track that information we should bake it into the `Entity` type. We do this because we need to dynamically parse the value column to support nested maps, and parsing requires a full `edn::ValueAndSpan`. Alternately, we could cheat and fake the spans when parsing nested maps, but that's potentially expensive. * Part 3: Use `value_and_span` apparatus in query-parser/. * Part 4: Use `value_and_span` apparatus in root crate. * Review comment: Make Span and SpanPosition Copy. * Review comment: nits. * Review comment: Make `or` be `or_exactly`. I baked the eof checking directly into the parser, rather than using the skip and eof parsers. I also took the time to restore some tests that were mistakenly commented out. * Review comment: Extract and use def_matches_* macros. * Review comment: .map() as late as possible.
2017-04-06 17:06:28 +00:00
edn::SpannedValue::Boolean(x) =>
Some(PatternValuePlace::Constant(NonIntegerConstant::Boolean(x))),
Improve parsing of nested `edn::ValueAndSpan` streams. r=rnewman (#393) * Pre: Expose more in edn. * Pre: Make it easier to work with ValueAndSpan. with_spans() is a temporary hack, needed only because I don't care to parse the bootstrap assertions from text right now. * Part 1a: Add `value_and_span` for parsing nested `edn::ValueAndSpan` instances. I wasn't able to abstract over `edn::Value` and `edn::ValueAndSpan`; there are multiple obstacles. I chose to roll with `edn::ValueAndSpan` since it exposes the additional span information that we will want to form good error messages in the future. * Part 1b: Add keyword_map() parsing an `edn::Value::Vector` into an `edn::Value::map`. * Part 1c: Add `Log`/`.log(...)` for logging parser progress. This is a terrible hack, but it sure helps to debug complicated nested parsers. I don't even know what a principled approach would look like; since our parser combinators are so frequently expressed in code, it's hard to imagine a data-driven interpreter that can help debug things. * Part 2: Use `value_and_span` apparatus in tx-parser/. I break an abstraction boundary by returning a value column `edn::ValueAndSpan` rather than just an `edn::Value`. That is, the transaction processor shouldn't care where the `edn::Value` it is processing arose -- even we care to track that information we should bake it into the `Entity` type. We do this because we need to dynamically parse the value column to support nested maps, and parsing requires a full `edn::ValueAndSpan`. Alternately, we could cheat and fake the spans when parsing nested maps, but that's potentially expensive. * Part 3: Use `value_and_span` apparatus in query-parser/. * Part 4: Use `value_and_span` apparatus in root crate. * Review comment: Make Span and SpanPosition Copy. * Review comment: nits. * Review comment: Make `or` be `or_exactly`. I baked the eof checking directly into the parser, rather than using the skip and eof parsers. I also took the time to restore some tests that were mistakenly commented out. * Review comment: Extract and use def_matches_* macros. * Review comment: .map() as late as possible.
2017-04-06 17:06:28 +00:00
edn::SpannedValue::Float(x) =>
Some(PatternValuePlace::Constant(NonIntegerConstant::Float(x))),
Improve parsing of nested `edn::ValueAndSpan` streams. r=rnewman (#393) * Pre: Expose more in edn. * Pre: Make it easier to work with ValueAndSpan. with_spans() is a temporary hack, needed only because I don't care to parse the bootstrap assertions from text right now. * Part 1a: Add `value_and_span` for parsing nested `edn::ValueAndSpan` instances. I wasn't able to abstract over `edn::Value` and `edn::ValueAndSpan`; there are multiple obstacles. I chose to roll with `edn::ValueAndSpan` since it exposes the additional span information that we will want to form good error messages in the future. * Part 1b: Add keyword_map() parsing an `edn::Value::Vector` into an `edn::Value::map`. * Part 1c: Add `Log`/`.log(...)` for logging parser progress. This is a terrible hack, but it sure helps to debug complicated nested parsers. I don't even know what a principled approach would look like; since our parser combinators are so frequently expressed in code, it's hard to imagine a data-driven interpreter that can help debug things. * Part 2: Use `value_and_span` apparatus in tx-parser/. I break an abstraction boundary by returning a value column `edn::ValueAndSpan` rather than just an `edn::Value`. That is, the transaction processor shouldn't care where the `edn::Value` it is processing arose -- even we care to track that information we should bake it into the `Entity` type. We do this because we need to dynamically parse the value column to support nested maps, and parsing requires a full `edn::ValueAndSpan`. Alternately, we could cheat and fake the spans when parsing nested maps, but that's potentially expensive. * Part 3: Use `value_and_span` apparatus in query-parser/. * Part 4: Use `value_and_span` apparatus in root crate. * Review comment: Make Span and SpanPosition Copy. * Review comment: nits. * Review comment: Make `or` be `or_exactly`. I baked the eof checking directly into the parser, rather than using the skip and eof parsers. I also took the time to restore some tests that were mistakenly commented out. * Review comment: Extract and use def_matches_* macros. * Review comment: .map() as late as possible.
2017-04-06 17:06:28 +00:00
edn::SpannedValue::BigInteger(ref x) =>
Some(PatternValuePlace::Constant(NonIntegerConstant::BigInteger(x.clone()))),
edn::SpannedValue::Instant(x) =>
Some(PatternValuePlace::Constant(NonIntegerConstant::Instant(x))),
Improve parsing of nested `edn::ValueAndSpan` streams. r=rnewman (#393) * Pre: Expose more in edn. * Pre: Make it easier to work with ValueAndSpan. with_spans() is a temporary hack, needed only because I don't care to parse the bootstrap assertions from text right now. * Part 1a: Add `value_and_span` for parsing nested `edn::ValueAndSpan` instances. I wasn't able to abstract over `edn::Value` and `edn::ValueAndSpan`; there are multiple obstacles. I chose to roll with `edn::ValueAndSpan` since it exposes the additional span information that we will want to form good error messages in the future. * Part 1b: Add keyword_map() parsing an `edn::Value::Vector` into an `edn::Value::map`. * Part 1c: Add `Log`/`.log(...)` for logging parser progress. This is a terrible hack, but it sure helps to debug complicated nested parsers. I don't even know what a principled approach would look like; since our parser combinators are so frequently expressed in code, it's hard to imagine a data-driven interpreter that can help debug things. * Part 2: Use `value_and_span` apparatus in tx-parser/. I break an abstraction boundary by returning a value column `edn::ValueAndSpan` rather than just an `edn::Value`. That is, the transaction processor shouldn't care where the `edn::Value` it is processing arose -- even we care to track that information we should bake it into the `Entity` type. We do this because we need to dynamically parse the value column to support nested maps, and parsing requires a full `edn::ValueAndSpan`. Alternately, we could cheat and fake the spans when parsing nested maps, but that's potentially expensive. * Part 3: Use `value_and_span` apparatus in query-parser/. * Part 4: Use `value_and_span` apparatus in root crate. * Review comment: Make Span and SpanPosition Copy. * Review comment: nits. * Review comment: Make `or` be `or_exactly`. I baked the eof checking directly into the parser, rather than using the skip and eof parsers. I also took the time to restore some tests that were mistakenly commented out. * Review comment: Extract and use def_matches_* macros. * Review comment: .map() as late as possible.
2017-04-06 17:06:28 +00:00
edn::SpannedValue::Text(ref x) =>
// TODO: intern strings. #398.
Some(PatternValuePlace::Constant(x.clone().into())),
edn::SpannedValue::Uuid(ref u) =>
Some(PatternValuePlace::Constant(NonIntegerConstant::Uuid(u.clone()))),
// These don't appear in queries.
edn::SpannedValue::Nil => None,
edn::SpannedValue::NamespacedSymbol(_) => None,
edn::SpannedValue::Keyword(_) => None, // … yet.
edn::SpannedValue::Map(_) => None,
edn::SpannedValue::List(_) => None,
edn::SpannedValue::Set(_) => None,
edn::SpannedValue::Vector(_) => None,
}
}
}
impl PatternValuePlace {
// I think we'll want move variants, so let's leave these here for now.
#[allow(dead_code)]
fn into_pattern_non_value_place(self) -> Option<PatternNonValuePlace> {
match self {
PatternValuePlace::Placeholder => Some(PatternNonValuePlace::Placeholder),
PatternValuePlace::Variable(x) => Some(PatternNonValuePlace::Variable(x)),
PatternValuePlace::EntidOrInteger(x) => if x >= 0 {
Some(PatternNonValuePlace::Entid(x))
} else {
None
},
PatternValuePlace::IdentOrKeyword(x) => Some(PatternNonValuePlace::Ident(x)),
PatternValuePlace::Constant(_) => None,
}
}
fn to_pattern_non_value_place(&self) -> Option<PatternNonValuePlace> {
match *self {
PatternValuePlace::Placeholder => Some(PatternNonValuePlace::Placeholder),
PatternValuePlace::Variable(ref x) => Some(PatternNonValuePlace::Variable(x.clone())),
PatternValuePlace::EntidOrInteger(x) => if x >= 0 {
Some(PatternNonValuePlace::Entid(x))
} else {
None
},
PatternValuePlace::IdentOrKeyword(ref x) => Some(PatternNonValuePlace::Ident(x.clone())),
PatternValuePlace::Constant(_) => None,
}
}
}
// Not yet used.
// pub enum PullDefaultValue {
// EntidOrInteger(i64),
// IdentOrKeyword(Rc<Keyword>),
// Constant(NonIntegerConstant),
// }
#[derive(Clone, Debug, Eq, PartialEq)]
pub enum PullConcreteAttribute {
Ident(Rc<Keyword>),
Entid(i64),
}
#[derive(Clone, Debug, Eq, PartialEq)]
pub struct NamedPullAttribute {
pub attribute: PullConcreteAttribute,
pub alias: Option<Rc<Keyword>>,
}
impl From<PullConcreteAttribute> for NamedPullAttribute {
fn from(a: PullConcreteAttribute) -> Self {
NamedPullAttribute {
attribute: a,
alias: None,
}
}
}
#[derive(Clone, Debug, Eq, PartialEq)]
pub enum PullAttributeSpec {
Wildcard,
Attribute(NamedPullAttribute),
// PullMapSpec(Vec<…>),
// LimitedAttribute(NamedPullAttribute, u64), // Limit nil => Attribute instead.
// DefaultedAttribute(NamedPullAttribute, PullDefaultValue),
}
impl std::fmt::Display for PullConcreteAttribute {
fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
match self {
&PullConcreteAttribute::Ident(ref k) => {
write!(f, "{}", k)
},
&PullConcreteAttribute::Entid(i) => {
write!(f, "{}", i)
},
}
}
}
impl std::fmt::Display for NamedPullAttribute {
fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
if let &Some(ref alias) = &self.alias {
write!(f, "{} :as {}", self.attribute, alias)
} else {
write!(f, "{}", self.attribute)
}
}
}
impl std::fmt::Display for PullAttributeSpec {
fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
match self {
&PullAttributeSpec::Wildcard => {
write!(f, "*")
},
&PullAttributeSpec::Attribute(ref attr) => {
write!(f, "{}", attr)
},
}
}
}
#[derive(Debug, Eq, PartialEq)]
pub struct Pull {
pub var: Variable,
pub patterns: Vec<PullAttributeSpec>,
}
#[derive(Debug, Eq, PartialEq)]
pub struct Aggregate {
pub func: QueryFunction,
pub args: Vec<FnArg>,
}
#[derive(Debug, Eq, PartialEq)]
pub enum Element {
Variable(Variable),
Aggregate(Aggregate),
/// In a query with a `max` or `min` aggregate, a corresponding variable
/// (indicated in the query with `(the ?var)`, is guaranteed to come from
/// the row that provided the max or min value. Queries with more than one
/// `max` or `min` cannot yield predictable behavior, and will err during
/// algebrizing.
Corresponding(Variable),
Pull(Pull),
}
impl Element {
/// Returns true if the element must yield only one value.
pub fn is_unit(&self) -> bool {
match self {
&Element::Variable(_) => false,
&Element::Pull(_) => false,
&Element::Aggregate(_) => true,
&Element::Corresponding(_) => true,
}
}
}
impl From<Variable> for Element {
fn from(x: Variable) -> Element {
Element::Variable(x)
}
}
impl std::fmt::Display for Element {
fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
match self {
&Element::Variable(ref var) => {
write!(f, "{}", var)
},
&Element::Pull(Pull { ref var, ref patterns }) => {
write!(f, "(pull {} [ ", var)?;
for p in patterns.iter() {
write!(f, "{} ", p)?;
}
write!(f, "])")
},
&Element::Aggregate(ref agg) => {
match agg.args.len() {
0 => write!(f, "({})", agg.func),
1 => write!(f, "({} {})", agg.func, agg.args[0]),
_ => write!(f, "({} {:?})", agg.func, agg.args),
}
},
&Element::Corresponding(ref var) => {
write!(f, "(the {})", var)
},
}
}
}
#[derive(Clone, Debug, Eq, PartialEq)]
pub enum Limit {
None,
Fixed(u64),
Variable(Variable),
}
/// A definition of the first part of a find query: the
/// `[:find ?foo ?bar…]` bit.
///
/// There are four different kinds of find specs, allowing you to query for
/// a single value, a collection of values from different entities, a single
/// tuple (relation), or a collection of tuples.
///
/// Examples:
///
/// ```rust
/// # extern crate mentat_query;
/// # use mentat_query::{Element, FindSpec, Variable};
///
/// # fn main() {
///
/// let elements = vec![
/// Element::Variable(Variable::from_valid_name("?foo")),
/// Element::Variable(Variable::from_valid_name("?bar")),
/// ];
/// let rel = FindSpec::FindRel(elements);
///
/// if let FindSpec::FindRel(elements) = rel {
/// assert_eq!(2, elements.len());
/// }
///
/// # }
/// ```
///
#[derive(Debug, Eq, PartialEq)]
pub enum FindSpec {
/// Returns an array of arrays, represented as a single array with length a multiple of width.
FindRel(Vec<Element>),
/// Returns an array of scalars, usually homogeneous.
/// This is equivalent to mapping over the results of a `FindRel`,
/// returning the first value of each.
FindColl(Element),
/// Returns a single tuple: a heterogeneous array of scalars. Equivalent to
/// taking the first result from a `FindRel`.
FindTuple(Vec<Element>),
/// Returns a single scalar value. Equivalent to taking the first result
/// from a `FindColl`.
FindScalar(Element),
}
/// Returns true if the provided `FindSpec` returns at most one result.
impl FindSpec {
pub fn is_unit_limited(&self) -> bool {
use FindSpec::*;
match self {
&FindScalar(..) => true,
&FindTuple(..) => true,
&FindRel(..) => false,
&FindColl(..) => false,
}
}
pub fn expected_column_count(&self) -> usize {
use FindSpec::*;
match self {
&FindScalar(..) => 1,
&FindColl(..) => 1,
&FindTuple(ref elems) | &FindRel(ref elems) => elems.len(),
}
}
/// Returns true if the provided `FindSpec` cares about distinct results.
///
/// I use the words "cares about" because find is generally defined in terms of producing distinct
/// results at the Datalog level.
///
/// Two of the find specs (scalar and tuple) produce only a single result. Those don't need to be
/// run with `SELECT DISTINCT`, because we're only consuming a single result. Those queries will be
/// run with `LIMIT 1`.
///
/// Additionally, some projections cannot produce duplicate results: `[:find (max ?x) …]`, for
/// example.
///
/// This function gives us the hook to add that logic when we're ready.
///
/// Beyond this, `DISTINCT` is not always needed. For example, in some kinds of accumulation or
/// sampling projections we might not need to do it at the SQL level because we're consuming into
/// a dupe-eliminating data structure like a Set, or we know that a particular query cannot produce
/// duplicate results.
pub fn requires_distinct(&self) -> bool {
!self.is_unit_limited()
}
pub fn columns<'s>(&'s self) -> Box<Iterator<Item=&Element> + 's> {
use FindSpec::*;
match self {
&FindScalar(ref e) => Box::new(std::iter::once(e)),
&FindColl(ref e) => Box::new(std::iter::once(e)),
&FindTuple(ref v) => Box::new(v.iter()),
&FindRel(ref v) => Box::new(v.iter()),
}
}
}
// Datomic accepts variable or placeholder. DataScript accepts recursive bindings. Mentat sticks
// to the non-recursive form Datomic accepts, which is much simpler to process.
#[derive(Clone, Debug, Eq, Hash, PartialEq)]
pub enum VariableOrPlaceholder {
Placeholder,
Variable(Variable),
}
impl VariableOrPlaceholder {
pub fn into_var(self) -> Option<Variable> {
match self {
VariableOrPlaceholder::Placeholder => None,
VariableOrPlaceholder::Variable(var) => Some(var),
}
}
pub fn var(&self) -> Option<&Variable> {
match self {
&VariableOrPlaceholder::Placeholder => None,
&VariableOrPlaceholder::Variable(ref var) => Some(var),
}
}
}
#[derive(Clone,Debug,Eq,PartialEq)]
pub enum Binding {
BindScalar(Variable),
BindColl(Variable),
BindRel(Vec<VariableOrPlaceholder>),
BindTuple(Vec<VariableOrPlaceholder>),
}
impl Binding {
/// Return each variable or `None`, in order.
pub fn variables(&self) -> Vec<Option<Variable>> {
match self {
&Binding::BindScalar(ref var) | &Binding::BindColl(ref var) => vec![Some(var.clone())],
&Binding::BindRel(ref vars) | &Binding::BindTuple(ref vars) => vars.iter().map(|x| x.var().cloned()).collect(),
}
}
/// Return `true` if no variables are bound, i.e., all binding entries are placeholders.
pub fn is_empty(&self) -> bool {
match self {
&Binding::BindScalar(_) | &Binding::BindColl(_) => false,
&Binding::BindRel(ref vars) | &Binding::BindTuple(ref vars) => vars.iter().all(|x| x.var().is_none()),
}
}
/// Return `true` if no variable is bound twice, i.e., each binding entry is either a
/// placeholder or unique.
///
/// ```
/// extern crate mentat_query;
/// use std::rc::Rc;
///
/// let v = mentat_query::Variable::from_valid_name("?foo");
/// let vv = mentat_query::VariableOrPlaceholder::Variable(v);
/// let p = mentat_query::VariableOrPlaceholder::Placeholder;
///
/// let e = mentat_query::Binding::BindTuple(vec![p.clone()]);
/// let b = mentat_query::Binding::BindTuple(vec![p.clone(), vv.clone()]);
/// let d = mentat_query::Binding::BindTuple(vec![vv.clone(), p, vv]);
/// assert!(b.is_valid()); // One var, one placeholder: OK.
/// assert!(!e.is_valid()); // Empty: not OK.
/// assert!(!d.is_valid()); // Duplicate var: not OK.
/// ```
pub fn is_valid(&self) -> bool {
match self {
&Binding::BindScalar(_) | &Binding::BindColl(_) => true,
&Binding::BindRel(ref vars) | &Binding::BindTuple(ref vars) => {
let mut acc = HashSet::<Variable>::new();
for var in vars {
if let &VariableOrPlaceholder::Variable(ref var) = var {
if !acc.insert(var.clone()) {
// It's invalid if there was an equal var already present in the set --
// i.e., we have a duplicate var.
return false;
}
}
}
// We're not valid if every place is a placeholder!
!acc.is_empty()
}
}
}
}
// Note that the "implicit blank" rule applies.
// A pattern with a reversed attribute — :foo/_bar — is reversed
// at the point of parsing. These `Pattern` instances only represent
// one direction.
#[derive(Clone, Debug, Eq, PartialEq)]
pub struct Pattern {
pub source: Option<SrcVar>,
pub entity: PatternNonValuePlace,
pub attribute: PatternNonValuePlace,
pub value: PatternValuePlace,
pub tx: PatternNonValuePlace,
}
impl Pattern {
pub fn simple(e: PatternNonValuePlace,
a: PatternNonValuePlace,
v: PatternValuePlace) -> Option<Pattern> {
Pattern::new(None, e, a, v, PatternNonValuePlace::Placeholder)
}
pub fn new(src: Option<SrcVar>,
e: PatternNonValuePlace,
a: PatternNonValuePlace,
v: PatternValuePlace,
tx: PatternNonValuePlace) -> Option<Pattern> {
let aa = a.clone(); // Too tired of fighting borrow scope for now.
if let PatternNonValuePlace::Ident(ref k) = aa {
if k.is_backward() {
// e and v have different types; we must convert them.
// Not every parseable value is suitable for the entity field!
// As such, this is a failable constructor.
let e_v = e.to_pattern_value_place();
if let Some(v_e) = v.to_pattern_non_value_place() {
return Some(Pattern {
source: src,
entity: v_e,
attribute: k.to_reversed().into(),
value: e_v,
tx: tx,
});
} else {
return None;
}
}
}
Some(Pattern {
source: src,
entity: e,
attribute: a,
value: v,
tx: tx,
})
}
}
#[derive(Clone, Debug, Eq, PartialEq)]
pub struct Predicate {
pub operator: PlainSymbol,
pub args: Vec<FnArg>,
}
#[derive(Clone, Debug, Eq, PartialEq)]
pub struct WhereFn {
pub operator: PlainSymbol,
pub args: Vec<FnArg>,
pub binding: Binding,
}
#[derive(Clone, Debug, Eq, PartialEq)]
pub enum UnifyVars {
/// `Implicit` means the variables in an `or` or `not` are derived from the enclosed pattern.
/// DataScript regards these vars as 'free': these variables don't need to be bound by the
/// enclosing environment.
///
/// Datomic's documentation implies that all implicit variables are required:
///
/// > Datomic will attempt to push the or clause down until all necessary variables are bound,
/// > and will throw an exception if that is not possible.
///
/// but that would render top-level `or` expressions (as used in Datomic's own examples!)
/// impossible, so we assume that this is an error in the documentation.
///
/// All contained 'arms' in an `or` with implicit variables must bind the same vars.
Implicit,
/// `Explicit` means the variables in an `or-join` or `not-join` are explicitly listed,
/// specified with `required-vars` syntax.
///
/// DataScript parses these as free, but allows (incorrectly) the use of more complicated
/// `rule-vars` syntax.
///
/// Only the named variables will be unified with the enclosing query.
///
/// Every 'arm' in an `or-join` must mention the entire set of explicit vars.
Explicit(BTreeSet<Variable>),
}
impl WhereClause {
pub fn is_pattern(&self) -> bool {
match self {
&WhereClause::Pattern(_) => true,
_ => false,
}
}
}
#[derive(Clone, Debug, Eq, PartialEq)]
pub enum OrWhereClause {
Clause(WhereClause),
And(Vec<WhereClause>),
}
impl OrWhereClause {
pub fn is_pattern_or_patterns(&self) -> bool {
match self {
&OrWhereClause::Clause(WhereClause::Pattern(_)) => true,
&OrWhereClause::And(ref clauses) => clauses.iter().all(|clause| clause.is_pattern()),
_ => false,
}
}
}
#[derive(Clone, Debug, Eq, PartialEq)]
pub struct OrJoin {
pub unify_vars: UnifyVars,
pub clauses: Vec<OrWhereClause>,
2017-04-04 21:54:08 +00:00
/// Caches the result of `collect_mentioned_variables`.
mentioned_vars: Option<BTreeSet<Variable>>,
}
Parse and Algebrize `not` & `not-join`. (#302) (Closes #303, #389, #422 ) r=rnewman * Part 1 - Parse `not` and `not-join` * Part 2 - Validate `not` and `not-join` pre-algebrization * Address review comments rnewman. * Remove `WhereNotClause` and populate `NotJoin` with `WhereClause`. * Fix validation for `not` and `not-join`, removing tests that were invalid. * Address rustification comments. * Rebase against `rust` branch. * Part 3 - Add required types for NotJoin. * Implement `PartialEq` for `ConjoiningClauses` so `ComputedTable` can be included inside `ColumnConstraint::NotExists` * Part 4 - Implement `apply_not_join` * Part 5 - Call `apply_not_join` from inside `apply_clause` * Part 6 - Translate `not-join` into `NOT EXISTS` SQL * Address review comments. * Rename `projected` to `unified` to better describe the fact that we are not projecting any variables. * Check for presence of each unified var in either `column_bindings` or `input_bindings` and bail if not there. * Copy over `input_bindings` for each var in `unified`. * Only copy over the first `column_binding` for each variable in `unified` rather than the whole list. * Update tests. * Address review comments. * Make output from Debug for NotExists more useful * Clear up misunderstanding. Any single failing clause in the not will cause the entire not to be considered empty * Address review comments. * Remove Limit requirement from cc_to_exists. * Use Entry.or_insert instead of matching on the entry to add to column_bindings. * Move addition of value_bindings to before apply_clauses on template. * Tidy up tests with some variable reuse. * Addressed nits, * Address review comments. * Move addition of column_bindings to above apply_clause. * Update tests. * Add test to ensure that unbound vars fail * Improve test for unbound variable to check for correct variable and error * address nits
2017-04-28 09:44:11 +00:00
#[derive(Clone, Debug, Eq, PartialEq)]
pub struct NotJoin {
pub unify_vars: UnifyVars,
pub clauses: Vec<WhereClause>,
}
#[derive(Clone, Debug, Eq, PartialEq)]
pub struct TypeAnnotation {
pub value_type: ValueType,
pub variable: Variable,
}
#[allow(dead_code)]
#[derive(Clone, Debug, Eq, PartialEq)]
pub enum WhereClause {
Parse and Algebrize `not` & `not-join`. (#302) (Closes #303, #389, #422 ) r=rnewman * Part 1 - Parse `not` and `not-join` * Part 2 - Validate `not` and `not-join` pre-algebrization * Address review comments rnewman. * Remove `WhereNotClause` and populate `NotJoin` with `WhereClause`. * Fix validation for `not` and `not-join`, removing tests that were invalid. * Address rustification comments. * Rebase against `rust` branch. * Part 3 - Add required types for NotJoin. * Implement `PartialEq` for `ConjoiningClauses` so `ComputedTable` can be included inside `ColumnConstraint::NotExists` * Part 4 - Implement `apply_not_join` * Part 5 - Call `apply_not_join` from inside `apply_clause` * Part 6 - Translate `not-join` into `NOT EXISTS` SQL * Address review comments. * Rename `projected` to `unified` to better describe the fact that we are not projecting any variables. * Check for presence of each unified var in either `column_bindings` or `input_bindings` and bail if not there. * Copy over `input_bindings` for each var in `unified`. * Only copy over the first `column_binding` for each variable in `unified` rather than the whole list. * Update tests. * Address review comments. * Make output from Debug for NotExists more useful * Clear up misunderstanding. Any single failing clause in the not will cause the entire not to be considered empty * Address review comments. * Remove Limit requirement from cc_to_exists. * Use Entry.or_insert instead of matching on the entry to add to column_bindings. * Move addition of value_bindings to before apply_clauses on template. * Tidy up tests with some variable reuse. * Addressed nits, * Address review comments. * Move addition of column_bindings to above apply_clause. * Update tests. * Add test to ensure that unbound vars fail * Improve test for unbound variable to check for correct variable and error * address nits
2017-04-28 09:44:11 +00:00
NotJoin(NotJoin),
OrJoin(OrJoin),
Pred(Predicate),
WhereFn(WhereFn),
RuleExpr,
Pattern(Pattern),
TypeAnnotation(TypeAnnotation),
}
#[allow(dead_code)]
#[derive(Debug, Eq, PartialEq)]
pub struct FindQuery {
pub find_spec: FindSpec,
pub default_source: SrcVar,
pub with: BTreeSet<Variable>,
pub in_vars: BTreeSet<Variable>,
pub in_sources: BTreeSet<SrcVar>,
pub limit: Limit,
pub where_clauses: Vec<WhereClause>,
pub order: Option<Vec<Order>>,
// TODO: in_rules;
}
impl FindQuery {
pub fn simple(spec: FindSpec, where_clauses: Vec<WhereClause>) -> FindQuery {
FindQuery {
find_spec: spec,
default_source: SrcVar::DefaultSrc,
with: BTreeSet::default(),
in_vars: BTreeSet::default(),
in_sources: BTreeSet::default(),
limit: Limit::None,
where_clauses: where_clauses,
order: None,
}
}
}
impl OrJoin {
2017-04-04 21:54:08 +00:00
pub fn new(unify_vars: UnifyVars, clauses: Vec<OrWhereClause>) -> OrJoin {
OrJoin {
unify_vars: unify_vars,
clauses: clauses,
mentioned_vars: None,
}
}
/// Return true if either the `OrJoin` is `UnifyVars::Implicit`, or if
/// every variable mentioned inside the join is also mentioned in the `UnifyVars` list.
pub fn is_fully_unified(&self) -> bool {
match &self.unify_vars {
&UnifyVars::Implicit => true,
&UnifyVars::Explicit(ref vars) => {
// We know that the join list must be a subset of the vars in the pattern, or
// it would have failed validation. That allows us to simply compare counts here.
// TODO: in debug mode, do a full intersection, and verify that our count check
// returns the same results.
2017-04-04 21:54:08 +00:00
// Use the cached list if we have one.
if let Some(ref mentioned) = self.mentioned_vars {
vars.len() == mentioned.len()
} else {
vars.len() == self.collect_mentioned_variables().len()
}
}
}
}
}
pub trait ContainsVariables {
fn accumulate_mentioned_variables(&self, acc: &mut BTreeSet<Variable>);
fn collect_mentioned_variables(&self) -> BTreeSet<Variable> {
let mut out = BTreeSet::new();
self.accumulate_mentioned_variables(&mut out);
out
}
}
impl ContainsVariables for WhereClause {
fn accumulate_mentioned_variables(&self, acc: &mut BTreeSet<Variable>) {
use WhereClause::*;
match self {
&OrJoin(ref o) => o.accumulate_mentioned_variables(acc),
&Pred(ref p) => p.accumulate_mentioned_variables(acc),
&Pattern(ref p) => p.accumulate_mentioned_variables(acc),
&NotJoin(ref n) => n.accumulate_mentioned_variables(acc),
&WhereFn(ref f) => f.accumulate_mentioned_variables(acc),
&TypeAnnotation(ref a) => a.accumulate_mentioned_variables(acc),
&RuleExpr => (),
}
}
}
impl ContainsVariables for OrWhereClause {
fn accumulate_mentioned_variables(&self, acc: &mut BTreeSet<Variable>) {
use OrWhereClause::*;
match self {
&And(ref clauses) => for clause in clauses { clause.accumulate_mentioned_variables(acc) },
&Clause(ref clause) => clause.accumulate_mentioned_variables(acc),
}
}
}
impl ContainsVariables for OrJoin {
fn accumulate_mentioned_variables(&self, acc: &mut BTreeSet<Variable>) {
for clause in &self.clauses {
clause.accumulate_mentioned_variables(acc);
}
}
}
2017-04-04 21:54:08 +00:00
impl OrJoin {
pub fn dismember(self) -> (Vec<OrWhereClause>, UnifyVars, BTreeSet<Variable>) {
2017-04-04 21:54:08 +00:00
let vars = match self.mentioned_vars {
Some(m) => m,
None => self.collect_mentioned_variables(),
};
(self.clauses, self.unify_vars, vars)
2017-04-04 21:54:08 +00:00
}
pub fn mentioned_variables<'a>(&'a mut self) -> &'a BTreeSet<Variable> {
if self.mentioned_vars.is_none() {
let m = self.collect_mentioned_variables();
self.mentioned_vars = Some(m);
}
2017-04-04 21:54:08 +00:00
if let Some(ref mentioned) = self.mentioned_vars {
mentioned
} else {
unreachable!()
2017-04-04 21:54:08 +00:00
}
}
}
Parse and Algebrize `not` & `not-join`. (#302) (Closes #303, #389, #422 ) r=rnewman * Part 1 - Parse `not` and `not-join` * Part 2 - Validate `not` and `not-join` pre-algebrization * Address review comments rnewman. * Remove `WhereNotClause` and populate `NotJoin` with `WhereClause`. * Fix validation for `not` and `not-join`, removing tests that were invalid. * Address rustification comments. * Rebase against `rust` branch. * Part 3 - Add required types for NotJoin. * Implement `PartialEq` for `ConjoiningClauses` so `ComputedTable` can be included inside `ColumnConstraint::NotExists` * Part 4 - Implement `apply_not_join` * Part 5 - Call `apply_not_join` from inside `apply_clause` * Part 6 - Translate `not-join` into `NOT EXISTS` SQL * Address review comments. * Rename `projected` to `unified` to better describe the fact that we are not projecting any variables. * Check for presence of each unified var in either `column_bindings` or `input_bindings` and bail if not there. * Copy over `input_bindings` for each var in `unified`. * Only copy over the first `column_binding` for each variable in `unified` rather than the whole list. * Update tests. * Address review comments. * Make output from Debug for NotExists more useful * Clear up misunderstanding. Any single failing clause in the not will cause the entire not to be considered empty * Address review comments. * Remove Limit requirement from cc_to_exists. * Use Entry.or_insert instead of matching on the entry to add to column_bindings. * Move addition of value_bindings to before apply_clauses on template. * Tidy up tests with some variable reuse. * Addressed nits, * Address review comments. * Move addition of column_bindings to above apply_clause. * Update tests. * Add test to ensure that unbound vars fail * Improve test for unbound variable to check for correct variable and error * address nits
2017-04-28 09:44:11 +00:00
impl ContainsVariables for NotJoin {
fn accumulate_mentioned_variables(&self, acc: &mut BTreeSet<Variable>) {
for clause in &self.clauses {
clause.accumulate_mentioned_variables(acc);
}
}
}
impl ContainsVariables for Predicate {
fn accumulate_mentioned_variables(&self, acc: &mut BTreeSet<Variable>) {
for arg in &self.args {
if let &FnArg::Variable(ref v) = arg {
acc_ref(acc, v)
}
}
}
}
impl ContainsVariables for TypeAnnotation {
fn accumulate_mentioned_variables(&self, acc: &mut BTreeSet<Variable>) {
acc_ref(acc, &self.variable);
}
}
impl ContainsVariables for Binding {
fn accumulate_mentioned_variables(&self, acc: &mut BTreeSet<Variable>) {
match self {
&Binding::BindScalar(ref v) | &Binding::BindColl(ref v) => {
acc_ref(acc, v)
},
&Binding::BindRel(ref vs) | &Binding::BindTuple(ref vs) => {
for v in vs {
if let &VariableOrPlaceholder::Variable(ref v) = v {
acc_ref(acc, v);
}
}
},
}
}
}
impl ContainsVariables for WhereFn {
fn accumulate_mentioned_variables(&self, acc: &mut BTreeSet<Variable>) {
for arg in &self.args {
if let &FnArg::Variable(ref v) = arg {
acc_ref(acc, v)
}
}
self.binding.accumulate_mentioned_variables(acc);
}
}
fn acc_ref<T: Clone + Ord>(acc: &mut BTreeSet<T>, v: &T) {
// Roll on, reference entries!
if !acc.contains(v) {
acc.insert(v.clone());
}
}
impl ContainsVariables for Pattern {
fn accumulate_mentioned_variables(&self, acc: &mut BTreeSet<Variable>) {
if let PatternNonValuePlace::Variable(ref v) = self.entity {
acc_ref(acc, v)
}
if let PatternNonValuePlace::Variable(ref v) = self.attribute {
acc_ref(acc, v)
}
if let PatternValuePlace::Variable(ref v) = self.value {
acc_ref(acc, v)
}
if let PatternNonValuePlace::Variable(ref v) = self.tx {
acc_ref(acc, v)
}
}
}