mentat/db/src/errors.rs

288 lines
9.4 KiB
Rust
Raw Normal View History

// Copyright 2016 Mozilla
//
// Licensed under the Apache License, Version 2.0 (the "License"); you may not use
// this file except in compliance with the License. You may obtain a copy of the
// License at http://www.apache.org/licenses/LICENSE-2.0
// Unless required by applicable law or agreed to in writing, software distributed
// under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR
// CONDITIONS OF ANY KIND, either express or implied. See the License for the
// specific language governing permissions and limitations under the License.
#![allow(dead_code)]
2018-06-06 01:23:59 +00:00
use failure::{
Backtrace,
Context,
Error,
Fail,
};
use std::collections::{
BTreeMap,
BTreeSet,
};
use rusqlite;
use edn::entities::{
TempId,
};
use mentat_core::{
KnownEntid,
};
use types::{
Entid,
Add type checking and constraint checking to the transactor. (#663, #532, #679) This should address #663, by re-inserting type checking in the transactor stack after the entry point used by the term builder. Before this commit, we were using an SQLite UNIQUE index to assert that no `[e a]` pair, with `a` a cardinality one attribute, was asserted more than once. However, that's not in line with Datomic, which treats transaction inputs as a set and allows a single datom like `[e a v]` to appear multiple times. It's both awkward and not particularly efficient to look for _distinct_ repetitions in SQL, so we accept some runtime cost in order to check for repetitions in the transactor. This will allow us to address #532, which is really about whether we treat inputs as sets. A side benefit is that we can provide more helpful error messages when the transactor does detect that the input truly violates the cardinality constraints of the schema. This commit builds a trie while error checking and collecting final terms, which should be fairly efficient. It also allows a simpler expression of input-provided :db/txInstant datoms, which in turn uncovered a small issue with the transaction watcher, where-by the watcher would not see non-input-provided :db/txInstant datoms. This transition to Datomic-like input-as-set semantics allows us to address #532. Previously, two tempids that upserted to the same entid would produce duplicate datoms, and that would have been rejected by the transactor -- correctly, since we did not allow duplicate datoms under the input-as-list semantics. With input-as-set semantics, duplicate datoms are allowed; and that means that we must allow tempids to be equivalent, i.e., to resolve to the same tempid. To achieve this, we: - index the set of tempids - identify tempid indices that share an upsert - map tempids to a dense set of contiguous integer labels We use the well-known union-find algorithm, as implemented by petgraph, to efficiently manage the set of equivalent tempids. Along the way, I've fixed and added tests for two small errors in the transactor. First, don't drop datoms resolved by upsert (#679). Second, ensure that complex upserts are allocated. I don't know quite what happened here. The Clojure implementation correctly kept complex upserts that hadn't resolved as complex upserts (see https://github.com/mozilla/mentat/blob/9a9dfb502acf5e4cdb1059d4aac831d7603063c8/src/common/datomish/transact.cljc#L436) and then allocated complex upserts if they didn't resolve (see https://github.com/mozilla/mentat/blob/9a9dfb502acf5e4cdb1059d4aac831d7603063c8/src/common/datomish/transact.cljc#L509). Based on the code comments, I think the Rust implementation must have incorrectly tried to optimize by handling all complex upserts in at most a single generation of evolution, and that's just not correct. We're effectively implementing a topological sort, using very specific domain knowledge, and its not true that a node in a topological sort can be considered only once!
2018-04-30 22:16:05 +00:00
TypedValue,
ValueType,
};
2018-06-06 01:23:59 +00:00
#[macro_export]
macro_rules! bail {
($e:expr) => (
return Err($e.into());
)
}
pub type Result<T> = ::std::result::Result<T, Error>;
// TODO Error/ErrorKind pair
Add type checking and constraint checking to the transactor. (#663, #532, #679) This should address #663, by re-inserting type checking in the transactor stack after the entry point used by the term builder. Before this commit, we were using an SQLite UNIQUE index to assert that no `[e a]` pair, with `a` a cardinality one attribute, was asserted more than once. However, that's not in line with Datomic, which treats transaction inputs as a set and allows a single datom like `[e a v]` to appear multiple times. It's both awkward and not particularly efficient to look for _distinct_ repetitions in SQL, so we accept some runtime cost in order to check for repetitions in the transactor. This will allow us to address #532, which is really about whether we treat inputs as sets. A side benefit is that we can provide more helpful error messages when the transactor does detect that the input truly violates the cardinality constraints of the schema. This commit builds a trie while error checking and collecting final terms, which should be fairly efficient. It also allows a simpler expression of input-provided :db/txInstant datoms, which in turn uncovered a small issue with the transaction watcher, where-by the watcher would not see non-input-provided :db/txInstant datoms. This transition to Datomic-like input-as-set semantics allows us to address #532. Previously, two tempids that upserted to the same entid would produce duplicate datoms, and that would have been rejected by the transactor -- correctly, since we did not allow duplicate datoms under the input-as-list semantics. With input-as-set semantics, duplicate datoms are allowed; and that means that we must allow tempids to be equivalent, i.e., to resolve to the same tempid. To achieve this, we: - index the set of tempids - identify tempid indices that share an upsert - map tempids to a dense set of contiguous integer labels We use the well-known union-find algorithm, as implemented by petgraph, to efficiently manage the set of equivalent tempids. Along the way, I've fixed and added tests for two small errors in the transactor. First, don't drop datoms resolved by upsert (#679). Second, ensure that complex upserts are allocated. I don't know quite what happened here. The Clojure implementation correctly kept complex upserts that hadn't resolved as complex upserts (see https://github.com/mozilla/mentat/blob/9a9dfb502acf5e4cdb1059d4aac831d7603063c8/src/common/datomish/transact.cljc#L436) and then allocated complex upserts if they didn't resolve (see https://github.com/mozilla/mentat/blob/9a9dfb502acf5e4cdb1059d4aac831d7603063c8/src/common/datomish/transact.cljc#L509). Based on the code comments, I think the Rust implementation must have incorrectly tried to optimize by handling all complex upserts in at most a single generation of evolution, and that's just not correct. We're effectively implementing a topological sort, using very specific domain knowledge, and its not true that a node in a topological sort can be considered only once!
2018-04-30 22:16:05 +00:00
#[derive(Clone, Debug, Eq, PartialEq)]
pub enum CardinalityConflict {
/// A cardinality one attribute has multiple assertions `[e a v1], [e a v2], ...`.
CardinalityOneAddConflict {
e: Entid,
a: Entid,
vs: BTreeSet<TypedValue>,
},
/// A datom has been both asserted and retracted, like `[:db/add e a v]` and `[:db/retract e a v]`.
AddRetractConflict {
e: Entid,
a: Entid,
vs: BTreeSet<TypedValue>,
},
}
2018-06-06 01:23:59 +00:00
// TODO Error/ErrorKind pair
#[derive(Clone, Debug, Eq, PartialEq, Fail)]
pub enum SchemaConstraintViolation {
/// A transaction tried to assert datoms where one tempid upserts to two (or more) distinct
/// entids.
ConflictingUpserts {
/// A map from tempid to the entids it would upsert to.
///
/// In the future, we might even be able to attribute the upserts to particular (reduced)
/// datoms, i.e., to particular `[e a v]` triples that caused the constraint violation.
/// Attributing constraint violations to input data is more difficult to the multiple
/// rewriting passes the input undergoes.
conflicting_upserts: BTreeMap<TempId, BTreeSet<KnownEntid>>,
},
Add type checking and constraint checking to the transactor. (#663, #532, #679) This should address #663, by re-inserting type checking in the transactor stack after the entry point used by the term builder. Before this commit, we were using an SQLite UNIQUE index to assert that no `[e a]` pair, with `a` a cardinality one attribute, was asserted more than once. However, that's not in line with Datomic, which treats transaction inputs as a set and allows a single datom like `[e a v]` to appear multiple times. It's both awkward and not particularly efficient to look for _distinct_ repetitions in SQL, so we accept some runtime cost in order to check for repetitions in the transactor. This will allow us to address #532, which is really about whether we treat inputs as sets. A side benefit is that we can provide more helpful error messages when the transactor does detect that the input truly violates the cardinality constraints of the schema. This commit builds a trie while error checking and collecting final terms, which should be fairly efficient. It also allows a simpler expression of input-provided :db/txInstant datoms, which in turn uncovered a small issue with the transaction watcher, where-by the watcher would not see non-input-provided :db/txInstant datoms. This transition to Datomic-like input-as-set semantics allows us to address #532. Previously, two tempids that upserted to the same entid would produce duplicate datoms, and that would have been rejected by the transactor -- correctly, since we did not allow duplicate datoms under the input-as-list semantics. With input-as-set semantics, duplicate datoms are allowed; and that means that we must allow tempids to be equivalent, i.e., to resolve to the same tempid. To achieve this, we: - index the set of tempids - identify tempid indices that share an upsert - map tempids to a dense set of contiguous integer labels We use the well-known union-find algorithm, as implemented by petgraph, to efficiently manage the set of equivalent tempids. Along the way, I've fixed and added tests for two small errors in the transactor. First, don't drop datoms resolved by upsert (#679). Second, ensure that complex upserts are allocated. I don't know quite what happened here. The Clojure implementation correctly kept complex upserts that hadn't resolved as complex upserts (see https://github.com/mozilla/mentat/blob/9a9dfb502acf5e4cdb1059d4aac831d7603063c8/src/common/datomish/transact.cljc#L436) and then allocated complex upserts if they didn't resolve (see https://github.com/mozilla/mentat/blob/9a9dfb502acf5e4cdb1059d4aac831d7603063c8/src/common/datomish/transact.cljc#L509). Based on the code comments, I think the Rust implementation must have incorrectly tried to optimize by handling all complex upserts in at most a single generation of evolution, and that's just not correct. We're effectively implementing a topological sort, using very specific domain knowledge, and its not true that a node in a topological sort can be considered only once!
2018-04-30 22:16:05 +00:00
/// A transaction tried to assert a datom or datoms with the wrong value `v` type(s).
TypeDisagreements {
/// The key (`[e a v]`) has an invalid value `v`: it is not of the expected value type.
conflicting_datoms: BTreeMap<(Entid, Entid, TypedValue), ValueType>
},
/// A transaction tried to assert datoms that don't observe the schema's cardinality constraints.
CardinalityConflicts {
conflicts: Vec<CardinalityConflict>,
},
}
impl ::std::fmt::Display for SchemaConstraintViolation {
fn fmt(&self, f: &mut ::std::fmt::Formatter) -> ::std::fmt::Result {
use self::SchemaConstraintViolation::*;
match self {
&ConflictingUpserts { ref conflicting_upserts } => {
Add type checking and constraint checking to the transactor. (#663, #532, #679) This should address #663, by re-inserting type checking in the transactor stack after the entry point used by the term builder. Before this commit, we were using an SQLite UNIQUE index to assert that no `[e a]` pair, with `a` a cardinality one attribute, was asserted more than once. However, that's not in line with Datomic, which treats transaction inputs as a set and allows a single datom like `[e a v]` to appear multiple times. It's both awkward and not particularly efficient to look for _distinct_ repetitions in SQL, so we accept some runtime cost in order to check for repetitions in the transactor. This will allow us to address #532, which is really about whether we treat inputs as sets. A side benefit is that we can provide more helpful error messages when the transactor does detect that the input truly violates the cardinality constraints of the schema. This commit builds a trie while error checking and collecting final terms, which should be fairly efficient. It also allows a simpler expression of input-provided :db/txInstant datoms, which in turn uncovered a small issue with the transaction watcher, where-by the watcher would not see non-input-provided :db/txInstant datoms. This transition to Datomic-like input-as-set semantics allows us to address #532. Previously, two tempids that upserted to the same entid would produce duplicate datoms, and that would have been rejected by the transactor -- correctly, since we did not allow duplicate datoms under the input-as-list semantics. With input-as-set semantics, duplicate datoms are allowed; and that means that we must allow tempids to be equivalent, i.e., to resolve to the same tempid. To achieve this, we: - index the set of tempids - identify tempid indices that share an upsert - map tempids to a dense set of contiguous integer labels We use the well-known union-find algorithm, as implemented by petgraph, to efficiently manage the set of equivalent tempids. Along the way, I've fixed and added tests for two small errors in the transactor. First, don't drop datoms resolved by upsert (#679). Second, ensure that complex upserts are allocated. I don't know quite what happened here. The Clojure implementation correctly kept complex upserts that hadn't resolved as complex upserts (see https://github.com/mozilla/mentat/blob/9a9dfb502acf5e4cdb1059d4aac831d7603063c8/src/common/datomish/transact.cljc#L436) and then allocated complex upserts if they didn't resolve (see https://github.com/mozilla/mentat/blob/9a9dfb502acf5e4cdb1059d4aac831d7603063c8/src/common/datomish/transact.cljc#L509). Based on the code comments, I think the Rust implementation must have incorrectly tried to optimize by handling all complex upserts in at most a single generation of evolution, and that's just not correct. We're effectively implementing a topological sort, using very specific domain knowledge, and its not true that a node in a topological sort can be considered only once!
2018-04-30 22:16:05 +00:00
writeln!(f, "conflicting upserts:")?;
for (tempid, entids) in conflicting_upserts {
Add type checking and constraint checking to the transactor. (#663, #532, #679) This should address #663, by re-inserting type checking in the transactor stack after the entry point used by the term builder. Before this commit, we were using an SQLite UNIQUE index to assert that no `[e a]` pair, with `a` a cardinality one attribute, was asserted more than once. However, that's not in line with Datomic, which treats transaction inputs as a set and allows a single datom like `[e a v]` to appear multiple times. It's both awkward and not particularly efficient to look for _distinct_ repetitions in SQL, so we accept some runtime cost in order to check for repetitions in the transactor. This will allow us to address #532, which is really about whether we treat inputs as sets. A side benefit is that we can provide more helpful error messages when the transactor does detect that the input truly violates the cardinality constraints of the schema. This commit builds a trie while error checking and collecting final terms, which should be fairly efficient. It also allows a simpler expression of input-provided :db/txInstant datoms, which in turn uncovered a small issue with the transaction watcher, where-by the watcher would not see non-input-provided :db/txInstant datoms. This transition to Datomic-like input-as-set semantics allows us to address #532. Previously, two tempids that upserted to the same entid would produce duplicate datoms, and that would have been rejected by the transactor -- correctly, since we did not allow duplicate datoms under the input-as-list semantics. With input-as-set semantics, duplicate datoms are allowed; and that means that we must allow tempids to be equivalent, i.e., to resolve to the same tempid. To achieve this, we: - index the set of tempids - identify tempid indices that share an upsert - map tempids to a dense set of contiguous integer labels We use the well-known union-find algorithm, as implemented by petgraph, to efficiently manage the set of equivalent tempids. Along the way, I've fixed and added tests for two small errors in the transactor. First, don't drop datoms resolved by upsert (#679). Second, ensure that complex upserts are allocated. I don't know quite what happened here. The Clojure implementation correctly kept complex upserts that hadn't resolved as complex upserts (see https://github.com/mozilla/mentat/blob/9a9dfb502acf5e4cdb1059d4aac831d7603063c8/src/common/datomish/transact.cljc#L436) and then allocated complex upserts if they didn't resolve (see https://github.com/mozilla/mentat/blob/9a9dfb502acf5e4cdb1059d4aac831d7603063c8/src/common/datomish/transact.cljc#L509). Based on the code comments, I think the Rust implementation must have incorrectly tried to optimize by handling all complex upserts in at most a single generation of evolution, and that's just not correct. We're effectively implementing a topological sort, using very specific domain knowledge, and its not true that a node in a topological sort can be considered only once!
2018-04-30 22:16:05 +00:00
writeln!(f, " tempid {:?} upserts to {:?}", tempid, entids)?;
}
Ok(())
},
&TypeDisagreements { ref conflicting_datoms } => {
writeln!(f, "type disagreements:")?;
for (ref datom, expected_type) in conflicting_datoms {
writeln!(f, " expected value of type {} but got datom [{} {} {:?}]", expected_type, datom.0, datom.1, datom.2)?;
}
Ok(())
},
&CardinalityConflicts { ref conflicts } => {
writeln!(f, "cardinality conflicts:")?;
for ref conflict in conflicts {
writeln!(f, " {:?}", conflict)?;
}
Ok(())
},
}
}
}
#[derive(Clone, Debug, Eq, PartialEq)]
pub enum InputError {
/// Map notation included a bad `:db/id` value.
BadDbId,
/// A value place cannot be interpreted as an entity place (for example, in nested map
/// notation).
BadEntityPlace,
}
impl ::std::fmt::Display for InputError {
fn fmt(&self, f: &mut ::std::fmt::Formatter) -> ::std::fmt::Result {
use self::InputError::*;
match self {
&BadDbId => {
writeln!(f, ":db/id in map notation must either not be present or be an entid, an ident, or a tempid")
},
&BadEntityPlace => {
writeln!(f, "cannot convert value place into entity place")
},
}
}
}
2018-06-06 01:23:59 +00:00
#[derive(Debug, Fail)]
pub enum DbError {
/// We're just not done yet. Message that the feature is recognized but not yet
/// implemented.
#[fail(display = "not yet implemented: {}", _0)]
NotYetImplemented(String),
/// We've been given a value that isn't the correct Mentat type.
#[fail(display = "value '{}' is not the expected Mentat value type {:?}", _0, _1)]
BadValuePair(String, ValueType),
/// We've got corrupt data in the SQL store: a value and value_type_tag don't line up.
/// TODO _1.data_type()
#[fail(display = "bad SQL (value_type_tag, value) pair: ({:?}, {:?})", _0, _1)]
BadSQLValuePair(rusqlite::types::Value, i32),
// /// The SQLite store user_version isn't recognized. This could be an old version of Mentat
// /// trying to open a newer version SQLite store; or it could be a corrupt file; or ...
// #[fail(display = "bad SQL store user_version: {}", _0)]
// BadSQLiteStoreVersion(i32),
/// A bootstrap definition couldn't be parsed or installed. This is a programmer error, not
/// a runtime error.
#[fail(display = "bad bootstrap definition: {}", _0)]
BadBootstrapDefinition(String),
/// A schema assertion couldn't be parsed.
#[fail(display = "bad schema assertion: {}", _0)]
BadSchemaAssertion(String),
/// An ident->entid mapping failed.
#[fail(display = "no entid found for ident: {}", _0)]
UnrecognizedIdent(String),
/// An entid->ident mapping failed.
/// We also use this error if you try to transact an entid that we didn't allocate,
/// in part because we blow the stack in error_chain if we define a new enum!
#[fail(display = "unrecognized or no ident found for entid: {}", _0)]
UnrecognizedEntid(Entid),
#[fail(display = "unknown attribute for entid: {}", _0)]
UnknownAttribute(Entid),
#[fail(display = "cannot reverse-cache non-unique attribute: {}", _0)]
CannotCacheNonUniqueAttributeInReverse(Entid),
#[fail(display = "schema alteration failed: {}", _0)]
SchemaAlterationFailed(String),
/// A transaction tried to violate a constraint of the schema of the Mentat store.
#[fail(display = "schema constraint violation: {}", _0)]
SchemaConstraintViolation(SchemaConstraintViolation),
/// The transaction was malformed in some way (that was not recognized at parse time; for
/// example, in a way that is schema-dependent).
#[fail(display = "transaction input error: {}", _0)]
InputError(InputError),
#[fail(display = "Cannot transact a fulltext assertion with a typed value that is not :db/valueType :db.type/string")]
WrongTypeValueForFtsAssertion,
}
#[derive(Debug)]
pub struct DbSqlError {
inner: Context<DbSqlErrorKind>,
}
impl Fail for DbSqlError {
fn cause(&self) -> Option<&Fail> {
self.inner.cause()
}
2018-06-06 01:23:59 +00:00
fn backtrace(&self) -> Option<&Backtrace> {
self.inner.backtrace()
}
2018-06-06 01:23:59 +00:00
}
2018-06-06 01:23:59 +00:00
impl ::std::fmt::Display for DbSqlError {
fn fmt(&self, f: &mut ::std::fmt::Formatter) -> ::std::fmt::Result {
::std::fmt::Display::fmt(&self.inner, f)
}
}
2018-06-06 01:23:59 +00:00
impl DbSqlError {
pub fn kind(&self) -> DbSqlErrorKind {
*self.inner.get_context()
}
}
2018-06-06 01:23:59 +00:00
impl From<DbSqlErrorKind> for DbSqlError {
fn from(kind: DbSqlErrorKind) -> DbSqlError {
DbSqlError { inner: Context::new(kind) }
}
}
2018-06-06 01:23:59 +00:00
impl From<Context<DbSqlErrorKind>> for DbSqlError {
fn from(inner: Context<DbSqlErrorKind>) -> DbSqlError {
DbSqlError { inner: inner }
}
}
2018-06-06 01:23:59 +00:00
#[derive(Copy, Clone, Eq, PartialEq, Debug, Fail)]
pub enum DbSqlErrorKind {
#[fail(display = "Could not set_user_version")]
CouldNotSetVersionPragma,
2018-06-06 01:23:59 +00:00
#[fail(display = "Could not get_user_version")]
CouldNotGetVersionPragma,
2018-06-06 01:23:59 +00:00
#[fail(display = "Could not search!")]
CouldNotSearch,
2018-06-06 01:23:59 +00:00
#[fail(display = "Could not insert transaction: failed to add datoms not already present")]
TxInsertFailedToAddMissingDatoms,
2018-06-06 01:23:59 +00:00
#[fail(display = "Could not insert transaction: failed to retract datoms already present")]
TxInsertFailedToRetractDatoms,
2018-06-06 01:23:59 +00:00
#[fail(display = "Could not update datoms: failed to retract datoms already present")]
DatomsUpdateFailedToRetract,
2018-06-06 01:23:59 +00:00
#[fail(display = "Could not update datoms: failed to add datoms not already present")]
DatomsUpdateFailedToAdd,
2018-06-06 01:23:59 +00:00
#[fail(display = "Failed to create temporary tables")]
FailedToCreateTempTables,
#[fail(display = "Could not insert non-fts one statements into temporary search table!")]
NonFtsInsertionIntoTempSearchTableFailed,
#[fail(display = "Could not insert fts values into fts table!")]
FtsInsertionFailed,
#[fail(display = "Could not insert FTS statements into temporary search table!")]
FtsInsertionIntoTempSearchTableFailed,
#[fail(display = "Could not drop FTS search ids!")]
FtsFailedToDropSearchIds,
#[fail(display = "Could not update partition map")]
FailedToUpdatePartitionMap,
}