[query] Allow later constraints to affect earlier patterns' table choices #151

Open
opened 2020-08-06 16:55:47 +00:00 by gburd · 0 comments
gburd commented 2020-08-06 16:55:47 +00:00 (Migrated from github.com)

We naïvely alias tables as patterns are processed: as soon as we see

[?x _ ?y]

we decide to query all_datoms, because the attribute is unknown, and we alias that to something like all_datoms00.

This is expedient: we can process a collection of patterns mutably, one by one, and at the end be almost ready to produce SQL.

However, if the next pattern is:

[(< ?y 10)]

we know that for this CC to produce results ?y must be numeric. Armed with that knowledge, we know we can use datoms instead of all_datoms, which will yield a more efficient query.

As noted in https://github.com/mozilla/mentat/pull/374#issuecomment-288278950, in order to do this we need to do one of three things:

  • Delay table choice until we're ready to finish the CC. This can require full descent: the top-level CC might know that ?y is numeric because both arms of a nested or-join are numeric. That gets a little complicated if we wish to do this right.
  • Do a multi-pass algebrizing step, where we collect some type knowledge before accumulating constraints and aliases. I'm not sure how feasible this is yet.
  • Re-alias as we go.
We naïvely alias tables as patterns are processed: as soon as we see ```edn [?x _ ?y] ``` we decide to query `all_datoms`, because the attribute is unknown, and we alias that to something like `all_datoms00`. This is expedient: we can process a collection of patterns mutably, one by one, and at the end be almost ready to produce SQL. However, if the next pattern is: ```edn [(< ?y 10)] ``` we _know_ that for this CC to produce results `?y` must be numeric. Armed with that knowledge, we know we can use `datoms` instead of `all_datoms`, which will yield a more efficient query. As noted in https://github.com/mozilla/mentat/pull/374#issuecomment-288278950, in order to do this we need to do one of three things: - Delay table choice until we're ready to finish the CC. This can require full descent: the top-level CC might know that `?y` is numeric because both arms of a nested `or-join` are numeric. That gets a little complicated if we wish to do this right. - Do a multi-pass algebrizing step, where we collect some type knowledge before accumulating constraints and aliases. I'm not sure how feasible this is yet. - Re-alias as we go.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: greg/mentat#151
No description provided.