[db] Optimize bulk insertions using fixed-size statement caches and base expansion #118
Labels
No labels
A-build
A-cli
A-core
A-design
A-edn
A-ffi
A-query
A-sdk
A-sdk-android
A-sdk-ios
A-sync
A-transact
A-views
A-vocab
P-Android
P-desktop
P-iOS
bug
correctness
dependencies
dev-ergonomics
discussion
documentation
duplicate
enhancement
enquiry
good first bug
good first issue
help wanted
hygiene
in progress
invalid
question
ready
size
speed
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: greg/mentat#118
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
This ticket tracks making bulk insertions (like at https://github.com/ncalexan/mentat/blob/db-tests/db/src/db.rs#L631) more efficient. This is follow-up to https://github.com/mozilla/mentat/pull/214#discussion_r99420334.
I don't know the best way to do this, so there's some experimentation needed. I can think of the following considerations:
:db/tx :db/txInstant MILLISECONDS
datom)So we want to cache a certain number of insertion statements for some number of assertions, and then chunk our input into those insertion statements. If we had unlimited bindings per statement and unlimited statement handles, we might do a base-2 expansion and lazily cache insertions of 1, 2, 4, 8, 16, assertions. However, given our considerations, it might be better to cache insertions for 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, (999 / bindings-per-insertion) assertions, effectively dividing the parameter space into "little" and "huge" regimes.
If we wanted to be really fancy, we could determine our caching strategy dynamically at initialization time, or we could generate different strategies at compile time for different hardware profiles (like most FFI implementations do).
This seems to me similar to "Montgomery ladders" for fast multiplication and "base phi expansions" for doing arithmetic in algebraic groups that have non-trivial computable morphisms, with some additional restrictions thrown in.