Investigate methods of reducing the on-disk size of Mentat databases #47
Labels
No labels
A-build
A-cli
A-core
A-design
A-edn
A-ffi
A-query
A-sdk
A-sdk-android
A-sdk-ios
A-sync
A-transact
A-views
A-vocab
P-Android
P-desktop
P-iOS
bug
correctness
dependencies
dev-ergonomics
discussion
documentation
duplicate
enhancement
enquiry
good first bug
good first issue
help wanted
hygiene
in progress
invalid
question
ready
size
speed
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: greg/mentat#47
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
This document highlights many of the issues and concerns https://docs.google.com/document/d/14ywV4PlBAdOsJrxd7QhMcxbo7pw8cdc1kLAb7I4QhFY/edit?ts=5b7b7f87.
Some work to measure the size of the database when storing history is in https://github.com/mozilla/application-services/pull/191. For my places.sqlite (100k places, 150k visits), it gets around 200MB larger.
It's worth noting that 'disk usage' was one of the primary concerns reported by user research for Fenix (although it's not clear if this size increase (relative to places) is the kind of thing that would make a dent relative to stuff like caches and the like -- A very informal poll of some friends of mine found that Fennec typically uses around 500MB of space (app + data), another 200MB isn't a trivial increase, but doesn't substantially change where we are in terms of app size).
Some bugs which may help (suggested by @rnewman):
:db.type/ref
to a{ :db/ident :my/keyword }
, so this doesn't seem like a high priority.I think something like sqlite's zipvfs extension would likely help (as the databases compress well), but have not tried it. Implementing it ourselves is likely beyond the scope of this effort (I took a look at the effort required and it wasn't exactly trivial). Additionally, whatever we do would need to somehow integrate with sqlcipher (I also took a look at bolting compression into sqlcipher before the encryption, but the fact that this makes the block output a variable size seemed to make this problematic).
Other notes:
compress
/uncompress
options of FTS4 did not help, since the strings in each column are relatively small. Additionally, the performance overhead here was substantial even for a very fast compressor (LZ4).Additional concerns exist around the fact that this problem may be exacerbated by materialized views (perhaps #33 will help or prevent this?)
Original by @thomcc