offering collaboration on all-rust storage engine, curious about representative benchmarks of yours #187

Open
opened 2020-08-06 16:56:22 +00:00 by gburd · 0 comments
gburd commented 2020-08-06 16:56:22 +00:00 (Migrated from github.com)

Hey! I have an all-rust KV store I'm building, sled, which has three main goals:

  • minimize non-rust compile-time dependencies for stateful systems
  • serve as a flexible platform for developing generalizable stateful testing techniques for the rust ecosystem
  • beat traditional B+ trees for write throughput and LSM trees for read latency

I'm about to release 0.15 as alpha, encouraging users with recomputable datasets to help exercise the system and hopefully violate as many of my bad assumptions as possible. On the table for 0.16 is an initial implementation of serializable transactions based on a simplified cicada-like approach (we are far from the point where atomic fetch add is our bottleneck, but the contention-aware safety checks are definitely on the table).

I wanted to reach out early to say that I'm quite willing to support the underlying storage semantics that mentat requires, and that I would LOVE to build a representative benchmark of those semantics (and those of other projects) so that I can work towards a high-quality experience for a broad range of the stateful rust ecosystem.

My main question for you folks is, do you have a specific benchmark or set of benchmarks that you prioritize and strive to minimize regressions with? How are you directing tuning efforts? I would love to have some insight into the specific performance goals of your project! Also, if sqlite is less than optimal for some of your required use cases, I would be quite eager to hear about those!

SQLite is a wonderful choice due to its reliability, and my goal is NOT to have something that I can tell people is more reliable. Being an SRE operating distributed databases has made me quite cautious about new storage technologies, and I am not trying to downplay the risks of using a new one. But I am trying to bring modern performance and reliability techniques to the rust and stateful systems engineering ecosystems.

Keep up the good work :)

Hey! I have an all-rust KV store I'm building, [sled](https://github.com/spacejam/sled), which has three main goals: * minimize non-rust compile-time dependencies for stateful systems * serve as a flexible platform for developing generalizable stateful testing techniques for the rust ecosystem * beat traditional B+ trees for write throughput and LSM trees for read latency I'm about to release 0.15 as alpha, encouraging users with recomputable datasets to help exercise the system and hopefully violate as many of my bad assumptions as possible. On the table for 0.16 is an initial implementation of serializable transactions based on a simplified [cicada](http://15721.courses.cs.cmu.edu/spring2018/papers/06-mvcc2/lim-sigmod2017.pdf)-like approach (we are far from the point where atomic fetch add is our bottleneck, but the contention-aware safety checks are definitely on the table). I wanted to reach out early to say that I'm quite willing to support the underlying storage semantics that mentat requires, and that I would LOVE to build a representative benchmark of those semantics (and those of other projects) so that I can work towards a high-quality experience for a broad range of the stateful rust ecosystem. My main question for you folks is, do you have a specific benchmark or set of benchmarks that you prioritize and strive to minimize regressions with? How are you directing tuning efforts? I would love to have some insight into the specific performance goals of your project! Also, if sqlite is less than optimal for some of your required use cases, I would be quite eager to hear about those! SQLite is a wonderful choice due to its reliability, and my goal is NOT to have something that I can tell people is more reliable. Being an SRE operating distributed databases has made me quite cautious about new storage technologies, and I am not trying to downplay the risks of using a new one. But I am trying to bring modern performance and reliability techniques to the rust and stateful systems engineering ecosystems. Keep up the good work :)
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: greg/mentat#187
No description provided.