Avoid monomorphizing large functions multiple times #279

New issue

Open

opened 2020-08-06 16:57:47 +00:00 by gburd · 0 comments

gburd commented

2020-08-06 16:57:47 +00:00

(Migrated from github.com)

Related to #772.

Most of this is taken from https://gist.github.com/thomcc/09f48b222e4fdf69c43479ee65daf2ab (which is taken from a list of top functions of mentat_ffi when compiled as a cdylib)

If you look at that listing there are a number of duplicated functions, which are almost all going to be a result of the function being monomorphized multiple times. We should avoid these for anything large.

IntoIterator seems like a frequent offender here. In other code ,stuff like Into and AsRef are often responsible.

transact_entities is duplicated at least twice and accounts for over 40kb according to cargo-bloat, , and probably could just take a Vec<Entity<V>> (this is still generic, but in practice TransactableValue is only ever ValueAndSpan AFAICT, so it will only be monomorphized a single time).
transact_simple_terms is likely the cause of the previous's bloat, and if it were only called with a single type it would likely not be getting inlined into transact_entities. Note also that we could fairly easily avoid the generic argument, since it's just used as something to pass to Generation::from.
project_elements should probably just take a Vec,

There are probably more that just get inlined too and contribute to their caller getting bigger. There's a balance between this and performance though, so it's not 100% cut and dry, but for huge functions the overhead of e.g. collecting an iterator into a vector seems likely to be negligible.

Related to #772. Most of this is taken from https://gist.github.com/thomcc/09f48b222e4fdf69c43479ee65daf2ab (which is taken from a list of top functions of mentat_ffi when compiled as a cdylib) If you look at that listing there are a number of duplicated functions, which are almost all going to be a result of the function being monomorphized multiple times. We should avoid these for anything large. IntoIterator seems like a frequent offender here. In other code ,stuff like `Into` and `AsRef` are often responsible. - [`transact_entities`](https://github.com/mozilla/mentat/blob/5fe4f12d32379415e36eb80877cbeec6fd443ffa/db/src/tx.rs#L619) is duplicated at least twice and accounts for over 40kb according to cargo-bloat, , and probably could just take a `Vec<Entity<V>>` (this is still generic, but in practice TransactableValue is only ever ValueAndSpan AFAICT, so it will only be monomorphized a single time). - [`transact_simple_terms`](https://github.com/mozilla/mentat/blob/5fe4f12d32379415e36eb80877cbeec6fd443ffa/db/src/tx.rs#L633) is likely the cause of the previous's bloat, and if it were only called with a single type it would likely not be getting inlined into transact_entities. Note also that we could fairly easily avoid the generic argument, since it's just used as something to pass to `Generation::from`. - [`project_elements`](https://github.com/mozilla/mentat/blob/60a57ea493a683199529da13425365beb3becf7c/query-projector/src/project.rs#L179) should probably just take a Vec, There are probably more that just get inlined too and contribute to their caller getting bigger. There's a balance between this and performance though, so it's not 100% cut and dry, but for huge functions the overhead of e.g. collecting an iterator into a vector seems likely to be negligible.