RUST-2330 Serialize Facet types to BSON binary#656
RUST-2330 Serialize Facet types to BSON binary#656abr-egn wants to merge 18 commits intomongodb:mainfrom
Facet types to BSON binary#656Conversation
| # make sure we're running in the repo root | ||
| cd $(dirname $0)/.. | ||
|
|
||
| RUSTFLAGS="--cfg mongodb_internal_bench" cargo +nightly bench tests::bench --all-features No newline at end of file |
There was a problem hiding this comment.
cargo bench requires using nightly; I put the benchmarks behind a custom cfg flag rather than a feature so they don't get picked up even when using --all-features.
| } | ||
|
|
||
| #[derive(Debug)] | ||
| struct Serializer { |
There was a problem hiding this comment.
Because the Facet serialization machinery feeds a stream of events rather than using visitors, the structure here is quite different than the Serde serialization; I find this setup easier to follow but maybe that's just because I wrote it.
| values: | ||
| - id: "min" | ||
| display_name: "1.81 (minimum supported version)" | ||
| display_name: "1.85 (minimum supported version)" |
There was a problem hiding this comment.
This along with the edition change allow let-chains which are really handy and this version's well past our declared threshold.
This did come with a bit more churn than I expected (formatted order of imports changed), I can pull that out if you think that'd be better.
| } | ||
| } | ||
|
|
||
| impl TryFrom<&RawArray> for Bson { |
There was a problem hiding this comment.
These (and similar added elsewhere) made testing more convenient and they seem fine to export.
| /// Note that accessing elements is an O(N) operation, as it requires iterating through the document | ||
| /// from the beginning to find the requested key. | ||
| #[derive(Clone, PartialEq)] | ||
| #[cfg_attr(feature = "facet-unstable", derive(facet::Facet), facet(opaque))] |
There was a problem hiding this comment.
All of the module types that derive Facet are also tagged with opaque; this means that you can't use facet machinery to inspect the internals and unless a [de]serializer special cases them (like bson::facet::to_vec does) they can only be [de]serialized via a proxy type (like bson::facet::ExtJson). This seems pretty ideal to me since it means we're not exposing internal field structure. There's maybe a case for making those structs that have all pub fields non-opaque, but that seems more like a "consider it if anyone asks for it" kind of thing.
| /// A BSON value backed by owned raw BSON bytes. | ||
| #[derive(Debug, Clone, PartialEq)] | ||
| #[cfg_attr(feature = "facet-unstable", derive(facet::Facet), facet(opaque))] | ||
| #[repr(C)] |
There was a problem hiding this comment.
Facet requires enums to have a specified repr since the default unspecified one makes it basically impossible to say anything about in-memory layout.
There was a problem hiding this comment.
These show facet to be 3-4x slower than serde, which is a bummer. I'm not sure how much of that is general facet overhead and how much could be reclaimed from optimization of the facet serializer. The structure of the latter should mean it's pretty comparable to the serde serializer, e.g. no intermediate buffers, short-circuiting for raw bson bytes, etc.
RUST-2330
This adds
bson::facet::to_vec, which serializes arbitrary types that implementFacetto BSON bytes, properly handling embedded Rust BSON types.