Serde for trait objects - Part 1: Overview
Consider
Here we have a trait object, Box<dyn Message>
.
The aim of this series of blog posts is to make this code compile.
Part of the solution will be to add a macro on top of the trait definition
Remark: All topics covered here are well-known. We follow typetag.
Remark: If you are in a situation where you want to serialize a trait object, please take a step back. Check if you can replace your trait object with an enum. In my experience, the enum approach is much easier to work with.1
In this series of blog posts I’m explaining how to use serde with trait objects:
- Part 1: Overview
- Part 2: Serialization
- Part 3: Deserialization
- Part 4: Registry
- Part 5: Lifetimes
- Part 6: Sync/Send
- Part 7: Macro Part A: Trait
- Part 8: Marco Part B: Implementation
How not to do it
First, we try a very naive approach - we start with the following snippet:
How to do implement the todo?
First idea: Recursion
We start with a stupid idea, a recursive call. Later, we will write recursive code by accident.
As expected, this yields a runtime overflow:
Second idea: Trait bound
We use serde::Serialize as a trait bound
1
2
3
4
5
6
7
8
9
10
trait Trait: serde::Serialize {}
#[derive(serde::Serialize)]
struct S {}
impl Trait for S {}
fn main() {
let s = S {};
let t: &dyn Trait = &s;
}
This code does not compile, because our trait is no longer object-safe,
Third idea: Erased serde
The solution is to use erased_serde::Serialize. This is an object safe version of the Serialize trait. Instead of using a generic serializer argument, it uses a trait object:
Note that erased_serde::Serialize
is sealed and cannot be implemented.
There is a blanket implementation of it for types implementing serde::Serialize
, so we can use it.
Using the trait bound, this is our current code.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
trait Trait: erased_serde::Serialize {}
#[derive(serde::Serialize)]
struct S {}
impl Trait for S {}
impl serde::Serialize for dyn Trait {
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: serde::Serializer,
{
todo!("We have to implement this")
}
}
fn main() {
let s = S {};
let t: &dyn Trait = &s;
let ser = serde_json::to_string(t).unwrap();
}
We still need to implement line 12, the serialization code.
But this is easy, erased_serde
contains the right function, thankfully even called serialize
:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
trait Trait: erased_serde::Serialize {}
#[derive(serde::Serialize)]
struct S {}
impl Trait for S {}
impl serde::Serialize for dyn Trait {
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: serde::Serializer,
{
erased_serde::serialize(self, serializer)
}
}
fn main() {
let s = S {};
let t: &dyn Trait = &s;
let ser = serde_json::to_string(t).unwrap();
}
This compiles and works as expected.
Note, if we change the implementation by adding a borrow, from
to
we end up with a runtime overflow
Method resolution is sometimes complicated!
Discussion
Finally, we can serialize our trait object. But, unfortunately, this is not good enough. Let’s consider the following situation, having two different structs both implementing a given trait:
This outputs:
Both trait objects yield the same serialized json string! How should we deserialize this correctly?
Remark: This is the same issue that serde has with serialization of enum
s, if one opts into the untagged representation.
See serde enum representation for a detailed discussion.
As always in this space (“serialization of trait objects”), learn what enum
s are doing, and avoid what they are avoiding!
2
Digression: Dotnet
Let’s think outside of our rusty box, and check what dotnet is doing. 3 The most important part of the snippet is it’s output, shown below. So please feel free to skip the C#-code.
This prints:
We see: Both structs are enhanced by some type information, consisting of type name and the crate (called “assembly” in the DotNet world) it was defined in. How does this help with deserialization? Dotnet has “reflection”, which means we can query the runtime during deserialization. Hence we can give our type information to the runtime, and it will look up the type for us and give us some constructor, which in turn will allows us deserialize the data into the given type. Finally, we cast the deserialized instance into to our interface, and we are done.
So, here are our tasks.
- Task 1: enhance the serialized data with some type information.
- Task 2: build some kind of runtime type database which allows querying types given our (de-)serialized type information
- Task 3: deserialize the serialized type into our trait object
Since Task 3 depends on the API of Task 2, we will proceed in the following order
- Task 1: Mostly straight-forward, since we already can serialize trait objects
- Task 3: An exercise in the visitor pattern
- Task 2: Some trade-offs will be discussed
Note that, once we have complete those three tasks, we will still need to write derive macros and so on, so we have a reasonable amount of work in front of us.
This is it for today, the next post will be about serialization and complete task 1.
Footnotes
-
(Rant) This series is going to be my reference answer to the question: “I’m struggling with trait objects. How do I solve problem XYZ?” So that I can say: “I suggest using an enum instead of a trait object. This is often much easier. For example, if you want to use serde for you trait object, you need to work through all of the following.” ↩
-
(Joke) Note that
enum
s avoid trait objects, hence you should also avoid trait objects ;-) ↩ -
I’m aware that this snippet is using Newtonsoft.Json instead of the System.Text.Json, but this is a topic for another day … ↩