gdritter repos documents / master posts / higher-rank-trait-bounds.md
master

Tree @master (Download .tar.gz)

higher-rank-trait-bounds.md @masterview rendered · raw · history · blame

Over the weekend, a friend of mine who is currently learning Rust
asked me a question. He was using [`serde`](https://serde.rs/) for
serialization, and he wanted a single function that took a file path
as argument, opened the file, deserialized it as a particular type,
and returned that value, where the return type could be inferred from
context. (For this example, I'm going to ignore error-handling and
just use `.unwrap()` to bail out on failures.) I hadn't used `serde`
much myself, so I briefly assumed the function in question would just
look like this:

```.rust
fn load_from_file<T>(path: String) -> T
where
    T: serde::Deserialize
{
    let mut file = File::open(path).unwrap();
    serde_json::from_reader(&mut file).unwrap()
}
```

But it wasn't quite so easy, and, like most of the trickier parts of
Rust, the problem is that lifetimes can be tricky. The
`serde::Deserialize` trait takes a single parameter: a lifetime which
corresponds to the lifetime _of the input source_. In the above
function, the input source is file from which we're reading, but it
might also be a `str` (using the `from_str` function) or a slice of
bytes (using the `from_slice` function). In any case, the deserializer
needs to know how long its input source lives, so we can make sure
that the input source doesn't get suddenly closed or freed while it's
in the middle of parsing.

Okay, that means we need a lifetime parameter to give to
`serde::Deserialize`. Let's try the obvious thing and add a lifetime
parameter to the signature of the function:

```.rust
fn load_from_file<'de, T>(path: String) -> T
where
    T: serde::Deserialize<'de>
{
    let mut file = File::open(path).unwrap();
    serde_json::from_reader(&mut file).unwrap()
}
```

This looks nice at a glance, but it doesn't compile! You can plug this
code snippet into `rustc` and it will (with its characteristic
helpfulness) suggest exactly the correct code to write. But I'm less
interested here in _what_ to write, and more in _why_ we're writing
it: why does this not work, and why does the suggested fix work?

# What Do Generic Parameters Mean?

Let's step back to very basic Rust: when you have a generic parameter
to (say) a function, what you're saying is that you want that
particular implementation detail to be supplied by the _use_ of the
function. Here's an incredibly trivial example: I can write a
polymorphic identity function, a function that simply returns the
value given it, by giving it a type parameter like this:

```.rust
fn identity<T>(x: T) -> T { x }
```

When we _use_ the `identity` function with a value of a particular
typesay, by calling `identity(22u32)`—we're also implicitly providing
a concrete choice for the type `T`, which it can infer from the type
of the argument. Rust also allows us to pass this type parameter
explicitly, with the slightly unwieldy syntax
`identity::<u32>(22)`. Either way, the use site of `identity` provides
the information necessary to choose a concrete type `T`.

The same goes for lifetime parameters: when we write a function like

```.rust
fn ref_identity<'a, T>(x: &'a T) -> &'a T { x }
```

what we're saying is that the reference we're taking as argument lives
_some_ amount of time, but that amount of time is going to be known at
the call site, and it'll get that information inferred from individual
uses of `ref_identity`. When we call `ref_identity(&foo)`, we're
saying that the lifetime parameter `'a` is going to be however long
`foo` lives in that context.

# Back to `load_from_file`

Okay, so with that, let's look at the our first pass at writing
`load_from_file`:

```.rust
fn load_from_file<'de, T>(path: String) -> T
where
    T: serde::Deserialize<'de>
{
    let mut file = File::open(path).unwrap();
    serde_json::from_reader(&mut file).unwrap()
}
```

What's the problem here? Well, our type doesn't actually reflect what
the body of our function does. Remember that the lifetime parameter we
give to `Deserialize` corresponds to the lifetime of the input source
to the deserializer, and we already know the lifetime of our source:
it's the lifetime of our `file` variable. That lifetime isn't
something we need to get from the callerfor that matter, it's not a
piece of information that the caller would even have access to!
Expressing that as a parameter in this instance is blatantly
incorrect.

but then we've got a problem, because we need _something_ to give
`Deserialize` as the lifetime parameter, and Rust doesn't have a way
of expressing, "The lifetime of _this variable here in the scope I am
currently defining_." We're caught: we need some lifetime parameter
there, but it can't be an argument, and we can't name the specific
individual lifetime that we know it should be.

So instead, one way of writing this is by saying, "This works for
_any_ lifetime we might care to give it." The difference here is
subtle but important: we aren't expressing the notion of "any lifetime
you, the caller, want to give me", we are expressing the notion of
"any lifetime _at all_."

It turns out that Rust has a syntax for this, which uses the `for`
keyword:

```.rust
fn load_from_file<T>(path: String) -> T
where
    T: for<'de> serde::Deserialize<'de>
{
    let mut file = File::open(path).unwrap();
    serde_json::from_reader(&mut file).unwrap()
}
```

If we try this, this function now works. And notice that the type
parameter list now includes only one thing: the `T` type that we're
deserializing to, which is exactly what we want!

The key here is the `for<'a> T` syntax: the `for` acts as a binder for
one or more fresh lifetime parameters, which are then used in the
following type. Here, we're introducing a new `'de` lifetime and then
filling that in in `Deserialize`. Importantly, this lifetime is now
quantified over _all possible_ lifetimes, not merely a lifetime that
the calling context might supply. And of course, 'all possible
lifetimes' includes the lifetime of the `file` variable inside the
function!

The `for<'a> T` syntax is a feature
called
[Higher-Ranked Trait Bounds](https://doc.rust-lang.org/nomicon/hrtb.html#higher-rank-trait-bounds-hrtbs)
and this feature was specifically necessary to support
[unboxed closures](https://github.com/rust-lang/rfcs/blob/master/text/0387-higher-ranked-trait-bounds.md),
but it ends up being useful in other situationslike this one!