Rust: Lifetimes

9 March 2023

Lifetimes seem to trip up every programmer new to Rust: just when you start to feel like you’re getting the hang of things, the compiler complains that a variable “doesn’t live long enough” and you find yourself adding 'a or 'static in random places without really understanding what you’re doing.

I have a theory why lifetimes are hard to wrap your head around, and it’s not that they’re particularly complicated. The trouble is that the Rust compiler does a pretty good job of figuring out lifetimes for simple cases. That’s a good thing of course -- otherwise we’d have to litter our code with lifetime annotations -- but it means that the first time you encounter them will be one of the complicated cases where the compiler needs the programmer’s help. You’re thrown in at the deep end, that’s why if feels frustrating.

Let’s try to avoid that here and start at the beginning. Why do we need to worry about lifetimes? In Rust: The Stack and the Heap I talked about the lifetime of stack- and heap-allocated objects. That was fairly straightforward. Things get more interesting when we look at references: if x is a reference, then at any place in your code that says *x the Rust compiler needs to make sure that the value x refers to is still valid. That’s not the case in other languages; in Go or Java this would be the garbage collector’s responsibility, in C or C++ it’s the programmer’s responsibility.

If a reference exists only inside a single function this is usually easy to figure out:

struct Point {
    x: i32,
    y: i32,
}

fn use_point() {
    let p = Point { x: 3, y: 0 };
    let x = &p.x;
    println!("x = {}", *x);   // ok
}

Since x is a reference to a field inside p, we should never be able to dereference x after p has been deallocated. In use_point, p is deallocated at the end of the function, so dereferencing x anywhere in the function is fine. Here’s an example that will result in a compiler error:

fn use_point_bad() {
    let x: &i32;
    {
        let p = Point { x: 3, y: 0 };
        x = &p.x;
    }
    println!("x = {}", *x);   // error!
}

The example is a bit contrived, but I think you get the idea: I defined the inner scope (marked by {…}) so that p would be deallocated just before we try to access one of its fields with *x. The compiler will rejects the code with this error:

`p.x` does not live long enough

The problem is that x refers to p.x and the lifetime of x is longer than the lifetime of p.x. That’s what we need to avoid: the lifetime of a reference can be the same as the lifetime of whatever it refers to, or it can be shorter, but it must never be longer.

So what about that 'a syntax? So far we didn’t have to write any code to tell the compiler about lifetimes. That’s because within functions, the Rust compiler will infer lifetimes automatically. However, it won’t do that across function calls.

Lifetimes and functions

Let’s take a look at a function that returns a reference:

fn get_x<'a>(p: &'a Point) -> &'a i32 {
    &p.x
}

What’s going on here? This is using the <…> syntax of generic functions, but instead of a type parameter we have 'a. The ' indicates this is a lifetime parameter. In the parameter and return types, &'a means “this is a reference with lifetime a”. The compiler will use this information when checking the lifetimes within the function (pretty simple here) and when checking the lifetimes in each function that calls get_x. For example, let’s say we have this call somewhere in our code:

let x = get_x(&p);

Just like the compiler can infer that p must be a Point and x will be a reference to i32, it can infer that &p and x must both have lifetime 'a, and it can figure out what 'a will be for this particular call. (This is a form of generics after all, so 'a will be different for different calls.)

The compiler actually has a bit more wiggle room here: a longer lifetime can be coerced into a smaller one, so it only needs to find a lifetime 'a that’s smaller than (or equal to) the lifetime of p and larger than (or equal to) the lifetime of x. You can think of that as similar to how an i32 value can be coerced to i16 but not to u32 or i64.

Here’s a slightly more complex case:

fn largest_x<'a>(p: &'a Point, q: &'a Point) -> &'a i32 {
    if p.x > q.x { &p.x } else { &q.x }
}

The return value can refer to either p or q, so the lifetime annotations ensure that the return value’s lifetime will be the same or shorter than the lifetime of both parameters. Functions can have more than one lifetime parameter:

fn get_both_x<'a, 'b>(p: &'a Point, q: &'b Point) -> (&'a i32, &'b i32) {
    (&p.x, &q.x)
}

Using a, b, … is the usual convention for lifetime parameters.

Lifetime annotations on methods work the same way, you just need to remember that &self or &mut self is also a reference. Here’s an example:

impl Point {
    fn get_x_and_y<'a, 'b>(&'a self, msg: &'b str) -> (&'a i32, &'a i32) {
        println!("Message: {}", msg);
        (&self.x, &self.y)
    }
}

Lifetime elision

Looking at that last example you may be asking, do we really need the 'b parameter? It does’t do much in terms of restricting lifetimes for code that calls get_x_and_y. The Rust developers had the same idea, so they added a set of lifetime elision rules that let you leave out the annotations in simple cases (“elision” means leaving something out). Here they are, straight from the Rust Reference:

Each elided lifetime in the parameters becomes a distinct lifetime parameter.
If there is exactly one lifetime used in the parameters (elided or not), that lifetime is assigned to all elided output lifetimes.
In method signatures, if the receiver has type &Self or &mut Self, then the lifetime of that reference to Self is assigned to all elided output lifetime parameters.

Rule 1 and 2 mean that we can write the get_x function without any lifetime annotations:

fn get_x(p: &Point) -> &i32 {
    &p.x
}

We can also write the get_x_and_y method without annotation thanks to rule 3 and 1:

impl Point {
    fn get_x_and_y(&self, msg: &str) -> (&i32, &i32) {
        println!("Message: {}", msg);
        (&self.x, &self.y)
    }
}

The largest_x and get_both_x functions still need annotations. As a little exercise, remove the annotations from one of the example functions, then go through the lifetime elision rules yourself. Do they add all the annotations back? If not, that function needs explicit lifetime annotations.

Lifetimes and structs

If a struct includes a reference, the struct definition always has to have a lifetime annotation; there’s no lifetime elision rule for structs. As an example, let’s look at a struct that wraps a reference to a std::fs::File and provides a convenient method to get the file size:

struct FileWrapper<'a> {
    file: &'a File,
}

impl<'a> FileWrapper<'a> {
    fn size(&self) -> io::Result<u64> {
        Ok(self.file.metadata()?.len())
    }
}

fn use_file_wrapper() -> io::Result<()> {
    let f = File::open("foo.txt")?;
    let w = FileWrapper { file: &f };
    println!("file size: {}", w.size()?);
    Ok(())
}

The lifetime annotations tell us that in use_file_wrapper, the lifetime of w needs to be shorter than or equal to that of f. That’s clearly the case here, so we won’t get any errors about lifetimes.

The static lifetime

const ORIGIN: Point = Point { x: 0, y: 0 };

What about constants? As far as the rest of the program is concerned, their lifetime is infinite -- when you have a reference to a constant you never need to worry about whether it’s still valid. Rust provides a special lifetime called 'static for this case:

fn get_origin() -> &'static Point {
    &ORIGIN
}

You see 'static most often with string literals: every "bla bla" in your code technically has type &'static str.