9 March 2023
Lifetimes seem to trip up every programmer new to Rust: just when you start to feel like you’re
getting the hang of things, the compiler complains that a variable “doesn’t live long enough” and
you find yourself adding 'a
or 'static
in random places without really
understanding what you’re doing.
I have a theory why lifetimes are hard to wrap your head around, and it’s not that they’re particularly complicated. The trouble is that the Rust compiler does a pretty good job of figuring out lifetimes for simple cases. That’s a good thing of course -- otherwise we’d have to litter our code with lifetime annotations -- but it means that the first time you encounter them will be one of the complicated cases where the compiler needs the programmer’s help. You’re thrown in at the deep end, that’s why if feels frustrating.
Let’s try to avoid that here and start at the beginning. Why do we need to worry about lifetimes? In
Rust: The Stack and the Heap I talked about the
lifetime of stack- and heap-allocated objects. That was fairly straightforward. Things get more
interesting when we look at references: if x
is a reference, then at any place in your
code that says *x
the Rust compiler needs to make sure that the value x
refers to is still valid. That’s not the case in other languages; in Go or Java this would be the
garbage collector’s responsibility, in C or C++ it’s the programmer’s responsibility.
If a reference exists only inside a single function this is usually easy to figure out:
struct Point {
x: i32,
y: i32,
}
fn use_point() {
let p = Point { x: 3, y: 0 };
let x = &p.x;
println!("x = {}", *x); // ok
}
Since x
is a reference to a field inside p
, we should never be able to
dereference x
after p
has been deallocated. In use_point
,
p
is deallocated at the end of the function, so dereferencing x
anywhere
in the function is fine. Here’s an example that will result in a compiler error:
fn use_point_bad() {
let x: &i32;
{
let p = Point { x: 3, y: 0 };
x = &p.x;
}
println!("x = {}", *x); // error!
}
The example is a bit contrived, but I think you get the idea: I defined the inner scope (marked by
{…}
) so that p
would be deallocated just before we try to access
one of its fields with *x
. The compiler will rejects the code with this error:
`p.x` does not live long enough
The problem is that x
refers to p.x
and the lifetime of x
is longer than the lifetime of p.x
. That’s what we need to avoid: the lifetime of a
reference can be the same as the lifetime of whatever it refers to, or it can be shorter, but it
must never be longer.
So what about that 'a
syntax? So far we didn’t have to write any code to tell the
compiler about lifetimes. That’s because within functions, the Rust compiler will infer lifetimes
automatically. However, it won’t do that across function calls.
Let’s take a look at a function that returns a reference:
fn get_x<'a>(p: &'a Point) -> &'a i32 {
&p.x
}
What’s going on here? This is using the <…>
syntax of generic functions,
but instead of a type parameter we have 'a
. The '
indicates this is a
lifetime parameter. In the parameter and return types, &'a
means “this is a
reference with lifetime a”. The compiler will use this information when checking the lifetimes
within the function (pretty simple here) and when checking the lifetimes in each function that calls
get_x
. For example, let’s say we have this call somewhere in our code:
let x = get_x(&p);
Just like the compiler can infer that p
must be a Point
and x
will be a reference to i32
, it can infer that &p
and x
must
both have lifetime 'a
, and it can figure out what 'a
will be for this
particular call. (This is a form of generics after all, so 'a
will be different for
different calls.)
The compiler actually has a bit more wiggle room here: a longer lifetime can be coerced into a
smaller one, so it only needs to find a lifetime 'a
that’s smaller than (or equal to)
the lifetime of p
and larger than (or equal to) the lifetime of x
. You can
think of that as similar to how an i32 value can be coerced to i16 but not to u32 or i64.
Here’s a slightly more complex case:
fn largest_x<'a>(p: &'a Point, q: &'a Point) -> &'a i32 {
if p.x > q.x { &p.x } else { &q.x }
}
The return value can refer to either p
or q
, so the lifetime annotations
ensure that the return value’s lifetime will be the same or shorter than the lifetime of both
parameters. Functions can have more than one lifetime parameter:
fn get_both_x<'a, 'b>(p: &'a Point, q: &'b Point) -> (&'a i32, &'b i32) {
(&p.x, &q.x)
}
Using a
, b
, … is the usual convention for lifetime parameters.
Lifetime annotations on methods work the same way, you just need to remember that &self
or &mut self
is also a reference. Here’s an example:
impl Point {
fn get_x_and_y<'a, 'b>(&'a self, msg: &'b str) -> (&'a i32, &'a i32) {
println!("Message: {}", msg);
(&self.x, &self.y)
}
}
Looking at that last example you may be asking, do we really need the 'b
parameter? It
does’t do much in terms of restricting lifetimes for code that calls get_x_and_y
. The
Rust developers had the same idea, so they added a set of lifetime elision rules that let you
leave out the annotations in simple cases (“elision” means leaving something out). Here they are,
straight from the Rust
Reference:
&Self
or &mut Self
,
then the lifetime of that reference to Self is assigned to all elided output lifetime
parameters.
Rule 1 and 2 mean that we can write the get_x
function without any lifetime
annotations:
fn get_x(p: &Point) -> &i32 {
&p.x
}
We can also write the get_x_and_y
method without annotation thanks to rule 3 and 1:
impl Point {
fn get_x_and_y(&self, msg: &str) -> (&i32, &i32) {
println!("Message: {}", msg);
(&self.x, &self.y)
}
}
The largest_x
and get_both_x
functions still need annotations. As a little
exercise, remove the annotations from one of the example functions, then go through the lifetime
elision rules yourself. Do they add all the annotations back? If not, that function needs explicit
lifetime annotations.
If a struct includes a reference, the struct definition always has to have a lifetime annotation;
there’s no lifetime elision rule for structs. As an example, let’s look at a struct that wraps a
reference to a std::fs::File
and provides a convenient method to get the file size:
struct FileWrapper<'a> {
file: &'a File,
}
impl<'a> FileWrapper<'a> {
fn size(&self) -> io::Result<u64> {
Ok(self.file.metadata()?.len())
}
}
fn use_file_wrapper() -> io::Result<()> {
let f = File::open("foo.txt")?;
let w = FileWrapper { file: &f };
println!("file size: {}", w.size()?);
Ok(())
}
The lifetime annotations tell us that in use_file_wrapper
, the lifetime of
w
needs to be shorter than or equal to that of f
. That’s clearly the case
here, so we won’t get any errors about lifetimes.
const ORIGIN: Point = Point { x: 0, y: 0 };
What about constants? As far as the rest of the program is concerned, their lifetime is infinite --
when you have a reference to a constant you never need to worry about whether it’s still valid. Rust
provides a special lifetime called 'static
for this case:
fn get_origin() -> &'static Point {
&ORIGIN
}
You see 'static
most often with string literals: every "bla bla"
in your
code technically has type &'static str
.