9 March 2023
Lifetimes seem to trip up every programmer new to Rust: just when you
start to feel like you’re getting the hang of things, the compiler
complains that a variable “doesn’t live long enough” and you find
yourself adding 'a
or 'static
in random places
without really understanding what you’re doing.
I have a theory why lifetimes are hard to wrap your head around, and it’s not that they’re particularly complicated. The trouble is that the Rust compiler does a pretty good job of figuring out lifetimes for simple cases. That’s a good thing of course – otherwise we’d have to litter our code with lifetime annotations – but it means that the first time you encounter them will be one of the complicated cases where the compiler needs the programmer’s help. You’re thrown in at the deep end, that’s why if feels frustrating.
Let’s try to avoid that here and start at the beginning. Why do we
need to worry about lifetimes? In Rust: The Stack and the
Heap I talked about the lifetime of stack- and heap-allocated
objects. That was fairly straightforward. Things get more interesting
when we look at references: if x
is a reference, then at
any place in your code that says *x
the Rust compiler needs
to make sure that the value x
refers to is still valid.
That’s not the case in other languages; in Go or Java this would be the
garbage collector’s responsibility, in C or C++ it’s the programmer’s
responsibility.
If a reference exists only inside a single function this is usually easy to figure out:
struct Point {
x: i32,
y: i32,
}
fn use_point() {
let p = Point { x: 3, y: 0 };
let x = &p.x;
println!("x = {}", *x); // ok
}
Since x
is a reference to a field inside p
,
we should never be able to dereference x
after
p
has been deallocated. In use_point
,
p
is deallocated at the end of the function, so
dereferencing x
anywhere in the function is fine. Here’s an
example that will result in a compiler error:
fn use_point_bad() {
let x: &i32;
{
let p = Point { x: 3, y: 0 };
x = &p.x;
}
println!("x = {}", *x); // error!
}
The example is a bit contrived, but I think you get the idea: I
defined the inner scope (marked by {…}
) so that
p
would be deallocated just before we try to access one of
its fields with *x
. The compiler will rejects the code with
this error:
`p.x` does not live long enough
The problem is that x
refers to p.x
and the
lifetime of x
is longer than the lifetime of
p.x
. That’s what we need to avoid: the lifetime of a
reference can be the same as the lifetime of whatever it refers to, or
it can be shorter, but it must never be longer.
So what about that 'a
syntax? So far we didn’t have to
write any code to tell the compiler about lifetimes. That’s because
within functions, the Rust compiler will infer lifetimes automatically.
However, it won’t do that across function calls.
Let’s take a look at a function that returns a reference:
fn get_x<'a>(p: &'a Point) -> &'a i32 {
&p.x
}
What’s going on here? This is using the <…>
syntax
of generic functions, but instead of a type parameter we have
'a
. The '
indicates this is a lifetime
parameter. In the parameter and return types, &'a
means “this is a reference with lifetime a”. The compiler will use this
information when checking the lifetimes within the function (pretty
simple here) and when checking the lifetimes in each function that calls
get_x
. For example, let’s say we have this call somewhere
in our code:
let x = get_x(&p);
Just like the compiler can infer that p
must be a
Point
and x
will be a reference to
i32
, it can infer that &p
and
x
must both have lifetime 'a
, and it can
figure out what 'a
will be for this particular call. (This
is a form of generics after all, so 'a
will be different
for different calls.)
The compiler actually has a bit more wiggle room here: a longer
lifetime can be coerced into a smaller one, so it only needs to find a
lifetime 'a
that’s smaller than (or equal to) the lifetime
of p
and larger than (or equal to) the lifetime of
x
. You can think of that as similar to how an i32 value can
be coerced to i16 but not to u32 or i64.
Here’s a slightly more complex case:
fn largest_x<'a>(p: &'a Point, q: &'a Point) -> &'a i32 {
if p.x > q.x { &p.x } else { &q.x }
}
The return value can refer to either p
or
q
, so the lifetime annotations ensure that the return
value’s lifetime will be the same or shorter than the lifetime of both
parameters. Functions can have more than one lifetime parameter:
fn get_both_x<'a, 'b>(p: &'a Point, q: &'b Point) -> (&'a i32, &'b i32) {
(&p.x, &q.x)
}
Using a
, b
, … is the usual convention for
lifetime parameters.
Lifetime annotations on methods work the same way, you just need to
remember that &self
or &mut self
is
also a reference. Here’s an example:
impl Point {
fn get_x_and_y<'a, 'b>(&'a self, msg: &'b str) -> (&'a i32, &'a i32) {
println!("Message: {}", msg);
(&self.x, &self.y)
}
}
Looking at that last example you may be asking, do we really need the
'b
parameter? It does’t do much in terms of restricting
lifetimes for code that calls get_x_and_y
. The Rust
developers had the same idea, so they added a set of lifetime
elision rules that let you leave out the annotations in simple
cases (“elision” means leaving something out). Here they are, straight
from the Rust
Reference:
&Self
or &mut Self
, then the lifetime
of that reference to Self is assigned to all elided output lifetime
parameters.Rule 1 and 2 mean that we can write the get_x
function
without any lifetime annotations:
fn get_x(p: &Point) -> &i32 {
&p.x
}
We can also write the get_x_and_y
method without
annotation thanks to rule 3 and 1:
impl Point {
fn get_x_and_y(&self, msg: &str) -> (&i32, &i32) {
println!("Message: {}", msg);
(&self.x, &self.y)
}
}
The largest_x
and get_both_x
functions
still need annotations. As a little exercise, remove the annotations
from one of the example functions, then go through the lifetime elision
rules yourself. Do they add all the annotations back? If not, that
function needs explicit lifetime annotations.
If a struct includes a reference, the struct definition always has to
have a lifetime annotation; there’s no lifetime elision rule for
structs. As an example, let’s look at a struct that wraps a reference to
a std::fs::File
and provides a convenient method to get the
file size:
struct FileWrapper<'a> {
file: &'a File,
}
impl<'a> FileWrapper<'a> {
fn size(&self) -> io::Result<u64> {
Ok(self.file.metadata()?.len())
}
}
fn use_file_wrapper() -> io::Result<()> {
let f = File::open("foo.txt")?;
let w = FileWrapper { file: &f };
println!("file size: {}", w.size()?);
Ok(())
}
The lifetime annotations tell us that in
use_file_wrapper
, the lifetime of w
needs to
be shorter than or equal to that of f
. That’s clearly the
case here, so we won’t get any errors about lifetimes.
const ORIGIN: Point = Point { x: 0, y: 0 };
What about constants? As far as the rest of the program is concerned,
their lifetime is infinite – when you have a reference to a constant you
never need to worry about whether it’s still valid. Rust provides a
special lifetime called 'static
for this case:
fn get_origin() -> &'static Point {
&ORIGIN
}
You see 'static
most often with string literals: every
"bla bla"
in your code technically has type
&'static str
.