Skip to Content
Technical Articles

Learning Rust with CAP (Part 2)

Recap

In the first blog post I’ve given you a first overview of the Rust programming language and have shown you some examples of my basic implementation of the SAP Cloud Application Programming Model (CAP) in Rust.

Now let’s examine why Rust is so different from other programming languages. All the features shown before are not new, but borrowed from other languages, and that’s just fine. One doesn’t always have to reinvent the wheel.

But there’s one thing where Rust stands out: The borrow checker.
I told you before that Rust doesn’t have a garbage collector and you as a programmer don’t have to deal with manual memory management. But how does this work, is this some kind of magic? The answer is simple if you follow these concepts.
(I follow along the guide of the official Rust book)

Ownership

The Rust compiler constantly checks these ownership rules:

  • Each value in Rust has a variable that’s called its owner.
  • There can only be one owner at a time.
  • When the owner goes out of scope, the value will be dropped.

This is very intuitive, but how does this work in practice?
First we need to introduce the stack and the heap.

Stack and Heap

In many programming languages (including Rust) you have several parts of memory: The stack and the heap. The stack follows the “last in, first out” principle, that means you can push data onto the stack and the last data you push will be popped off. All data on the stack must have a fixed size, so things like a vector of unknown size must be stored elsewhere. This is where the heap comes into the picture. It’s a less organised space where the memory allocator needs to find an empty space to store your data, this is called allocating. On the stack you then store a pointer to that place on the heap. Since the stack doesn’t need an allocator, including all of its bookkeeping shenanigans, it’s faster to store data. It’s also faster to read data on the stack as the processor doesn’t need to follow a pointer and jump around in memory.

Now let’s look at the ownership principles in action, consider data on the stack:

let s1 = 5; // i32 has a fixed size so data is stored on the stack
let s2 = s1; // this performs a copy because s1 is a primitive fixed-sized type
println!("s1: {}, s2: {}", s1, s2) // works

Now let’s consider data on the heap:

let h1 = String::from("hello"); // this type's memory can grow so it must be stored on the heap
let h2 = h1; // this is not a copy, just a pointer to the same data on the heap
println!("h1: {}, h2: {}", h1, h2) // not allowed

This code would lead to the very helpful compiler error:

error[E0382]: borrow of moved value: `h1`
--> src/main.rs:4:30
|
2 | let h1 = String::from("hello");
| -- move occurs because `h1` has type `std::string::String`, which does not implement the `Copy` trait
3 | let h2 = h1;
| -- value moved here
4 | println!("h1: {}, h2: {}", h1, h2)
| ^^ value borrowed here after move

With multiple owners, the compiler wouldn’t know if the respective memory should be freed if the variable goes out of scope. Also there would be uncontrolled mutations leading to many nasty bugs, for example “use-after-free”. A detailed look into memory-safety bugs and how Rust can eliminate them can be found in this paper.

The ownership principle also holds true for function invocations, that means if you call a function with an input parameter, you move ownership to the variable inside the function. If the function returns data, it’s moved again to the receiver.

Without additional mechanisms this would be a very inefficient way to program as it’s way too restrictive. For example it wouldn’t be possible for a function to just read the data without taking ownership. So let’s see how we can overcome these limitations.

References and Borrowing

The trick is to hardcode the intention of data consumers: Do they want to take ownership, do they just want to read or do they additionally want to mutate the data (but in a controlled way)? We already saw how ownership can be moved, so let’s see how one can tell the compiler that we don’t want ownership and that we’re not about to mutate the data. This is called ‘immutable borrowing’ and it looks like this:

let s = String::from("hello");
let len = calculate_length(&s); // we provide an immutable reference, notice the `&` symbol
println!("The length of '{}' is {}.", s, len);

Here, `s` is still the owner of the string, it’s just that the function `calculate_length` borrowed it for a short period.

It’s also possible to borrow something with the intention to mutate it. Take a look at this:

let mut s = String::from("hello"); // you're allowed to change s
change(&mut s); // we provide a mutable reference
println!("s is: {}", s); // s is now changed

Now if we wouldn’t establish some more rules, this would lead to nasty bugs again, so here we go:

  • At any given time, you can have either one mutable reference or any number of immutable references.
  • References must always be valid.

The first rule prevents something like this:

let mut s = String::from("hello");
let r1 = &s;
let r2 = &s;
let r3 = &mut s;
println!("{}, {}, and {}", r1, r2, r3);

We get the compiler error:

error[E0502]: cannot borrow `s` as mutable because it is also borrowed as immutable
--> src/main.rs:5:11
|
3 | let r1 = &s;
| -- immutable borrow occurs here
4 | let r2 = &s;
5 | let r3 = &mut s;
| ^^^^^^ mutable borrow occurs here
6 | println!("{}, {}, and {}", r1, r2, r3);
| -- immutable borrow later used here

This makes sure that consumers of immutable references don’t have to deal with data changes.

The second rule makes sure that you cannot have dangling pointers. Consider this example:

fn dangle() -> &String {
  let s = String::from("hello");
  &s
}

The Rust compiler gives you an error:

this function's return type contains a borrowed value, but there is no value for it to be borrowed from

This makes sense. You cannot just return a pointer to something which is just about to go out of scope.

Summary

There’s a lot more to learn (especially the concept of lifetimes), but this gives you a first glimpse of the borrow checker.
All those rules may be confusing at first, but over time and with the help of the Rust compiler you’ll get a hang of it. In return you get memory management for free, eliminate whole classes of errors and have better-structured programs.

Outlook

In the next blog post we’ll come back to our CAP server and look at other interesting features, stay tuned!

Be the first to leave a comment
You must be Logged on to comment or reply to a post.