Ownership in Rust


Lately I've been messing around with using Rust, a system programming language that is slowly gaining popularity among those people in need of a fast, compiled language but with modern improvements and conveniences around memory allocation. Or at least that is my understanding, this is my first foray into using a language that is more complex than JavaScript.

Today I'll discuss the concept of ownership in Rust as a way to help myself retain the information and hopefully understand it a bit better. Essentially, there are two main ways that memory is handled in programming languages today.

The first way exists in languages like C or C++ that require the person writing the code to manually allocate memory to their variables and then be responsible for freeing the memory from use after it is no longer needed. This allows that memory to be used elsewhere down the line. If you didn't do this, eventually the computer would run out of available memory and crash. On modern operating systems, this isn't too big of a deal because the OS will recognize what is going on and terminate your program and free the memory resulting in no harm to your computer. But, if you were compiling and running your badly written code on the kernel level for example, the OS wouldn't be there to save you and you could risk causing real damage to your system.

The second way is more familiar to me and it revolves around 'Garbage Collection' (GC). First introduced in 1959 with the Lisp programming language, it is widely used today. In languages like JavaScript, Python, or Ruby, memory is automatically freed up and recycled when it is no longer in use. The programmer doesn't have to worry about a lot of the pitfalls present in lower level languages. Kind of like driving an automatic car vs. a stick-shift. While GC is really convenient, it isn't going to be as efficient with handling memory as a knowledgeable programmer who is able to do it by hand. JavaScript is used all over the web today, but it isn't a good choice for writing a video game for example, because it just isn't fast enough among other reasons. Even with modern frameworks like Electron that have allowed us to use JS outside of the browser and create desktop programs, (see the popular chat app Slack), it generally isn't as fast as a natively written desktop program. I run Slack in my web browser because it hogs way too much of my computer's memory as an Electron app.

Rust introduces a third method of handling memory, that the creators have called 'Ownership'. Essentially, there are rules in place regarding safe memory usage that are checked by the compiler during compilation. This prevents your application from being slowed down while it is running which is what happens with garbage collected languages. It's kind of a best of both worlds scenario. Rust protects the programmer from writing inefficient code like in garbage collected languages but it doesn't do it while the program is running so there isn't a loss in speed. Kind of like how many super cars today are able to use automatic transmissions because the technology has caught up to the point that the car can shift more quickly and make smarter decisions than a person driving a manual transmission.

First, let's go through a brief explanation of how memory is stored and allocated. In a computer program, the memory is stored in RAM (Random Access Memory) in two places, the stack or the heap.

The stack is like a deck of cards. Think of things like variables and functions as individual cards in the deck. When a new variable is created, it is added to the top of the deck. When it is done being used it is removed from the top of the deck as well. This operation is called LIFO or Last-In-First-Out. The stack always operates like this and the order is always preserved. This makes it easy for the computer to tell what is going on and doesn't require too much thinking. Therefore, it's fast and efficient to allocate and free memory from the stack. The caveat is that in order for something exist in the stack, the computer has to know about its memory requirements at compile time. The stack can only handle data that have a fixed or static size.

Data that might change in size cannot go on the stack. That's where the heap comes in. As the name might suggest, it is not ordered like a stack. When something is added to the heap, you specify a memory size that the data will need and the computer finds space in the heap to satisfy your requirement. This provides more flexibility than the stack, but it is more costly because the computer needs to make more calculations to decide what to do with the data and where to store it in memory. In addition, when you want to read from the heap or manipulate the data stored there, the computer is going to be slower about completing these operations than it would on the stack.

Keeping track of what is being stored where is something that must be done in C or C++ and doing it poorly will cause problems, Rust takes care of that for us with the rules around ownership.

Here is Rust's ownership concept in action:

{
let message = "Hi!";
//message can be accessed here
}
//message is no longer available down here.

The variable message is storing the word "Hi!" as a string literal. Doing this will allocate storage on the stack. As far as I understand, variables are block scoped in Rust so, message is valid within its block and then it is freed from memory once the program is done with that block. However, since it is stored as a string literal, it cannot be mutated or reassigned. This contrasts against JavaScript where we could reassign the value inside the scope above like so:

{
let message = "Hi!";
let mesage2 = message;
console.log(message2); //"Hi!"
}

If we want to do that in Rust we need to use a more complex string type ```String```.  ```String``` allows for strings of different sizes to be stored. As discussed earlier, since the size is not static, String types are held in the heap.

{
let message = String::from("Hi!");
let message2 = message;
println!(message2); //"Hi!"
}

Okay, so same result, but what is happening behind the scenes? In a language like C++, there would be a problem here because when freeing the memory from these two variables, they are both pointing at the same memory address in the heap. The computer would be confused about what to do.

Taken from the Rust Book

Rust solves this problem like this: Once message2 (s2 in the picture) is pointing at the same address in the heap as message (s1), message is no longer in scope. Ownership dictates that the heap address now belongs to message2 as long as the variable is in scope. That way when the memory is freed, there is no confusion, message just goes away and message2 can be responsible for freeing the data stored in the heap.

All of this to show one small example of how ownership works in Rust. As I continue learning more, I'm going to try to keep adding new posts. Until next time.