csperkins.org

Advanced Systems Programming H (2021-2022)

Lecture 5: Resource Ownership and Memory Management

This lecture begins to discuss the various options available for resource ownership and memory management in modern programming languages. It discusses how running programs are arranged in memory, and outlines the need for automatic memory management. It then discusses two approaches to memory management: reference counting and region-based memory management, considering their various costs and benefits. The region-based approach used in Rust is shown to have several important benefits, in terms of efficient and deterministic memory management, but at the cost of making certain classes of program more complex to express.

Part 1: Memory

The first part of this lecture reviews of a process is stored in memory, and what different types of memory exist within a running process, as a prelude to the discussion of automatic memory management. It also discusses how memory-unsafe behaviour can lead to security vulnerabilities through buffer overflows attacks.

Slides for part 1

 

00:00:00.200 In this week’s lecture, I want to

00:00:02.200 talk about resource ownership and memory management.

 

00:00:06.000 In this first part, I’ll talk about

00:00:07.866 how a process is stored in memory,

00:00:10.100 and what memory has to be managed.

 

00:00:12.400 Then, in the following parts,

00:00:14.300 I’ll start our discussion of different

00:00:16.166 approaches to memory management,

00:00:18.166 starting with reference counting and region-based

00:00:20.766 memory management, and how these interact with

00:00:23.366 resource management.

 

00:00:26.600 Let’s start by talking about memory.

 

00:00:30.600 To understand memory management, you must first

00:00:33.400 understand what memory needs to be managed.

00:00:35.800 In all modern operating systems, every process

00:00:38.300 is given a virtual address space in

00:00:40.700 which it executes. Each process thinks it’s

00:00:43.066 the only one running on the machine,

00:00:45.466 and has access to the full range

00:00:47.033 of memory addresses.

 

00:00:49.000 The underlying hardware translates these virtual addresses

00:00:52.000 into physical addresses that represent particular locations

00:00:55.000 in real memory, and makes sure that

00:00:57.966 the different processes are isolated from each other.

 

00:01:01.300 The virtual address space that a process

00:01:03.566 sees is divided into several different parts.

 

00:01:06.933 At the bottom of memory, starting with

00:01:09.000 the lowest numbered address sits the program

00:01:11.033 text itself: the machine runs that represents

00:01:14.466 the running program. That’s immediately followed by

00:01:17.800 the static data and space for any

00:01:19.900 global variables.

 

00:01:22.000 Following this in memory is the heap:

00:01:24.100 memory allocated using malloc(), or similar mechanisms.

 

00:01:27.900 Following that is the memory used to

00:01:30.200 hold shared libraries and memory mapped files.

 

00:01:33.000 Then the stack space, growing down from

00:01:35.100 the top of user accessible memory.

 

00:01:37.533 Then, finally, the operating system kernel itself

00:01:40.566 occupies the top part of the address space.

 

00:01:43.433 How much memory is allocated to each

00:01:45.600 of these, and what virtual addresses each

00:01:47.900 sits between, depend on whether you have

00:01:50.900 a 32-bit or a 64-bit machine,

00:01:53.433 and on what operating system you’re running.

 

00:01:55.900 But all the systems follow the same basic approach.

 

00:02:01.566 The program text, data, and global variables

00:02:04.966 occupy the lowest part of the address space.

 

00:02:08.100 There are four parts to this.

 

00:02:10.966 The very lowest page of memory,

00:02:12.800 usually the first 4096 bytes starting at

00:02:16.133 address zero, is reserved. In languages like

00:02:20.166 C, a null pointer is represented by

00:02:22.400 address zero, so the hardware virtual memory

00:02:25.000 controller is programmed to prohibit access to

00:02:27.766 that address, and to adjacent addresses,

00:02:30.366 as a way of trapping null-pointer dereferences.

 

00:02:33.800 Trying to access this memory will trap

00:02:36.133 into the operating system, which will kill

00:02:38.100 the process with a segmentation violation error.

 

00:02:42.000 Following this reserved page, sits the compiled

00:02:44.700 program text. That is, the machine code

00:02:47.966 representing the program. This occupies the lowest

00:02:51.266 part of the address space, starting just above address zero.

 

00:02:55.800 Next comes the data segment.

 

00:02:58.233 This comprises string literals and static global variables.

00:03:02.266 Data and global variables where the value

00:03:04.233 is known at compile time.

 

00:03:06.500 These are stored in the compiled binary,

00:03:08.866 along with the executable machine code,

00:03:11.266 and loaded into memory following the program text.

 

00:03:15.000 Then, space is allocated for the BSS segment.

00:03:18.966 This provides reserved space for uninitialised

00:03:21.933 global variables defined in the program.

 

00:03:24.600 The name BSS stands for “block started

00:03:27.400 by symbol”, and is a historical relic.

 

00:03:30.833 The program text and data segments are

00:03:33.333 fixed. They comprise the compiled program code

00:03:36.033 and data known at compile time.

 

00:03:38.666 Similarly, the size of the BSS segment

00:03:41.133 is known at compile time.

 

00:03:43.500 In older operating systems,

00:03:45.533 the program and data always start at a fixed

00:03:48.000 location at the start of the memory.

 

00:03:50.500 In modern systems,

00:03:52.033 they’re still loaded near the start of the memory,

00:03:54.766 but the actual starting address is randomised

00:03:56.833 each time the program runs.

 

00:03:59.000 Why is it randomised?

 

00:04:01.066 It’s a security measure.

 

00:04:02.800 It makes it harder for code executed as part of

00:04:05.366 a buffer overflow attack to call into

00:04:07.800 other parts of the program, since it

00:04:09.966 can’t know where they’ll be located in memory.

 

00:04:14.633 As a program executes, it needs space

00:04:17.333 to hold the function parameters, return addresses,

00:04:19.800 and local variables.

00:04:21.833 These are stored on the stack.

 

00:04:24.900 The stack occupies the top of user

00:04:27.000 accessible memory, starting at a random address

00:04:30.000 for security, and grows downwards.

 

00:04:33.566 Each time a function is called,

00:04:35.366 starting with main(), the parameters to that

00:04:38.133 function, the return address, and a pointer

00:04:40.766 to the previous stack frame are pushed

00:04:42.566 onto the stack. The occupy the next

00:04:45.466 addresses below the previous top of the stack.

 

00:04:48.866 And when the function starts to execute,

00:04:51.200 space for its local variables is similarly

00:04:53.433 allocated on the stack.

 

00:04:56.033 If that function calls other functions,

00:04:58.166 the stack grows, as the new stack frames are created.

 

00:05:02.000 And when a function returns, the stack

00:05:04.500 shrinks and the memory it used on

00:05:06.233 the stack is automatically reclaimed.

 

00:05:08.866 The compiler generates the code to manage

00:05:11.333 the stack, as part of the code

00:05:13.133 it generates for each function.

 

00:05:15.333 It knows how the return address, parameters, and stack

00:05:18.500 pointer are represented, and how to grow

00:05:21.000 and shrink the stack.

 

00:05:23.000 And the operating system generates the stack

00:05:25.033 frame for main() when the program starts,

00:05:27.900 with a return address pointing to the process cleanup code.

 

00:05:32.000 To the programmer, the stack is managed automatically.

 

00:05:35.600 Ownership of the stack memory follows function invocation.

 

00:05:41.500 This slide shows an example of a simple C program.

 

00:05:45.100 It comprises a main() function that inspects

00:05:47.566 the values given as command line arguments,

00:05:49.966 and either prints a greeting or a usage message.

 

00:05:53.833 When this program starts to execute,

00:05:56.366 and main() is called, the stack contents

00:05:59.000 are as shown in the green region

00:06:01.033 on the right of the slide.

 

00:06:03.466 The stack contains the arguments to main(),

00:06:06.300 argc and argv, the address to return

00:06:10.066 to once main() completes, and the local variables for main.

 

00:06:14.833 When execution reaches the printf() line,

00:06:17.966 shown in red, and the function is

00:06:20.233 called, a new stack frame is created.

 

00:06:24.266 This new stack frame holds the arguments

00:06:26.533 for printf(), the address to return to

00:06:29.466 once the printf() call finishes, and a

00:06:31.733 pointer to the previous stack frame.

 

00:06:34.633 The address of the previous stack frame

00:06:36.533 is stored for ease of debugging,

00:06:38.266 so stack traces can be printed and

00:06:40.500 so debuggers can trace program execution.

 

00:06:44.200 As printf() starts to execute, space is

00:06:46.733 allocated on the stack for any local variables it uses.

 

00:06:50.566 And, if printf() itself calls any functions,

00:06:53.266 the stack will continue to grow as needed.

 

00:06:57.333 This suggests the classic buffer overflow attack.

 

00:07:01.333 If the language is not type safe,

00:07:03.566 and doesn’t check array bounds,

00:07:05.600 then the call to printf() can be arranged

00:07:07.700 so that it writes past the bounds of a local

00:07:09.600 variable stored on the stack.

 

00:07:12.000 When this happens, it overwrites whatever is

00:07:14.633 stored in the next higher addresses on the stack.

 

00:07:18.366 What would that be?

 

00:07:20.166 Well, maybe some other local variables,

00:07:23.200 but also the pointer to the previous stack frame,

00:07:26.000 then the return address of the function.

 

00:07:29.000 The goal is to overwrite the return address,

00:07:31.700 and some of the following memory.

00:07:33.933 The attacker tries to fill the memory

00:07:36.066 following the return address with the code

00:07:37.866 they want to execute. And to overwrite

00:07:40.500 the return address so that it points to that code.

 

00:07:44.266 When the function then returns, rather than

00:07:47.233 return to the correct place, it returns

00:07:49.733 instead to the return address written during

00:07:51.766 the overflows, and executes the attacker’s code.

 

00:07:55.533 It’s a technique known as stack smashing.

 

00:07:59.033 The link to the Phrack article on

00:08:01.233 the slide is the classic explanation of this technique.

 

00:08:05.766 Modern systems have workarounds, of course.

 

00:08:09.900 They randomise the location of the stack

00:08:11.833 each time a program runs, to make

00:08:13.833 it harder to known what value to

00:08:15.300 overwrite the return address with.

 

00:08:18.000 And they arrange for the virtual memory hardware to mark

00:08:20.566 the stack memory as non-executable,

00:08:22.800 so the system will refuse to execute the code

00:08:25.466 if the return address is successfully overwritten.

 

00:08:29.000 These techniques help.

00:08:30.933 The classic stack smashing attack I’ve described

00:08:34.000 won’t work on a modern system.

 

00:08:36.933 But they don’t entirely solve the problem.

00:08:39.666 Newer attacks, such as return oriented programming,

00:08:43.100 can defeat the protections in some cases.

 

00:08:46.000 The real solution, of course, is to

00:08:48.533 use a language that checks array bounds,

00:08:51.333 and prevents buffer overflows in the first place.

 

00:08:56.000 Back to memory management.

 

00:08:58.400 In addition to the program code and the stack,

00:09:01.333 the heap holds explicitly allocated memory.

 

00:09:06.733 This is memory allocated by calls to

00:09:08.666 functions such as malloc() and calloc() in C;

00:09:11.966 boxed memory in Rust;

00:09:13.800 and objects allocated using new() in Java.

 

00:09:17.500 The heap starts at a low address in memory,

00:09:20.500 following the BSS segment,

00:09:22.266 and grows upwards towards the stack.

 

00:09:25.100 As you might expect by now,

00:09:26.733 the exact starting address for heap allocations

00:09:28.933 is randomised for security.

 

00:09:31.733 In general, successive heap allocations will occupy

00:09:34.733 consecutive blocks in memory,

00:09:36.633 although multi-threaded systems will partition the heap

00:09:39.800 so that each thread that its own

00:09:41.433 space in which to make allocations,

00:09:43.300 without needed to coordinate with other threads.

 

00:09:46.433 And, depending on the processor, small allocations

00:09:49.900 may be rounded up in size to

00:09:51.533 align on a 32 bit or a 64 bit boundary.

 

00:09:56.500 While the stack is managed automatically as

00:09:58.800 functions are called, the challenge of memory

00:10:01.233 management is primarily about how to manage

00:10:03.333 and reclaim the heap memory.

 

00:10:06.000 This can be done manually, by calling the free() function.

 

00:10:10.000 It can be done automatically via reference

00:10:12.533 counting or garbage collection, as we’ll discuss

00:10:15.200 in the next lecture.

 

00:10:17.000 Or it can be done automatically,

00:10:18.966 based on regions and lifetime analysis,

00:10:21.233 as I’ll discuss in later part of this lecture.

 

00:10:26.000 In addition to allocating heap memory,

00:10:28.733 most operating systems allow memory mapped files,

00:10:31.900 allowing data on disk to be directly

00:10:34.100 mapped into the address space.

 

00:10:37.000 On Unix-like systems, such mappings are created

00:10:39.866 using the mmap() system call.

 

00:10:43.000 The mmap() call returns a pointer to

00:10:45.133 a block of memory that acts as

00:10:46.633 a proxy for the file. Reading from

00:10:49.800 that memory will page into the relevant

00:10:51.533 part of the file into memory, and return its contents.

 

00:10:54.933 And writes to that block of memory will eventually be

00:10:57.400 paged back out to disk by the operating system.

 

00:11:01.266 The file is paged in and out

00:11:03.066 of memory as needed, with only the

00:11:05.333 parts of the file being accessed being

00:11:07.200 loaded. It’s an effective way of providing

00:11:10.066 random access to parts of a file.

 

00:11:13.433 Such memory mapped files generally occupy the

00:11:16.200 space between the heap and the stack.

 

00:11:19.000 Shared libraries are usually implemented using memory

00:11:21.666 mapped files, and are mapped into this memory space too.

 

00:11:27.333 Finally, the operating system kernel resides at

00:11:30.033 the top of the address space.

 

00:11:32.566 Kernel memory isn’t directly accessible to user

00:11:34.900 programs, and attempts to access that memory

00:11:37.566 will result in a segmentation violation.

 

00:11:41.000 The hardware provides a special machine code

00:11:43.100 instruction, known as syscall on 64-bit Intel

00:11:46.733 processors, that switches the system to kernel

00:11:49.700 mode and executes a particular function in

00:11:52.100 kernel memory, after performing some permission checks.

 

00:11:56.000 The kernel itself can read and write

00:11:58.566 to the entire address space. The hardware

00:12:01.166 protects the operating system kernel from the

00:12:03.200 user processes, but the operating system has

00:12:05.600 to be able to control those processes

 

00:12:09.600 That concludes our discussion of how

00:12:11.533 programs are stored in memory.

 

00:12:13.700 In the next part, I’ll start to

00:12:15.333 talk about memory management, and the use

00:12:17.666 of reference counting as a memory management technique.

Part 2: Automatic Memory Management: Reference Counting

The second part of the lecture introduces the concepts of automatic memory management and its use in systems programs. Then it discusses one of the simplest automatic memory management schemes: reference counting. It outlines how reference counting works, and describes its limitations, costs, and benefits.

Slides for part 2

 

00:00:00.400 In this part, I’ll talk about the goals

00:00:02.633 of automatic memory management.

00:00:04.566 And I’ll discuss one of the simplest automatic memory

00:00:06.866 management schemes, reference counting.

 

00:00:11.000 Automatic memory management

00:00:12.900 is frequently distrusted by systems programmers.

00:00:15.633 There’s a general belief that it has high overheads.

00:00:19.166 That it’s slow, CPU hungry, and wastes memory.

00:00:22.900 And that the use of automatic memory management

00:00:25.466 introduces unpredictability into the timing

00:00:28.200 of time sensitive operations.

 

00:00:31.000 And to some extent this belief is true.

 

00:00:34.066 There are many different types of automatic

00:00:36.100 memory management. And some of them do

00:00:38.600 have high CPU costs. Some of them

00:00:41.000 do waste memory. And some do introduce

00:00:43.866 unpredictability into timing.

 

00:00:46.733 Equally, though, there are very real problems

00:00:49.266 due to manual memory management.

 

00:00:52.000 Managing memory manually has unpredictable overheads.

 

00:00:56.066 As a programmer you known when you

00:00:58.266 call malloc() and free(), but you have

00:01:00.466 no idea how long those calls will

00:01:01.933 take to execute. And they can be

00:01:04.166 quite processor intensive, and can take unpredictable

00:01:06.666 amounts of time, in some cases.

 

00:01:10.000 And there are the numerous, and well

00:01:11.966 known, problems due to memory leaks,

00:01:13.900 memory corruption, buffer overflows, use-after-free bugs,

00:01:16.933 and iterator invalidation, all of which are

00:01:19.833 due to manual memory management.

 

00:01:23.000 Systems programmers have focussed on the problems

00:01:25.333 of moving to automatic memory management,

00:01:27.466 discounting the problems of the status quo.

 

00:01:31.166 Furthermore, we’re starting to see automatic memory

00:01:33.966 management techniques that solve some of the

00:01:35.800 problems with older approaches. New garbage collection

00:01:39.566 algorithms have lower overheads and are more

00:01:41.466 predictable. And systems have gotten faster,

00:01:44.633 making the overheads of existing algorithms more acceptable.

 

00:01:48.300 And different approaches, such as region-based memory

00:01:51.133 management, are starting to see widespread use,

00:01:54.766 and offer more predictability and stronger compile-time

00:01:58.266 behaviour guarantees.

 

00:02:00.700 For many systems programs, it may be

00:02:02.900 time to reconsider the cost-benefit tradeoff for

00:02:05.266 automatic memory management, to see if the

00:02:07.866 balance has shifted.

 

00:02:11.500 Systems programs traditionally used a mix of

00:02:13.900 manual and automatic memory management.

 

00:02:17.600 The memory used for the stack is managed automatically.

 

00:02:21.666 In the sample code on the slide,

00:02:23.600 for example, the memory for the local

00:02:25.833 variable, di, in the saveDataForKey() function is

00:02:29.266 automatically allocated when the function is called,

00:02:32.900 and automatically freed when the function returns.

 

00:02:37.000 This is so common that we don’t even think about it.

 

00:02:40.166 Stack-based memory management, in this form,

00:02:43.466 works extremely well for languages like C

00:02:46.366 and C++ that support complex value types.

 

00:02:49.400 That is, for languages that can put

00:02:51.666 structs on the stack and pass them by value.

 

00:02:55.800 It works less well for languages like

00:02:57.600 Java, where objects are heap allocated.

 

00:03:00.666 In Java, the variable, di, would be

00:03:03.166 a reference to a heap allocated object,

00:03:05.400 which would need to be garbage collected.

 

00:03:07.833 Only the reference to that object would

00:03:10.066 be stored on the stack. While,

00:03:12.600 in C and C++, the entire object

00:03:14.966 can be efficiently managed on the stack.

 

00:03:17.566 The stack in C and C++ is

00:03:20.000 an example of successful automatic memory management.

 

00:03:24.000 The heap, on the other hand,

00:03:25.900 is generally managed manually. Allocation is by

00:03:28.866 a call to malloc(). Deallocation, if the

00:03:31.266 programmer remembers to do so, is by

00:03:33.833 an explicit call to free().

 

00:03:37.900 It would be good to have an

00:03:39.266 effective and automatic way of managing the

00:03:41.033 heap. Something that’s as simple, efficient,

00:03:44.300 and effective as managing the stack.

 

00:03:47.266 The goal of automatic memory management is

00:03:49.366 to manage the heap. To find memory

00:03:51.966 that’s no longer in use, and make

00:03:53.900 that space available for reuse. To look

00:03:56.033 for objects that are no longer referenced

00:03:58.166 by the running program, and to free their memory.

 

00:04:01.000 And to do so efficiently and automatically.

 

00:04:05.000 And to do so safely.

 

00:04:06.966 It’s better to waste memory, and to keep an

00:04:09.033 object alive when it shouldn’t be,

00:04:11.200 than to deallocate the memory used by

00:04:13.766 an object that’s potentially still in use.

 

00:04:16.500 There are three approaches to automatically

00:04:18.800 managing the heap.

 

00:04:21.000 The first is reference counting, that I’ll

00:04:22.933 discuss in a minute.

 

00:04:25.000 The second is region-based lifetime tracking,

00:04:27.700 that I’ll discuss in the next part of this lecture.

 

00:04:31.000 And the third is to use garbage

 

00:04:32.933 collection, that I’ll discuss in lecture 6.

 

00:04:37.500 The first of the automatic memory management

00:04:40.100 techniques is reference counting.

 

00:04:43.500 Reference counting is simple and easy to understand.

 

00:04:47.166 When allocating memory for an object,

00:04:49.433 the allocation contains additional space for a

00:04:51.500 reference count that’s stored along with the object.

 

00:04:54.900 That is, every object has a hidden

00:04:57.566 extra field, large enough to hold an

00:04:59.866 integer value, that’s stored alongside the object

00:05:03.100 and managed by the runtime system,

00:05:04.966 invisibly to the programmer.

 

00:05:07.333 This extra field contains a reference count.

00:05:10.400 It counts how many other objects have

00:05:12.833 a pointer, a reference, to this object.

 

00:05:15.333 When a heap allocated object is created,

00:05:18.033 the reference count is set to one,

00:05:20.133 and a reference to the object is returned.

 

00:05:22.866 When a new reference, a new pointer,

00:05:25.366 to the object is made, then the

00:05:28.066 reference count is increased by one.

 

00:05:30.233 When a reference is removed or changed,

00:05:32.766 so that it no longer points to

00:05:34.866 the object, then the reference count is decreased by one.

 

00:05:38.300 If the reference count reaches zero,

00:05:40.866 then there are no references left that

00:05:43.466 point to the object, and it may

00:05:44.966 be reclaimed, and it’s memory deallocated.

 

00:05:48.000 If an object that contains references to

00:05:50.300 other objects is reclaimed, this removes a

00:05:52.966 reference to those objects. Doing so potentially

00:05:56.233 causes their reference counts to go to

00:05:58.433 zero, triggering further reclamation.

 

00:06:01.066 The reference counts are maintained automatically.

00:06:04.100 In a compiled language with reference counting,

00:06:07.133 such as the Objective C runtime used

00:06:09.766 in iPhones, then the compiler generates the

00:06:12.500 code to manage references and reclaim objects

00:06:14.966 when pointers are manipulated. In an interpreted

00:06:18.533 language with reference counting, such as Python

00:06:21.400 or Ruby, the interpreter updates the references

00:06:24.533 and reclaims the objects.

 

00:06:28.500 The key benefit of reference counting is

00:06:31.066 that it’s predictable and understandable.

 

00:06:34.000 It’s easy to explain how reference counting works.

 

00:06:37.533 It’s easy to understand when memory is

00:06:39.866 reclaimed, and what actions in a program

00:06:42.333 might trigger memory to be reclaimed.

 

00:06:45.000 It’s easy to understand what is the

00:06:46.766 overhead and what is the cost.

00:06:49.233 It’s one additional integer per object;

00:06:51.166 an increment of a reference count when

00:06:53.466 taking a pointer to an object;

00:06:55.200 and a decrement of a reference count,

00:06:57.033 an if statement, and a potential call

00:06:58.900 to free() when removing a reference.

 

00:07:01.933 The behaviour is intuitive to programmers –

00:07:04.233 and that counts for a lot.

 

00:07:06.466 Reference counting is also incremental.

 

00:07:09.266 Memory is reclaimed in small bursts, and the costs

00:07:12.133 that occur are an occasional small overhead

00:07:14.666 on pointer operations. There are few long

00:07:17.233 bursts of memory management activity, and it’s

00:07:19.800 clear what might trigger such bursts.

 

00:07:24.000 Reference counting also has some costs.

 

00:07:27.600 Cyclic data structures can be produced that

00:07:30.100 contain mutual references. That is, a set

00:07:33.100 of objects can reference each other in

00:07:35.233 a loop, such that the objects are

00:07:37.666 all reachable from some other object,

00:07:39.166 so the reference count never reaches zero,

00:07:41.733 but not from the rest of the program.

 

00:07:44.566 This can lead to a disconnected set

00:07:46.466 of objects, unreachable from the rest of

00:07:48.400 the code, that aren’t reclaimed because they

00:07:50.833 have non-zero reference counts.

 

00:07:53.000 That is, it can leak memory.

 

00:07:56.400 The programmer needs to notice, and explicitly

00:07:59.066 set one of the references to null,

00:08:00.933 to break the cycle before removing the

00:08:03.066 last other pointer to the objects.

 

00:08:05.000 Only then will the objects be reclaimed.

 

00:08:08.266 Reference counting stores a reference count alongside

00:08:11.100 each object, as an additional hidden field.

00:08:14.100 If the objects can be accessed concurrently,

00:08:16.633 it may also need a mutex for each object.

 

00:08:19.666 This uses additional memory. And can be

00:08:22.566 quite a high overhead if the objects are small.

 

00:08:25.733 And the processor cost of updating the

00:08:27.633 references can be moderately high, if the

00:08:30.133 program manipulates pointers a lot, of if

00:08:32.000 the objects don’t live long.

 

00:08:35.233 Despite these limitations,

00:08:36.900 reference counting is widely used.

 

00:08:39.800 It’s common in scripting languages, such as

00:08:42.500 Python or Ruby, where the overhead of

00:08:44.866 using an interpreted language far outweighs the

00:08:47.133 overhead of using reference counting.

 

00:08:50.000 It’s also used for some aspects of systems programming.

 

00:08:54.166 Applications for the iPhone and Mac are

00:08:56.700 typically written in the Objective C language.

 

00:08:59.700 This is an object oriented extension to

00:09:02.100 C that uses reference counting to manage

00:09:03.966 the memory allocated to objects. And it’s

00:09:06.966 actually highly optimised and reasonably high performance.

 

00:09:09.966 Reference counting has always been fast enough

00:09:12.333 for user interface code and much application

00:09:14.766 logic. It’s not used for low-level,

00:09:17.933 in-kernel, code, where performance is critical, though.

 

00:09:22.766 That concludes our discussion of reference counting.

 

00:09:25.766 It’s a simple, and reasonably effective automatic

00:09:28.400 memory management scheme, that’s quite widely used.

 

00:09:31.600 In the next part, I’ll move on to discuss

00:09:34.300 an alternative known as region-based memory management.

Part 3: Region-based Memory Management

The third part of the lecture introduces the concept of region-based memory management. It describes the concepts and rationale behind this approach, and how it builds on the way the stack is managed. And it describes the benefits and limitations of region-based approaches to memory management, such as that employed by Rust. The concept of ownership tracking is introduced.

Slides for part 3

 

00:00:00.066 In this part, I want to introduce

00:00:02.033 region-based memory management, and discuss how it’s

00:00:04.833 used to manage memory in Rust.

 

00:00:08.000 Region-based memory management is a new way

00:00:10.366 of managing memory that grew out of

00:00:12.266 frustrations with the alternatives.

 

00:00:15.000 Reference counting, as we saw in the

00:00:16.933 last part, is simple and easy to

00:00:19.466 understand, but has relatively high overhead.

 

00:00:23.133 It takes space to store the reference

00:00:25.100 counts, and time to update them.

 

00:00:27.600 And while this overhead is okay for

00:00:29.733 most applications, it’s too much for performance-critical

00:00:32.566 systems code.

 

00:00:35.000 Garbage collection, which I’ll talk about in

00:00:37.933 detail in the next lecture, has unpredictable

00:00:40.200 timing and high memory overheads.

 

00:00:42.433 And manual memory management is too error prone.

 

00:00:46.066 Region-based memory management aims for the middle

00:00:48.366 ground between these.

 

00:00:50.333 It aims to be safe and offer

00:00:52.666 predictable timing. And it has no run-time

00:00:55.133 cost, compared to manual memory management.

 

00:00:57.800 But, to achieve this, it forces some

00:01:00.733 changes in the way code is written.

 

00:01:05.000 So what is region-based memory management?

00:01:07.800 Well, to understand that, let’s first recap

00:01:11.366 the way memory is managed on the stack.

 

00:01:14.566 The slide shows a simple C program.

 

00:01:17.900 This comprises two functions: main() and conic_area(),

00:01:21.766 where the conic_area() function implements a mathematical

00:01:25.000 formula to calculate the surface area

00:01:27.133 of a right circular cone.

 

00:01:30.000 There’s one global static variable,

00:01:32.066 holding the value of Pi.

 

00:01:34.500 The main() function takes no parameters,

00:01:37.066 and hold local variables width, height, and area.

 

00:01:40.700 And the conic_area() function takes two parameters,

00:01:43.433 w and h, and hold the local

00:01:45.833 variables r and a.

 

00:01:49.000 The diagram on the right shows the

00:01:50.900 memory regions owned by the program,

00:01:52.933 while the conic_area() function is executing.

 

00:01:56.433 We see there’s a region holding the

00:01:58.133 global variables, and sub-regions for the stack

00:02:01.066 frame for main() and for the conic_area()

00:02:03.700 functions. These nest, one within the other,

00:02:06.433 with the global region lasting for the

00:02:09.966 entire duration of the program, and the

00:02:12.666 other regions nesting within, and lasting for

00:02:15.200 part of the time.

 

00:02:17.000 The lifetime of the local variables and

00:02:19.066 function parameters matches that of the stack

00:02:21.733 frames of the functions holding them.

 

00:02:24.600 Memory is allocated and deallocated automatically,

00:02:27.866 efficiently, and seamlessly.

 

00:02:32.000 Essentially, there’s a hierarchy of regions corresponding

00:02:34.500 to the call stack of the program.

 

00:02:37.166 The lifetime of the global variables is

00:02:39.566 that of the entire program, and the

00:02:42.066 lifetime of the local variables matches the

00:02:44.366 execution time of the functions they live within.

 

00:02:47.400 And, within each function, we see there

00:02:49.933 may be lexically scoped variables that have

00:02:51.933 a lifetime less than that of the enclosing function.

 

00:02:55.533 In the code on the slide,

00:02:57.366 for example, we see that the variable

00:02:59.666 i, the index variable in the for

00:03:01.700 loop, lives only for the duration of that for loop.

 

00:03:06.000 Each of these variables lives within a

00:03:08.566 region, scoped by the program code,

00:03:10.766 and is automatically allocated when entering that

00:03:13.166 region, and deallocated when the program leaves the region.

 

00:03:16.466 This works seamlessly.

 

00:03:19.100 So much so that we take it for granted when writing code.

 

00:03:24.433 There’s one limitation of this stack-based approach

00:03:27.166 to memory management.

00:03:29.100 It requires data to be allocated on the stack.

 

00:03:33.000 In the example on this slide,

00:03:35.333 the variable tmp lives on the stack

00:03:37.466 frame for the hostname_matches() function. Space for

00:03:41.166 the variable is created when the function

00:03:43.033 starts to execute, and freed when the function concludes.

 

00:03:47.533 But that’s only the memory that holds tmp.

 

00:03:50.633 That is, the memory for the local

00:03:52.666 variable holding the pointer is managed automatically.

 

00:03:56.666 The heap allocated memory, the memory allocated

00:03:59.700 by the call to malloc(), is not freed automatically.

 

00:04:04.000 When tmp goes out of scope,

00:04:06.233 the space holding the pointer is freed,

00:04:08.433 but the value it points to is not.

 

00:04:11.766 The program doesn’t call free() to deallocate

00:04:14.066 that memory, and it doesn’t return the

00:04:16.333 pointer so some other function can do so.

 

00:04:19.966 It leaks memory. The pointer is gone,

00:04:23.533 so there’s no remaining reference to the

00:04:25.466 allocated memory, so no way to free that memory.

 

00:04:30.566 Stack-based memory management is clearly effective,

00:04:33.533 but has limited applicability.

 

00:04:35.933 Can we extend it to manage the heap?

 

00:04:39.033 Yes. We can arrange for the compiler

00:04:41.866 to track the lifetime of data,

00:04:44.000 and in the same way that it

00:04:45.733 inserts code to manage the stack,

00:04:47.466 it can insert code to manage heap

00:04:48.933 memory based on object lifetimes.

 

00:04:51.833 If we arrange the code so that

00:04:53.700 there’s a single clear owner of every

00:04:55.500 data value, then we can arrange it

00:04:57.833 so that if a pointer goes out

00:04:59.100 of scope, then the value it points

00:05:01.133 to is also freed.

 

00:05:03.666 This requires us to track ownership of

00:05:05.533 all data objects, but as we saw

00:05:07.833 in the last lecture, Rust does this.

 

00:05:11.333 The idea then it to define the

00:05:13.533 Box-of-T type, which is a value that’s

00:05:16.366 stored on the stack that holds a

00:05:18.033 reference to data of type T that’s

00:05:21.433 allocated on the heap.

 

00:05:23.800 Calling Box::new() allocates and populates heap memory,

00:05:28.400 and returns a pointer to that memory

00:05:30.600 that’s stored in a local variable on

00:05:32.066 the stack. The resulting local variable is

00:05:35.666 a normal variable, with lifetime matching that

00:05:38.166 of the stack frame in which it resides.

 

00:05:41.366 And the heap allocated object has lifetime

00:05:43.633 matching that of the Box too.

 

00:05:46.566 When the local variable holding the Box

00:05:48.466 goes out of scope, its destructor is

00:05:50.800 automatically called by the compiler. The destructor

00:05:54.433 frees the heap allocated memory. This is

00:05:57.766 the basis for region based memory management in Rust.

 

00:06:01.400 It’s highly efficient. And it safety manages

00:06:05.033 heap memory. But it loses generality,

00:06:08.266 since it ties the lifetime of heap

00:06:10.333 allocation to that of stack frames.

 

00:06:13.833 If you’ve programmed C++, this approach will

00:06:16.766 be familiar as the RAII – Resource

00:06:19.866 Acquisition Is Initialization – design pattern.

 

00:06:23.566 Python also does something similar in its “with” clauses.

 

00:06:30.133 Rust takes the pattern further though.

 

00:06:33.766 To be effective, region-based memory management has

00:06:36.666 to not just tie object lifetimes to

00:06:39.033 stack frames, but also to track how

00:06:42.033 objects are passed between stack frames.

 

00:06:45.133 And do so in a way that

00:06:46.733 maintains the single owner invariant for every object.

 

00:06:49.700 At every point, the compiler must know

00:06:52.666 what region of code owns each piece

00:06:54.800 of data. And as data is passed

00:06:57.266 between functions, as ownership of the data

00:07:00.266 is passed between functions, the compiler must

00:07:02.666 track that data ownership.

 

00:07:05.300 And, when the data item finally goes

00:07:07.866 out of scope, after its ownership

00:07:10.500 has been passed between various functions,

00:07:12.433 only then should it deallocate the heap memory.

 

00:07:17.166 As we saw in the last lecture,

00:07:19.066 Rust tracks ownership of data items as

00:07:21.366 they’re passed between functions.

 

00:07:24.066 The code on the slide is the

00:07:25.966 Rust equivalent of the C program we

00:07:27.933 saw earlier, to calculate the conic area.

 

00:07:31.300 If we look at the lifetimes of

00:07:32.900 the variables in the function area_of_cone(),

00:07:35.566 we see that the variable, r,

00:07:37.466 has lifetime matching that of the function’s

00:07:39.566 stack frame. It’s allocated when the function

00:07:42.666 starts to execute, and goes out of

00:07:45.000 scope and is freed when the function finishes.

 

00:07:48.533 The variable, a, though is created within

00:07:51.400 the area_of_cone() but then returned from that

00:07:54.566 function. Ownership of the value is passed

00:07:57.666 from that function to its caller,

00:07:59.800 the main() function. And from there,

00:08:02.200 it’s passed as an argument to the

00:08:04.633 println!() macro, that consumes the value.

 

00:08:07.500 The key point is that the lifetime

00:08:09.466 of the value initially stored in the

00:08:11.166 local variable, a, in the area_of_cone() function,

00:08:14.566 outlives that function.

 

00:08:17.133 And the compiler tracks this lifetime.

 

00:08:19.733 It knows that the ownership has changed.

 

00:08:22.466 That the value has been passed out

00:08:24.533 of the function into main(), and is

00:08:26.333 later consumed by the println!() call.

 

00:08:30.100 We see that ownership of the return

00:08:32.233 value is moved to the calling function.

 

00:08:35.333 The value is moved into the stack

00:08:37.066 frame of the calling function, and assigned

00:08:39.466 to a local variable. And the original

00:08:42.066 value, in the called function’s stack frame,

00:08:44.666 is destroyed when that function returns.

 

00:08:48.000 If the value being moved is a

00:08:49.500 Box, what’s moved is the Box itself.

 

00:08:52.433 That is, the smart pointer to the

00:08:54.333 heap allocated memory is moved into the

00:08:56.400 stack frame of the caller. The value

00:08:59.200 on the heap that it references is unchanged.

 

00:09:02.766 Similarly, if a reference is returned,

00:09:05.100 it’s the reference that’s moved. The referenced

00:09:08.100 value is unchanged.

 

00:09:11.500 Any variable not returned by a function

00:09:13.966 goes out of scope and is destroyed

00:09:15.800 when the function returns.

 

00:09:18.466 What does this mean? The destructor for

00:09:21.466 the type, the drop() method, is called.

 

00:09:24.566 Then the memory holding the object is deallocated.

 

00:09:28.100 If the value going out of scope

00:09:29.966 is a Box, the destructor for the

00:09:32.133 box calls the destructor for the heap

00:09:34.666 allocated object, then frees the heap memory.

 

00:09:37.300 Since every object goes out of scope

00:09:39.233 eventually, when it reaches the end of

00:09:40.933 its lifetime, since ensures that all heap memory is freed.

 

00:09:47.266 Since the Rust compiler tracks the lifetime

00:09:49.933 of every object, it can prevent common

00:09:52.000 lifetime-related errors.

 

00:09:54.466 For example, the Rust program shown on

00:09:56.900 the top of this slide is a

00:09:58.700 function that tries to return a reference

00:10:00.433 to a local variable. This isn’t possible,

00:10:03.533 since the local variable ceases to exist

00:10:05.166 once the function returns.

 

00:10:08.000 The compiler notices this, and as we

00:10:10.900 see, the code doesn’t compile. You can’t

00:10:14.333 return a reference to, that is,

00:10:16.800 borrow, a value that doesn’t exist any more.

 

00:10:19.733 The equivalent C program, shown on the

00:10:22.766 bottom of the slide, will compile just

00:10:24.733 fine, but crash at runtime when the

00:10:26.833 returned pointer is used.

 

00:10:29.800 Good C compilers warn about this,

00:10:32.000 in simple cases like this example,

00:10:34.433 but will compile more complex variants of

00:10:36.866 the code without warnings.

 

00:10:40.500 Similarly, because the Rust compiler tracks object

00:10:43.400 lifetimes, it can prevent you from accessing

00:10:45.800 memory once it’s been freed.

 

00:10:48.766 The Rust code shown at the top

00:10:50.800 of the slide shows an example.

 

00:10:53.066 It imports and explicitly calls the drop()

00:10:55.300 function, to force the local variable,

00:10:57.900 x, to be deallocated.

 

00:11:01.000 Calling drop() manually is never explicitly needed

00:11:03.666 in Rust, but is possible, and has

00:11:06.666 the same effect as calling free() to

00:11:08.433 deallocate memory in a C program.

 

00:11:12.000 The Rust code on the slide,

00:11:13.866 that deallocates memory then tries to print

00:11:16.266 out the contents of the just-deallocated memory,

00:11:19.133 fails to compile. The compiler knows that

00:11:22.633 the drop() call consumes the object and

00:11:25.033 that its lifetime ends. It knows that

00:11:27.266 the value cannot then be accessed.

 

00:11:30.200 The equivalent C program, shown at the

00:11:32.733 bottom, will compile without warnings, but has

00:11:36.066 undefined behaviour at runtime since it accesses

00:11:39.166 previously freed memory.

 

00:11:43.366 In addition to returning ownership of data,

00:11:46.533 functions can take ownership of a value.

 

00:11:49.366 When a value is passed to a

00:11:51.400 function, that function takes ownership of that

00:11:53.966 value. We see this in the code

00:11:56.233 on the slide, where the consume() function

00:11:58.633 takes ownership of the local variable, a,

00:12:00.800 from the main() function, in the

00:12:02.633 form of its parameter, x.

 

00:12:05.233 And, as usual, once the consume() function

00:12:07.700 ends, the values of any data it

00:12:09.633 owns that are not returned, as destroyed.

 

00:12:13.000 We see this when we try to

00:12:14.500 compile the code. The main() function can’t

00:12:17.500 access the local variable, a, to print

00:12:19.766 its length, since it doesn’t own the

00:12:21.500 value anymore. It gave ownership to the

00:12:25.100 consume() function, and that function destroyed the

00:12:27.233 value and didn’t pass it back.

 

00:12:29.733 Accordingly, it’s no longer accessible in main()

00:12:32.500 and the code won’t compile.

 

00:12:36.566 That concludes this introduction to region-based

00:12:39.066 memory management.

 

00:12:40.900 The key insight is that if every

00:12:43.200 value has a unique owner, and the

00:12:45.666 compiler tracks ownership and the lifetime of

00:12:48.233 those values, then every value can be

00:12:50.700 automatically freed at the end of its lifetime.

 

00:12:54.066 This is the key difference between Rust

00:12:56.400 and other programming languages. The Rust compiler

00:12:59.533 knows when a value has been consumed,

00:13:01.966 when it’s reached the end of its

00:13:04.100 lifetime, and will enforce this at compile time.

 

00:13:07.000 Other languages either don’t track lifetimes,

00:13:09.700 like C, and potentially allow use after

00:13:12.533 free bugs, dangling references, and so on.

 

00:13:15.900 Or they use some form of garbage

00:13:18.166 collection, to track liveness of data at

00:13:20.200 run time. They’re either unsafe, or have

00:13:23.533 run time overhead.

 

00:13:26.000 Rust is both safe, and avoids the

00:13:28.433 run-time overheads.

 

00:13:30.000 But to achieve this, Rust restricts the

00:13:32.300 set of programs that can be expressed,

00:13:34.600 as we’ll see in the next part of this lecture.

Part 4: Resource Management

The final part of the lecture discusses how region-based memory management can be extended to include resource management. It further expands on the concept of ownership tracking, introduced in the previous part of the lecture, and shows how Rust supports borrowing of data to make region-based memory management feasible. It outlines the limitations of region-based memory management and how it limits the classes of program that can be written, and discusses the trade-off this imposes. And it reviews how region-based management can be used to enforce deterministic resource clean-up.

Slides for part 4

 

00:00:00.366 In this final part of the lecture,

00:00:02.600 I’ll talk about how Rust extends ownership

00:00:05.166 with the idea of borrowing, and how

00:00:07.333 this can apply to management of resources

00:00:09.233 other than memory.

 

00:00:12.000 We saw in the previous part that

00:00:14.000 functions can pass ownership of data between

00:00:16.433 them. We saw how functions take ownership

00:00:19.533 of data they’re passed, and how they

00:00:22.033 can return ownership of data to their caller.

 

00:00:24.600 We discussed how this relates to memory

00:00:26.533 management, with memory being freed when an

00:00:28.666 object goes out of scope at the

00:00:30.133 end of a function. And, we showed

00:00:32.166 how, with the aid of the single

00:00:33.633 ownership rules, Rust can leverage this to

00:00:36.433 provide an automatic region-based memory management scheme.

 

00:00:40.833 This can work, but repeatedly passing and

00:00:43.700 returning ownership of data to and from

00:00:46.233 functions is inconvenient. To make this easier,

00:00:49.966 Rust augments the ownership rules with the

00:00:52.166 idea of borrowed data.

 

00:00:55.666 Functions in Rust can take references to

00:00:58.133 data as parameters. For example, in the

00:01:01.700 slide we see the function borrow() that

00:01:04.100 takes a mutable reference to a vector,

00:01:06.700 x, as its parameter, and modifies the

00:01:09.766 vector, by pushing a new element onto the end.

 

00:01:13.700 In the main() function, we create a

00:01:15.866 vector, add some data to it,

00:01:18.033 call the borrow() function, and print the

00:01:19.900 length of the vector.

 

00:01:22.433 Unlike the case we saw in the

00:01:24.166 last part, where the function consumed the

00:01:26.366 vector, in this case the vector is

00:01:28.633 still accessible after the call to borrow().

 

00:01:31.966 Why is this?

 

00:01:34.000 Well, because borrow() is passed data by reference.

 

00:01:38.166 The borrow() function is passed, takes ownership

00:01:41.766 of, a reference to the vector.

 

00:01:44.066 It doesn’t take ownership of the vector.

 

00:01:46.933 When the borrow() function returns, the lifetime

00:01:50.200 of the parameter x ends, and it

00:01:52.466 is reclaimed in the usual way.

 

00:01:55.300 The function has taken ownership of that

00:01:57.033 parameter, and hasn’t returned ownership to the

00:01:59.200 caller, so the data is freed.

 

00:02:02.233 But, what it took ownership of was

00:02:04.266 a reference to the vector, not the

00:02:06.233 vector itself. So what is freed is

00:02:09.066 the reference, not the vector. Ownership of

00:02:12.666 the vector remains with the main() function at all times.

 

00:02:16.433 This is known as borrowing a reference.

 

00:02:20.533 It’s not shown on the slide,

00:02:22.233 but a function that takes a reference

00:02:23.800 as a parameter can return that reference

00:02:25.800 to the caller. This is safe,

00:02:28.566 because the reference has to be accessible

00:02:30.500 to the caller – otherwise, how could

00:02:32.200 it pass it to the function –

00:02:33.700 so the returned reference will also be accessible.

 

00:02:38.700 In this example, the borrow() function changes

00:02:41.466 the contents of the vector. It pushes

00:02:44.433 a new element onto the end of the vector.

 

00:02:47.466 And in this case, it’s safe for it to do so.

 

00:02:51.000 But this is not always safe.

 

00:02:54.000 For example, if main() was iterating over

00:02:56.533 the contents of the vector, and passed

00:02:59.033 a mutable reference to that vector to

00:03:01.100 a function that modified it, while that

00:03:03.866 iteration was in progress, this might lead

00:03:06.466 to elements being skipped or duplicated,

00:03:09.000 or to a result to be calculated

00:03:11.066 with inconsistent data.

 

00:03:13.866 This is a problem known as iterator invalidation.

 

00:03:18.800 To avoid iterator invalidation, and other problems,

00:03:22.166 references in Rust come in two different

00:03:24.333 types, and have restrictions on how they can be used.

 

00:03:29.066 Rust has two different types of pointer;

00:03:31.333 two different types of reference.

 

00:03:34.100 The first is written &T, and is

00:03:37.366 a shared reference to an immutable object

00:03:39.833 of type T.

 

00:03:42.133 The second is written &mut T,

00:03:45.633 and is a unique reference to a

00:03:47.833 mutable object of type T.

 

00:03:50.866 The Rust compiler and runtime system work

00:03:53.366 together to control how references can be

00:03:55.166 used, and to track ownership of references

00:03:58.000 and the referenced values.

 

00:04:01.000 There are three fundamental rules.

 

00:04:04.566 An object of type T can be

00:04:07.000 referenced by one or more references of

00:04:09.800 type &T. Or it can be referenced

00:04:13.333 by exactly one reference of type &mut T.

 

00:04:16.533 But it’s not possible to have

00:04:18.333 both mutable and immutable references to the

00:04:20.766 same object.

 

00:04:23.000 If an object is defined to be

00:04:24.966 immutable, it’s not possible to take a

00:04:27.466 mutable reference to it.

 

00:04:30.000 And it an object is defined to

00:04:31.800 be mutable, then taking an immutable reference

00:04:34.666 to that object makes it immutable for

00:04:36.700 the duration of the immutable reference.

 

00:04:41.066 These restrictions complicate how pointers work in

00:04:43.800 Rust. And they limits the set of

00:04:46.333 programs it’s possible to write.

 

00:04:49.000 But they allow functions to safely borrow

00:04:51.500 objects, without needing to give away ownership.

 

00:04:55.933 In a Rust program, to be able

00:04:58.300 to change an object, you must either

00:05:00.600 own the object, and it not be

00:05:02.333 marked as immutable.

 

00:05:04.000 Or you must own the only &mut

00:05:06.433 reference to it.

 

00:05:09.000 These rules prevent iterator invalidation.

 

00:05:12.966 Iterators in Rust are designed to take

00:05:15.066 an immutable reference to the object over

00:05:17.266 which they iterate. This guarantees that the

00:05:20.266 object can’t change, and that no mutable

00:05:22.733 references to the object can exist.

 

00:05:25.966 If the object being iterated over cannot

00:05:27.933 change, the iterator cannot be invalidated.

 

00:05:31.666 The compiler checks and enforces these rules.

 

00:05:37.000 By tracking ownership, and controlling how pointers

00:05:40.633 are used, Rust turns various issues that

00:05:43.466 are run-time bugs in other languages into

00:05:46.233 compile-time errors.

 

00:05:49.000 Rust prevents use-after-free bugs by making it

00:05:52.433 impossible to return references to data to

00:05:55.266 outlive the data being referenced.

 

00:05:58.000 Rust prevents iterator invalidation, by making it

00:06:01.100 impossible to change the object being iterated

00:06:03.500 over whilst the iterator exists.

 

00:06:06.500 And Rust prevents race conditions between multiple

00:06:09.000 threads, because the rules about pointers make

00:06:11.366 it impossible for two threads to each

00:06:13.533 have a mutable reference to the same object.

 

00:06:16.900 All these behaviours are checked at compile time.

 

00:06:21.466 And Rust has efficient run-time behaviour.

 

00:06:25.000 The Rust compiler generates exactly the same

00:06:27.533 code to allocate and free memory as

00:06:29.766 would a correctly written C program using

00:06:31.866 malloc() and free(). The difference is that

00:06:35.166 the compiler ensures the malloc() and free()

00:06:37.733 calls are in the correct places.

 

00:06:40.000 As a result the timing and memory

00:06:43.100 usage of Rust code is a predictable

00:06:45.833 as a correctly written C program.

 

00:06:48.866 Rust is deterministic in when memory is

00:06:51.100 both allocated and freed.

 

00:06:55.766 The region-based approach to memory management used

00:06:58.233 by Rust has some limitations though.

 

00:07:02.000 Primarily, Rust ensures correctness by limiting the

00:07:05.266 types of program it’s possible to write.

 

00:07:08.666 The rules about ownership and borrowing

00:07:10.800 make it impossible to express certain

00:07:12.833 data structures in safe Rust.

 

00:07:15.600 For example, it’s not possible to write

00:07:18.366 code that uses data structures that contain

00:07:20.500 reference cycles.

 

00:07:23.000 The canonical example of this, is that

00:07:25.300 it’s not possible to write a doubly

00:07:26.933 linked list in safe Rust. If you

00:07:30.100 look at the example of the slide,

00:07:32.166 where you have a list containing elements

00:07:33.833 a, b, and c, you’ll see that

00:07:36.366 it’s not possible to add an element,

00:07:37.933 d, to the end.

 

00:07:40.266 You can take an immutable reference to

00:07:42.133 element c, and add it to element

00:07:44.333 d. This works, since Rust allows two

00:07:47.533 immutable references to C to exist –

00:07:50.266 one from element b and one from element d.

 

00:07:54.000 What you can’t do, though, is take

00:07:56.500 a mutable reference to element c,

00:07:58.200 in order to modify it to add

00:07:59.900 a reference to element d. This is

00:08:03.000 because there’s already an immutable reference to

00:08:05.266 element c, held by element b,

00:08:07.966 and you can’t have both mutable and

00:08:09.733 immutable references to the same object.

 

00:08:13.133 The restrictions on references that prevent race

00:08:15.833 conditions, iterator invalidation, and so on,

00:08:19.000 also prevent cyclic data structures.

 

00:08:22.000 And it’s not just doubly linked lists.

 

00:08:25.000 They prevent any data structure that contains

00:08:27.266 a loop of pointers.

 

00:08:29.533 It’s a fundamental limitation of the way

00:08:31.466 references work in Rust. It trades expressive

00:08:34.933 power for safety.

 

00:08:38.000 The designers of Rust recognised this,

00:08:40.100 and added an escape hatch.

 

00:08:43.000 Rust also has a third type of

00:08:45.133 reference, known as the raw pointer.

 

00:08:48.700 Raw pointers work just like pointers in

00:08:50.900 C. They allow you to circumvent the

00:08:53.533 restrictions that Rust imposes on mutable and

00:08:56.066 immutable references, and make it possible to

00:08:59.366 write cyclic data structures in exactly the

00:09:01.666 same way you would do so in C.

 

00:09:04.333 Of course, by allowing you to circumvent

00:09:06.666 these restrictions, you lose the safety guarantees

00:09:09.400 of Rust. A Rust program that uses

00:09:12.733 raw pointers is as likely to suffer

00:09:14.766 from memory safety issues, such as use

00:09:17.200 after free bugs, iterator invalidation, and race

00:09:20.000 conditions, as is a C program.

 

00:09:24.000 To make this clear, Rust requires use

00:09:27.033 of raw pointers to be explicitly labelled

00:09:29.200 as unsafe in the code. It warns

00:09:32.366 the programmer, and makes it easy to find such uses.

 

00:09:36.633 There’s a second limitation of safe Rust,

00:09:39.366 which is that it can’t express shared

00:09:41.266 ownership of mutable data.

 

00:09:44.000 Shared ownership of immutable data is straight-forward.

 

00:09:48.200 It’s okay for many different references to

00:09:50.600 point to an object that cannot change.

 

00:09:53.466 Shared ownership of mutable data is problematic,

00:09:56.466 though, because it potentially open up race

00:09:59.066 conditions. This is why Rust only allows

00:10:02.300 a single mutable reference to a value

00:10:04.200 to exist at once.

 

00:10:06.866 But, occasionally, rarely, you need shared mutable state.

 

00:10:12.066 If you do, Rust has a RefCell-of-T

00:10:15.133 type. This wraps some value of type

00:10:18.400 T, and dynamically enforces the borrowing rules.

 

00:10:22.100 That is, it allows callers to borrow,

00:10:24.800 or mutably borrow, the wrapped value,

00:10:27.600 and enforces at run time that there

00:10:29.833 can only be immutable borrows, or a

00:10:32.366 single mutable borrow, but not both.

 

00:10:35.666 The RefCell type provides essentially the same

00:10:38.700 guarantees as the regular Rust borrowing rules,

00:10:41.366 but enforced at run time rather than

00:10:43.633 at compile time. This means that attempts

00:10:46.533 to take, for example, several mutable references

00:10:49.366 to the wrapped object, will cause a

00:10:51.366 panic when the code executes, rather than

00:10:53.666 failing to compile.

 

00:10:56.000 The RefCell is safe,

00:10:57.633 in that it never causes undefined behaviour,

00:11:00.433 but it can cause a run time failure.

 

00:11:05.266 Possibly the biggest issue with region-based memory

00:11:08.200 management, though, is that it forces the

00:11:10.866 programmer to consider object ownership explicitly,

00:11:13.966 and early in the design.

 

00:11:17.000 To what extent this is a problem

00:11:19.233 depends, I think, on your background.

 

00:11:22.500 If you’re experienced at programming in C,

00:11:24.866 this tends to be a non-issue.

 

00:11:27.500 Writing correct C programs also requires careful

00:11:30.633 consideration of data ownership, to know what

00:11:33.366 functions malloc() data and what functions free()

00:11:35.966 that data. Rust essentially codifies and enforces

00:11:39.433 the rules that well-written C programs tend

00:11:42.100 to adopt anyway.

 

00:11:44.700 If you’re experience is rather with a

00:11:46.500 language like Python or Java, where the

00:11:49.366 use of a garbage collector or reference

00:11:51.266 counting hides a lot of the complexity

00:11:53.166 of memory management, then needing to start

00:11:55.833 thinking about ownership is more of a

00:11:57.666 burden. In this case, programming effectively in

00:12:01.400 Rust may require a shift in your thinking.

 

00:12:06.733 That concludes our introduction to region-based

00:12:09.100 memory management.

 

00:12:10.666 This is one of the more unusual features of Rust.

 

00:12:14.566 It provides an efficient

00:12:16.166 and predictable way of managing memory,

00:12:18.266 and offers strong correctness guarantees that can

00:12:20.566 prevent many common bugs.

 

00:12:23.300 But it does so by constraining the

00:12:25.333 types of program that can be written,

00:12:27.000 and by making the programmer explicitly consider

00:12:29.300 data ownership.

 

00:12:31.233 It’s a trade-off that works well for

00:12:33.233 some types of problem, but less well for others.

 

00:12:39.300 The ownership rules of Rust can also

00:12:41.766 be useful for resource management.

 

00:12:46.000 As we’ve seen, Rust tracks data ownership,

00:12:49.166 and deterministically frees heap allocated memory when

00:12:52.300 references to that memory go out of scope.

 

00:12:55.533 Particular types can use this ownership tracking

00:12:58.166 to implement custom destructors that provide deterministic

00:13:01.366 clean-up of resources they own.

 

00:13:04.433 If a type implements the Drop trait,

00:13:06.766 then the runtime will call the drop()

00:13:08.800 method on instances of that type at

00:13:10.966 the end of their lifetime, when they

00:13:12.700 go out of scope. This allows those

00:13:15.900 instances to clean-up after themselves.

 

00:13:19.000 For example, the File class implements the

00:13:21.866 Drop trait to close the underlying file

00:13:24.366 when file objects go out of scope.

 

00:13:28.000 Python has special syntax for this,

00:13:30.200 as shown on the slide. In Rust,

00:13:33.466 much like in C++, the cleanup happens

00:13:36.700 naturally when the object goes out of

00:13:38.300 scope, without the explicit syntax.

 

00:13:43.000 Finally, as we saw in the discussion

00:13:45.666 of struct-based state machines in Lecture 4,

00:13:48.866 ownership tracking allows state transitions to be enforced.

 

00:13:53.500 A state machine can be defined using

00:13:55.666 structs to represent the state, and to

00:13:58.333 hold any needed state variables. And state

00:14:01.433 transitions can be implemented by defining functions

00:14:04.033 on those structs that consume self,

00:14:06.166 and return a new state.

 

00:14:08.966 The slide shows an example, where methods

00:14:11.966 login() and disconnect() are implemented on an

00:14:15.066 UnauthenticatedConnection struct.

 

00:14:18.366 Note that both of these take self

00:14:20.866 as their first parameter, not &self.

 

00:14:24.066 That is, they take ownership of the

00:14:26.400 struct on which they’re implemented, rather than

00:14:28.600 borrow that struct.

 

00:14:31.300 This means they consume self. They destroy

00:14:34.333 the object on which they’re called.

 

00:14:37.300 That is, they force the object representing

00:14:39.600 the state to be destroyed. And they

00:14:42.266 return ownership of a new object,

00:14:44.333 representing the new state of the system.

 

00:14:47.566 This forces the cleanup of any data

00:14:49.600 held by that state that’s not explicitly

00:14:51.700 copied to the new state. It enforces

00:14:54.500 a clean state transition.

 

00:14:57.600 The blog post linked talks about these

00:14:59.666 issues further, and I encourage you to read it.

 

00:15:04.766 That concludes the first part of our

00:15:06.466 discussion of memory and resource management.

 

00:15:09.866 We’ve looked at how processes are stored in memory.

 

00:15:12.466 How reference counting works,

00:15:14.333 and what are its strengths and weaknesses.

 

00:15:17.033 And how region based memory management is

00:15:19.133 used in Rust as an alternative.

 

00:15:22.333 In the next lecture,

00:15:23.566 we’ll move on to discuss garbage collection.

Discussion

Lecture 5 focussed on resource ownership and memory management. It started by discussing how a process is laid out in memory, and what memory needs to be managed. Then, it moved on to discuss the problem of, and need for, automatic memory management in general terms, highlighting that systems programmers tend to distrust automatic memory management, yet suffer from the problems due to manual memory management.

The lecture then discussed reference counting. This is widely used in both scripting languages, such as Python or Ruby, and in the Objective-C runtime used on Apple's iPhone and Mac operating systems. It noted that reference counting is relatively easy to understand, incremental, and predictable, but suffers from relatively high overhead with small or short-lived objects, and cannot collect cyclic data structures.

The lecture then focussed on the region-based memory management scheme used by Rust. It noted that the language tracks ownership of data as it moves between functions, and deterministically cleans up heap allocated memory when the lifetime of an object ends. Region based memory management achieves good, predictable, performance, and provides a number of nice guarantees around absence of race conditions, use-after-free bugs, iterator invalidation, and so on. But it comes at the cost of making the type system considerably more complex, and in limiting the types of programs that can be expressed.

Discussion will focus, primarily, on how the region based memory management scheme works, and on its advantages and disadvantages. The goal is to understand whether you think the benefits outweigh the complexity introduced.