csperkins.org

Advanced Systems Programming H (2021-2022)

Lecture 2: Systems Programming

Lecture 2 is an introduction to systems programming. It discusses what is meant by systems programming, what is the state of the art, and how changes in the environment are driving changes in systems and the way we write systems programs. It concludes with a discussion of how systems programming might evolve, to set the scene for the remainder of the course.

Part 1: Systems Programming

The 1st part of this lecture reviews what is systems programming. It talks about the tendency of systems programs to be constrained by memory performance, the need to conform to external data representations. It highlights how I/O performance is frequently a bottleneck. And it discusses the need to efficient management of shared state. Background reading is highlighted that considers these issues in more depth.

Slides for part 1

 

00:00:00.333 In this lecture, I’d like to introduce

00:00:02.766 the concept of systems programming,

00:00:04.566 and explain what makes it a uniquely challenging problem.

 

00:00:09.500 I’ll start be explaining what is systems programming,

00:00:12.533 and what makes it different to applications programming.

 

00:00:15.966 I’ll briefly discuss the state of the art

00:00:18.000 in systems programming languages and operating systems.

 

00:00:21.266 Then, I’ll move on to highlight some challenges

00:00:23.600 and limitations that affect these state of the art systems.

 

00:00:27.033 Finally, I’ll conclude by discussing some possible

00:00:29.633 next steps in systems programming,

00:00:31.600 to set the scene for the remainder of the course.

 

00:00:35.666 So, what are systems programs?

 

00:00:38.400 They’re the programs that form the infrastructure.

00:00:41.200 They comprise the operating systems,

00:00:43.533 device drivers, network protocol stacks, and services

00:00:47.533 on which the rest of the computing ecosystem is built.

 

00:00:51.633 Systems programs tend to be characterised by

00:00:54.233 strict performance constraints.

00:00:56.100 They must effectively manage memory,

00:00:58.533 and achieve high performance,

00:01:00.133 while matching externally-specified constraints

00:01:02.500 on the data representation.

 

00:01:04.466 Authors of device drivers and

00:01:06.533 network protocol implementations

00:01:08.566 don’t have the luxury of

00:01:09.900 choosing their own data representation:

00:01:12.066 they must match externally specified formats exactly.

 

00:01:15.400 And they must do so while achieving high performance.

 

00:01:18.800 This rules out surprisingly many programming languages

00:01:22.100 and environments, since they don’t provide such control.

 

00:01:25.800 It’s difficult to write high-performance systems

00:01:28.333 programs in Java or Python, for example,

00:01:30.866 since they don’t give you control over

00:01:32.966 how data is laid out in memory.

 

00:01:35.566 Languages like C, C++, and Rust do give such control.

 

00:01:42.400 Systems programs need to manage I/O efficiently.

 

00:01:46.100 In part, this is a continuation of the

00:01:48.366 previous point about data layout.

00:01:50.666 If the internal and external representations

00:01:53.366 of data differ,

00:01:54.333 then the format conversion hurts performance.

 

00:01:57.533 But it’s also that I/O performance of systems

00:02:00.000 programs is fundamental.

 

00:02:01.633 If a single application is slow,

00:02:04.100 then it can be replaced with minimal impact

00:02:05.866 on the rest of the system.

00:02:08.300 If the underlying operating system is slow,

00:02:10.666 the entire system is affected.

 

00:02:14.133 Systems programs also need to effectively

00:02:16.566 manage shared state.

 

00:02:18.933 Modern hardware is increasingly multicore,

00:02:21.533 and modern applications are concurrent.

 

00:02:24.400 Operating systems resources, device drivers,

00:02:27.466 network protocols, and other libraries

00:02:30.033 are always being accessed by multiple processes,

00:02:32.733 and in multiple contexts, at once.

 

00:02:35.366 Race conditions, deadlocks,

00:02:37.766 and poor performance at this level

00:02:39.733 affects everything running on the system.

00:02:42.200 It’s critical to making effective use

00:02:44.633 of the available resources.

 

00:02:50.633 Memory management for systems programs must be predictable.

 

00:02:54.366 There must be predictable bounds on memory usage.

00:02:57.733 The memory needed by the system should scale

00:03:00.466 with the tasks it’s performing,

00:03:01.866 and overheads should be small compared to the data.

 

00:03:06.200 It should be clear how much memory is used,

00:03:08.800 and when that memory is used,

00:03:10.900 so that the system as a whole can be understood.

 

00:03:14.233 This is especially important

00:03:16.033 as real-time applications become more common,

00:03:18.366 since unpredictability in memory management

00:03:20.966 can disrupt application timing performance.

 

00:03:24.066 Consistent and bounded memory usage

00:03:26.533 is beneficial overall though,

00:03:28.400 especially as relatively memory constrained

00:03:31.433 devices, such as IoT devices, wearables,

00:03:34.200 and smartphones, proliferate.

 

00:03:38.200 Data representation and locality matters for performance.

 

00:03:43.400 The diagram shows the effects of object sizes

00:03:46.366 and their spacing, known as the stride distance,

00:03:49.400 on read performance for a typical modern system.

 

00:03:52.566 We see that as the size of the objects being read increases,

00:03:56.366 the read throughput drops.

00:03:58.533 Similarly, as the stride increases,

00:04:01.100 that is, as objects are spaced further apart in memory,

00:04:04.100 throughput drops.

 

00:04:05.600 This drop in throughput can be dramatic:

00:04:08.400 the worst case shows several orders of magnitude performance

00:04:11.466 loss compared to the peak.

 

00:04:13.666 The cause is the CPU cache.

 

00:04:16.800 Modern processors can access small objects,

00:04:19.433 located close to each other in memory,

00:04:21.133 extremely quickly.

 

00:04:23.066 Fetching larger objects into the cache is slower.

 

00:04:26.400 And fetching widely, or irregularly,

00:04:28.700 spaced objects from memory is slower still.

 

00:04:32.266 Effective systems programs arrange data in memory

00:04:35.500 such that related objects are co-located,

00:04:37.933 to achieve high performance.

00:04:39.966 This requires a programming language

00:04:42.033 that gives precise control over data layout.

 

00:04:45.500 Similarly, device drivers and network

00:04:48.466 protocol implementation must confirm

00:04:50.500 to fixed external layouts.

 

00:04:52.733 Whether matching the structure

00:04:54.266 of a hardware control register,

00:04:55.966 or the format of a standard network protocol data unit,

 

00:04:59.200 such systems must have exact

00:05:01.133 bit-level control over the data representation.

 

00:05:05.266 The key feature of systems programming languages

00:05:08.233 is that they provide such control over data layout

00:05:10.966 and memory management.

 

00:05:12.366 This introduces complexity.

00:05:14.700 Pointers in C are undoubtably harder to

00:05:17.533 learn than objects in Java, for example.

 

00:05:20.466 But it’s needed if performance is to be maintained.

 

00:05:24.966 The second constraint on systems programs

00:05:27.933 tends to be I/O performance.

 

00:05:30.266 The chart shows how the transmission rate

00:05:32.600 of Ethernet has increased over time,

00:05:34.966 from the initial 1Mbps data rate,

00:05:37.700 to 40 Gbps parts,

00:05:40.133 and with 100 Gbps Ethernet becoming available.

 

00:05:44.266 It’s clear that the transmission rate of Ethernet,

00:05:46.666 in bits per second,

00:05:48.100 is still growing at something close to an exponential rate.

 

00:05:52.300 While this has been happening,

00:05:53.900 the maximum Ethernet packet size, the MTU,

00:05:57.133 has remained constant at 1500 bytes.

 

00:06:00.433 The result is that the number of packets sent per second

00:06:03.100 has increased exponentially.

 

00:06:05.766 Wireless link speeds follow a similar curve.

 

00:06:09.533 This is becoming problematic,

00:06:11.400 because single core CPU performance

00:06:13.633 peaked in the mid-2000s.

 

00:06:16.000 For over a decade now,

00:06:17.700 the number of network packets that must be processed

00:06:20.333 per second has been increasing,

00:06:22.033 while performance of the processors that must

00:06:24.400 handle those packets has improved little.

00:06:27.000 The result is that packets must be processed

00:06:29.666 more efficiently than ever before.

 

00:06:31.666 The system has less time to spend

00:06:33.800 processing each packet,

00:06:35.266 since it must process more packets

00:06:37.166 with approximately the same amount of resources.

 

00:06:40.700 To make matters worse,

00:06:42.233 while this has been happening,

00:06:43.566 we’ve also seen the move to encrypt virtually

00:06:45.833 all of the data passing over the network,

00:06:48.200 to improve security,

00:06:50.133 and the increasing use of streaming and interactive video.

 

00:06:53.633 Both of these increase the amount of work

00:06:55.600 that must be performed for each packet.

 

00:06:58.800 The combination of more work per packet,

00:07:02.000 more packets per second, and processors

00:07:04.366 the are not getting much faster,

00:07:06.333 make writing network device drivers,

00:07:08.566 protocol stack implementations,

00:07:10.500 and networked applications, increasingly difficult.

 

00:07:14.733 Similar trends exist for solid state disk performance.

 

00:07:18.833 The I/O stack, network and disk,

00:07:21.200 is fast becoming a critical bottleneck

00:07:23.233 in overall system performance.

 

00:07:25.800 Systems software to support I/O has to be highly efficient.

 

00:07:31.866 The final constraint on systems programs

00:07:34.100 is management of shared state.

 

00:07:36.566 Systems programs are responsible for coordinating

00:07:39.133 use of shared resources

00:07:40.766 between the operating system

00:07:42.500 and the applications running on the system.

 

00:07:44.833 They manage the internal state

00:07:46.733 of the operating system kernel,

00:07:48.600 the file systems,

00:07:49.900 network code, device drivers, etc.

 

00:07:53.033 All this is accessed concurrently by multiple threads,

00:07:56.033 as different applications access operating system resources.

 

00:08:01.066 Applications programs can be written in a way that

00:08:03.500 ignores concurrency,

00:08:05.033 and uses only a single thread of execution,

00:08:07.500 if the application programmer so chooses.

 

00:08:10.633 The systems programs that support those applications

00:08:13.600 don’t have that option.

 

00:08:15.100 They must support multiple applications running at once,

00:08:18.133 on multicore hardware.

 

00:08:20.466 Concurrency and parallelism

00:08:22.566 are pervasive concerns for systems programs,

00:08:24.900 and cannot be avoided.

 

00:08:28.600 What is common to all these concerns

00:08:30.900 is that performance of the systems programs

00:08:33.166 comprising the infrastructure

00:08:34.766 fundamentally affects the overall system performance.

 

00:08:38.800 Inefficiencies in systems programs

00:08:41.266 will reduce performance,

00:08:42.733 and increase power consumption of the entire system,

 

00:08:45.733 irrespective of the application it’s running.

 

00:08:48.833 This means that systems components are often the bottleneck,

00:08:51.866 simply because they’re the basis

00:08:53.433 on which higher-layer components depend.

 

00:08:57.433 These trends push systems programming languages

00:09:00.466 to offer low-level control.

 

00:09:02.633 To achieve the necessary performance and efficiency,

00:09:05.466 it’s essential that the program be able to control

00:09:07.900 the layout of data in memory,

00:09:09.733 to manage when, and in what order, data is accessed,

00:09:13.366 and to have precise control of when,

00:09:15.566 and how, data is shared.

 

00:09:18.066 This is why languages, such as C and C++,

00:09:21.133 that give such low-level control,

00:09:23.200 remain popular for systems programming.

 

00:09:26.500 A C programmer can, if they choose,

00:09:29.200 control precisely the layout of data in memory,

00:09:32.166 and they have the flexibility to choose exactly when

00:09:34.766 and how data is shared between threads.

 

00:09:37.566 They don’t have to exercise that control,

00:09:39.900 but it’s available if necessary.

 

00:09:43.133 Providing this control is a hallmark

00:09:45.266 of systems programming languages.

 

00:09:47.866 It pushes the language, and it’s runtime,

00:09:50.233 to be more transparent and predictable

00:09:52.366 in how it compiles to machine instructions,

00:09:55.100 and about the costs of various operations,

00:09:58.166 and to favour simpler,

00:09:59.666 more predictable, runtime environments.

 

00:10:03.000 Languages such as Java, Python, and Go,

00:10:05.766 that do not provide such transparency and control,

00:10:08.500 are not well suited to such low-level systems programming.

 

00:10:12.800 The paper listed on the slide,

00:10:14.966 “Programming language challenges in systems codes:

00:10:17.800 why systems programmers still use C,

00:10:20.033 and what to do about it”

00:10:21.700 by Jonathan Shapiro, explores these ideas,

00:10:25.133 and discusses what is systems programming, in more detail.

 

00:10:29.033 I strongly encourage you to read it.

 

00:10:33.000 That concludes this introduction

00:10:34.966 to what makes systems programming challenging.

 

00:10:37.466 In the next part, we’ll briefly discuss

00:10:39.533 the state of the art in systems programming languages

00:10:42.033 and operating systems.

Part 2: The State of the Art

The 2nd part of the lecture considers the state of the art in systems programming. It reviews the history of C and Unix, and briefly highlights their strengths and weaknesses.

Slides for part 2

 

00:00:00.066 What is the state of the art in operating systems

00:00:03.233 and systems programming?

 

00:00:05.733 Most modern computing systems

00:00:08.000 run some variant of Unix as their operating system.

00:00:11.266 That operating system, and many of its

00:00:14.166 supporting utilities and services,

00:00:16.100 are written in the C programming language.

 

00:00:18.866 If we think about the computers we carry with us every day,

00:00:21.933 for example,

00:00:23.233 Android phones run a variant of Linux,

00:00:26.166 and the iPhone operating system is also a Unix variant:

00:00:29.866 it’s derived from a combination of the Mach microkernel,

00:00:33.000 from Carnegie Mellon University,

00:00:35.200 and 4.3BSD Unix,

00:00:37.633 along with a large number of user space utilities

00:00:40.500 adapted from FreeBSD.

 

00:00:42.700 Both feature a Unix-like kernel, runtime,

00:00:46.066 and core system written in C,

00:00:48.400 augmented by user interface toolkits

00:00:50.566 and frameworks written in a higher-level language.

 

00:00:54.966 In the data centre, to support the applications we use,

00:00:58.133 are tens of thousands of Linux servers.

 

00:01:01.133 And on the desktop,

00:01:02.500 macOS is the same core software stack as the iPhone.

 

00:01:06.266 About the only popular system not z

00:01:08.266 based on Unix is Microsoft Windows.

 

00:01:11.000 The Windows kernel is, instead,

00:01:12.900 heavily influenced by VMS,

00:01:15.766 another system that dates to the same era as Unix.

00:01:19.066 It’s also written in C.

 

00:01:22.900 The core of Unix is not a modern design.

 

00:01:25.700 It was originally written in assembly

00:01:27.700 for the PDP-7 minicomputer in 1969,

00:01:31.733 and ported to the PDP-11 in the early 1970s.

 

00:01:36.366 The PDP-11, as shown in the photo,

00:01:39.433 was an impressive machine for its time,

00:01:42.000 capable of supporting large multi-user

00:01:44.600 and multi-tasking applications with up to 124K of memory.

 

00:01:50.566 This more powerful machine allowed the operating system

00:01:53.733 to be rewritten into a new,

00:01:55.800 higher-level, programming language,

00:01:57.800 developed in parallel with Unix, that eventually became C.

 

00:02:02.866 Unix was the first operating system to be written

00:02:05.666 in a high level language, with very little assembly code,

 

00:02:09.266 and this made it highly portable.

00:02:11.766 Indeed, there was a thriving open source community

00:02:14.733 developing Unix in the 1980s and early 1990s,

00:02:18.300 centred around the University of California Berkeley

00:02:21.500 and BSD Unix.

 

00:02:23.933 BSD Unix eventually became a

00:02:26.133 core part of Apple’s operating systems,

00:02:28.900 while Linux development was kick-started by a lawsuit

00:02:31.966 between AT&T, the original developers of Unix,

00:02:35.333 and the University of California,

00:02:37.566 over the status of BSD Unix.

 

00:02:41.466 The design of C and Unix

00:02:43.766 has proven surprisingly resilient and flexible.

 

00:02:46.900 The system calls, APIs, and user space tools

00:02:50.633 have been extended and augmented over the years,

00:02:53.533 but the core APIs, tools, and programming language

00:02:56.933 are recognisably the same.

 

00:02:59.700 Unix clearly did a lot of things right

00:03:02.566 – otherwise it could never have survived for this long,

00:03:04.966 and been adapted to so many different

00:03:07.100 platforms and environments

00:03:08.900 – but is it still the right model?

 

00:03:13.466 Much of the success of Unix came because it was portable,

00:03:17.200 and because the source code was available to universities,

00:03:20.400 allowing a strong developer and user community to form.

 

00:03:24.033 The development of the BSD Sockets API,

00:03:27.266 and one of the earliest TCP/IP stacks,

00:03:29.800 also helped Unix grow along with the Internet.

 

00:03:33.466 But I think there’s more than just community.

 

00:03:36.333 Unix has a small and, for its time,

00:03:39.133 a relatively clean and consistent programming API.

00:03:42.800 It’s easy to understand and extend,

00:03:45.300 robust and reasonably high-performance,

00:03:47.966 and its transparent and offers low-level control

00:03:51.033 and access when needed.

 

00:03:53.033 The kernel and utilities are, on the whole, well designed.

 

00:03:59.066 Further, the C programming language was portable,

00:04:02.400 easy to use, and offered features that were quite high-level

00:04:05.966 – for it’s time.

 

00:04:07.933 To programmers coming from writing systems programs

00:04:10.933 in assembly language, the abstraction of

00:04:13.566 pointers in C greatly simplified

00:04:15.766 writing portable device drivers and other low-level code.

 

00:04:19.366 The concept of pointers in C provides a single

00:04:23.533 mechanism to solve many problems.

 

00:04:25.833 The same approach lets you build

00:04:27.900 complex data structures,

00:04:29.500 manipulate data in memory,

00:04:31.433 and access hardware control registers

00:04:34.366 – no other programming language offers such a simple

00:04:37.200 and successful abstraction for such uses.

 

00:04:40.900 Further, the C type system is strong enough

00:04:44.366 to make writing systems programs significantly

00:04:46.633 easier than writing them in assembly,

00:04:48.933 yet weak enough to allow aliasing,

00:04:51.066 casting, and sharing of data

00:04:53.300 in ways that were necessary for performance.

 

00:04:56.500 For example, a zero copy network protocol stack

00:04:59.700 can be cleanly implemented in C,

00:05:01.866 passing pointers to different offsets inside a packet,

00:05:05.200 cast to different types,

00:05:07.133 to the next layer in the stack as the packet is processed.

 

00:05:10.866 Such aliasing and flexibility with types

00:05:13.566 is natural to an assembly language programmer,

00:05:16.166 and can be cleanly expressed in C,

00:05:18.633 but is a very poor fit for most other

00:05:20.900 programming languages.

 

00:05:24.833 The code fragment on the slide gives an example,

00:05:27.600 to show how C can be used to cleanly write

00:05:30.100 portable device drivers.

 

00:05:32.333 There are two parts to this.

 

00:05:34.866 The first is the definition of a data type,

00:05:37.466 struct ctrl_reg,

00:05:39.566 and the second is the definition of the enable_irq()

00:05:42.566 function.

 

00:05:44.333 The struct definition

00:05:46.000 describes the format of a control register

00:05:48.300 for some hardware device.

 

00:05:50.300 It uses the bitfield syntax of C

00:05:53.033 to specify the exact layout of a 16 bit control word.

 

00:05:57.966 The enable_irq() function reads from a control register,

00:06:01.566 r, located at a fixed memory address,

00:06:04.866 0x80 00 00 24.

00:06:09.200 It checks if the busy bit,

00:06:11.433 the 5th bit in the control register,

00:06:13.400 is clear, and if so,

00:06:15.466 enables interrupts by setting the irq_enable bit

00:06:18.700 and writing to the control register.

 

00:06:21.633 It’s a simple example,

00:06:23.300 showing the sort of data structure and operation

00:06:25.900 that is common in device drivers,

00:06:27.900 and showing how cleanly, portably,

00:06:30.400 and readably device drivers can be written in C.

 

00:06:34.466 This sort of clean, low-level, access

00:06:37.000 is one of the key strengths of C.

 

00:06:41.200 That explains the benefits,

00:06:43.366 but what’s wrong with Unix and C?

 

00:06:47.333 The Unix networking and filesystem APIs

00:06:50.200 can be performance bottlenecks.

00:06:52.433 They emphasise saving buffer space

00:06:54.666 at the expense of additional system calls

00:06:56.900 in a way that made sense when memory capacity

00:06:59.200 was limited and system calls

00:07:01.000 were not so expensive,

00:07:02.366 but that doesn’t reflect modern hardware.

 

00:07:05.400 The Unix security model

00:07:07.300 is designed to arbitrate access by multiple

00:07:09.733 users to a single expensive machine,

00:07:12.433 not to control access to

00:07:14.033 personal data owned by a single user.

 

00:07:17.066 It can effectively prevent other users

00:07:19.600 on my machine from seeing my personal data,

00:07:22.266 but can’t easily control access for

00:07:24.133 applications I choose to run.

 

00:07:26.500 And Unix has no portable APIs for modern concerns,

00:07:29.900 such as mobility, power management, and so on.

 

00:07:34.900 The C programming language

00:07:36.766 offers a powerful and flexible abstraction

00:07:39.033 for memory access, in the form of pointers,

00:07:42.033 but this is perhaps too powerful.

00:07:44.700 It’s too easy to enable buffer overflows,

00:07:47.166 use-after-free bugs, or off-by-one errors,

00:07:49.933 or to invoke undefined behaviour,

00:07:51.933 and cause a security vulnerability.

 

00:07:54.800 The weak type system offers flexibility,

00:07:58.433 but makes it difficult to reason about

00:08:00.466 the correctness of the system.

 

00:08:02.366 Many bugs happen because the type system is flexible

00:08:05.400 enough to allow a particular design,

00:08:07.666 but not powerful enough to

00:08:08.966 check that design for correctness.

 

00:08:11.933 And C and Unix have limited support for concurrency.

00:08:16.200 Modern versions of C did, eventually,

00:08:18.933 get a well-defined model for when concurrent

00:08:21.000 memory accesses are visible across threads,

00:08:23.533 along with a portable multi-threading library,

00:08:26.533 but these are not well supported.

 

00:08:29.500 And, perhaps more importantly,

00:08:31.766 it’s not clear that multi-threading is the right

00:08:33.900 concurrency abstraction going forward.

 

00:08:37.133 It’s frequently forgotten today,

00:08:40.566 but multi-threading was an extremely controversial

00:08:43.500 addition to C and Unix.

00:08:45.633 The suggested benefits,

00:08:47.266 providing lower-overhead concurrency, were real.

 

00:08:50.600 Unfortunately, so too were the concerns around race

00:08:53.633 conditions and correctness.

 

00:08:58.133 Overall, Unix has proven to be a success.

00:09:01.566 It has some limitations,

00:09:03.533 especially around high-performance I/O,

00:09:05.666 power management, mobility, etc.,

00:09:08.233 but these can mostly be addressed with extensions,

00:09:11.200 retaining the older APIs for backwards compatibility

00:09:14.766 – and we’ll talk more about how this is

00:09:16.600 done later in the course.

 

00:09:19.100 More concerning are the limitations of the security model

00:09:22.033 and package management systems.

 

00:09:24.333 The increasing use of containers, sandboxing,

00:09:27.200 and virtualisation to work around limitations and

00:09:30.333 improve security is a sign that the underlying

00:09:33.000 security and shared library mechanisms

00:09:35.400 are not meeting the current needs.

 

00:09:39.133 The C programming language

00:09:40.933 is increasingly becoming a liability.

00:09:43.166 It offers a flexible approach to memory management,

00:09:46.400 and the pointer abstraction is powerful,

00:09:48.966 but it’s proven too difficult to use correctly.

 

00:09:52.433 It’s too easy to introduce security vulnerabilities,

00:09:56.100 or to trip over undefined behaviour.

 

00:09:58.933 In practice, no-one seems able to

00:10:01.633 consistently write correct and secure C programs.

 

00:10:05.033 The answer here cannot be “better programmers”

00:10:07.866 – they don’t exist.

00:10:09.500 Rather, we need less error prone languages.

 

00:10:13.800 The paper, “Some were meant for C:

00:10:16.100 The endurance of an unmanageable language”,

00:10:18.266 by Stephen Kell,

00:10:19.800 discusses some of the issues with

00:10:21.633 the C language in more detail

00:10:23.333 – both its strengths and weaknesses.

 

00:10:26.500 Please read it, and think about these issues,

00:10:28.866 before the discussion session.

 

00:10:32.633 That concludes our brief review of the state of the art

00:10:35.600 in systems programming, and some of its limitations.

 

00:10:38.800 In the next part, we’ll talk more about some of the

00:10:41.666 changes in the environment that are pushing us to reconsider

00:10:44.633 how we write systems programs.

Part 3: Challenges and Limitations

The 3rd part of the lecture discusses changes in the environment that are affecting systems programs. It talks about the end of Moore's law and the breakdown of Dennard scaling, and the implications of these hardware trends on microprocessor architecture and concurrency. It discusses the increasing need for secure systems due to the presence of always-on connectivity and the Internet, and outlines some features of C and Unix that are hard to secure. And it discusses the impact of mobile, battery-powered, devices, and climate change, on need for energy efficiency in systems programs.

Slides for part 3

 

00:00:00.133 What changes in the environment

00:00:02.333 are affecting systems programs?

 

00:00:05.100 I see four trends.

 

00:00:07.033 The first is the end of Moore’s law and Dennard Scaling.

 

00:00:10.933 This is forcing

00:00:12.866 changes in the way processors are designed,

00:00:14.966 forcing more concurrency and more specialisation.

00:00:18.700 Second is that increase in concurrency,

00:00:21.433 and its impact on the way we design and build software.

 

00:00:25.600 Third is the increasing urgency

00:00:27.900 to build more secure systems,

00:00:29.700 as the ubiquity of the Internet,

00:00:31.933 and ever-connected devices,

00:00:33.666 exposes more and more vulnerabilities.

 

00:00:36.666 Fourth, and finally, is the impact of

00:00:40.033 connectivity, and the changes to the way

00:00:42.333 we write applications caused by pervasive connectivity,

00:00:45.833 mobility, and cloud computing.

 

00:00:50.700 Moore’s law is the observation, made by Gordon Moore,

00:00:54.733 the co-founder of Intel, in 1965,

00:00:56.966 that the number of transistors

00:00:59.800 that can be integrated onto a single

00:01:02.266 device doubles roughly every two years,

00:01:04.400 due to improvements in manufacturing.

 

00:01:06.833 When plotted with time on the x-axis,

00:01:09.800 and number of transistors on a log

00:01:13.733 scale on the y-axis, such exponential growth

00:01:15.900 should appear as a straight line.

 

00:01:17.866 This is true for the original prediction,

00:01:20.500 shown in the graph on the left,

00:01:22.433 but isn’t very convincing. It was based

00:01:24.833 on four data points, capturing 1962-1965.

 

00:01:31.000 If we add modern data, though,

00:01:33.166 the pattern still holds up.

00:01:35.266 The orange points on this plot show

00:01:37.433 the number of transistors for processors released

00:01:39.766 between 1970 and 2018.

 

00:01:42.900 They follow the

00:01:44.766 same straight-line relation, at least approximately.

 

00:01:47.500 The industry has, through heroic effort,

00:01:50.066 ensured that Moore’s law continues.

 

00:01:54.000 But, we’re rapidly approaching physical limits.

 

00:01:57.733 The feature size of the transistors that

00:02:00.433 form modern microprocessors is 10nm, with some

00:02:03.900 manufacturers introducing devices with 5nm features.

 

00:02:08.300 A 5nm feature is roughly 20 atoms across.

 

00:02:13.366 Four more generators of Moore’s law

00:02:16.266 would result in features only one or two atoms in size.

00:02:20.133 There is a physical limit in how small things can go.

 

00:02:24.433 Transistors will stop shrinking soon.

 

00:02:27.166 Moore’s law will come to an end.

 

00:02:31.466 An issue that’s had greater impact than

 

00:02:33.800 Moore’s law, although it’s perhaps less well

00:02:36.966 known, is the breakdown of Dennard scaling.

 

00:02:39.900 As we see in the equation on the right,

00:02:42.433 the power consumed by a transistor

00:02:44.733 is proportional to the capacitance of

00:02:48.133 the switching gate; the frequency at which

00:02:51.200 the transistor switches, that is, the clock

00:02:53.200 rate; and the square of the voltage applied.

 

00:02:55.466 Plus a constant term, known as the leakage current.

 

00:02:59.033 The voltage and capacitance,

00:03:02.300 in turn, directly relate to the size of the transistor.

 

00:03:06.533 The Dennard scaling relation is the observation

00:03:09.900 that, as transistors shrink in size,

00:03:11.733 due to Moore’s law, their power consumption will decrease,

00:03:15.300 since this reduces the voltage and capacitance.

 

00:03:19.400 This lets you build more power efficient computers.

 

00:03:23.000 Alternatively,as the capacitance and voltage decrease,

00:03:27.266 you can increase the clock frequency to match.

 

00:03:30.700 This lets you build faster computers

00:03:33.000 that use roughly the same amount of power.

 

00:03:36.433 From the home computers of the mid-1980s,

00:03:39.166 through to around 2005,

00:03:41.233 this was the story of the computer industry.

 

00:03:44.766 Moore’s law reduced the size of transistors and,

00:03:47.800 due to Dennard scaling, computers got faster.

 

00:03:51.466 We started with a clock rate of around 1MHz,

00:03:54.800 in the era of the Commodore 64 and BBC Micro,

00:03:58.033 and by the end had with PCs with Intel

00:04:01.200 Pentium IV processors running at 3.8GHz.

 

00:04:05.800 Unfortunately, as we see in the green

00:04:08.400 points in this figure,

00:04:09.933 the clock rate stopped increasing at that point.

 

00:04:13.866 The reason is the leakage current,

00:04:16.133 that started to grow exponentially when the

00:04:18.333 feature size dropped below around 65nm.

 

00:04:22.433 Moore’s law still gave power savings,

00:04:25.166 as the size of transistors decreased,

00:04:27.533 but those savings had to be taken,

00:04:29.900 keeping the clock rate constant, to balance

00:04:31.833 the increased power consumption due to the

00:04:34.133 greater leakage current.

 

00:04:37.033 It simply became difficult to make processors

00:04:39.400 faster without them melting, due to the

00:04:42.133 heat from the excessive power consumption.

 

00:04:45.100 For this reason, while Moore’s law has

00:04:47.433 continued to give us more transistors since

00:04:50.733 the mid-2000s, clock rates have remained stalled.

 

00:04:55.266 Given that Moore’s law is continuing,

00:04:57.766 how do processor designers use those additional transistors?

 

00:05:02.166 Mostly, as some combination of large caches,

00:05:05.500 and to support multicore operation.

 

00:05:08.733 Starting shortly after the breakdown of Dennard scaling,

00:05:11.933 when it became clear that

00:05:13.233 processor clock rates couldn’t increase anymore,

00:05:15.500 because we’d reached the limit in power consumption,

00:05:18.366 we see the number of processor cores start to increase.

 

00:05:22.400 This is shown on the figure as the black data points.

 

00:05:26.533 What’s interesting is that the increasing

00:05:28.233 number of cores plots as a straight line in this figure.

 

00:05:31.700 Given the log scale on the y-axis,

00:05:34.033 this implies the number of

00:05:35.233 cores is increasing exponentially.

 

00:05:38.800 A consequence of this, is that concurrent

00:05:41.266 programming is becoming more important.

 

00:05:44.033 Multiple threads of execution are needed to extract good

00:05:47.233 performance from modern processors.

 

00:05:50.566 Another consequence is that concurrency bugs more

00:05:53.000 becoming more visible.

 

00:05:55.233 Race conditions, deadlocks,

00:05:57.033 and other problems with interactions between multiple

 

00:05:59.300 threads tend to manifest themselves more often

00:06:02.066 as the amount of parallelism increases.

 

00:06:05.500 Not only are we increasingly seeing the

00:06:09.600 need for concurrency, we’re also seeing how

00:06:12.200 difficult it is to write concurrent software.

 

00:06:17.800 In addition to the changes in computer hardware,

00:06:20.600 leading to increased concurrency,

00:06:22.766 we’re also seeing far more devices

00:06:24.466 with always-on Internet connectivity.

 

00:06:28.366 Unfortunately, with this rise in connectivity

00:06:31.400 has come increasing numbers of security vulnerabilities.

 

00:06:35.133 The graph shows the number of security

00:06:36.933 vulnerabilities reported per year, since the year 2000,

00:06:40.800 colour-coded by severity.

 

00:06:43.766 What is clear is that the combination of Unix and C,

00:06:47.300 and the applications that run on such systems, has not

00:06:50.333 proven easy to secure.

 

00:06:53.066 As we’ll see later, most of these

00:06:55.466 vulnerabilities are due to weak type systems

00:06:58.200 and a lack of memory safety in modern systems.

 

00:07:01.366 That is, many of these

00:07:03.266 vulnerabilities are avoidable, in principle, by changing

00:07:06.666 the way we implement software systems.

 

00:07:11.500 The final trend I want to highlight

00:07:13.933 is that of increasing mobility and connectivity.

 

00:07:17.300 Devices are mobile, always on, and always

00:07:20.066 connected to the network, and to remote computing resources.

 

00:07:24.466 But they’re also deeply constrained. They’re constrained

00:07:28.733 by the limitations of battery power,

00:07:30.333 and the vagaries of the networks to which they connect.

 

00:07:33.766 It’s not clear that we have the APIs, tools, and programming

00:07:37.066 models to make effective use of such

00:07:39.500 devices, or to adapt to the variation

00:07:42.700 in the environment.

 

00:07:45.466 This concludes the review of the challenges

00:07:47.900 and limitations facing systems programming.

 

00:07:51.133 I’m sure this is not complete,

00:07:53.100 and you can think of many more issues.

 

00:07:55.733 What is clear, though, is that we’ve

00:07:58.133 seen a radical shift in computing hardware,

00:08:00.133 and the way in which computing devices

00:08:02.066 are used, over the last fifteen years.

 

00:08:05.400 This shift has not been accompanied by a similar shift

00:08:08.133 in the way we design and implement

00:08:09.733 operating systems and systems software.

 

00:08:13.400 In the final part of this lecture,

00:08:15.433 we’ll discuss some possible ways operating systems

00:08:17.933 and programming languages could change,

00:08:20.033 to start to address these challenges.

Part 4: Next Steps in Systems Programming

The final part of the lecture discusses how systems programming might evolve. It outlines how modern type systems, memory safe languages, and techniques from functional programming, can help improve systems programs. In particular, it focusses on how the can improve security, support for concurrency, and system correctness. This sets the scene for the discussion in the remainder of the course.

Slides for part 4

 

00:00:00.300 In the previous parts of this lecture,

00:00:02.800 I discussed the state of the art

00:00:04.400 in operating systems and systems programming languages,

00:00:07.400 and some of the challenges and changes

00:00:09.466 that such systems face.

 

00:00:11.766 In the following, I’ll consider how advances

00:00:14.133 in programming languages, and the use of

00:00:16.100 strong type systems and functional programming techniques,

00:00:19.066 can help to improve the expressivity,

00:00:21.366 correctness, and security of systems programs,

00:00:24.666 to begin to address some of these challenges.

 

00:00:29.400 Advances in the theory of programming language

00:00:31.766 design have long provided tools that could

00:00:34.100 improve the way we write systems programs.

 

00:00:37.066 More recently, we’re finally starting to see

00:00:40.066 new programming languages being developed that incorporate

00:00:42.800 these ideas into practical languages and systems.

 

00:00:46.266 In particular, the ideas behind functional programming,

00:00:49.766 and modern type systems, can help to

00:00:52.200 improve memory management and memory safety,

00:00:54.700 while maintaining control over allocation

00:00:57.066 and data representation.

 

00:00:59.333 They can help to improve security by

00:01:01.966 eliminating certain classes of vulnerability.

00:01:04.600 For example, by preventing buffer overflows,

00:01:07.166 or helping to track untrusted data.

 

00:01:10.133 They can help to improve support for

00:01:12.266 concurrency by eliminating data races, by encouraging

00:01:15.533 immutability or tracking data ownership.

 

00:01:18.566 And, in general, the use of strong

00:01:20.666 type systems and functional approaches can help

00:01:23.966 eliminate entire classes of bug, by changing

00:01:26.866 the way we design programs using an

00:01:29.000 approach known as type-driven design.

 

00:01:32.833 To achieve these benefits, we’ll need to

00:01:35.166 apply functional programming techniques and modern type

00:01:37.866 systems to systems programming.

 

00:01:40.700 What do I mean by modern type systems?

 

00:01:43.300 Well, a modern type system is one

00:01:46.366 that can provide useful guarantees about program behaviour.

 

00:01:49.900 We’ve long had type systems that can

00:01:51.800 describe basic properties of data. For example,

00:01:54.733 languages that let us specify that a

00:01:56.566 particular value is an array that can store integers,

00:01:59.533 rather than floating point numbers, are unsurprising.

 

00:02:03.000 A more interesting type system is one

00:02:05.633 that can enforce, at compile time,

00:02:07.866 that there are no out-of-bounds accesses to that array.

 

00:02:11.033 Or that there are no use-after-free bugs in a program.

 

00:02:14.266 Or no memory leaks.

00:02:15.566 No race conditions between threads.

 

00:02:17.566 No iterator invalidation. And so on.

 

00:02:20.466 Modern type systems let us reason about,

00:02:23.433 and prove, interesting properties of real systems programs.

 

00:02:27.633 They can help ensure concurrent code is correct,

00:02:30.633 and they can help avoid particular classes of bugs

00:02:33.200 and security vulnerabilities.

 

00:02:36.166 Modern type systems also help

00:02:38.133 us to model the problem space, and allow us

00:02:40.600 to detect inconsistencies and problems in our solutions.

 

00:02:44.233 They support the concept of no cost abstraction.

 

00:02:47.466 That is, compile-time checks of system

00:02:49.733 properties that have no run-time cost.

 

00:02:52.466 This lets us describe, and check,

00:02:54.700 our design, and put constraints on possible

00:02:57.466 program misbehaviours, before running the code.

 

00:03:01.300 Modern type systems help us to describe

00:03:03.300 the problem space, and its constraints,

00:03:05.466 in a way that allows us to check

00:03:07.233 useful properties of the system as

00:03:09.000 we elaborate on our design and implementation.

 

00:03:12.100 Essentially they allow us to use the

00:03:14.300 compiler as a debugger for our designs.

 

00:03:18.533 As an example of this, think about

00:03:20.700 how we write networking code.

 

00:03:23.366 In the Berkeley Sockets API, used to

00:03:25.666 write network servers in C, one calls

00:03:28.366 the accept() function on a socket to

 

00:03:30.433 accept an incoming TCP connection.

 

00:03:33.200 The accept() function takes as its parameter an existing

00:03:36.566 socket that’s listening for connections, and returns

00:03:39.333 a new socket that represents the newly accepted connection.

 

00:03:43.333 A common bug in networking code is

00:03:45.866 to mix up these two sockets.

 

00:03:48.100 For example, to attempt to send data

00:03:50.133 using the socket that’s listening for new connections,

00:03:52.866 rather than using a socket that

00:03:54.433 represents a previously accepted connection.

 

00:03:57.366 This compiles, but fails at runtime.

 

00:04:00.800 Such code compiles because sockets are identified

00:04:03.400 by simple integers in C.

 

00:04:06.200 The read(), write(), and accept() calls

00:04:08.733 take file descriptors,

00:04:10.233 with type int, to identify the sockets.

 

00:04:13.466 As a result, the compiler can’t tell the two types

00:04:16.966 of socket, connected and listening, apart.

 

00:04:20.866 A better API would use the type system to distinguish

00:04:24.100 the different types of socket.

 

00:04:25.933 There should be separate Connected Socket

00:04:28.133 and Listening Socket types.

 

00:04:30.233 The accept() function should take as its first parameter a

00:04:33.133 Listening Socket, and should return a Connected Socket.

 

00:04:36.833 And the read() and write() functions should accept

00:04:39.500 only Connected Sockets.

 

00:04:41.833 In systems where the networking library works this way,

00:04:45.166 a program that tries to write() to a listening socket

00:04:47.900 will fail to compile, since the compiler can distinguish

00:04:51.500 the two types of socket, and detect the mismatch.

 

00:04:55.266 Connected sockets and listening sockets

00:04:57.833 can still represented by integers,

00:05:00.000 so the generated code doesn’t change,

00:05:02.600 and can be just as efficient.

 

00:05:04.833 But, by making them distinct types of integer,

00:05:07.600 the compiler could help us find bugs.

 

00:05:10.700 This is a trivial example, but the principle is important.

 

00:05:14.900 The more information that can be represented in the types,

00:05:18.300 the more the compiler can help check

00:05:20.100 that those types are used consistently.

 

00:05:22.700 This means we find bugs at compile

00:05:24.466 time, rather than having to debug the code at runtime.

 

00:05:29.933 The other advance that can help systems

00:05:32.133 programming is the application of some ideas

00:05:34.333 from the functional programming community.

 

00:05:37.633 Functional programming is a programming style that

00:05:40.166 emphasises the avoidance of side effects and

00:05:42.466 shared state, and that promotes structuring code

00:05:45.166 as referentially transparent functions.

 

00:05:48.466 Imperative languages,

00:05:50.000 and especially object-oriented languages,

00:05:52.500 tend to actively encourage the opposite approach,

00:05:55.600 leading to programs full of shared state,

00:05:58.500 side effects, and impure functions.

 

00:06:01.666 The entire point of objects, for example,

00:06:04.400 is to hold shared state.

 

00:06:06.700 And methods in object oriented languages

00:06:09.133 are generally called in order to invoke the side effect

00:06:11.500 of modifying the object’s state.

 

00:06:14.500 I argue that this style of programming

00:06:16.666 is problematic for systems programs, and it’s

00:06:19.200 much better to avoid shared state and

00:06:21.300 program in a functional style where possible.

 

00:06:24.666 Why?

 

00:06:26.400 Because, side effects and shared state make

00:06:28.466 code difficult to reason about, and harder to debug.

 

00:06:32.300 They increase coupling between seemingly

00:06:34.500 unrelated sections of code, and make it

00:06:37.200 difficult to understand the behaviour of parts

00:06:39.300 of a program in isolation.

 

00:06:41.766 And, importantly, the presence of side effects

00:06:44.933 and shared state makes concurrent code much

00:06:47.600 harder to write.

 

00:06:49.566 Concurrent accesses to shared state have to

00:06:51.933 be protected by locks, to avoid race conditions.

 

00:06:55.800 Experience has shown that it’s difficult

00:06:57.566 to get the locking right, in any non-trivial program.

 

00:07:01.433 By systematically writing code in a functional style,

00:07:04.500 where possible, and avoiding shared state,

00:07:07.266 the number of locks that are needed can be greatly reduced.

 

00:07:10.933 It’s easier to get the locking right,

00:07:13.400 and avoid race conditions,

00:07:15.300 if there are fewer things to lock.

 

00:07:18.433 The essential benefit of writing programs in

00:07:20.900 a pure functional style is that it

00:07:22.833 constrains what’s possible.

 

00:07:25.000 Code written using referentially transparent functions,

00:07:28.133 without side effects, is easier to test

00:07:30.700 and debug - because there’s no hidden state.

 

00:07:34.266 Such code is also thread safe,

00:07:36.566 since there are no side effects,

00:07:38.133 and no mutable state to share between threads.

 

00:07:41.766 By constraining the way programs are written,

00:07:44.400 certain classes of bugs and certain problems

00:07:46.800 can be avoided.

 

00:07:48.400 Avoiding these problems is

00:07:50.100 desirable, if we want to write concurrent,

00:07:51.933 secure, and correct systems programs.

 

00:07:55.466 Accordingly, when possible,

00:07:57.400 I encourage you to adopt

00:07:58.566 a functional style when writing code.

 

00:08:01.633 Note, though, that I’m not saying

00:08:03.700 “write your programs in Haskell”.

 

00:08:06.100 Haskell is a great language,

00:08:07.900 if your goal is to develop the concept

00:08:09.833 of pure functional programming.

 

00:08:11.833 It takes this one idea,

00:08:13.733 and pushes it as far as is possible.

 

00:08:16.933 That’s wonderful, if you’re a programming language

00:08:19.033 researcher. And many of the ideas that

00:08:21.333 came out of Haskell are widely applicable

00:08:23.566 to programming in general.

 

00:08:25.600 But, it’s clear that not everything

00:08:27.400 can be naturally expressed in the functional style.

 

00:08:30.566 Some programs are just easier to write in an imperative,

00:08:33.533 or object oriented, way.

 

00:08:36.000 That’s fine.

 

00:08:37.533 Use functional programming ideas where they make

00:08:39.566 sense, to prevent certain classes of bugs.

 

00:08:43.166 If another approach works better, do that.

 

00:08:46.466 But, always think about how to reduce

00:08:48.933 shared state, and reduce side effects.

 

00:08:54.133 We can use functional programming techniques

00:08:56.600 and modern type systems to help improve memory

00:08:58.766 management, and make it safer and less error prone.

 

00:09:02.933 Systems programmers are used to manual memory management,

00:09:06.200 as implemented in C.

 

00:09:08.133 Memory is allocated by calling the malloc() function,

00:09:11.300 that returns a pointer to the newly allocated memory.

 

00:09:15.000 When that memory is no longer used,

00:09:16.800 the programmer calls the free() function with that pointer,

00:09:19.233 to release the memory.

 

00:09:21.266 The system performs few checks on what

00:09:23.133 is done with the memory between those calls.

 

00:09:26.233 For example, the value returned by malloc()

00:09:28.900 can be cast to a pointer to

00:09:30.866 any type of object, whether or not the size of that object

00:09:34.300 matches the size of the allocated memory,

00:09:36.866 and the program can interpret the same region of

00:09:39.866 memory as different types of object simultaneously,

00:09:42.866 whether or not that makes sense.

 

00:09:45.833 Similarly, the behaviour of a program that

00:09:48.066 accesses memory outside the region returned by

00:09:50.166 malloc() is undefined, but the system makes

00:09:52.966 no attempt to prevent such accesses.

 

00:09:56.633 Finally, arrays and strings are represented as

00:09:59.166 pointers to their first element, and don’t

00:10:01.633 store the length of the array,

00:10:03.533 and array indexing is represented by pointer arithmetic.

 

00:10:07.433 The code fragment on the slide illustrates this

 

00:10:10.266 – compile and run this code,

00:10:11.933 to see if it behaves as you expect.

 

00:10:14.766 An important consequence of this

00:10:16.466 definition of arrays is that the runtime

00:10:18.966 cannot check whether array accesses are within bounds,

00:10:22.033 since the bounds aren’t recorded.

 

00:10:25.533 There were good reasons for this behaviour

00:10:28.000 when C was designed.

 

00:10:30.033 Machines were slow, and memory was limited.

 

00:10:33.200 Storing the size of the allocation

00:10:35.766 along with the allocated object takes memory,

00:10:38.500 and memory was in short supply

00:10:40.133 on the machines where C was developed.

 

00:10:42.800 Similarly, checking array bounds takes

00:10:44.866 time and requires additional code.

 

00:10:48.400 These concerns mattered at the time,

00:10:50.466 but are not necessarily valid today.

 

00:10:53.400 On machines with kilobytes of memory and

00:10:55.433 MHz clocks, the memory and runtime overheads

00:10:58.700 mattered. On modern machines, that have millions

00:11:02.500 of times more memory, and processors that

00:11:05.300 are tens-of-thousands of times faster, the overheads

00:11:08.233 are less significant.

 

00:11:10.366 Compiler technology and program

00:11:12.400 optimisations have also improved,

00:11:14.733 to the extent that compilers can often now determine when

00:11:17.466 bounds checks are unnecessary, and remove them

00:11:20.166 from the generated code.

 

00:11:22.700 We can certainly afford to check array

00:11:24.800 bounds and other memory accesses on modern systems

00:11:28.000 – but we need to move away from C

00:11:29.900 as a systems programming language to do so.

 

00:11:35.466 What is now clear is that

00:11:37.400 manual memory management is a source of bugs.

 

00:11:40.666 It leads to use-after-free bugs, where memory

00:11:43.333 is accessed after it has been deallocated.

 

00:11:46.366 It leads to memory leaks, where memory

00:11:48.700 is never deallocated, even after it has

00:11:51.000 ceased to be referenced by the program.

 

00:11:53.466 It leads to buffer overflows,

00:11:55.800 where the program tries to access memory outside the

00:11:58.100 region it has allocated.

 

00:12:00.066 And it leads to iterator invalidation,

00:12:02.533 where the contents of a list can

00:12:04.166 change while a program is accessing it,

00:12:06.600 making a previously valid pointer invalid,

00:12:09.100 and leading to out of bounds memory access.

 

00:12:13.033 All of these problems can be eliminated

00:12:15.266 when using programming languages with modern type systems.

 

00:12:19.300 Managed languages such as Java,

00:12:21.833 running on virtual machines, have been able to trap

00:12:24.700 such events at run time for many years.

 

00:12:27.733 They take a performance hit to

00:12:29.600 do so, compared to systems languages that

00:12:31.833 compile to native code, but gain safety.

 

00:12:35.933 Recently, though, we’ve seen new programming languages

00:12:39.066 start to emerge that can track and

00:12:40.866 detect these problems at compile time.

 

00:12:43.433 The Rust programming language, for example,

00:12:46.200 will detect iterator invalidation and use-after-free bugs,

00:12:49.566 and throw a compile time error if they occur in a program.

 

00:12:53.333 The Idris programming language

00:12:55.100 can do the same with buffer overflows.

 

00:12:58.266 As programmers, we fix the same types

00:13:00.300 of problem, again and again, in heroic

00:13:03.166 debugging sessions. Alternatively, we should consider if

00:13:06.933 there’s a better solution, that can prevent

00:13:09.500 such classes of bug entirely.

 

00:13:14.233 The concern about memory management becomes critical

00:13:17.033 when we look at causes of security vulnerabilities.

 

00:13:21.066 The figure on the slide is from Microsoft.

 

00:13:24.166 It tracks the root cause of bugs

00:13:25.733 found in software developed by Microsoft,

00:13:28.300 Windows, Office, Teams, etc., for the period

00:13:31.100 from 2006 to 2018.

 

00:13:35.233 What it shows is that around 70%

00:13:37.466 of all reported security vulnerabilities in Microsoft

00:13:40.266 software, for that time period, relate to memory safety.

 

00:13:45.933 70% of reported security vulnerabilities are buffer

00:13:49.833 overflows, use-after-free bugs, memory corruption,

00:13:53.366 iterator invalidation, and so on.

 

00:13:56.633 By switching to a memory safe language,

00:13:59.233 Microsoft could eliminate two-thirds of the security

00:14:02.100 vulnerabilities in their systems.

 

00:14:04.733 Now, obviously this is difficult to do.

 

00:14:08.066 Rewriting Windows and Office would be an enormous job,

00:14:11.866 and would almost certainly introduce

00:14:13.366 more bugs than it would fix.

 

00:14:15.933 But surely there’s a lesson that can be learnt for new code?

 

00:14:20.333 Once we’ve started to address the low-handing

00:14:23.066 fruit around memory safety, we should start

00:14:25.500 to think about the root causes of

00:14:27.300 other security vulnerabilities.

 

00:14:30.100 What is common, is that vulnerabilities are

00:14:32.500 the result of mismatched assumptions.

 

00:14:35.300 One part of the code assumes some property

00:14:37.733 is true of data it processes.

 

00:14:40.100 Other code is supposed to validate that the property

00:14:43.166 holds, but a check is missed somewhere.

 

00:14:45.333 Such problems are easy to miss,

00:14:47.633 because the assumptions and constraints tend to

00:14:49.866 be visible only in the comments and

00:14:51.766 documentation around the code, and so can’t

00:14:54.000 be automatically checked.

 

00:14:57.166 One approach to reducing security vulnerabilities is

00:15:00.333 to encode as many of these assumptions,

00:15:02.966 and as much of this knowledge as

00:15:04.566 possible, into the types.

 

00:15:07.500 Once this knowledge is available in a machine readable form,

00:15:10.933 a compiler can check the code for

00:15:13.200 consistency, and highlight problems before they become

00:15:16.066 security vulnerabilities.

 

00:15:19.066 Doing so is more up-front work.

 

00:15:21.800 It’s more effort to write code in a way that records,

00:15:24.766 and can automatically check

00:15:26.333 your assumptions, using the type system.

 

00:15:29.366 But, over time, that effort pays off

00:15:31.633 through having fewer surprises when the code is used.

 

00:15:35.466 You trade-off design time,

00:15:37.033 and compile time, effort, for less time

00:15:39.833 spent debugging the system, and correcting security

00:15:42.600 vulnerabilities, after it’s been deployed.

 

00:15:48.200 We see similar classes of behaviour around concurrency.

 

00:15:52.400 The common abstraction is that of threads,

00:15:54.733 locks, and shared mutable state.

 

00:15:57.666 This is pthreads if you’re a C programmer,

00:15:59.866 or synchronised methods and objects

00:16:01.700 if you’re a Java programmer.

 

00:16:04.133 Before accessing a shared resource,

00:16:06.466 you must acquire the appropriate lock. Then, once you’ve

00:16:09.600 finished the resource, release the lock.

 

00:16:12.733 And, if you access multiple resources at once,

00:16:15.333 you must remember to acquire and release

00:16:17.266 those locks in the right order to avoid deadlock.

 

00:16:21.100 The problem is, as anyone who’s tried

00:16:23.200 this at any scale knows, that it’s

00:16:25.100 easy to get the locking wrong.

 

00:16:27.266 To hold too many, or too few,

00:16:29.600 locks. Or to acquire or release the

00:16:31.966 locks at the wrong time, or in the wrong order.

 

00:16:35.766 More critically, it’s clear that locks don’t compose.

 

00:16:40.300 If you start with two libraries that use locking correctly,

00:16:43.633 and combine them, then the result might

00:16:46.000 not have the correct amount of locking.

 

00:16:48.566 The locking requirements to protect

00:16:50.633 the combined data might differ

00:16:52.200 from those of the individual components.

 

00:16:55.066 The usual example here is a banking program.

 

00:16:58.500 Such a program probably has a bank account object,

00:17:01.400 recording the balance of an account,

00:17:03.566 and allowing money to be deposited or withdrawn.

 

00:17:07.100 Such an object needs locking,

00:17:09.100 to avoid corruption of the balance if deposits

00:17:11.433 and withdrawals are made simultaneously.

 

00:17:14.666 What’s less clear is that if you transfer money

00:17:17.266 from one bank account to another,

00:17:19.066 it’s not sufficient for each account

00:17:20.866 to be correctly locked.

 

00:17:22.833 If the money is to be transferred atomically, so it

00:17:26.233 either appears in one account or the other,

00:17:28.166 then locking the individual accounts is not sufficient.

 

00:17:31.766 There’s a risk the system

00:17:33.533 is observed during the transfer, and the

00:17:35.700 money either appears in neither account,

00:17:37.933 or in both, depending on how the transfer was implemented.

 

00:17:41.800 Additional locking is needed to prevent this,

00:17:44.633 over and above the composition of the locking

00:17:46.633 needed on the individual accounts.

 

00:17:50.000 There are two approaches to avoid problems

00:17:52.066 with locking and concurrency.

 

00:17:54.633 One is to apply the techniques of functional programming.

 

00:17:58.000 Race conditions don’t occur

00:17:59.966 if only immutable data is shared,

00:18:01.733 and if the program is structured to avoid

00:18:03.900 shared state and side effects.

 

00:18:06.433 The other is to avoid races by tracking ownership of data.

 

00:18:10.166 Structure programs so that the transfer of data is atomic

00:18:13.633 and that every object has a single owner at all times,

00:18:16.933 and so that data is either mutable, or visible to others,

00:18:20.766 but never both.

 

00:18:23.200 There are trade-offs in the two approaches,

00:18:25.400 that we’ll discuss in detail later in the course.

 

00:18:28.366 What’s interesting, though, is that

00:18:30.233 both eliminate entire classes of concurrency-related bugs,

00:18:33.866 and can be checked for correctness by a compiler.

 

00:18:39.533 This is the key point I want to get across.

 

00:18:43.133 There are many types of problem that frequently occur

00:18:45.533 in systems programs

00:18:47.033 because those systems programs are written in C.

 

00:18:51.066 Problems such as use-after-free bugs, memory leaks,

00:18:54.233 buffer overflows, iterator invalidation, and data races

00:18:58.400 in multi-threaded code, are common in C programs.

 

00:19:02.366 Many are also common in programs

00:19:04.200 written in higher-level languages.

 

00:19:06.633 But, there are new and emerging languages and tools

00:19:10.066 that can not just fix these bugs,

00:19:12.266 but that can eliminate these classes of bug,

00:19:14.633 while maintaining the control

00:19:16.666 and efficiency needed for systems programming.

 

00:19:19.466 There exist programming languages that can flag,

00:19:22.033 at compile time, whether a given piece

00:19:24.166 of code suffers from data races,

00:19:26.566 iterator invalidation, buffer overflows, use after free

00:19:29.600 bugs, and so on.

 

00:19:31.500 It’s not always easy to switch to these languages,

00:19:34.766 but where possible, maybe we should consider to do so?

 

00:19:40.100 In addition, modern type systems are flexible

00:19:42.733 enough to effectively model the problem space

00:19:45.366 in many case. This allows us to

00:19:48.100 use the types to check our designs for consistency.

 

00:19:52.466 When writing code,

00:19:53.933 define types representing the problem domain.

00:19:56.866 For example, rather than using an int,

00:19:59.766 define a PersonID type, an Age,

00:20:02.533 or a Temperature, or similar.

00:20:05.233 With modern languages,

00:20:06.666 these compile down to the exact

00:20:08.166 same code as would using the generic primitive type,

00:20:11.633 but with the advantage that the compiler

00:20:13.866 can check that you’re passing the right type of data around.

 

00:20:18.166 Similarly, encode constraints on when an object

00:20:20.933 can be used as part of its type.

 

00:20:23.566 The example here is the ListeningSocket vs ConnectedSocket

00:20:26.966 distinction in networking code we discussed earlier,

00:20:29.633 but the technique is generally applicable.

 

00:20:33.733 Modern systems programming languages let you define

00:20:37.000 new types and abstractions easily,

00:20:38.600 and without run-time cost.

 

00:20:40.900 Use this ability:

00:20:42.500 define the types and function signatures first,

00:20:45.400 and let the compiler check your design for consistency,

00:20:48.533 and only then write the body of the functions.

 

00:20:51.800 We’re used to debugging code at runtime.

 

00:20:54.766 What I’m suggesting is that if you can represent

00:20:57.366 the key features of the problem

00:20:58.733 domain in your program’s types,

00:21:00.300 then the compiler can also help debug your design.

 

00:21:06.300 What seems clear is that systems programs,

00:21:09.033 and much software in general,

00:21:10.733 have reached the stage where people

00:21:12.500 can’t manage the complexity.

 

00:21:15.100 The C programming language gives precise control

00:21:18.033 over data representation, memory management, and sharing

00:21:21.666 of state. But exercising that control has

00:21:24.800 proven too difficult.

 

00:21:26.933 Bugs and security vulnerabilities abound in software,

00:21:30.233 and pervasive connectivity and concurrency are making

00:21:33.500 the problem worse.

 

00:21:35.566 Emerging, strongly-typed, languages and systems

00:21:38.833 give the same degree of control, with added safety.

 

00:21:42.566 They allow us to use the type system to eliminate

00:21:45.166 certain classes of common bugs,

00:21:47.100 and to model the problem space via the types

00:21:49.900 in a way that helps us detect logic errors early.

 

00:21:53.400 We can improve systems programming.

 

00:21:56.066 In the remainder of this course, we’ll explore how,

00:21:59.066 using the Rust programming language as an example.

 

00:22:03.666 So, to summarise.

 

00:22:05.700 Systems programming is infrastructure programming.

 

00:22:09.133 It’s programming where efficiency

00:22:11.133 and control over data representation matters,

00:22:13.833 and where concurrency and security are pervasive challenges.

 

00:22:17.500 The state of the art in deployed systems is Unix and C.

 

00:22:22.333 Old, but surprisingly flexible, systems,

00:22:25.333 that are reaching the end of their life.

 

00:22:27.966 To address the challenge of writing secure

00:22:30.400 and highly concurrent code,

00:22:32.000 we need better tools and techniques to help tame the

00:22:34.500 complexity, and help us debug our designs.

 

00:22:38.233 I believe we’re now starting

00:22:39.633 to see such tools being developed.

 

00:22:41.933 I’ll introduce these in the remainder of this course.

Discussion

The lecture highlighted two papers: “Programming language challenges in systems codes: why systems programmers still use C, and what to do about it”, by Jonathan Shapiro, and “Some were meant for C: The endurance of an unmanageable language”, by Stephen Kell.

These papers explore what is a systems programming language, and discuss some common features of systems languages. In particular that highlight that performance is critical, but also that systems languages tend to offer low-level control of data representation, memory management, and I/O, and that programmers make use of this control. They suggest that systems languages that attempt to trade performance for safety, or that rely on compiler optimisations to fix data layout, will not be accepted by the broader community. Does with this definition of systems programming? Do you agree that the fallacies are real?

The lecture and reading also suggest that the C programming language is increasingly a liability: it's too easy to introduce security vulnerabilities and trip over undefined behaviour and it provides insufficient abstraction. But, C provides important control over data representation and allows interworking with external data Do you agree with this critique of C? What features of C do you think work well? What are the most problematic?

The lecture also reviews Moore's law and Dennard scaling, and discusses how these trends in hardware development are driving changes in software. Do you understand this discussion? Do you have questions about the hardware trends?

The lecture suggests that the combination of functional programming and modern programming languages with improved type systems can help address some of the challenges in systems programming. The functional style of programming, with referentially transparent functions, no side effects, and no shared mutable state, is claimed to make code easier to test and reduce the risks of concurrency. Do you agree that we benefit from adopting these techniques in systems languages? How applicable are the ideas from Haskell? Can you write C code in a functional style, and would it help?

The lecture also claims that modern type systems can help to prevent some types of unsafe behaviour, such as buffer overflows, use-after-free bugs, race conditions, and iterator invalidation, and allow us to better model the problem space. Does the suggestion that we should change our approach to writing software, and focus on modelling and debugging the design, rather than debugging the code, make sense?