Networked Systems H (2022-2023)
Lecture 1: The Changing Internet
Lecture 1 introduces the course, and reviews some of the material
covered in the Networks and Operating Systems Essentials course in
Level 2. It discusses what is a network protocol and the concept of
layering as a way of structuring networked systems. It reviews some
important aspect of the physical and data link layer; IPv4 and IPv6
and the operation of the network layer; the UDP and TCP transport
protocols; and the higher layers in the protocol stack. Finally,
it concludes by discussing some of the changes occurring in the
network and some of the challenges forcing such changes, to set
the scene for the later discussion.
Part 1: The Changing Internet
The first part of this lecture introduces the course. It reviews the
aims, objectives, and learning outcomes, and the structure of the
lectures and labs. It outlines the assessment scheme, and the dates
and topics for the assessed exercises. Recommended reading is given.
Slides for part 1
00:00:00.233
Welcome to network systems
00:00:02.700
for those who don't know me,
00:00:04.033
my name is Colin Perkins,
00:00:05.366
and i'm the lecturer and coordinator for this course.
00:00:08.833
In this first lecture
00:00:10.100
I'll review the aims, objectives,
00:00:11.966
and administration of the course.
00:00:14.533
Then, I'll recap some of the material
00:00:16.333
covered in the Networks and Operating Systems
00:00:18.200
Essentials course in lecture 2.
00:00:21.133
Finally, I'll briefly introduce some
00:00:23.066
of the changes occurring in the network
00:00:25.233
to set the scene for the remainder of the course.
00:00:30.500
In this part of the lecture
00:00:31.866
I'll start with some administrative details.
00:00:34.700
I'll talk about the aims, objectives,
00:00:36.633
and intended learning outcomes for the course
00:00:39.300
I'll outline the content of the lectures and labs,
00:00:41.966
the timetable for the lectures and laboratory sessions,
00:00:44.833
and the structure of the assessed exercises and exam.
00:00:48.600
Finally, I'll highlight
00:00:50.666
the recommended reading for the course.
00:00:55.200
As I mentioned at the start, I'm Colin Perkins
00:00:57.800
and I'm the lecturer and coordinator for the course.
00:01:01.200
If you have questions about the material
00:01:02.866
covered in the course,
00:01:03.933
my email addresses on the slide.
00:01:05.900
I'm happy to answer questions via email.
00:01:08.966
We also have office hours
00:01:10.300
and discussion sessions throughout the course.
00:01:12.600
and a chat room in Microsoft teams.
00:01:16.000
The course materials,
00:01:17.166
including lecture recordings,
00:01:18.766
copies of the slides,
00:01:20.200
lab handouts,
00:01:21.133
and assessed exercises,
00:01:22.866
will be uploaded to Moodle.
00:01:25.500
The material is also on my website
00:01:27.733
at csperkins.org/teaching
00:01:30.700
the version on my website has lecture transcripts
00:01:33.000
that are difficult to upload to Moodle,
00:01:35.200
as well as the other material.
00:01:39.966
The aims and objectives for the course of four-fold
00:01:44.500
Firstly, the course aims to introduce
00:01:46.966
the fundamental concepts and theory of communication.
00:01:51.000
We'll build on the material from the Networks
00:01:53.066
and Operating Systems Essentials course in level, 2
00:01:56.166
covering the concepts in more depth
00:01:58.133
and going into more details about the operation
00:02:00.300
of the Internet.
00:02:02.900
Second, the cost aims to
00:02:05.366
give you a good understanding
00:02:06.700
of the technologies that comprise the modern Internet,
00:02:10.000
and an understanding of how,
00:02:11.500
and why, the network is changing.
00:02:14.866
We're in the middle of a rapid
00:02:16.266
period of change in the Internet infrastructure.
00:02:19.200
This course will try to make it
00:02:20.566
clear what's changing,
00:02:21.800
and what are the drivers for those changes.
00:02:25.666
This should give you the ability to evaluate network systems
00:02:28.900
and understand what technologies and approaches
00:02:31.200
are suitable for particular scenarios,
00:02:33.533
so you can advice on what's appropriate to use.
00:02:38.000
Finally, the course will build on the
00:02:40.066
material in the Systems Programming course
00:02:42.233
to introduce low-level network programming.
00:02:44.700
and to give you practice with systems programming in C.
00:02:51.666
By the end of the course
00:02:53.200
you should be able to describe and compare
00:02:55.266
the capabilities of different communications technologies,
00:02:58.633
understand the implications of scale on network systems,
00:03:02.700
and the different quality of service needs
00:03:05.400
of different applications.
00:03:08.200
In concrete terms,
00:03:09.766
this means you should understand what are
00:03:11.633
the different protocols in use in the Internet,
00:03:14.433
and when it's appropriate to use one or the other.
00:03:18.266
For example,
00:03:19.666
you should understand the differences
00:03:21.200
between UDP, TCP, and QUIC,
00:03:24.366
and when it's appropriate to use each of these protocols.
00:03:28.600
You should understand the importance of
00:03:30.366
heterogeneity in the design of the Internet,
00:03:33.266
the importance of layered protocol stacks,
00:03:36.433
and the way that different components of the
00:03:38.300
network are combined to form a whole.
00:03:42.100
Finally, you should be able to write
00:03:44.266
simple low-level communication software in C,
00:03:47.700
showing awareness of good practices
00:03:50.000
for correct and secure programming.
00:03:55.333
The course is organized as 10 lectures
00:03:58.000
and 5 laboratory sessions.
00:04:00.700
There's one lecture a week for 10 weeks,
00:04:03.233
with a pre-recorded lecture
00:04:04.866
and time for discussion each week
00:04:07.866
Labs are more variable.
00:04:09.766
The first 2 lab exercises
00:04:11.366
are expected to take one week each,
00:04:13.700
then the next 3 exercises should take two weeks each.
00:04:17.600
The labs run for 8 weeks,
00:04:19.233
starting in week 2 of the semester.
00:04:23.733
Lecture recordings will be made available ahead of time,
00:04:27.100
and comprise an hour or so of material,
00:04:29.433
split into several shorter parts each week.
00:04:33.433
Each lecture will be accompanied
00:04:34.966
by a set of discussion questions
00:04:37.200
that will be available on Moodle
00:04:38.633
and on the course website.
00:04:41.166
These discussion questions are not assessed,
00:04:43.500
and are intended to help you understand the material.
00:04:47.500
We have a live lecture session timetabled
00:04:49.866
from 4:00 to 6:00 pm on Thursday afternoons.
00:04:53.366
This is intended for a discussion
00:04:55.100
of the lecture material and questions.
00:04:57.933
You should watch the lecture videos,
00:04:59.666
and think about the discussion questions,
00:05:01.666
before the timetabled sessions.
00:05:05.266
This discussion slot
00:05:06.700
is your main opportunity
00:05:07.833
to ask questions about the material.
00:05:10.566
I strongly encourage you to come prepared
00:05:12.766
to use the slot to talk.
00:05:15.033
The more questions and discussion we have,
00:05:17.300
the more useful it'll be for everyone.
00:05:22.666
There are also weekly labs.
00:05:25.566
The goal of the labs is twofold.
00:05:29.033
First, they are intended to help you practice
00:05:31.333
C programming.
00:05:32.700
building on the material in the systems programming course,
00:05:35.866
and to introduce you to network programming in C.
00:05:40.400
While languages like Python are popular for web development,
00:05:44.100
C is still by far the most
00:05:46.100
widely used language for writing low-level networking code.
00:05:49.766
So it's important that you're familiar with network
00:05:51.933
programming in C.
00:05:54.666
The second goal of the labs
00:05:56.466
is to complement the material covered in the lectures,
00:05:59.566
to allow you to explore certain topics
00:06:02.533
around congestion control
00:06:04.066
and the structure of the network,
00:06:05.600
in more detail.
00:06:07.800
As with the lectures,
00:06:09.366
the lab materials will be made available in advance
00:06:11.800
of the timetabled sessions.
00:06:13.966
The labs are to be completed in your own time.
00:06:17.933
The timetabled sessions,
00:06:19.533
from 1:00-3:00pm on Thursdays,
00:06:21.433
are for live support with the lab exercises.
00:06:24.766
You should try to solve the exercises
00:06:27.033
and think of questions you might need to ask
00:06:29.466
before the timetable support slot,
00:06:31.966
so the lab demonstrators and I
00:06:33.566
can effectively help you.
00:06:39.266
This course follows the usual model
00:06:41.900
of an exam worth of 80% of the marks
00:06:44.500
and an assessed exercise was the remaining 20%.
00:06:48.800
The assessed exercise will be made available in Lecture 5,
00:06:52.233
and will be due on the same day as Lecture 7 is held.
00:06:56.733
The usual penalties for late submission will be applied,
00:06:59.700
following the University's code of assessment.
00:07:03.233
If you're ill,
00:07:04.166
or have other circumstances that might affect
00:07:06.200
your on-time submission,
00:07:08.033
then you can contact me before the deadline
00:07:09.933
to request an extension
00:07:12.700
Also, following the School's policy,
00:07:16.066
note that submissions of assessed exercises
00:07:18.433
that do not follow the submission instructions
00:07:20.700
given in the handout
00:07:21.933
will be given a two-band penalty.
00:07:25.000
And I want to emphasize this last point,
00:07:27.166
because some people are surprised each year.
00:07:30.100
The penalties for late submission,
00:07:31.800
and for not following submission instructions,
00:07:34.333
will be strictly enforced.
00:07:40.866
As I said earlier,
00:07:42.000
the final exam is worth 80% of the marks for the course.
00:07:46.433
The format of the exam
00:07:47.966
is that there are 3 questions,
00:07:49.433
and you have to answer all questions.
00:07:52.200
There are past exam papers on Moodle,
00:07:54.466
including one with sample answers.
00:07:57.900
The curriculum has changed over the years,
00:08:00.300
and the pandemic has caused
00:08:02.833
a shift to to online open book exams,
00:08:06.733
and together these mean the style of questions has changed.
00:08:10.933
The more recent exam papers,
00:08:12.800
from 2020 onwards,
00:08:14.433
are most representative of
00:08:16.033
the style of questions you can expect
00:08:19.700
When studying the material,
00:08:21.166
and preparing for the exam,
00:08:23.000
remember that the goal of the course
00:08:24.900
is to help you understand how the Internet works.
00:08:28.133
The aim of the exam is to test that understanding,
00:08:31.600
not simply to test your memory of the details.
00:08:35.133
Explain why,
00:08:37.200
don't just recite what
00:08:39.933
To achieve an A grade in this course,
00:08:42.600
representing excellent performance,
00:08:45.066
you'll need to demonstrate
00:08:46.233
that you can make a reasoned argument,
00:08:48.266
develop logical conclusions,
00:08:50.600
and apply your learning to new situations,
00:08:53.200
to solve new problems.
00:08:56.100
The material in the lectures and the labs is examinable,
00:08:59.533
but you're also expected to follow the required readings
00:09:02.433
and to develop your broader understanding of the material.
00:09:09.233
So what are those required readings?
00:09:12.833
Well, the slide lists for good books on computer networking.
00:09:17.500
You should read one of them
00:09:19.866
The books by Peterson and Davie,
00:09:21.900
and by Olivier Bonventure,
00:09:23.866
are available for free online.
00:09:27.633
The lectures, labs, and discussion
00:09:29.900
can introduce you to the important concepts of networking.
00:09:33.700
but it's also important that you read about,
00:09:36.000
and around, the material covered,
00:09:38.100
so you understand the details.
00:09:40.900
The books explain the material in different ways,
00:09:44.366
and may help illustrate different aspects of the subjects.
00:09:47.533
and the different perspectives might make things clear
00:09:50.000
in cases where my explanations do not.
00:09:54.733
In addition to the books,
00:09:56.533
many of the lecture slides
00:09:57.900
also include links to standards documents,
00:10:00.366
research papers, blog posts,
00:10:02.533
or talks that go into more detail about the material.
00:10:07.466
You're not expected to read all of this material,
00:10:10.233
and much of it goes into far more detail than you need.
00:10:13.300
but you should at least look at it
00:10:15.066
to begin to get a feel for the depth of the subject.
00:10:19.866
There's only so much depth
00:10:21.566
that can be covered in a pre-recorded lecture.
00:10:24.666
I've chosen the reading material carefully
00:10:26.966
to complement the lectures.
00:10:29.333
Following along with the reading,
00:10:31.100
and participating in the discussion sessions,
00:10:33.500
will further your understanding.
00:10:37.200
The exam questions
00:10:38.400
will focus on application of the knowledge
00:10:40.466
you gained during the course,
00:10:41.866
not on rote memorization.
00:10:44.766
Reading further,
00:10:46.000
thinking about the ideas,
00:10:47.333
and discussing the material
00:10:48.633
covered in the lectures and the labs is essential.
00:10:55.033
That concludes the review
00:10:56.233
of the course structure and administration.
00:10:59.366
In the remaining parts of this lecture,
00:11:01.433
I'll review the traditional Internet architecture,
00:11:04.200
starting with a discussion of protocols and layering.
00:11:07.466
Then talk about some of the ways in which the network is changing.
Part 2: Protocols and Layers
Part 2 of the lecture introduces the idea of a network protocol, and
the concept of layering as a way of structuring networked systems. It
reviews the 7-layer OSI model, and discusses how it's a useful way of
thinking about systems, even though it's not representative of any
real-world systems.
Slides for part 2
00:00:01.333
I’d like to begin the course by reviewing
00:00:03.366
some of the fundamental principles
00:00:05.100
of networked systems.
00:00:06.900
In particular, I'd like to talk briefly
00:00:09.466
about what is a networked system,
00:00:11.366
and how networked systems are structured in
00:00:13.600
the form of a layered protocol stack.
00:00:17.000
So, what is a networked system?
00:00:20.433
A networked system is a set of cooperating,
00:00:23.233
autonomous, computing devices
00:00:25.433
that exchange data to perform some application goal.
00:00:29.600
I talk about computing devices,
00:00:31.800
because networked systems are not limited to traditional
00:00:34.500
PCs, laptops, or servers.
00:00:37.366
There are far more smart phones and tablets
00:00:39.566
connected to the network
00:00:40.733
than there are laptops and servers, for example
00:00:44.000
But there are also numerous sensors and controllers
00:00:46.466
that form the Internet of Things:
00:00:48.500
cameras, smart light bulbs, heating controllers,
00:00:51.633
weather stations, medical devices,
00:00:54.566
industrial automation, and so on.
00:00:57.633
The network is comprised of an increasingly
00:00:59.966
diverse set of devices, and has to
00:01:02.166
meet their diverse needs.
00:01:05.000
These devices are autonomous.
00:01:07.166
Each device acts independently
00:01:09.233
with no central control,
00:01:11.100
and can choose what data to send,
00:01:13.133
and when to send it.
00:01:15.333
And these devices are all running different applications
00:01:18.233
with different requirements.
00:01:20.066
They require different things from the network
00:01:22.433
and need different network protocols
00:01:24.466
to support their needs.
00:01:28.000
There are four aspects to the network.
00:01:30.900
The first is communication.
00:01:33.233
How can two devices,
00:01:34.800
that are connected to a single link,
00:01:37.100
reliably exchange data.
00:01:39.966
Second, is networking.
00:01:42.133
How can we combine multiple links
00:01:45.033
to form a wide area network,
00:01:46.866
around a building, a campus, or a region.
00:01:51.200
Third is internetworking.
00:01:53.433
How can we connect multiple networks
00:01:56.133
together to form an internet?
00:01:58.166
How can multiple independently operated
00:02:00.866
networks work together
00:02:02.466
to act as-if they were a single global network?
00:02:05.233
How can data be routed across this collection
00:02:07.900
of networks to reach its destination?
00:02:10.666
And, finally, there is the problem of transport.
00:02:14.000
How do the end systems ensure
00:02:16.366
that data is delivered across this network,
00:02:18.333
or networks, with appropriate reliability
00:02:21.366
to meet the needs of the application?
00:02:26.366
Networked systems are fundamentally
00:02:28.266
about communications protocols.
00:02:31.100
A sender is trying to talk to one of more receivers,
00:02:34.233
via some communications channel.
00:02:36.833
It does this by sending messages,
00:02:39.233
that have to fit within the constraints
00:02:40.833
of the communications channel.
00:02:43.166
These constraints provide limits on the speed
00:02:45.700
and reliability of transmission.
00:02:47.866
The channel is not infinitely fast
00:02:50.100
or perfectly reliable.
00:02:53.566
The messages being sent across the channel
00:02:55.966
have a well defined format
00:02:58.300
Much like programming languages,
00:03:00.333
network protocols have well defined syntax
00:03:02.900
that describes the structure and format
00:03:04.833
of the messages that can be sent.
00:03:07.766
They also have well defined semantics
00:03:09.766
that define the meaning of the messages,
00:03:12.066
the order in which they are sent,
00:03:13.733
and the patterns of communication.
00:03:16.600
Together the syntax and semantics of the messages
00:03:19.833
define a network protocol, such as
00:03:23.566
HTTP, TCP/IP, etc.
00:03:27.000
Each protocol has a particular purpose
00:03:29.466
and solves a particular problem.
00:03:32.100
Protocols can be combined,
00:03:34.100
and layered on one another,
00:03:35.700
to gradually raise the level of abstraction,
00:03:38.400
and provide more sophisticated services
00:03:40.533
to the applications.
00:03:44.900
The Open Systems Interconnection Reference Model,
00:03:48.133
the OSI Model,
00:03:49.933
is a common way of thinking about protocol layering.
00:03:54.366
The OSI model structures the network
00:03:56.500
as a set of seven layers.
00:03:59.300
At the bottom of the protocol stack
00:04:01.266
is the physical layer.
00:04:03.100
For two devices, two end systems,
00:04:06.200
that are directly connected, as shown on the slide,
00:04:09.566
the physical layer represents
00:04:11.333
the means of interconnection.
00:04:12.933
The type of cable,
00:04:14.333
if it’s a wired link,
00:04:15.800
or the details of the radio channel.
00:04:18.866
The purpose of the physical layer
00:04:20.566
is to exchange data between the devices.
00:04:24.766
Above this sits the data link layer.
00:04:28.366
The link layer structures that data into messages,
00:04:31.233
identifies the devices,
00:04:33.866
and coordinates when each device can send,
00:04:36.466
so they share access to the channel.
00:04:41.266
If a single network comprises more than one
00:04:43.166
physical link, devices known as switches
00:04:45.966
can be inserted into the network.
00:04:48.233
Switches operate at the data link layer,
00:04:51.466
adapting the message framing and arbitrating access
00:04:54.466
to the different channels, in order to
00:04:56.633
bridge the different links together.
00:04:59.966
Examples of this include Ethernet switches,
00:05:02.900
that connect multiple Ethernet links together
00:05:05.400
to form a larger Ethernet network,
00:05:08.200
and Wi-Fi base stations that bridge between Ethernet
00:05:11.033
and Wi-Fi networks.
00:05:15.966
The third layer in the OSI model
00:05:18.100
is the network layer.
00:05:20.500
The network layer supports the interconnection
00:05:22.800
of multiple independently operated networks.
00:05:26.466
It abstracts away the details
00:05:27.900
of the different types of ink layer,
00:05:29.833
and combines different networks into one,
00:05:32.933
to give the illusion
00:05:33.966
that there is a single global internetwork.
00:05:37.166
The Internet protocols,
00:05:38.566
IPv4 and IPv6,
00:05:40.833
are examples of network layer protocols,
00:05:43.566
and the devices that connect different networks
00:05:45.633
together are known as routers.
00:05:49.533
Above the network layer,
00:05:51.300
the transport layer ensures that data is delivered
00:05:53.766
with appropriate reliability between end systems.
00:05:57.100
The Transmission Control Protocol,
00:05:59.333
TCP,
00:06:00.333
is a widely used example of a transport protocol.
00:06:04.533
Finally, the Session, Presentation, and Application layers
00:06:09.066
support the coordination of multiple transport connections,
00:06:12.566
describe the data formats used,
00:06:14.733
and support application semantics.
00:06:20.400
The OSI reference model is a standard
00:06:22.466
model for layered protocol design.
00:06:25.066
It’s important to realise, though,
00:06:27.566
that real networks don’t follow the OSI model.
00:06:31.133
No deployed networked system has seven layers
00:06:34.366
structured in this way.
00:06:36.700
Real systems are more complex.
00:06:39.066
In some cases, especially around the upper layers of
00:06:42.266
the stack, the layers merge together,
00:06:44.700
and the boundaries between them are unclear.
00:06:48.266
In other cases, systems are designed with
00:06:50.766
more layers, or with clearly defined sub-layers,
00:06:54.266
to support features that don't fit neatly
00:06:56.433
into the layer boundaries defined by the OSI model.
00:07:00.433
Tunnelling solutions, that encapsulate data from a
00:07:02.900
lower layer, and transmit it within a higher layer
00:07:05.566
are also common.
00:07:06.766
Virtual Private Networks,
00:07:08.733
VPNs, are a good example of this,
00:07:11.000
and take network layer data,
00:07:13.033
in the form IP packets,
00:07:14.700
and tunnel them inside a transport layer
00:07:16.700
connection to some other part of the network,
00:07:19.166
giving the illusion that a device
00:07:21.166
is physically connected elsewhere.
00:07:22.900
There’s not a strict progression of layers.
00:07:26.700
We talk about the OSI model
00:07:28.600
because it's an extremely useful way to structure
00:07:30.933
discussion about networks,
00:07:32.566
not because it represents the structure
00:07:34.566
of any particular network.
00:07:38.266
To conclude, the network is a collection
00:07:41.433
of autonomous computing devices,
00:07:43.566
cooperating to exchange
00:07:45.400
messages to support application needs.
00:07:47.866
Those messages are structured in the form
00:07:50.666
of a layered protocol stack,
00:07:52.066
gradually building up features
00:07:54.233
from the exchange of data between directly
00:07:56.233
connected devices
00:07:57.466
until we have the global Internet.
00:08:00.200
The seven layer OSI model doesn’t reflect reality,
00:08:03.300
but is a useful way of thinking
00:08:05.500
about the protocol stack.
00:08:07.333
For this reason, I’ll use it to structure the
00:08:10.000
discussion in the other parts of this lecture
00:08:11.933
starting, in the next part,
00:08:14.000
with a discussion of the physical
00:08:15.600
and data link layers.
Part 3: Physical and Data Link Layers
Part 3 of the lecture reviews the physical and data link layers.
It briefly reviews baseband data encoding, carrier modulation,
and spread spectrum communication. It talks about the limitations
of physical links, the Shannon-Hartley theorem, the factor that
limit the performance of a communications channel.
It also briefly reviews the data link layer, talking about framing,
addressing, and media access control.
Slides for part 3
00:00:00.900
The physical and data link layers
00:00:03.100
occupy the lowest levels of the protocol stack,
00:00:05.600
and support data transmission
00:00:07.366
between directly connected devices.
00:00:10.066
In the following, I’ll briefly discuss the physical
00:00:12.566
characteristics of network links
00:00:14.366
and the modulation process by which data
00:00:16.266
is transmitted across those links.
00:00:18.500
Then, I’ll talk about features of the data link layer,
00:00:21.266
such as framing, addressing, and media access control.
00:00:27.300
The physical layer describes the properties of
00:00:29.900
the communications channel.
00:00:31.200
It’s the realm of the electrical engineer;
00:00:33.566
of cables, optical fibres,
00:00:35.533
and radio transmission.
00:00:37.666
When considering the physical layer,
00:00:39.666
we discuss the physical properties of the channel,
00:00:42.500
the way bits are encoded onto the channel,
00:00:44.733
and the capacity, and error rate, of the channel.
00:00:48.066
We begin by asking whether the link
00:00:50.066
is wired or wireless?
00:00:52.866
If it is a wired link,
00:00:54.433
we then consider the type of cable
00:00:56.366
or optical fibre used.
00:00:58.000
We ask what voltage is applied to the cable,
00:01:00.700
or what frequency of laser light
00:01:02.566
is sent down the fibre.
00:01:04.366
And we ask how the bits are encoded as
00:01:06.100
variations in that voltage
00:01:07.600
or in the intensity of the laser.
00:01:10.233
If, instead, the channel is wireless,
00:01:12.800
we consider what type of antenna is used,
00:01:15.300
the transmission power, the carrier frequency,
00:01:17.300
and the modulation scheme used to encode
00:01:19.466
data onto the carrier wave.
00:01:22.133
Given these details, we can then estimate
00:01:24.200
the capacity of the channel, and model
00:01:26.533
the physical limitations of the performance
00:01:28.566
of the transmission link.
00:01:32.733
When using a wired link,
00:01:34.533
whether an electrical cable or an optical fibre,
00:01:37.266
the signal to be transmitted is usually
00:01:39.466
directly encoded onto the channel.
00:01:41.966
That is, the voltage applied to an electrical cable,
00:01:44.900
or the brightness of a laser shining down an optical fibre,
00:01:48.266
is changed in a way that directly corresponds to the
00:01:50.733
signal to be transmitted.
00:01:53.433
The signal will occupy a certain frequency
00:01:55.600
range, known as its bandwidth.
00:01:58.200
This is measured in units of Hertz, Hz,
00:02:00.733
and directly corresponds to the complexity and
00:02:03.033
information content of the signal.
00:02:06.100
The more information being transmitted in a given
00:02:08.066
time interval, the greater the bandwidth.
00:02:11.200
A signal directly applied to a channel
00:02:13.366
occupies what’s known as the baseband frequency range:
00:02:15.933
the range starting at 0 Hz,
00:02:18.066
and reaching up to the maximum bandwidth of the signal.
00:02:21.833
Every channel also has a maximum bandwidth it can transmit.
00:02:26.100
This depends on the physical characteristics of the channel.
00:02:29.000
For example, the maximum bandwidth that can be sent
00:02:31.400
over a twisted pair electrical cable
00:02:33.466
depends on the length of the cable,
00:02:35.300
the tightness of the twists,
00:02:36.900
and the thickness of the wires.
00:02:39.333
A channel is only able to transmit a particular signal
00:02:42.266
if the bandwidth of the signal
00:02:43.933
is less than the bandwidth of the channel
00:02:46.300
There’s a maximum rate at which data can be transmitted,
00:02:49.000
depending on the physical characteristics of the channel.
00:02:52.766
That maximum rate is determined by Nyquist’s theorem.
00:02:56.533
This states that the maximum data rate
00:02:58.433
a channel can support,
00:02:59.866
Rmax,
00:03:01.033
cannot exceed 2 B Log2 V bits per second,
00:03:07.133
where B is the maximum bandwidth of the channel,
00:03:10.166
and V is the number of different values each symbol can take.
00:03:14.333
For binary data, each symbol can have
00:03:16.700
one of two values, it can be a zero
00:03:18.800
or a one. Accordingly, V is equal to two.
00:03:23.100
The value of “Log2 V” term, in this case,
00:03:25.966
evaluates to one,
00:03:27.400
and the maximum data rate directly
00:03:29.333
corresponds to the channel bandwidth.
00:03:34.000
So, how is data encoded onto the channel?
00:03:38.433
The simplest case is what’s known as
00:03:40.833
non-return to zero, NRZ, encoding,
00:03:43.666
as shown in the top figure on the slide.
00:03:46.833
When sending binary data, NRZ encoding
00:03:50.433
directly encodes the signal onto the channel.
00:03:53.433
If the binary value 1 is to
00:03:54.933
be sent over an electrical cable,
00:03:56.600
for example, a high voltage is applied
00:03:58.900
to the cable, whereas a low voltage
00:04:01.166
is applied to send the binary value zero.
00:04:03.633
The receiver simply measures the voltage,
00:04:05.700
and directly translates it into binary values.
00:04:10.600
Non-return to zero encoding is simple,
00:04:12.833
but has the problem that long runs
00:04:14.833
of ones or zeros result in signals
00:04:17.100
that maintain the same value for long periods of time.
00:04:20.666
The example has a run of four consecutive one bits,
00:04:23.733
followed a little later by a run of four consecutive zeros,
00:04:27.133
and each results in a signal that’s unchanging
00:04:29.400
for a significant period of time.
00:04:32.933
At high data rates, and with long
00:04:35.133
consecutive sequences of the same value,
00:04:37.166
it can be difficult to measure the exact
00:04:39.033
time for which the signal is unchaning
00:04:41.666
This leads to miscounting, where the receiver
00:04:44.066
believes one more, or one less, bit was sent than intended,
00:04:47.600
giving a corrupt signal.
00:04:50.300
To avoid this, more complex encodings are used.
00:04:54.500
One such scheme is Manchester Encoding.
00:04:57.433
This encodes every bit to be sent as a pair of values
00:05:01.633
A binary 1 is sent as a high-to-low transition
00:05:04.600
in the signal strength, whereas a binary 0
00:05:07.466
is sent as a low-to-high transition.
00:05:10.366
This avoids the miscounting problem of NRZ encoding,
00:05:13.600
since the signal always changes
00:05:15.433
irrespective of the data being sent,
00:05:17.700
but at the cost of requiring twice as many transitions,
00:05:20.700
and hence using twice the bandwidth.
00:05:23.800
There are many different methods of baseband data encoding,
00:05:27.666
of which the NRZ and Manchester encodings are the simplest.
00:05:31.800
The different encodings trade-off increased complexity
00:05:34.600
for better performance.
00:05:39.633
When encoding data onto a wireless channel,
00:05:42.300
carrier modulation is used rather than baseband encoding.
00:05:46.433
This allows multiple signals to be
00:05:48.166
carried on a single channel,
00:05:49.766
each modulated onto a carrier wave
00:05:52.066
transmitted at a different frequency.
00:05:55.233
Carrier modulation shifts the frequency range occupied
00:05:58.166
by a signal up, so it’s centred on a carrier frequency
00:06:01.633
Instead of occupying the baseband range, from 0 - B Hz,
00:06:05.500
the signal is shifted to occupy
00:06:07.233
a frequency range centred on the carrier frequncy.
00:06:10.566
This is done by varying some property of the carrier wave
00:06:13.133
to match the signal being sent.
00:06:15.400
The receiver tunes into the carrier frequency,
00:06:18.133
and measures the variation in the carrier wave
00:06:20.366
to extract the signal.
00:06:23.133
In much the same way as wired links,
00:06:25.266
there are limitations in the bandwidth
00:06:26.833
of the signal that can be sent,
00:06:28.400
depending on the carrier frequency,
00:06:30.266
the type of antenna used,
00:06:31.933
the transmission power,
00:06:33.500
the modulation scheme, etc.
00:06:37.000
Broadcast radio stations use carrier modulation to
00:06:39.800
transmit on different frequencies.
00:06:41.833
The principle is the same
00:06:43.533
for digital transmission, except that
00:06:45.600
it’s data being transmitted, rather than music.
00:06:51.466
There are three types of carrier modulation that can be used.
00:06:55.866
Amplitude modulation varies the loudness of the carrier
00:06:58.900
wave to directly match the signal being transmitted.
00:07:02.566
AM radio works the same way.
00:07:05.666
Amplitude modulation is simple,
00:07:07.733
but can perform poorly,
00:07:09.266
because radio noise takes the form
00:07:11.100
of changes in the loudness of
00:07:12.966
the signal that corrupt the received data.
00:07:16.766
Frequency modulation varies the frequency of
00:07:19.533
the carrier to match the signal.
00:07:21.900
In the example on the slide,
00:07:23.633
it switches between a high frequency,
00:07:25.533
that corresponds to binary zeros,
00:07:27.333
and a lower frequency that corresponds
00:07:29.233
to a binary ones.
00:07:30.700
FM radio works in a similar way,
00:07:33.166
varying the frequency to match
00:07:34.766
the speech or music being transmitted.
00:07:37.700
This is slightly more complex than amplitude modulation,
00:07:40.766
but more resistant to noise and interference.
00:07:44.400
Finally, phase modulation shifts forwards or backwards
00:07:47.633
in the cycle of the waveform to indicate different symbols.
00:07:51.566
Real systems tend to use a combination
00:07:53.600
of modulation techniques, perhaps varying both the
00:07:56.566
amplitude and phase, to increase the data rate.
00:08:02.333
Radio signals modulated onto a carrier
00:08:04.333
wave are prone to interference.
00:08:06.933
This is because signals sent at particular frequencies
00:08:09.600
tend to be blocked by vehicles, trees,
00:08:11.900
and people moving around, or by the weather
00:08:14.566
and other radio transmissions,
00:08:16.166
while signals sent on a carrier at a different
00:08:18.633
frequency may be unaffected.
00:08:21.000
This strength of this interference can change rapidly.
00:08:25.166
To avoid this problem, many wireless links
00:08:28.000
use a technique known as spread spectrum communication,
00:08:31.166
where the carrier frequency is changed
00:08:33.000
several times per second, following a pseudo-random sequence
00:08:36.200
known to both the sender and the receiver.
00:08:39.766
Spread spectrum communication limits
00:08:41.900
the impact of interference,
00:08:43.533
because the transmission will quickly switch away
00:08:46.433
from a poorly performing channel.
00:08:48.500
It adds a lot of complexity,
00:08:50.466
since both sender and receiver need to
00:08:52.700
continually change the carrier frequency,
00:08:55.066
and synchronise what frequencies they use,
00:08:57.666
but greatly improves performance.
00:09:00.666
The concept of spread spectrum communication
00:09:03.600
was invented by Hedy Lamarr,
00:09:05.500
a Hollywood actress turned inventor,
00:09:07.333
during World War 2.
00:09:09.166
It’s now widely used as part of the Wi-Fi-standards.
00:09:13.833
The impact of noise on the data rate
00:09:16.400
achievable over a particular channel can be predicted.
00:09:19.800
This is the case whether that noise is due to electrical
00:09:22.166
or radio interference, imperfections in an optical fibre,
00:09:25.366
or other means.
00:09:27.433
In the simplest case, where it’s assumed that
00:09:30.433
the noise affects all frequencies used by
00:09:32.233
the transmission to the same extent,
00:09:34.300
the maximum data rate of the channel
00:09:36.400
can be determined using the Shannon-Hartley theorem.
00:09:39.433
This states that the maximum data rate,
00:09:41.866
Rmax, is equal to the bandwidth of
00:09:44.133
the channel, B, multiplied by the log
00:09:47.333
of 1 plus the strength of the signal, S,
00:09:49.900
divided by the amount of noise, N.
00:09:53.266
The bandwidth and the amount of noise
00:09:55.300
depend on the channel and the environment.
00:09:58.633
For example, for a wireless link,
00:10:00.733
they depend on the carrier frequency,
00:10:02.733
the type of antenna, the weather,
00:10:04.833
the presence of obstacles between the sender and receiver,
00:10:07.533
and whether there are other
00:10:08.633
simultaneous radio transmissions.
00:10:11.400
The signal strength depends on the amount of power
00:10:14.066
applied at the transmitter.
00:10:16.233
This allows the sender to trade-off
00:10:18.066
battery life for performance,
00:10:20.033
saving power by transmitting more slowly.
00:10:26.500
We have seen that the physical layer enables communication.
00:10:29.566
It allows the sender and receiver to exchange
00:10:32.133
a sequence of bits of data across a channel,
00:10:34.033
but assigns no meaning to those bits.
00:10:37.600
The data link layer starts to provide
00:10:39.733
structure to the bitstream provided by the physical layer.
00:10:43.200
It provides framing,
00:10:44.600
splitting the bitstream into individual messages,
00:10:47.366
and gives the ability to detect,
00:10:49.366
and possibly correct,
00:10:50.600
errors in the transmission of those messages.
00:10:53.966
It provides addressing,
00:10:55.566
giving each device an identifier
00:10:57.466
that can be used to indicate
00:10:58.600
what device sent the message
00:11:00.533
and what device, or devices,
00:11:02.566
should act on the message.
00:11:04.766
And, finally, the data link layer provides
00:11:07.233
media access control.
00:11:09.033
It arbitrates access to the channel,
00:11:11.300
to make sure that more than one device
00:11:13.133
doesn’t try to send at the same time,
00:11:15.033
and to ensure that each gets its fair share to transmit.
00:11:21.533
A key role of the data link layer
00:11:23.900
is to separate the bitstream into
00:11:25.233
meaningful frames of data,
00:11:27.500
and to identify the devices that are
00:11:29.066
sending and receiving those frames.
00:11:32.066
For example, if we consider an Ethernet link,
00:11:35.666
the bitstream is split up into frames
00:11:38.166
that contain a number of different elements.
00:11:41.233
First is the start code.
00:11:43.533
This is a preamble, containing a particular pattern
00:11:46.033
that occurs only occurs at the start of a message,
00:11:48.500
and is used to alert the receiver
00:11:50.100
that a new message is starting.
00:11:53.166
This is followed by some header information.
00:11:55.966
The header comprises a source address,
00:11:58.033
specifying the identity of the device sending the frame,
00:12:01.033
and a destination address that identifies the receiver.
00:12:04.666
These are followed by a length field,
00:12:06.366
indicating the amount of data to follow.
00:12:09.766
The data comes next, up to 1500 bytes in length.
00:12:14.633
And, finally, a cyclic redundancy code
00:12:17.166
concludes the packet,
00:12:18.300
allowing the receiver to check
00:12:19.600
if the frame was received correctly
00:12:23.733
The start code provides for synchronisation
00:12:26.233
and timing recovery.
00:12:28.000
It’s a regular pattern that’s only
00:12:29.533
sent at the start of a frame,
00:12:31.033
and allows the receiver to precisely
00:12:32.900
measure the speed at which the frame is being sent.
00:12:38.300
The source and destination addresses
00:12:41.433
identify the devices sending and receiving the message.
00:12:46.200
Each is 48 bits, six bytes, in size,
00:12:49.300
and is globally unique.
00:12:51.633
The addresses are split into two 24-bit parts,
00:12:54.533
one indicating the vendor, and one indicating the device.
00:12:58.933
In this example,
00:13:00.266
the vendor ID of 00:14:51 indicates Apple,
00:13:05.100
and the device ID of 04:27:ea
00:13:08.666
indicates the laptop on which I’m recording this lecture.
00:13:12.933
Modern operating systems are starting to randomly
00:13:15.433
change the Ethernet addresses each time they
00:13:17.500
connect to the network, to limit tracking
00:13:19.766
and improve privacy.
00:13:22.833
Finally, the data part of an Ethernet frame
00:13:25.666
contains data for the next layer
00:13:27.366
up in the protocol stack, the network layer.
00:13:32.933
The last feature of the data link layer
00:13:35.166
that I want to discuss is media access control.
00:13:38.966
If you have a channel that’s shared
00:13:40.400
between multiple devices, such as a
00:13:42.600
common wireless link or a shared cable,
00:13:44.800
then there’s the risk that two devices
00:13:46.966
can try to send at once.
00:13:49.233
In this slide, for example, devices
00:13:51.566
A and B both try to send a message
00:13:53.566
to device C at the same time.
00:13:56.766
The centre image shows the signals sent
00:13:59.400
by the devices A and B,
00:14:01.300
each of which is sending using NRZ encoding.
00:14:04.966
These are entirely normal signals.
00:14:07.900
The right-hand image shows what’s received at C.
00:14:11.200
This is the superposition of the two signals,
00:14:13.966
the result of adding the two signals together,
00:14:16.433
and is corrupt and meaningless to the receiver.
00:14:20.166
Media access control
00:14:21.600
is the problem of avoiding such collisions.
00:14:26.400
A common way to perform media access
00:14:28.733
control is using a technique known as
00:14:30.433
carrier sense multiple access with collision detection,
00:14:34.066
CSMA/CD.
00:14:36.666
The idea is that when a device wants to send,
00:14:39.466
it first listens to see if another
00:14:41.466
device is sending already.
00:14:43.766
If another transmission is active,
00:14:45.600
then it waits before trying again.
00:14:48.666
If it doesn’t hear anything, it starts to send data.
00:14:52.666
While sending, it listens to see if another
00:14:54.866
device also starts to send.
00:14:57.300
If such a collision occurs,
00:14:58.900
the device stops sending,
00:15:00.600
waits, and tries again.
00:15:03.400
Collisions don’t usually happen,
00:15:05.300
because devices listen before sending,
00:15:07.533
but there’s always some chance that messages
00:15:10.033
might overlap because of the
00:15:11.366
time it takes a message to traverse the network.
00:15:15.766
As we see from the diagram on the right,
00:15:18.333
if device A starts to send,
00:15:20.366
and simultaneously B listens,
00:15:22.900
hears nothing because the message from A
00:15:25.066
hasn’t reached it yet, and starts sending,
00:15:27.700
then there’s the risk that both messages
00:15:29.366
collide and are corrupted.
00:15:32.033
Collisions are more likely to occur
00:15:33.833
in long distance networks,
00:15:35.433
with large propagation delays,
00:15:37.266
but can happen in any network with a shared channel.
00:15:42.300
If a collision occurs,
00:15:44.033
how long should a device wait
00:15:45.500
before trying to re-send a message?
00:15:48.333
Well, devices shouldn’t always wait for the
00:15:50.466
same amount of time.
00:15:52.533
Doing so would run the risk that two devices get stuck,
00:15:55.833
each repeatedly trying to send,
00:15:57.766
waiting the same time,
00:15:59.033
and then colliding again in a loop.
00:16:01.966
The amount of time to wait should be randomised,
00:16:04.166
to avoid deterministic collisions.
00:16:07.900
If such randomisation is used,
00:16:10.133
but another collision occurs after waiting,
00:16:12.500
this suggests that the network is busy.
00:16:15.533
It’s unlikely that two devices will
00:16:17.300
randomly wait for the same time,
00:16:19.500
so a subsequent collisions suggest that
00:16:21.333
there are many devices trying to send.
00:16:24.233
A sender should therefore increase the time it waits
00:16:26.600
after each collision,
00:16:27.666
to reduce the overall load on the network.
00:16:31.000
Many data link layer protocols
00:16:32.600
double the wait time after each repeated collision,
00:16:35.066
resetting when a successful transmission occurs.
00:16:39.266
This approach, known as CSMA/CD, is widely
00:16:44.066
used, including in Ethernet and WiFi networks.
00:16:47.333
A consequence of it
00:16:49.033
is that devices share access to a channel,
00:16:51.733
and how quickly they can send a message
00:16:53.866
depends on how busy is the network.
00:16:56.466
This introduces some element of unpredictability
00:16:58.833
into the timing of many messages sent over the network.
00:17:04.966
To conclude,
00:17:06.100
the physical layer provides
00:17:07.933
for encoding a sequence of bits onto a channel,
00:17:10.733
but says nothing about the meaning of those bits.
00:17:14.166
The data link layer starts to add structure,
00:17:16.800
separating the bit stream into frames,
00:17:19.166
checking those frames
00:17:20.466
to ensure they were correctly received,
00:17:22.400
identifying devices,
00:17:24.000
and arbitrating access to the channel.
00:17:26.766
Together, the physical and data link layers
00:17:29.400
enable local area networks,
00:17:31.133
where devices connected to a single link can communicate.
00:17:35.466
This forms the basis of the Internet.
00:17:38.733
In the next part, I’ll talk about the network layer,
00:17:42.000
that allows multiple networks to be combined into one.
Part 4: The Network Layer and the Internet Protocols
Part 4 of the lecture is about the network layer and the Internet
Protocols. It reviews the role of the network layer, and the ideas of
addressing, routing, and forwarding. And, it talks about the network
layer in the Internet, and IPv4 and IPv6.
Slides for part 4
00:00:01.466
The network layer allows several independently operated
00:00:04.733
networks to be combined to give the
00:00:06.866
appearance of a single network.
00:00:08.933
It provides an internetworking function that allows us to
00:00:11.600
build an internet.
00:00:13.666
In this part, I’ll talk about the Internet Protocols,
00:00:16.366
IPv4 and IPv6,
00:00:18.933
that provide the network layer in the Internet,
00:00:21.366
and briefly review the network layer concepts
00:00:24.000
of addressing, routing, and forwarding.
00:00:28.466
The network layer is the internetworking point
00:00:31.033
in the protocol stack.
00:00:32.866
The use of a common network layer
00:00:34.466
protocol allows us to decouple the operation
00:00:37.000
of the networks that comprise an internet,
00:00:39.133
from the operation of the applications that
00:00:41.166
run on the internet.
00:00:43.733
It allows each network to make its own choice
00:00:45.800
about what sort of data link
00:00:47.200
and physical layer technologies to use,
00:00:49.733
because that choice is hidden from the
00:00:51.466
applications and transport protocols
00:00:53.033
by the common network layer.
00:00:55.200
It doesn’t matter whether the underlying network
00:00:57.233
is Ethernet, WiFi, optical fibre,
00:00:59.966
or something else,
00:01:01.533
because the differences are hidden
00:01:02.933
from the upper-layer protocols.
00:01:05.533
Similarly, the use of a common network
00:01:07.800
layer makes it easy to deploy different
00:01:10.033
applications and transport protocols.
00:01:12.566
The lower layers must deliver network layer packets,
00:01:15.566
but are unaware of the type of application data
00:01:17.800
contained in those packets.
00:01:20.000
They cannot tell whether the packets being delivered
00:01:22.266
comprise an email message, a web page,
00:01:24.800
a phone call, streaming video, or whatever.
00:01:28.900
This approach is very flexible,
00:01:31.500
and makes it easy to support new physical
00:01:33.300
and data link layer technologies,
00:01:35.500
and new transport protocols and applications,
00:01:38.266
provided they can deliver packets for,
00:01:40.500
and operate over, the common network layer.
00:01:44.266
The disadvantage is that the network
00:01:46.533
layer is not optimised for any one application.
00:01:49.433
It emphasises generality and flexibility
00:01:52.333
to support many different uses,
00:01:54.600
rather than providing optimal performance
00:01:56.366
for any particular use case.
00:01:59.466
In the Internet,
00:02:00.533
the network layer is known as the Internet Protocol, IP.
00:02:04.233
This is the IP part of the
00:02:06.133
well known TCP/IP protocol suite.
00:02:10.100
The Internet Protocol provides a common way
00:02:12.433
to identify devices on the network,
00:02:14.500
using what’s known as an IP address.
00:02:17.400
It provides routing algorithms to direct packets
00:02:19.800
across the network from source to destination.
00:02:22.700
And it forwards those packets in a best effort manner,
00:02:25.333
accepting that the network may be unreliable.
00:02:29.066
At its core, the Internet Protocol
00:02:31.233
provides uniform connectivity.
00:02:33.600
Any host can send data to any other host,
00:02:36.066
subject to firewall policy,
00:02:38.166
but makes no quality guarantees.
00:02:43.366
There are two versions of the Internet Protocol in use.
00:02:46.600
The most commonly used version is IP version 4.
00:02:49.966
This was introduced into what became the Internet in 1983.
00:02:54.733
The figure shows the format of an IPv4 packet.
00:02:58.266
This is sent in the payload data section
00:03:00.300
of a data link layer packet,
00:03:01.933
such as an Ethernet or WiFi frame,
00:03:04.400
with the different parts of the IPv4 packet
00:03:06.866
being sent in the order shown,
00:03:08.766
left-to-right, top-to-bottom,
00:03:10.733
starting with the version number,
00:03:12.400
header length, DSCP field, and so on,
00:03:15.700
and concluding with the transport layer data.
00:03:19.533
Key parts of an IPv4 packet are
00:03:21.833
the source and destination addresses,
00:03:24.000
each 32 bits in size,
00:03:26.166
that denote the network interfaces
00:03:28.133
from which the packet was sent,
00:03:29.533
and to which it should be delivered.
00:03:32.200
The use of 32 bits for the address fields
00:03:34.500
allows 2-to-the-power-32 possible addresses,
00:03:37.333
around 4 billion,
00:03:38.900
which is not enough for the current Internet.
00:03:41.800
This is the motivation to switch to IPv6.
00:03:45.633
In addition to addressing,
00:03:47.333
IPv4 provides the fragment identifier,
00:03:50.233
fragment offset,
00:03:51.733
“don’t fragment” (DF),
00:03:53.800
and “more fragments” (MF) fields to allow
00:03:56.233
large IPv4 packets to be split into
00:03:58.966
pieces for delivery over networks
00:04:00.766
that can only deliver small packets.
00:04:03.366
It also includes a Differentiated Services Code Point
00:04:06.833
(DSCP) field to allow packets to request
00:04:10.066
special treatment by the network.
00:04:12.133
For example,
00:04:13.200
a packet that carries video conferencing or gaming data
00:04:16.100
might ask for low latency delivery,
00:04:18.633
while one carrying data that’s part of a background
00:04:20.966
software update might indicate that it’s low priority.
00:04:25.200
The time-to-live (TTL) field prevents packets
00:04:27.533
from circulating forever in the network in case a routing
00:04:30.266
problem causes them to go around in a loop,
00:04:32.566
and the header checksum detects transmission errors.
00:04:36.533
Finally, the upper layer protocol identifier
00:04:38.900
identifies the format of the transport layer data
00:04:41.000
that follows the IPv4 header.
00:04:43.600
This usually indicates that the transport
00:04:45.566
layer data is a TCP segment or a UDP datagram.
00:04:52.433
IPv6 was designed to solve the
00:04:54.500
problem that the IPv4 addresses are too small.
00:04:58.033
It replaces the 32 bit addresses used in IPv4
00:05:01.666
with 128 bit addresses.
00:05:04.100
This vastly increases the number of devices
00:05:06.033
that can be added to the network,
00:05:08.433
since each additional bit
00:05:10.000
doubles the number of addresses that are available.
00:05:13.300
In addition, IPv6 simplifies the header.
00:05:16.900
It removes the support for in-network fragmentation,
00:05:19.733
that was present in IPv4,
00:05:21.866
since it was difficult to implement efficiently,
00:05:24.733
and instead requires the hosts to adjust the size
00:05:26.966
of the packets they send to match the network path.
00:05:30.866
It also removes the header checksum,
00:05:32.500
since it’s usually redundant
00:05:33.900
with the checksum provided by the data link layer.
00:05:37.933
As of late 2020,
00:05:39.900
Google reports that about a third of their users access
00:05:42.433
Google over IPv6.
00:05:45.033
Statistics from Akamai,
00:05:46.633
a large content distribution network,
00:05:49.000
report around 60% of connections
00:05:51.266
to their network from India are over IPv6,
00:05:54.333
around 50% from the US, Germany,
00:05:57.333
Belgium, Greece, Taiwan, and Vietnam,
00:06:00.266
and around 35% from the UK.
00:06:03.766
IPv6 took a long time to start seeing deployment,
00:06:06.833
but its use has greatly accelerated over the last few years.
00:06:14.000
If we have IPv4 and IPv6,
00:06:17.400
you might ask what about IPv5?
00:06:20.900
Well, experiments with packet voice over the ARPAnet,
00:06:24.100
the precursor to the Internet,
00:06:25.966
started in the early 1970s, with the Network Voice Protocol
00:06:30.000
developed by Danny Cohen
00:06:31.433
at the University of Southern California’s
00:06:33.233
Information Sciences Institute.
00:06:36.233
This work eventually led to the Internet Stream Protocol,
00:06:39.100
ST-II, that was an experimental
00:06:41.466
multimedia streaming protocol
00:06:43.166
developed mostly in the 1980s and early 1990s.
00:06:47.333
ST-II ran in parallel to IPv4,
00:06:50.400
and used IP version 5 in its header.
00:06:54.433
ST-II was not widely deployed,
00:06:57.300
but it helped prototype a number of important ideas
00:06:59.933
around multimedia transport over packet networks.
00:07:04.100
Both what worked well, and what didn’t.
00:07:07.300
Steve Casner and Eve Schooler,
00:07:09.533
who both worked with Danny Cohen at ISI,
00:07:12.233
helped lead the development of the next wave of
00:07:14.366
multimedia transport protocols,
00:07:16.400
RTP and SIP,
00:07:18.166
based on experiences with ST-II and earlier protocols.
00:07:22.400
RTP and SIP are extremely widely used
00:07:25.733
in modern video conferencing services,
00:07:28.000
such as Zoom, Webex, and Microsoft Teams,
00:07:31.000
and as the basis for today’s mobile phone networks.
00:07:37.000
IPv4 and IPv6 have differently sized addresses,
00:07:41.533
but otherwise work similarly.
00:07:44.400
In both protocols, an IP address represents
00:07:47.100
]the location of a particular network interface.
00:07:50.666
If a device has more than one network interface,
00:07:54.000
for example if it’s a smart phone with both 5G and WiFi,
00:07:57.566
it will have one IP address for each interface.
00:08:01.066
It may also have both IPv4 and IPv6 addresses
00:08:04.900
assigned to some, or all, of its network interfaces.
00:08:08.600
And, in some cases, can have more than one
00:08:10.966
address of each type assigned to an interface.
00:08:13.866
Importantly, the IP address identifies the location
00:08:17.433
at which a network interface is attached to the network.
00:08:20.066
It does not identify the device,
00:08:22.533
and if a device moves to a different place
00:08:24.766
it will acquire a different IP address.
00:08:27.700
IPv4 and IPv6 addresses
00:08:30.600
both comprise a network part and a host part.
00:08:33.933
The network part, often known as the network prefix,
00:08:37.200
identifies the network to which the device is attached,
00:08:40.300
while the host part of the address
00:08:42.066
identifies a particular attachment point on that network.
00:08:45.700
The fraction of the address tthe address that identifies
00:08:48.366
the network part, and the fraction left for the host varies,
00:08:52.466
with different networks being assigned
00:08:54.033
differing amounts of address space.
00:08:57.133
As an example,
00:08:58.366
the School of Computing Science operates an IPv4 network
00:09:01.700
in Lilybank Gardens and the Boyd Orr Building
00:09:04.166
where the first 20 bits of the IPv4 address
00:09:06.966
comprise the network part,
00:09:08.600
and the last 12 bits are the host part.
00:09:11.233
IPv4 addresses in the range 130.209.240.0
00:09:17.033
to 130.209.255.255,
00:09:21.433
that all share the same initial 20 bit prefix,
00:09:24.800
are all within that network.
00:09:27.133
The School also has an IPv6 network,
00:09:29.733
that works similarly, although the addresses are longer.
00:09:34.366
In the wide area,
00:09:35.833
Internet traffic is routed towards its destination
00:09:38.733
looking only at the network part of the IP address.
00:09:41.733
Only when it reaches the destination network
00:09:44.566
do the routers in the network inspect
00:09:46.400
the host part of the address,
00:09:48.066
to find the device that should receive the packet.
00:09:52.166
Finally, it’s important to remember that the network layer
00:09:55.133
does not use names such as example.com.
00:09:58.166
Traffic delivered over the Internet
00:10:00.300
contains source and destination IP addresses,
00:10:03.066
and is routed based on those IP addresses.
00:10:06.200
The host that sends an IP packet
00:10:08.366
resolves the name of the destination to an IP address,
00:10:12.166
and puts that address into the destination address field
00:10:14.733
of the IP packet.
00:10:16.433
The network delivers that packet using only that IP address.
00:10:21.400
The DNS, that resolves names,
00:10:23.500
is just another application that runs on the Internet,
00:10:26.166
and is not fundamental to its operation.
00:10:31.933
The Internet is a network of networks.
00:10:35.533
Each network is administered and operated separately.
00:10:39.133
It acts as what is known as an Autonomous System, an AS.
00:10:43.666
Each AS can choose to use different technologies internally,
00:10:47.400
and can have different rules and policies.
00:10:50.333
The commonality is that all run the Internet Protocol.
00:10:54.300
The University of Glasgow acts as an Autonomous System
00:10:57.000
in the Internet,
00:10:58.500
as do companies such as Google or Facebook,
00:11:01.166
and Internet Service Providers such as BT,
00:11:03.900
Virgin Media, O2, etc.
00:11:09.000
Within a network, the network operator
00:11:11.900
will seek to deliver data to its destination
00:11:14.300
as efficiently as possible.
00:11:16.633
The Internet places no requirements on how they do this,
00:11:20.166
or on what data link layer
00:11:21.500
or physical layer technologies they use.
00:11:24.500
Each network operator is free to use whatever technologies,
00:11:27.666
and whatever routing or forwarding algorithms,
00:11:29.966
that it chooses.
00:11:32.033
Typically, the network operator
00:11:34.133
will seek to ensure traffic follows the shortest path
00:11:36.600
from source to destination across their network,
00:11:39.733
using either a distance vector routing algorithm,
00:11:42.566
or, more likely, a link state routing algorithm
00:11:45.700
such as OSPF.
00:11:47.400
There are a wide variety of different approaches used,
00:11:51.133
though, depending on the size of the network,
00:11:53.766
and the needs the network operator and its customers.
00:11:59.533
Most Internet traffic is not confined to a single AS.
00:12:03.733
Rather, it’s common for traffic to
00:12:06.066
pass through several Autonomous Systems
00:12:07.933
on its path from source to destination.
00:12:11.333
For example, packets sent from the University of
00:12:13.733
Glasgow to Google will start in the University’s network,
00:12:17.666
then traverse a network known as JANET
00:12:19.800
(the Joint Academic NETwork; the University’s ISP),
00:12:22.666
and then finally reach Google.
00:12:25.666
Each of these three networks is an Autonomous System.
00:12:29.200
Each must cooperate to forward the packets,
00:12:31.633
and find an appropriate route
00:12:33.366
for the data across the network.
00:12:35.666
Indeed, all of the ASes that comprise the Internet
00:12:38.833
cooperate to ensure the network
00:12:40.366
can successfully route data to its destination,
00:12:42.966
wherever that destination is.
00:12:46.200
This cooperation is enabled by
00:12:48.400
the Border Gateway Protocol, BGP.
00:12:51.766
BGP is a routing protocol.
00:12:54.100
It allows each AS to advertise the network
00:12:56.900
prefixes that it owns,
00:12:58.466
to tell the rest of the network where to send packets
00:13:01.400
destined for IP addresses contained within those prefixes.
00:13:05.466
In addition, BGP allows ASes
00:13:07.933
to advertise the routes they can use to reach
00:13:10.000
the network prefixes owned by other ASes.
00:13:13.300
This allows, for example,
00:13:15.266
an ISP to advertise how to reach the IP
00:13:17.766
addresses used by its customers.
00:13:20.433
Similarly, BGP allows ASes to filter out
00:13:23.833
advertisements for network prefixes to which they will not
00:13:26.366
forward traffic.
00:13:29.200
The information exchanged in BGP
00:13:31.733
allows ASes to decide how to route
00:13:33.700
packets across the network.
00:13:36.166
Networks located near the edge of the Internet
00:13:38.866
need to maintain relatively little information
00:13:41.166
to participate in BGP.
00:13:43.333
They simply need to know their own network prefixes,
00:13:46.466
and those of their customers.
00:13:48.400
They then advertise those prefixes
00:13:50.466
to the rest of the network.
00:13:52.566
Those edge ASes know how to route traffic to themselves
00:13:56.066
and their customers,
00:13:57.500
and just pass everything else to a default route
00:13:59.700
that directs it out to the wider network.
00:14:02.933
For networks nearer the core of the Internet,
00:14:05.400
this approach is no longer sufficient.
00:14:08.133
These ASes form the so-called default free zone,
00:14:11.200
the DFZ,
00:14:12.433
where they use BGP to put together something
00:14:15.400
close to a complete map of the Internet,
00:14:17.800
so they know how to forward packets to reach any
00:14:20.000
possible destination.
00:14:22.700
When it comes to BGP routing,
00:14:24.666
and finding the correct route for data to
00:14:26.433
cross the wide area Internet,
00:14:28.733
policy, politics, and economics are often
00:14:31.633
more important than finding the shortest route.
00:14:37.000
A final feature of the network layer is forwarding.
00:14:40.766
Given a route through the network,
00:14:42.833
calculated by BGP for the inter-domain path,
00:14:45.733
and by an intra-domain routing protocol
00:14:47.900
such as OSPF within each network,
00:14:50.600
how are the packets actually forwarded along that path?
00:14:54.700
Forwarding in the Internet follows a best effort approach.
00:14:58.533
A router in the network receives a packet,
00:15:00.966
and makes its best effort to forward it
00:15:02.700
towards its destination.
00:15:05.600
The network is connectionless.
00:15:07.933
A sender doesn’t need to establish a connection
00:15:10.566
or ask permission before sending a packet,
00:15:12.900
and makes no attempt to reserve capacity.
00:15:16.100
As a results, the network makes no guarantees.
00:15:19.533
If there are insufficient resources available to
00:15:22.033
forward a packet, then it may be delayed
00:15:24.400
or will simply be discarded.
00:15:27.466
The figure shows an example,
00:15:29.133
showing the time packets take to traverse the network,
00:15:32.366
with the x-axis showing time since the start
00:15:35.166
of transmission, and the y-axis
00:15:37.533
showing the time taken to receive a response
00:15:39.500
from the destination.
00:15:42.033
As can be seen, the time taken to get a response
00:15:44.566
may vary significantly,
00:15:46.033
depending on the amount of other traffic in the network.
00:15:49.566
Packets may be delayed, reordered, lost,
00:15:52.766
duplicated, or corrupted in transit.
00:15:56.800
In a well engineered network,
00:15:58.800
there is little timing variation,
00:16:00.866
packets are rarely lost or corrupted,
00:16:03.033
and almost never arrive out of order.
00:16:05.733
If the packets traverse a poor quality network,
00:16:08.600
however, behaviour may be less predictable.
00:16:12.666
The Internet can provide extremely high quality,
00:16:16.200
but there’s no requirement that it does so.
00:16:19.000
This gives flexibility.
00:16:20.566
The Internet can encompass all sorts of different networks,
00:16:23.833
not just those in rich countries
00:16:25.733
with well-developed infrastructure.
00:16:28.000
But, it requires applications
00:16:30.300
to be able to cope with unpredictable quality.
00:16:34.500
To summarise,
00:16:36.200
the network layer provides the interworking function
00:16:39.300
that allows networks to cooperate,
00:16:41.366
and come together to build a global internet.
00:16:44.400
It provides a common addressing scheme
00:16:46.566
to identify endpoints of communication,
00:16:49.066
a routing scheme to allow systems to determine
00:16:51.833
the appropriate path for data to take
00:16:53.500
across the internetwork, and a forwarding scheme
00:16:56.333
to move packets towards their destination.
00:16:59.833
In the Internet,
00:17:00.833
the network layer is the Internet Protocols,
00:17:03.466
IPv4 and IPv6.
00:17:06.433
Data is routed between Autonomous Systems using BGP,
00:17:10.266
and within autonomous systems using distance vector
00:17:13.400
or link-state routing protocols, such as OSPF.
00:17:17.433
Finally, packet forwarding happens on a best effort basis,
00:17:21.466
giving the flexibility operate over any type of link layer
00:17:24.800
at the cost of potentially unpredictable behaviour.
00:17:28.800
The network layer provides connectivity.
00:17:31.666
Above it sits the transport layer,
00:17:33.600
that provides the abstractions that
00:17:35.433
applications use to deliver data.
Part 5: The Transport Layer
Part 5 of the lecture reviews the transport layer in the Internet.
It talks about UDP and TCP, the services they provide to Internet
applications, and their strengths and weaknesses. It reviews TCP
connection establishment and congestion control very briefly.
Slides for part 5
00:00:00.566
The transport layer isolates the applications
00:00:03.566
from the vagaries of the network.
00:00:05.566
It ensures data is delivered with appropriate reliability,
00:00:08.833
and adapts the speed of transmission to match
00:00:11.033
the network capacity.
00:00:13.066
There are two widely used transport layer protocols
00:00:15.466
in the Internet: UDP and TCP.
00:00:19.300
UDP provides an unreliable packet delivery service,
00:00:22.666
essentially exposing the raw IP
00:00:25.033
network model to the applications.
00:00:27.500
It’s useful for applications that prefer
00:00:29.533
timeliness over reliability,
00:00:31.766
and as a building block for developing
00:00:33.433
new transport protocols.
00:00:35.833
TCP provides a reliable service,
00:00:38.300
retransmitting any lost packets,
00:00:40.466
putting reordered data back into the correct order,
00:00:43.200
and adapting its transmission rate
00:00:45.000
to match the available capacity.
00:00:47.166
It’s useful for applications that need data
00:00:49.633
to be delivered reliably
00:00:51.233
and as fast as possible.
00:00:53.566
In this part, I’ll talk about transport layer concepts,
00:00:56.866
and the UDP and TCP transport protocols
00:00:59.466
used in the Internet.
00:01:01.566
I’ll also briefly introduce the idea of congestion control,
00:01:04.700
adapting the sending rate of the transport
00:01:06.600
to match the available network capacity.
00:01:10.766
As discussed in the previous part,
00:01:13.600
an IP network provides only a
00:01:15.466
best effort packet delivery service.
00:01:18.000
IP packets can be lost,
00:01:19.800
duplicated, delayed, or re-ordered in transit.
00:01:24.033
The role of the transport protocol
00:01:26.100
is to isolate applications,
00:01:27.933
as much as is necessary, from the network.
00:01:31.333
The transport protocol demultiplexes traffic
00:01:33.866
destined for different applications.
00:01:35.933
It enhances the network quality of service
00:01:39.633
to offer appropriate reliability for those applications.
00:01:43.233
And it performs congestion control,
00:01:45.033
to adapt the transmission rate
00:01:46.400
to the available network capacity.
00:01:49.433
There are two transport protocols that have
00:01:51.366
been successfully deployed in the Internet.
00:01:53.700
UDP, the user datagram protocol,
00:01:57.200
provides an unreliable service.
00:02:00.200
TCP, the transmission control protocol,
00:02:03.133
provides a reliable, ordered,
00:02:05.333
and congestion controlled service.
00:02:07.966
Applications that run on the Internet
00:02:10.066
use one of these two transport protocols.
00:02:14.300
UDP is the simplest
00:02:16.366
possible transport protocol that can run on the Internet.
00:02:19.666
It exposes the raw IP service to applications,
00:02:23.733
adding only the concept of a port number,
00:02:26.133
to identify different applications running on a single host.
00:02:30.166
Like IPv4 and IPv6, UDP is connectionless.
00:02:35.300
An application doesn’t need to establish a UDP connection.
00:02:39.200
Rather, it can simply send a packet towards
00:02:41.633
a destination without asking permission
00:02:43.733
or establishing a connection.
00:02:46.933
UDP datagrams are delivered in a best effort manner.
00:02:50.600
Datagrams might not arrive at all,
00:02:52.800
and if they do arrive, they might not arrive in the order
00:02:55.733
they were sent.
00:02:57.266
The UDP transport protocol
00:02:58.866
doesn’t attempt to correct for this,
00:03:01.033
or to ensure reliable delivery.
00:03:03.733
It’s the responsibility of the application using UDP
00:03:07.133
to reconstruct the ordering, or to detect lost packets.
00:03:11.566
Similarly, UDP doesn’t perform any form of
00:03:14.700
congestion control.
00:03:16.366
If an application using UDP
00:03:18.566
tries to send faster than the network can deliver packets,
00:03:22.033
then those packets will simply be discarded.
00:03:25.033
UDP doesn’t help the application adapt
00:03:27.533
its sending rate to match the network capacity.
00:03:31.066
Accordingly, applications using UDP
00:03:34.433
must be able to tolerate some loss of data,
00:03:36.933
to receive data out of order,
00:03:39.400
and to be able to estimate and adapt
00:03:41.700
to the available network capacity.
00:03:44.533
Doing this well is extremely difficult,
00:03:47.033
and UDP is not suitable for many applications.
00:03:51.433
Where UDP is useful
00:03:53.233
is when the application prefers timeliness over reliability.
00:03:57.466
Because it doesn’t attempt to retransmit lost data,
00:04:00.533
and doesn’t buffer data
00:04:02.233
to allow it to be put into the correct order,
00:04:04.433
UDP offers the lowest possible latency for
00:04:07.433
data sent across the network.
00:04:10.300
That’s useful for applications like voice-over-IP,
00:04:13.366
video conferencing, and gaming,
00:04:15.633
that can tolerate some data loss but need low latency.
00:04:20.033
For most applications, though,
00:04:22.066
the unreliable nature of UDP
00:04:23.933
makes it poorly suited to their needs.
00:04:28.333
By way of contrast,
00:04:29.966
the TCP protocol provides a fully reliable,
00:04:33.266
ordered, byte stream delivery service
00:04:36.100
than runs over an IP network.
00:04:39.400
TCP is a connection oriented transport protocol.
00:04:42.800
Applications that use TCP as their transport
00:04:46.300
must first setup a connection from sender to receiver,
00:04:49.333
before they can send data.
00:04:51.900
Once a connection is established,
00:04:54.000
the TCP protocol ensures that any lost
00:04:56.533
data is retransmitted,
00:04:58.233
and that any reordered data
00:04:59.866
is put back into the correct order,
00:05:01.600
before delivering it to the receiving application.
00:05:05.600
The TCP protocol also adapts the rate at which
00:05:08.866
data is sent across the network
00:05:10.466
to match the available network capacity,
00:05:13.066
a process known as congestion control.
00:05:16.833
A TCP sender writes a sequence
00:05:19.133
of bytes into a TCP connection,
00:05:21.533
and that exact same sequence of bytes
00:05:23.400
is delivered to the receiver.
00:05:25.066
Reliably. In the order sent.
00:05:27.833
And at the maximum speed the network can support.
00:05:31.933
The overwhelming majority of applications
00:05:34.366
running on the Internet use TCP as their transport protocol.
00:05:39.366
TCP has two limitations.
00:05:42.600
The first is that TCP delivers a sequence of bytes,
00:05:46.100
not a sequence of messages.
00:05:48.400
If a sender writes 2000 bytes of data
00:05:51.200
onto a TCP connection,
00:05:53.500
comprising two messages of 1000 bytes each,
00:05:57.033
then TCP guarantees
00:05:58.966
that those 2000 bytes will be received reliably,
00:06:01.933
and in the order sent.
00:06:03.733
It does not guarantee that they will be received as
00:06:06.233
two messages of 1000 bytes each.
00:06:09.500
Indeed, it’s entirely possible that TCP
00:06:11.966
will deliver the data as a block of 1500 bytes
00:06:15.033
followed by a separate block containing
00:06:17.300
the remaining 500 bytes.
00:06:20.000
Applications that care about message boundaries must
00:06:23.233
structure the data they send over a TCP connection
00:06:26.300
to allow those message boundaries to be reconstructed.
00:06:30.066
Since reading and writing data to a file
00:06:32.600
also doesn’t preserve message boundaries,
00:06:35.266
doing this tends not to be difficult.
00:06:39.433
The other limitation of TCP is that,
00:06:42.133
because it delivers data reliably and in order,
00:06:45.100
it must retransmit any lost packets.
00:06:48.333
This delays any data following those lost packets,
00:06:51.700
since it can’t be delivered to the application
00:06:53.866
until the retransmission arrives.
00:06:56.966
This is an unavoidable trade-off,
00:06:58.800
if data is to be delivered reliably,
00:07:00.700
and in the order sent, across an unreliable network,
00:07:04.000
but does mean that TCP trades latency for reliability.
00:07:11.466
TCP delivers data in the form of segments,
00:07:14.600
each delivered within an IP packet.
00:07:17.966
Each TCP segment
00:07:20.033
includes a source and destination port numbers,
00:07:22.733
that identify the applications
00:07:24.966
that send and receive the data.
00:07:27.300
Each TCP segment also includes a sequence number,
00:07:30.433
that counts the number of bytes of data sent
00:07:32.900
and allows the receiver to detect loss
00:07:35.166
and reconstruct the original sending order,
00:07:37.600
and an acknowledgement number
00:07:39.400
that indicates the next segment it wishes to receive.
00:07:43.033
TCP segments also include the receiver window size,
00:07:46.900
that indicates the amount of buffer space
00:07:48.833
the receiver has to store incoming TCP segments,
00:07:52.100
a checksum to detect packet corruption,
00:07:55.366
a set of flags to manage connection setup,
00:07:58.166
and an urgent pointer to support
00:07:59.900
advance delivery of important data.
00:08:02.766
The actual TCP payload data follows this header.
00:08:07.400
As can be seen,
00:08:08.800
TCP packets carry a lot of information
00:08:11.366
in addition to that carried in an IP packet header.
00:08:15.066
TCP is a complex and sophisticated transport protocol,
00:08:18.500
that adds a lot of features to IP
00:08:20.800
in order to ensure that data is delivered effectively.
00:08:25.933
A TCP connection proceeds in three stages.
00:08:30.933
At the start of the connection
00:08:32.866
is an initial three-way handshake
00:08:34.766
that establishes the connection.
00:08:36.866
An initial TCP packet is sent
00:08:39.566
from the TCP client to the server,
00:08:41.666
with the SYN (“synchronise”) bit
00:08:44.500
set in the TCP header,
00:08:46.233
to indicate the start of a connection.
00:08:49.266
The server sends a response,
00:08:51.566
with its SYN bit set,
00:08:54.233
to indicate that it’s willing to establish a connection.
00:08:57.433
This response also has the ACK bit set,
00:09:00.300
indicating that the acknowledgement number is valid,
00:09:02.900
because it acknowledges receipt of that initial packet.
00:09:07.000
The client completes the three-way
00:09:09.133
SYN - SYN+ACK - ACK handshake
00:09:11.266
by sending an acknowledgement to the server.
00:09:13.800
This establishes the connection.
00:09:17.200
The client and server can then exchange data.
00:09:20.933
In this example, the client sends data to the server,
00:09:24.366
and the server acknowledges receipt of that data.
00:09:27.633
The initial packet had sequence number 0,
00:09:30.933
and each packet includes 1500 bytes of data.
00:09:35.133
We see the server sending acknowledgement packets,
00:09:37.900
and delivering data to the application in response
00:09:40.300
to recv() calls, as the data packets arrive.
00:09:44.366
We also see that the client
00:09:46.066
can send some number of data packets,
00:09:48.100
known as its congestion window,
00:09:50.100
before it receives an acknowledgement.
00:09:52.966
In this example, the congestion window is 3000 bytes:
00:09:56.533
the client is allowed to send up to 3000 bytes of data
00:09:59.633
without receiving an acknowledgement.
00:10:04.000
The packet with sequence number 4500,
00:10:07.533
sent from client to server, is lost.
00:10:10.800
As a result, when the next packet,
00:10:13.300
with sequence number 6000, arrives
00:10:16.033
the server generates another acknowledgment
00:10:18.433
indicating that it’s still expecting packet 4500.
00:10:23.233
This continues until three duplicate
00:10:25.166
acknowledgements have been received,
00:10:26.800
when the client retransmits the lost packet.
00:10:30.200
Eventually, the missing data arrives at the server.
00:10:33.933
This fills the gap, and the missing data,
00:10:36.533
and the three following packets that were already received,
00:10:39.533
are delivered to the application.
00:10:42.166
The server sees a delay,
00:10:43.866
while the missing data is retransmitted
00:10:46.000
then receives a burst of data.
00:10:48.533
Message boundaries are not preserved.
00:10:52.433
Finally, the connection is closed
00:10:54.333
using a three-way handshake
00:10:55.566
similar to that used to open the connection.
00:10:58.866
We’ll talk more about how TCP establishes connections,
00:11:02.166
and reliably transmits data, in later lectures.
00:11:08.533
The TCP protocol includes a sophisticated algorithm,
00:11:12.000
known as congestion control,
00:11:14.000
that adapts its transmission rate
00:11:15.766
to match the available network capacity.
00:11:18.766
This operates in two phases.
00:11:21.533
The first phase is known as slow-start.
00:11:24.800
At the start of a connection,
00:11:26.666
TCP starts by sending slowly,
00:11:29.066
but increases its sending rate exponentially
00:11:31.633
until it reaches the network capacity.
00:11:33.933
Once it’s sending at a speed
00:11:36.200
that matches the available network capacity,
00:11:38.533
TCP switches to a congestion avoidance phase,
00:11:41.666
adapting its sending rate following a sawtooth pattern
00:11:45.033
that allows it to gradually adapt to changes in capacity.
00:11:49.166
There are a lot of subtle details
00:11:51.300
in TCP congestion control behaviour,
00:11:53.233
that we’ll talk about later in the course.
00:11:56.033
What’s important for now is that TCP can effectively
00:11:59.366
adjust the rate at which data is sent
00:12:01.200
to match changes in the available network capacity.
00:12:04.533
The algorithm it uses to do this looks simple at first glance,
00:12:08.600
but actually contains a lot of complex features,
00:12:11.433
and not obvious behaviours, and is highly effective.
00:12:17.266
To conclude,
00:12:18.933
the transport layer protocols adapt the service
00:12:22.533
provided by the network layer
00:12:23.866
to meet the needs of applications.
00:12:26.266
The Internet provides two transport protocols,
00:12:28.833
TCP and UDP.
00:12:31.033
TCP is a complex and sophisticated protocol,
00:12:34.200
that is highly optimised to deliver reliable data quickly.
00:12:38.066
It’s well suited to the overwhelming majority
00:12:40.800
of Internet applications.
00:12:42.766
UDP is much less sophisticated,
00:12:45.400
and essentially exposes the best effort
00:12:47.433
packet delivery service
00:12:48.666
offered by the underlying IP layer to the application.
00:12:51.700
It’s useful when developing applications that
00:12:54.100
prefer timeliness to reliability,
00:12:56.566
or as a basis for building new transport protocols,
00:12:59.700
but is very difficult to use effectively.
00:13:02.866
The transport layer is the last
00:13:05.100
general purpose layer in the protocol stack.
00:13:07.533
Above it sit the application protocols,
00:13:10.033
that we’ll discuss briefly in the next part.
Part 6: Higher Layer Protocols
Part 6 of the lecture reviews the higher layers of the protocol stack.
It talks about the role of the session layer in managing connections;
it reviews the way the presentation layer supports different data
formats and format negotiation; and it reviews the role of the application
layer in supporting the application logic. Finally, it discusses the
importance of standardisation in ensuring interoperability.
Slides for part 6
00:00:00.000
The higher layers of the OSI reference
00:00:03.696
model – the session, presentation, and application
00:00:06.393
layers – provide services to help applications
00:00:09.089
manage transport connections, data formats, and the
00:00:11.785
application logic.
00:00:12.556
In this part, I’ll talk briefly about
00:00:15.352
some issues to consider around higher layer
00:00:18.048
protocols, and of the importance of protocol
00:00:20.744
standards for interoperability.
00:00:23.000
The OSI reference model defines three layers
00:00:25.878
above the transport layer. These are the
00:00:28.655
session, presentation, and application layers.
00:00:30.639
The goal of protocols at these layers
00:00:33.517
is to support the needs of applications.
00:00:36.294
They manage transport connection, name and locate
00:00:39.072
resources used by the application, describe and
00:00:41.850
negotiate data formats, and present the data
00:00:44.627
in an appropriate manner. Essentially, they translate
00:00:47.405
the application’s needs into protocol mechanisms.
00:00:49.786
The protocols used at these layers tend
00:00:52.663
to be quite tightly bound to particular
00:00:55.441
classes of application, and are less general
00:00:58.218
purpose than protocols at the transport layer
00:01:00.996
and below. As a result, the boundaries
00:01:03.774
between these layers tend to be less
00:01:06.551
clear, and many systems implement these higher
00:01:09.329
layer protocols without a clear distinction between
00:01:12.106
the layers.
00:01:14.000
The session layer is about managing transport
00:01:16.991
layer connections.
00:01:17.816
Some applications are straight forward, with a
00:01:20.807
single client connecting to a single server,
00:01:23.698
or to a small set of servers.
00:01:26.588
HTTP is an example of such a
00:01:29.479
protocol. This class of protocols tends to
00:01:32.369
need relatively little session layer support,
00:01:34.847
often limited to being able to re-use
00:01:37.738
a transport layer connection to send and
00:01:40.628
receive multiple messages.
00:01:41.867
Others systems are more complex, involving multiple
00:01:44.858
clients and servers collaborating to meet the
00:01:47.748
application needs. Examples include video conferencing and
00:01:50.639
chat applications, and multiplayer games, that use
00:01:53.529
a cluster of servers to support end-user
00:01:56.420
applications. The session layer in these applications
00:01:59.311
tends to be concerned with finding the
00:02:02.201
participants in a call or players in
00:02:05.092
the game, and forwarding messages between those
00:02:07.982
participants and between the servers supporting each
00:02:10.873
user.
00:02:12.036
Peer-to-peer applications tend to have still more
00:02:15.026
sophisticated session layer features. These applications have
00:02:17.917
the problem of how to set-up peer-to-peer
00:02:20.808
connections in the presence of firewalls and
00:02:23.698
network address translators, and often need to
00:02:26.589
adapt to rapid changes in group membership.
00:02:29.479
BitTorrent is a well known example of
00:02:32.370
this class of application.
00:02:34.022
Finally, there are applications that use multicast
00:02:37.012
or broadcast transmission, where the network can
00:02:39.903
send messages to a group of receivers.
00:02:42.794
Like peer-to-peer applications, these tend to require
00:02:45.684
the session layer to manage group membership
00:02:48.575
changes.
00:02:49.738
Each session layer protocol is different,
00:02:52.315
and they have relatively little in common
00:02:55.206
with each other, since they tend to
00:02:58.096
be closely tied to particular classes of
00:03:00.987
application.
00:03:01.000
The presentation layer sits above the session
00:03:03.900
layer, and manages the presentation, representation,
00:03:06.300
and conversion of data.
00:03:07.900
Presentation layer features include the media type
00:03:10.800
descriptions that web servers and video conferencing
00:03:13.600
tools use to describe data formats.
00:03:16.000
They specify that a page is in
00:03:18.800
HTML, whether an image is a JPEG
00:03:21.600
or a PNG, or that the video
00:03:24.400
is compressed with H.264 rather than VP8.
00:03:27.200
And they allow applications to describe the
00:03:30.000
data formats they support, and to negotiate
00:03:32.800
an agreed format with the other participants
00:03:35.600
in the session.
00:03:36.800
Other presentation layer features include channel encodings,
00:03:39.700
where data is adapted to fit the
00:03:42.500
limitations of the communications channel. An example
00:03:45.300
is email, where the original design only
00:03:48.100
supported textual data in ASCII format.
00:03:50.500
When support for email attachments was added,
00:03:53.300
those attachments had to look like text,
00:03:56.100
so they could pass through mail servers
00:03:58.900
that hadn’t yet been upgraded. A channel
00:04:01.700
encoding scheme, known as BASE64 encoding,
00:04:04.100
was developed to do this, allowing arbitrary
00:04:06.900
data to be converted to a text-based
00:04:09.700
format for transmission.
00:04:10.900
Finally, the presentation layer is often where
00:04:13.800
support for internationalisation is implemented. There are
00:04:16.600
two sorts of concerns here. One is
00:04:19.400
around labelling the character set used to
00:04:22.200
represent textual data, whether it is Unicode
00:04:25.000
text in UTF-8 format, or some national
00:04:27.800
character set such as ASCII, Latin1,
00:04:30.200
or the Big5 system used in Taiwan
00:04:33.000
and Hong Kong. The other is around
00:04:35.800
labelling the language, and possibly the regional
00:04:38.600
variant or dialect, that is used.
00:04:41.000
The problems addressed by the presentation layer
00:04:43.900
are wide-ranging, and tend to relate to
00:04:46.700
the broader system, and to the environment
00:04:49.500
in which the application operates, not just
00:04:52.300
to the network communication.
00:04:55.000
The final layer in the 7-layer OSI
00:04:58.119
reference model is the application layer.
00:05:00.706
It is here that protocol functions that
00:05:03.725
are specific to the application logic are
00:05:06.744
implemented. The application layer protocols deliver email
00:05:09.763
messages, retrieve web pages, stream video,
00:05:12.350
support multiplayer games, and so on.
00:05:14.938
By definition, such messages are entirely application
00:05:18.056
specific, and there is little that is
00:05:21.075
general to say here.
00:05:24.000
The OSI model is a reasonable way
00:05:27.116
of thinking about network protocols. Having a
00:05:30.132
common model helps to frame and organise
00:05:33.148
discussion of network protocols. Real networks are
00:05:36.164
more complex, and the layer boundaries are
00:05:39.180
less clear cut than the model suggests,
00:05:42.196
especially around the higher layers, but the
00:05:45.212
OSI model is close enough to reality
00:05:48.228
to be useful.
00:05:49.521
But, it misses two key layers:
00:05:52.206
the financial and the political.
00:05:54.360
Network protocols are only successful to the
00:05:57.476
extent that they enable different devices to
00:06:00.492
communicate. This requires interoperability between different implementations
00:06:03.508
of a protocol, implemented by different groups
00:06:06.524
of people.
00:06:07.386
Getting the incentives right, so that different
00:06:10.502
vendors can work together to ensure their
00:06:13.518
products interoperate, is a problem that rapidly
00:06:16.534
goes beyond the technical, and into the
00:06:19.550
realms of organisational politics, market forces,
00:06:22.135
regulation, and economics.
00:06:23.427
It’s an area where standards setting organisations,
00:06:26.543
such as the Internet Engineering Task Force,
00:06:29.559
the International Telecommunications Union, the World-wide Web
00:06:32.575
Consortium, the 3rd Generation Partnership Project,
00:06:35.161
the Moving Picture Experts Group, and others,
00:06:38.177
play an important role.
00:06:41.000
Network protocol standards are a very human
00:06:44.141
process. They’re the result of much discussion
00:06:47.181
and negotiation, much argument and compromise.
00:06:49.787
The IETF describes the outcome as “rough
00:06:52.828
consensus and running code”. The Internet works
00:06:55.869
because thousands of engineers spend the effort
00:06:58.909
to make sure it works, to make
00:07:01.950
sure their products work together, and that
00:07:04.991
the protocols are described well enough to
00:07:08.031
support interoperability.
00:07:10.000
This concludes our brief review of the
00:07:12.260
Internet protocols. In the next part,
00:07:14.111
we’ll start to think about how the
00:07:16.271
network is changing and evolving, to set
00:07:18.431
the scene for the remainder of the
00:07:20.591
course.
Part 7: The Changing Internet
Part 7, the final part of this lecture, moves on from the review to
talk about some of the changes occurring the Internet, to begin to
set the scene for later lectures. It talks about the assumptions
made in the design of the network, and discusses some changes that
mean those assumption don't necessarily hold in the modern Internet.
In particular, it talks about IPv4 address exhaustion, challenges
in establishing connectivity, increasing device mobility, hypergiants
and centralisation, the need to support real-time applications, and
challenges in securing the network. Finally, it briefly discusses
some of the difficulties faced in upgrading a globally deployed running
network.
Slides for part 7
00:00:00.000
We are in the middle of a
00:00:03.689
period of rapid change in the Internet
00:00:06.378
infrastructure. In this part, I want to
00:00:09.067
highlight some ways in which the network
00:00:11.756
is changing, to set the scene for
00:00:14.444
the remainder of the course.
00:00:16.365
In particular, I want to talk about
00:00:19.154
the exhaustion of the IPv4 address space,
00:00:21.843
and the accelerating transition to IPv6.
00:00:24.148
The increasing proportion of wireless, mobile,
00:00:26.552
Internet endpoints, such as smartphones, and the
00:00:29.241
implications of this shift to a network
00:00:31.930
of mobile devices.
00:00:33.083
The increasing centralisation of the network around
00:00:35.871
a small number of hypergiant content providers.
00:00:38.560
The increasing use of real-time applications,
00:00:40.965
and the corresponding need for low-latency transport.
00:00:43.654
And, finally, new approaches to protocol design
00:00:46.443
to support innovation in the face of
00:00:49.132
network ossification.
00:00:51.000
The designers of the Internet made certain
00:00:53.791
assumptions.
00:00:54.925
They assumed that devices would generally be
00:00:57.716
located at a fixed location in the
00:01:00.407
network, and would have a small number
00:01:03.099
of network interfaces that could be given
00:01:05.790
persistent and globally unique addresses.
00:01:07.712
They assumed the network, and the services
00:01:10.503
that run on it, would be operated
00:01:13.194
by many different organisations, working as peers,
00:01:15.885
and that there would be no central
00:01:18.576
points of control.
00:01:19.729
They assumed that best effort packet delivery
00:01:22.520
service would provide sufficient quality, and that
00:01:25.211
applications would be adaptive, and able to
00:01:27.902
cope with changes in the network capacity
00:01:30.593
and available bandwidth.
00:01:31.746
They assumed that the network was trusted
00:01:34.537
and secure.
00:01:35.306
And they assumed that innovation would happen
00:01:38.097
at the edges, and that the network
00:01:40.788
itself would provide only a simple packet
00:01:43.479
delivery service.
00:01:44.248
These assumptions generally made sense for a
00:01:47.039
research network in the mid-1980s, when the
00:01:49.730
original design of the Internet protocols was
00:01:52.421
being finalised. Do they still make sense
00:01:55.112
for a network today?
00:01:57.000
The first assumption was that devices were
00:02:00.231
located at fixed places in the network,
00:02:03.362
and that each device had a small
00:02:06.493
number of network interfaces, that could be
00:02:09.624
given persistent and globally unique IP addresses.
00:02:12.755
This gives the desirable property that every
00:02:15.986
device is addressable by every other device.
00:02:19.117
It assumes a network of peers,
00:02:21.800
with no conceptual difference between clients and
00:02:24.931
servers, where any device connected to the
00:02:28.062
network can take either role, depending on
00:02:31.193
the software it runs. And where,
00:02:33.877
in principle, any device can connect to
00:02:37.008
any other device.
00:02:38.350
Of course, addressable doesn’t necessarily imply reachable,
00:02:41.580
and the Internet has long supported firewalls
00:02:44.711
to provide access control, by blocking access
00:02:47.842
to certain devices, but the ability for
00:02:50.973
any device to be a server empowers
00:02:54.104
end-users.
00:02:55.301
If any device can act as a
00:02:58.532
server, it ought to be possible to
00:03:01.663
run a website, or other service,
00:03:04.347
anywhere in the network, including on a
00:03:07.478
home machine. There should be no requirement
00:03:10.609
to pay for a dedicated server,
00:03:13.293
located in a managed data centre,
00:03:15.976
to host a website or other service.
00:03:19.107
Unfortunately, this assumption failed.
00:03:20.996
It failed because IPv4 had insufficient addresses
00:03:24.227
to support it, and because devices became
00:03:27.358
mobile.
00:03:28.555
The problem with the lack of addresses
00:03:31.786
in IPv4 results in many hosts sharing
00:03:34.917
IP address with others, using a technique
00:03:38.048
known as network address translation (NAT).
00:03:40.732
In theory, IPv6 will solve this problem
00:03:43.863
by providing enough addresses for each host,
00:03:46.994
but IPv6 has been slow to deploy.
00:03:50.125
The result of this lack of addresses
00:03:53.356
is that connectivity becomes difficult.
00:03:55.592
It’s increasingly necessary to try both IPv4
00:03:58.823
and IPv6 addresses when connecting to a
00:04:01.954
device, perhaps racing connection attempts in parallel
00:04:05.085
to get good performance – a technique
00:04:08.216
known as Happy Eyeballs because it improves
00:04:11.347
end user web browsing performance.
00:04:13.583
It’s also necessary to think about NAT
00:04:16.814
traversal. A client behind a NAT can
00:04:19.945
easily connect to a server on the
00:04:23.076
public Internet but, as we’ll discuss in
00:04:26.207
Lecture 2, it’s difficult to connect to
00:04:29.338
a device located behind a NAT,
00:04:32.022
and to build peer-to-peer applications.
00:04:34.258
This combination greatly increases the complexity of
00:04:37.489
establishing a connection.
00:04:38.831
This makes networked applications slower, and less
00:04:42.062
reliable. And, perhaps more importantly, it forces
00:04:45.193
server applications to run in data centres,
00:04:48.324
and discourages peer-to-peer applications. This forces reliance
00:04:51.454
on cloud services, and encourages centralisation of
00:04:54.585
Internet services onto large cloud providers,
00:04:57.269
such as Amazon Web Services and Google.
00:05:00.000
How severe is the shortage of IPv4
00:05:03.391
addresses?
00:05:04.612
Well, IP address assignment follows a hierarchical
00:05:08.003
model. The Internet Assigned Numbers Authority (IANA)
00:05:11.295
assigns blocks of IP addresses to Regional
00:05:14.586
Internet Registries. Those regional registries then assign
00:05:17.878
addresses to ISPs, and other organisations within
00:05:21.169
their region, and those in turn allocate
00:05:24.461
addresses to end users.
00:05:26.341
The IANA assigned the last available blocks
00:05:29.733
of IP addresses to the regional registries
00:05:33.024
in 2011 – around ten years ago.
00:05:36.316
Since then, the regional registries have gradually
00:05:39.707
been running down their pool of available
00:05:42.999
addresses. As of late 2020, the regional
00:05:46.290
registries for Europe, North America, Latin America,
00:05:49.582
and the Asia-Pacific Region have entirely run
00:05:52.873
out of available IPv4 addresses. Africa is
00:05:56.165
projected to run out of IPv4 addresses
00:05:59.456
during 2021.
00:06:00.397
There is now a thriving market in
00:06:03.788
the transfer of Pv4 addresses, with networks
00:06:07.080
that were previously assigned IPv4 addresses,
00:06:09.901
and have more than they need,
00:06:12.722
selling the addresses on to others.
00:06:15.544
As of late 2020, a single IPv4
00:06:18.835
address can be sold for around $25.
00:06:22.127
The IPv4 address space owned by the
00:06:25.418
University of Glasgow, for example, is worth
00:06:28.710
around $1,600,000.
00:06:30.000
Despite this shortage of IPv4 addresses,
00:06:32.821
adoption of IPv6 has been relatively slow,
00:06:35.995
although it’s now reaching critical mass.
00:06:38.716
For example, as shown on the slide,
00:06:41.991
Google currently reports that around a third
00:06:45.165
of its users are on IPv6.
00:06:47.886
Availability of IPv6 is highly variable,
00:06:50.707
since network operators tend to switch their
00:06:53.881
customers all at once. Generally, either all
00:06:57.056
the users in a particular network have
00:07:00.230
IPv6, or none of them. In the
00:07:03.405
UK, for example, most mobile networks assign
00:07:06.579
IPv6 addresses to smartphones on their networks
00:07:09.753
by default, but many residential ISPs and
00:07:12.928
businesses still use IPv4. What type of
00:07:16.102
address you get depends on how you
00:07:19.277
connect to the network.
00:07:21.091
Other countries follow similar patterns, with some
00:07:24.365
networks switching wholesale to IPv6, while others
00:07:27.540
remain on IPv4.
00:07:30.000
This mixed use of IPv4 and IPv6,
00:07:33.276
with many IPv4 hosts being located behind
00:07:36.453
network address translators, greatly complicates connection establishment.
00:07:39.629
In the IPv4 Internet, peer-to-peer applications must
00:07:42.905
perform a complex process to discover NAT
00:07:46.082
bindings, exchange candidate addresses with their peer,
00:07:49.258
and probe to establish what addresses are
00:07:52.434
usable for a connection.
00:07:54.249
This uses a set of protocols,
00:07:57.072
known as STUN and TURN, and the
00:08:00.248
assistance of a central server with a
00:08:03.424
globally unique public IPv4 address, to detect
00:08:06.601
the presence of network address translation,
00:08:09.323
to determine what type of address translation
00:08:12.499
is being performed, and to derive a
00:08:15.676
set of candidate addresses that can be
00:08:18.852
used for peer to peer communication.
00:08:21.575
The peers use the central server to
00:08:24.751
exchange these candidate addresses with each other,
00:08:27.927
and then follow an algorithm known as
00:08:31.103
Interactive Connectivity Establishment (ICE) to setup direct,
00:08:34.280
low-latency, peer-to-peer flows.
00:08:35.641
This works, most of the time,
00:08:38.464
but is complicated, slow, power hungry,
00:08:41.186
and generates a lot of otherwise unnecessary
00:08:44.362
traffic – and all because there aren’t
00:08:47.539
enough IPv4 addresses.
00:08:50.000
The fix, of course, is to move
00:08:53.290
to IPv6.
00:08:54.201
Unfortunately, the move to IPv6 will take
00:08:57.490
a long time. And, while it’s happening,
00:09:00.680
there will be some devices, and some
00:09:03.869
networks, that support only IPv4, some that
00:09:07.059
support only IPv6, and some that support
00:09:10.248
both.
00:09:11.454
To reach users on both IPv4 and
00:09:14.744
IPv6, popular services tend to be hosted
00:09:17.933
on servers that have both IPv4 and
00:09:21.123
IPv6 addresses. This is known as dual
00:09:24.312
stack hosting, and it further encourages centralisation
00:09:27.502
onto large hosting providers with the resources
00:09:30.691
to provide both types of address.
00:09:33.425
To get good performance, clients must try
00:09:36.715
to connect using both IPv4 and IPv6
00:09:39.904
addresses simultaneously, or near simultaneously. This further
00:09:43.094
complicates connection setup, making it harder to
00:09:46.283
write networks applications.
00:09:48.000
The Internet Protocols are designed such that
00:09:50.907
IP addresses encode the location of a
00:09:53.714
network interface within the network. An IP
00:09:56.521
address does not represent a device,
00:09:58.927
it represents a location where a device
00:10:01.733
can attach to the network.
00:10:03.738
If a device is attached to the
00:10:06.645
network via a wireless connection, but moves
00:10:09.452
so that it changes the WiFi or
00:10:12.259
4G base station to which it connects,
00:10:15.066
then that device will be assigned a
00:10:17.873
new IP address.
00:10:19.076
This has some privacy benefits, but makes
00:10:21.983
it difficult to maintain long-lived connections with
00:10:24.790
that device. For example, TCP connections fail,
00:10:27.597
and must be re-established by the application,
00:10:30.403
when a device moves. And UDP applications
00:10:33.210
need to coordinate with their peers to
00:10:36.017
change the IP addresses they use for
00:10:38.824
communication.
00:10:39.975
Applications that want to maintain long-lived connections,
00:10:42.882
or that want to accept incoming connections,
00:10:45.689
must deal with the complexity of changing
00:10:48.496
IP addresses, and the need to signal
00:10:51.303
such changes to their peers and reestablish
00:10:54.110
connections as the device moves.
00:10:56.115
Something has to keep track of where
00:10:59.021
each device is, in order to route
00:11:01.828
traffic. The assumptions in the design of
00:11:04.635
the Internet mean that complexity is visible
00:11:07.442
to applications, rather than hidden inside the
00:11:10.249
network.
00:11:11.000
As we’ve seen, several aspects of the
00:11:14.324
Internet’s design push toward centralisation.
00:11:16.627
The network topology is gradually flattening,
00:11:19.491
and moving away from a complex mesh
00:11:22.715
of peer connections, towards a hub-and-spoke model
00:11:25.939
centred around a small number of large,
00:11:29.164
centralised, services, directly connecting to so-called “eyeball”
00:11:32.388
networks, at the edge of the network,
00:11:35.612
where consumers of those services live.
00:11:38.376
This enables the set of hypergiant content
00:11:41.700
providers, including Google, Facebook, Amazon, Akamai,
00:11:44.464
Apple, Netflix, and so on, to dominate,
00:11:47.688
and makes it difficult for new competitors
00:11:50.912
to gain a foothold. This has implications
00:11:54.136
for network neutrality, competition, and innovation.
00:11:58.000
We’re also seeing steady growth of real-time
00:12:01.077
traffic, with streaming video being, by far,
00:12:04.055
the dominant type of traffic in the
00:12:07.032
network – while still growing at ~40%
00:12:10.009
year on year.
00:12:11.285
Streaming video has reasonably strict timing and
00:12:14.362
quality constraints, and these push network operators
00:12:17.340
to improve the quality of their networks,
00:12:20.317
and push streaming video content providers to
00:12:23.294
peer directly with the residential edge networks.
00:12:26.271
This is a further incentive towards centralisation,
00:12:29.249
the flattening of the network, since such
00:12:32.226
direct peerings make it easier to ensure
00:12:35.203
high-quality video is delivered to viewers.
00:12:37.755
Increases in streaming video are also driving
00:12:40.832
changes in TCP congestion control, such as
00:12:43.810
Google’s BBR algorithm, and the development of
00:12:46.787
TCP replacements, such as QUIC, both aimed
00:12:49.764
at reducing latency and increasing quality.
00:12:52.316
Lectures 4 and 5 talk will about
00:12:55.294
some of these developments.
00:12:56.995
WebRTC-based video conferencing services, such as Zoom,
00:13:00.072
Webex, and Microsoft Teams, have even stricter
00:13:03.049
latency requirements.
00:13:05.000
The COVID-19 pandemic has accelerated these effects.
00:13:08.345
This graph shows measurement of Internet traffic
00:13:11.691
as the initial lockdowns started in March
00:13:14.936
2020. It shows that many residential networks
00:13:18.182
saw a 20-25% increase in the total
00:13:21.427
amount of Internet traffic they were carrying,
00:13:24.673
as people shifted to working from home.
00:13:27.918
It also shows a corresponding drop in
00:13:31.164
mobile traffic.
00:13:32.091
That shift in traffic wasn’t evenly distributed,
00:13:35.436
though.
00:13:37.000
This second graph shows the amount of
00:13:40.090
video data being sent over Webex,
00:13:42.653
one of the popular video conferencing platforms,
00:13:45.643
over a similar time period. Usage grew
00:13:48.633
by around a factor of 20 in
00:13:51.623
less than a week, and has continued
00:13:54.614
growing since.
00:13:55.468
Impressively, the Internet was flexible and robust
00:13:58.558
enough to support this rapid change in
00:14:01.548
how it is used.
00:14:03.257
The question is, can we maintain such
00:14:06.347
flexibility while also improving quality? Lectures 6
00:14:09.337
and 7 discuss this topic further.
00:14:13.000
The final shift in assumptions has been
00:14:16.191
around security.
00:14:17.074
The Internet Protocols were originally designed to
00:14:20.265
support a research network, with a relatively
00:14:23.355
small set of users, who had reasonably
00:14:26.446
closely aligned goals, and provided little in
00:14:29.537
the way of security.
00:14:31.303
Over the years, the protocols have changed
00:14:34.494
to provide increasingly sophisticated security and protection
00:14:37.585
from attacks. The Edward Snowden revelations accelerated
00:14:40.675
that trend, by increasing awareness of large-scale
00:14:43.766
government surveillance, but the increase in security
00:14:46.857
started before that, in response to hacking
00:14:49.948
and criminal activity.
00:14:51.272
A significant challenge going forward will be
00:14:54.463
in balancing the needs of law enforcement
00:14:57.554
to access and monitor some traffic,
00:15:00.203
in a targeted manner, while preserving privacy
00:15:03.294
and protecting against attackers.
00:15:05.060
We’ll talk about these topics more in
00:15:08.251
Lectures 3, 4, 8, and 9.
00:15:12.000
The Internet is now a globally deployed
00:15:15.103
network. Like any large system, it becomes
00:15:18.105
increasingly ossified, increasingly difficult to change,
00:15:20.679
over time.
00:15:21.537
The slow transition from IPv4 to IPv6
00:15:24.640
is one example of this ossification.
00:15:27.214
Another example would be the difficulty in
00:15:30.316
updating TCP, to better support low-latency services
00:15:33.319
and improve performance. The widespread use of
00:15:36.322
NATs, firewalls, and other middleboxes makes such
00:15:39.325
changes surprisingly difficult. We’re now starting to
00:15:42.327
see serious attempts to replace TCP,
00:15:44.901
with protocols such as QUIC that employ
00:15:47.904
pervasive encryption and tunnel over UDP to
00:15:50.907
avoid such interference by the legacy network,
00:15:53.909
but it’s not clear that these will
00:15:56.912
succeed.
00:15:58.091
Finally, we must consider the effects of
00:16:01.194
the push towards centralised services and applications,
00:16:04.196
driven by both technical and business considerations,
00:16:07.199
and whether these are beneficial to consumers
00:16:10.202
and users of the network, or not.
00:16:13.205
Is this shift towards a small number
00:16:16.207
of hypergiant service providers an inevitable consequence
00:16:19.210
of the design of the network,
00:16:21.784
or of the business and regulatory environment,
00:16:24.787
and to what extent should we attempt
00:16:27.789
to influence it through technological change in
00:16:30.792
the network?
00:16:32.000
The way the network is used is
00:16:34.890
changing, and the technologies that support that
00:16:37.580
use are necessarily shifting too. The network
00:16:40.269
has become more fragmented, there are more
00:16:42.959
serious security threats, more demanding applications,
00:16:45.265
and some significant shifts in the devices
00:16:47.955
and technologies we use to access the
00:16:50.644
network.
00:16:51.779
In the rest of this course,
00:16:54.184
I want to start to discuss how
00:16:56.874
the protocols that form the Internet are
00:16:59.564
evolving to meet these needs, and to
00:17:02.254
highlight some of the open issues and
00:17:04.944
challenges still to be addressed.
00:17:06.865
We’ll start, in the next lecture,
00:17:09.270
by discussing the increasing fragmentation of the
00:17:11.960
network and its implications for connection establishment.
Discussion
The goal of the Networked Systems (H) course is to discuss how the
Internet is changing to support more devices, to improve real-time
and low-latency applications, and to increase security.
The recording for lecture 1 reviewed some prior material, which should
be somewhat familiar to you from the Networks and Operating Systems
Essentials course in Level 2, and introduced some of the changes we'll
discuss in the remainder of the course. The live discussion session
will briefly recap this review, then discuss the following points.
IPv4, IPv6, and NAT:
One of the ongoing changes in the network is the transition from IPv4
to IPv6. The lecture presented data from Google showing that about 40%
of their traffic is running over IPv6. Does your home network support
IPv6? What about your mobile provider?
Try https://ipv6-test.com to find out.
Due to the shortage of IPv4 addresses, many networks use NAT to share
IP addresses. We'll talk more about this in Lecture 2, but for now,
find our whether your home network uses IPv4 with a NAT. There are
instructions for finding your machine’s local IP address online.
If it's using one of the private address ranges (10.0.0.0 - 10.255.255.255,
172.16.0.0 - 172.31.255.255, or 192.168.0.0 - 192.168.255.255) your
home network is using a NAT.
Google “what is my IP” to find your public IPv4 and IPv6 address, and
compare these with the addresses your network uses internally.
Does it matter that we’re running out of IPv4 addresses?
Real-time Applications:
Streaming video is the majority of Internet traffic, and video
conferencing providers saw a massive traffic increase due to the
pandemic. The lecture presented some data to illustrate this, and we'll
talk about the issues more in the rest of the course.
How well have video conferencing apps worked for you?
Do you see frequent quality problems?
Hypergiants, Centralisation, and Security
Internet topology is flattening and becoming increasingly centralised,
with direct connections from “eyeball” networks to massive content
providers – Google, Facebook, Amazon, Apple, Akamai, etc.
What are the implications for network neutrality, competition,
innovation, privacy, freedom of speech, pervasive monitoring, and
security?