QUIC Multiplexing, Peer-to-Peer, and WebRTC
The initial focus of QUIC development has been on client-server use, primarily as a transport for HTTP/2. In the long term, however, if QUIC is to become a general purpose transport, it must be usable by peer-to-peer applications. This requires support for NAT traversal, for which the IETF has developed the STUN protocol and the ICE algorithm. The version of QUIC specified in draft-ietf-quic-transport-07 can't easily support this, since its packets are formatted such that they're difficult to distinguish from STUN packets. This post outlines the problems and proposes changes to simplify demultiplexing QUIC and STUN packets. It also considers how to distinguish QUIC packets from other protocols such as those used by WebRTC.
Establishing a peer-to-peer connection in the presence of NATs is a multi-step process. Firstly, a host must gather its local interface addresses, then probe for the presence of NATs and to discover how they translate IP addresses and ports ("binding discovery") using the STUN protocol. Then, the host shares its list of candidate addresses, both local and as translated by the NAT, with its peer using some indirect signalling channel. Finally, it systematically checks connectivity between each of its local addresses and each of its peer's candidate addresses, to determine if a usable peer-to-peer path exists. The ICE algorithm describes the order in which these checks are performed.
For the purposes of establishing a peer-to-peer QUIC connection, the important point is that the STUN packets used for binding discovery must be sent using the same 5-tuple (UDP with the same local and remote IP addresses and ports) as the QUIC packets. Accordingly, it must be possible to efficiently demultiplex QUIC packets and STUN packets at the receiver.
How can QUIC and STUN be demultiplexed? Once way would be to rely on the fact that both protocols use per-packet authentication, but implement it using different mechanisms so that QUIC packets won't authenticate as STUN, and vice-versa. This is computationally expensive, however, so a more efficient solution is desirable.
The WebRTC community has similar demultiplexing concerns, and has developed a heuristic that can distinguish the protocols they use based on known values in the first octet of the packet. For example, RTP and RTCP packets have a fixed version number (0x02) in the top two bits, therefore their first octet must be in the range 128-191. Similar analysis has been done for STUN, ZRTP, DTLS, and TURN, in some cases slightly restricting their usable extension space, to get the table of possible values for their first octet shown below:
+----------------+ | [0..3] -+--> forward to STUN | | | [16..19] -+--> forward to ZRTP incoming | | packet --> | [20..63] -+--> forward to DTLS | | | [64..79] -+--> forward to TURN Channel | | | [128..191] -+--> forward to RTP/RTCP +----------------+
This demultiplexing approach cannot be directly used with draft-ietf-quic-transport-07. QUIC defines two packet types: long header and short header packets. The format of long header packets is shown below. They have the high bit of the first octet set to 1 and the lower seven bits as a packet type field, with types 0x01-0x06 defined, giving first octets in the range 129-134, conflicting with RTP.
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+ |1| Type (7) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | + Connection ID (64) + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Packet Number (32) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Version (32) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Payload (*) ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The first octet of QUIC short header packets, shown below, starts with a zero bit, a Connection ID Flag (C) that is set to 1 if the packet contains a Connection ID field, a Key Phase Flag (K), and a packet type, with types 0x01-0x03 defined. Depending on the values of the C and K flag bits and the packet type, this gives a first octet in the range 1-3, 33-35, 65-67, or 95-99, conflicting with STUN, DTLS, or TURN.
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+ |0|C|K| Type (5)| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | + [Connection ID (64)] + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Packet Number (8/16/32) ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Protected Payload (*) ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
It's possible, however, to renumber QUIC packets in a way that avoids the majority of the collisions. For long header packets, the packet type field can be renumbered to count down from 0x7F-0x7B rather than up from 0x01-0x06. Similarly, for short header packets, the types can be renumbered to count down from 0x1F-0x1D rather than up from 0x01-0x03. Finally, in short header packets, the sense of the C bit can be inverted, becoming an Omit Connection ID Flag that's set to 1 if the Connection ID field is omitted.
With these changes, QUIC long header packets will be a first octet in the range 251-255, avoiding the previous conflict with RTP. QUIC short header packets will have first octet in the range 93-95 or 125-127 if the connection ID is omitted, avoiding conflicts, but will have first octet in the range 29-31 or 61-63 if the connection ID is present and so will conflict with DTLS.
Does this solve the demultiplexing problem for peer-to-peer QUIC? Yes, since it allows efficient demultiplexing of QUIC and STUN based on the value of first octet of the packet. Each protocol will then perform its own integrity check, and so will catch any misdirected packets.
It might also be useful to run QUIC and WebRTC on a single UDP port. This could be useful to run WebRTC media from a QUIC-based web server, or to replace the WebRTC data channel with a QUIC-based alternative. By renumbering the QUIC packets as suggested here it becomes possible to efficiently demultiplex WebRTC traffic and QUIC, provided the QUIC flows omit the connection ID in short header packets.
I submitted pull request #956 to the QUIC base drafts to implement the changes outlined here. The issues mentioned are further considered in a recent submission to the IETF (draft-aboba-avtcore-quic-multiplexing-01) that I prepared along with Bernard Aboba and Peter Thatcher. I gave a presentation on this in the AVTCORE working group at IETF 100 (slides) and Martin Thomson gave a related presentation in the QUIC working group.
In the long run, this approach to demultiplexing protocols on a single UDP port is clearly not sustainable. If we're to avoid reinventing STUN within every new protocol, it seems appropriate to define a standard way to demultiplex STUN packets from some sort of shim layer that can signal the actual next layer protocol to be demultiplexed in an extensible manner. That is, to introduce a path layer above UDP that handles NAT traversal and upper layer protocol identification, and to run new protocols above that. It may be too late to do this for QUIC, and it's certainly too late for WebRTC, but the current approach does not look sustainable.