Research : Real Time Collaboration on the World Wide Web (WebRTC)

RTP Multiplexing Architecture (-00 working group draft)

Posted on in category rtcweb

Our draft on Guidelines for using the Multiplexing Features of RTP has been adopted as a work item of the AVTCORE working group. We've submitted an updated version of the draft to reflect this change in status. There are no technical changes in this version.


IETF 86 — Presentation on Use of RTCP XR in WebRTC

Posted on in category rtcweb

I will be giving a short presentation on the use of RTCP XR reports in WebRTC at the IETF 86 meeting in Orlando on 12 March 2013. This will focus on two of the open issues in our rtp-usage draft: what RTCP XR blocks are required for performance monitoring, and what congestion feedback is needed to support pre-RMCAT congestion control (in particular, is any RTCP XR feedback needed to support this).


Web Real-Time Communication (WebRTC): Media Transport and Use of RTP (-06)

Posted on in category rtcweb

We submitted version -06 of our draft on WebRTC Media Transport and Use of RTP just before the deadline for IETF 86. The changes in this version of the draft are as follows:

  • Expand and clarify discussion of RTP session multiplexing in Section 4.4
  • Add Section 7.2 on RTCP extensions for congestion control
  • Clarify Section 12.1 on RTP Sessions and PeerConnections
  • Expand Section 12.4 on SSRC collision detections
  • Rewrite and clarify Section 12.5 on Contributing Sources and the CSRC list
  • Rewrite and clarify Section 12.9 on differentiated treatment of flows
  • Expand security considerations
There is no agenda time set aside for general discussion of the draft in the Orlando IETF meeting, but there will be a discussion on the open issue around the use of RTCP XR blocks in the RTCWEB working group.


RTP Multiplexing Architecture (-03)

Posted on in category rtcweb

We submitted an update to our draft on Guidelines for using the Multiplexing Features of RTP. This is a relatively minor update, mostly removing text on new RTP topologies, which is now in the RFC5117bis draft. We are not planning to discuss this draft at IETF 86, but will work further on it later in the year.


Multiple Media Types in an RTP Session (-02)

Posted on in category rtcweb

Our draft on Multiple Media Types in an RTP Session was updated for IETF 86. This revision adds sections 6.4.1 on Timing out SSRCs and 6.4.2 on Tuning RTCP transmissions, and fixes the use of RFC 2119 normative language in some places, but is otherwise a minor update.


Multiple RTP Session on a Single Lower-Layer Transport (-05)

Posted on in category rtcweb

We submitted another minor update to our draft on Multiple RTP Sessions on a Single Lower-Layer Transport. The only changes are to correct minor editorial problems. We hope to progress this draft further later in the year, but will not be discussing it at IETF 86 in Orlando.


IETF 85 — Presentation on Multiple RTP Sessions on a Single Lower-layer Transport

Posted on in category rtcweb

Magnus Westerlund gave a brief presentation about our draft on Multiple RTP Session on a Single Lower-layer Transport in the AVTCORE working group at IETF 85 in Atlanta. This talk highlights the changes since the last meeting, and calls for working group adoption.


IETF 85 — Presentation on Multiple Media Types in an RTP Session

Posted on in category rtcweb

I gave a presentation about our draft on Multiple Media Types in an RTP Session at the IETF 85 Meeting in Atlanta on 5 November 2012. This talk summarises the changes made to the draft since the last IETF meeting, and highlights open issues around the choice of RTCP reporting interval and the SSRC timeout rules for RTP sessions with widely varying media bandwidth.


Web Real-Time Communication (WebRTC): Media Transport and Use of RTP (-05)

Posted on in category rtcweb

We submitted an updated version of our draft on WebRTC Media Transport and Use of RTP just before the deadline for IETF 85. The changes in this version of the draft are as follows:

  • Use RFC 2119 terminology by reference, rather than by copying the definitions.
  • In Section 4.2 on the Choice of RTP Profile, note that the RTP/SAVPF profile with the updated list of recommended codecs is mandated, not the standard RTP/SAVPF profile. Update Section 4.3 on the Choice of RTP Payload Formats to match.
  • In Section 4.6, clarify that the use of non-compound RTCP packets MUST be negotiated on the signalling channel before use, and that implementations are REQUIRED to support compound RTCP feedback packets if the remote endpoint does not agree to use non-compound RTCP packets.
  • In Section 4.9, remove the reference to RFC 6222 and instead reference the RFC 622bis draft.
  • Update references to RFC 5117 to joint to the RTP Topologies update draft.
  • In Section 5.1.1, clarify that a WebRTC sender is REQUIRED to understand and react to FIR messages it receives, but that sending FIR messages is OPTIONAL.
  • Rewrite Section 7 on rate control and media adaptation for clarity. Merge the previous Sections 7.1 and 7.2 into a single new section, and try to better explain the relationship between the RTP circuit breakers, the signalled SDP bandwidth limitations, and any RTP/AVPF TMMBR messages.
  • Add Section 13 on Open Issues.
  • Revise and expand Section 15 on Security Considerations.
In addition, there are numerous minor editorial corrections throughout the draft.


Multiple Media Types in an RTP Session (-01)

Posted on in category rtcweb

Our draft on Multiple Media Types in an RTP Session was adopted as a work item of the IETF AVTCORE working group during the IETF 84 meeting in Vancouver in July 2012. The -00 working group draft brings the references up-to-date, but is otherwise identical to the last individual submission.

We submitted an updated draft just prior to the IETF 85 submission cut-off. This version makes the following changes:

  • Note that RFC 3551 mandates that audio and video are run on separate RTP sessions, and so needs to be updated if we are to allow multiple media types in a single RTP session. Section 6.1 is updated to specify this.
  • Update Section 3.3, to clarify what is meant by architectural equality. In particular, note that RTP requires a common session bandwidth, so any flows that are multiplexed together in an RTP session need to have similar bandwidth requirements.
  • Update Section 5.1 to further clarify that all flows that are to be multiplexed together need to have similar bandwidth requirements, due to the requirement for a common RTCP reporting interval.
  • Update Sections 6.3 and 7.2 to clarify use with the RTP payload format for Generic FEC.
  • Update Section 6.4 to discuss the constraints imposed by the single RTCP reporting interval.
This draft will be discussed in the AVTCORE working group meeting at IETF 85 in Atlanta in November 2012.


Multiple RTP Session on a Single Lower-Layer Transport (-04)

Posted on in category rtcweb

We submitted a minor update to our draft on Multiple RTP Sessions on a Single Lower-Layer Transport. The main changes in this version are:

  • Make the prefix-based shim the preferred solution, following the discussions at IETF 84 and elsewhere. Define the shim as being two octets with a 14-bit session ID, and 2-bits to avoid clashes with other protocols.
  • Clarify that the prefix needs to be chosen to avoid clashes with any STUN or DTLS packets multiplexed on the same port.
  • Note that the signalling is waiting on the outcome of the BUNDLE discussions in the IETF MMUSIC working group.
  • Note that DTLS-SRTP and MIKEY key derivation might need to be updated to avoid having to derive new keys for each multiplexed session. Details are to-be-determined.
This draft will be discussed in the AVTCORE working group meeting at IETF 85 in Atlanta, where we hope it will be adopted as a working group item.


IETF 84 — Presentation on WebRTC Media Transport and Use of RTP

Posted on in category rtcweb

The slides from my presentation on Media Transport and Use of RTP to the IETF RTCWEB working group at IETF 84 are now available. These review the mailing list discussion since the interim meeting, and highlight open issues with the draft.


IETF 84 — Presentations on RTP Multiplexing

Posted on in category rtcweb

Magnus Westerlund and I gave three presentations about our joint work on RTP multiplexing in the AVTCORE working group meeting at IETF 84 in Vancouver. Firstly, I gave a presentation on Guidelines for Using the Multiplexing Features of RTP. I reviewed the contents of the draft, and outlined my view of the problems with the current version of the draft. I suggested that the way forward may be to revise RFC 5117 to document the new approaches to building RTP topologies that have been seen, allowing us to simplify this draft and ensure that clear guidelines are provided for how to use the multiplexing features of RTP.

Magnus Westerlund spoke about Multiple Media Types in an RTP Session, outlining some of the issues and constraints around including audio and video media in a single RTP session.

Finally, Magnus gave a talk on Multiple RTP Sessions Over One Transport. He reviewed the shim-based approach to multiplexing separate RTP sessions over a single lower-layer transport flow, and outlined the open design isses with the shim, signalling, and keying.


Multiple Media Types in an RTP Session

Posted on in category rtcweb

We've submitted a new draft about Multiple Media Types in an RTP Session. This describes how audio, video, and text content can be multiplexed together into a single RTP session. This multiplexing is prohibited by the RTP specification and the RTP Profile for Audio and Video Conferences with Minimal Control, but is desired by some in the WebRTC community. Our new draft discusses why it might be desirable to allow this functionality, and outlines the changes to the RTP specification needed to enable it. This is an early draft for discussion at the IETF 84 meeting in Vancouver, and is not yet suitable for implementation.


Web Real-Time Communication (WebRTC): Media Transport and Use of RTP (-04)

Posted on in category rtcweb

We've submitted an update to our draft on RTP usage in the RTCWeb context. This version attempts to reflect the outcomes of the IETF RTCWeb working group interim meeting that was held in Kista on 12-13 June 2012. This is a major update to the draft, touching almost every section of the text. The changes are too numerous to describe in detail, but a complete diff from the previous version is available. After the extensive discussion in the interim meeting, we are not planning to present this draft in the IETF 84 meeting in Vancouver; discussion should take place on the IETF RTCWeb working group's mailing list instead. If there are still unresolved issues, we will discuss the draft further at IETF 85 in Atlanta in November 2012.


RTP Multiplexing Architecture (-02)

Posted on in category rtcweb

We submitted an updated version of our draft on Guidelines for using the Multiplexing Features of RTP (previously entitled the RTP Multiplexing Architecture). This is a major rewrite of the draft; a complete diff from the previous version is available showing the changes.


Multiple RTP Session on a Single Lower-Layer Transport (-03)

Posted on in category rtcweb

We submitted an update to our draft on Multiple RTP Sessions on a Single Lower-Layer Transport. This draft proposes a shim-layer solution for multiplexing several RTP sessions onto a single transport-layer flow. Compared to the other RTP multiplexing proposals, it has the benefit of preserving the full semantics of RTP, but at the cost of some small overhead. This version of the draft expands the motivation for the work and the derived requirements and design considerations. The specification is also updated to move the shim layer to the start of the packet, immediately after the UDP header. This draft will be discussed in the AVTCORE working group meeting at IETF 84 in Vancouver in July 2012.


IETF RTCWeb Interim Meeting

Posted on in category rtcweb

I attended the IETF RTCWeb working group interim meeting, hosted by Ericsson in Kista, Sweden, on 12-13 June 2012 (I was unfortunately only able to make the first day, since I had to be at the Cisco Workshop on Adaptive Media Transport in San Jose, CA, on 14-15 June). I was involved in two presentations at the RTCWeb interim meeting. Firstly, Magnus Westerlund gave a talk on the RTP API and Topologies for WebRTC conferencing, to help understand what RTP topologies make sense in that environment, how middleboxes (such as mixers) can be used, and how these topologies can be reflected in the Javascript APIs. Then I gave a longer presentation and led a discussion on media transport and use of RTP, trying to turn the requirements into a concrete list of protocol features that are required to be implemented in all web browsers that support the WebRTC standards.


Web Real-Time Communication (WebRTC): Media Transport and Use of RTP (-03)

Posted on in category rtcweb

We've submitted an update to our draft on RTP usage in the RTCWeb context, for discussion at the IETF RTCWeb working group's interim meeting in Kista on 12-13 June 2012. This new version includes a greatly expanded discussion of RTP topologies, some more concrete recommendations on use of the core RTP protocol, and various other clarifications throughout. This is still work in progress, and will likely evolve significantly after the discussion at the interim meeting.


IETF 83 — Presentations on RTP Multiplexing

Posted on in category rtcweb

Magnus Westerlund presented our drafts on Options for Multiplexing Multiple Media Flows in RTP and Multiplexing RTP Sessions on a Single Transport in the AVTCORE working group session at IETF 83. These are part of the ongoing discussion of how to multiplexing different media streams onto RTP sessions in the RTCWeb context.


RTP Multiplexing Architecture (-01)

Posted on in category rtcweb

We submitted a new version of the RTP Multiplexing Architecture draft a few days ago. This draft discusses Options for Multiplexing Multiple Media Flows in RTP, considering what are the available multplexing points in RTP, and offering guidance on when to use the available options. In particular, the aim is to help understand the trade-off between the various alternatives for multiplexing, to document options for new application categories, and understand their impact on the installed base. This version of the draft has been extensively re-written to try to clarify the issues and arguments.

  • Magnus Westerlund, Bo Burman, and Colin Perkins, RTP Multiplexing Architecture, Internet Engineering Task Force, March 2012, Work in progress (draft-westerlund-avtcore-multiplex-architecture-01.txt).

Multiple RTP Session on a Single Lower-Layer Transport (-02)

Posted on in category rtcweb

This document specifies how multiple RTP sessions are to be multiplexed on the same lower-layer transport, e.g. a UDP flow. It discusses various requirements that have been raised and their feasibility, which results in a solution with a certain applicability. A solution is recommended and that solution is provided in more detail, including signalling and examples.

This update adds an explicit comparison of the various options for multiplexing multiple RTP sessions on a single transport, to outline how they meet the requirements. The recommendations have not changed, but the discussion for why they are made is hopefully now clearer.


Web Real-Time Communication (WebRTC): Media Transport and Use of RTP (-02)

Posted on in category rtcweb

We've just posted an update to our draft on the use of RTP in the RTCWeb context. This is a minor update, consisting almost entirely of editorial cleanups. The only technical changes are to make RFC 6222 support REQUIRED rather than RECOMMENDED, and to replace the discussion of congestion control requirements with references to Randell Jesup's draft on Congestion Control Requirements For Real Time Media, and my new draft on RTP Congestion Control: Circuit Breakers for Unicast Sessions.


RTP Congestion Control: Circuit Breakers for Unicast Sessions (-00)

Posted on in category rtcweb

The Real-time Transport Protocol (RTP) is widely used in voice-over-IP, video teleconferencing, and telepresence systems. Many of these systems run over best-effort IP networks, and can suffer from packet loss and increased latency due to network congestion. Designing effective RTP congestion control algorithms, to adapt the transmission of RTP-based media to match the available network capacity, while also maintaining the user experience, is a difficult but important problem. Many such congestion control and media adaptation algorithms have been proposed, but to date there is no consensus on the correct approach, or even that a single standard algorithm is desirable.

This memo does not attempt to propose a new RTP congestion control algorithm. Rather, it proposes a minimal set of “circuit breakers” conditions under which there should be general agreement that an RTP flow is causing serious congestion, and should cease transmission. It is hoped that future standards-track congestion control algorithms for RTP will operate within the envelope defined by this memo.


IETF 82 — Presentations on RTP Multiplexing

Posted on in category rtcweb

I presented an overview of the RTP Multiplexing Architecture in the AVTCORE session at IETF 82. The presentation outlines the rationale behind the RTP multiplexing architecture, summarises the issues discussed in the draft, and briefly reviews the guidelines, and proposed protocol clarifications and extensions. It concludes by asking if the working group believes our RTP multiplexing draft is an appropriate starting point for discussion and further work.

Magnus Westerlund presented our draft on Multiple RTP Sessions over a Single Transport Flow. This discusses how various multiplexing options meet the requirements for multiplexing, feeding into the discussions in the AVTCORE and RTCWeb working groups.


Web Real-Time Communication (WebRTC): Media Transport and Use of RTP (-01)

Posted on in category rtcweb

We've updated our draft on RTP Requirements for RTC-Web, and renamed it to better reflect the content. Changes in this version include:

  • Update title, rewrite Abstract and Introduction, and restructure the draft
  • Update section on Rate Control and Media Adaptation
  • Update section on Security Considerations
  • Update section on Use of RTP: Core Protocols
  • Add some initial discussion of multiplexing several RTP sessions onto a single lower-layer transport, primarily referencing other drafts for the content.
  • Address comments by Harald Alvestrand from the mailing list (29 August 2011)
The main technical changes are in Section 8, and focus on the requirements for Rate Control and Media Adaptation, the limitations of RTCP when used for this purpose without RTP/AVPF, and interoperability with legacy systems. There are also smaller technical changes, and numerious editorial fixes, throughout.


Multiple RTP Session on a Single Lower-Layer Transport (-01)

Posted on in category rtcweb

This document specifies how multiple RTP sessions are to be multiplexed on the same lower-layer transport, e.g. a UDP flow. It discusses various requirements that have been raised and their feasibility, which results in a solution with a certain applicability. A solution is recommended and that solution is provided in more detail, including signalling and examples.

I joined Magnus as co-author for the -01 version of this draft. There are changes throughout, but the main focus has been to expand the discussion of the single RTP session proposal in Section 4.3.


RTP Multiplexing Architecture (-00)

Posted on in category rtcweb

RTP has always been a protocol that supports multiple participants each sending their own media streams in an RTP session. To do this, it relies on three main multiplexing points: RTP session, SSRC, and Payload Type. However, most uses of RTP to date have been simpler, often with only a single SSRC in each direction, and a single RTP session per media type. More recently, however, the more complex use cases have started to be more common, and hence more guidance on how to use RTP in these cases is needed. This new draft analyzes a number of cases and discusses the usage of the various multiplexing points and the need for functionality when defining RTP/RTCP extensions that use multiple RTP streams and multiple RTP sessions. This developed from the RTCweb discussion of session multiplexing, but is heading in a more generally applicable direction, and may impact the IETF CLUE working group, as well as general RTP sessions involving multiple participants and complex topologies.

  • Magnus Westerlund, Bo Burman, and Colin Perkins, RTP Multiplexing Architecture, Internet Engineering Task Force, October 2011, Work in progress (draft-westerlund-avtcore-multiplex-architecture-00.txt).

RTP Requirements for RTC-Web (draft-ietf-rtcweb-rtp-usage-00)

Posted on in category rtcweb

This draft was accepted as a working group draft after the RTCWeb interim meeting. This version updates the draft filename and date to reflect that, but makes no other changes.

  • Colin Perkins, Magnus Westerlund, and Jörg Ott, RTP Requirements for RTC-Web, Internet Engineering Task Force, September 2011, Work in progress (draft-ietf-rtcweb-rtp-usage-00.txt).

RTP with TCP Friendly Rate Control (-01)

Posted on in category rtcweb

This version is submitted to keep the draft from expiring. No changes other than the date and version number.

  • Ladan Gharai and Colin Perkins, RTP with TCP Friendly Rate Control, Internet Engineering Task Force, September 2011, Work in progress (draft-gharai-avtcore-rtp-tfrc-01.txt).

RTP Requirements for RTC-Web (-03)

Posted on in category rtcweb

This version makes the following changes relative to the -02 draft:

  • Removed discussion of multiplexing for RTP sessions entirely, pending submission of a proposal for multiplexing audio and video within an RTP session (as discussed at IETF 81).
  • Added a reference to draft-ietf-avtcore-srtp-vbr-audio-03 in Security Considerations.
  • Added a recommendation that draft-ietf-avtcore-srtp-encrypted-header-ext-00 be used when the client-to-mixer and mixer-to-client audio level indication RTP header extensions are in use in SRTP encrypted sessions.
  • Added Slice Loss Indication and Reference Picture Selection Indication messages as OPTIONAL loss tolerance mechanisms.

These are intended to address the comments and discussion during the IETF 81 meeting in Quebec City.

  • Colin Perkins, Magnus Westerlund, and Jörg Ott, RTP Requirements for RTC-Web, Internet Engineering Task Force, August 2011, Work in progress (draft-perkins-rtcweb-rtp-usage-03.txt).

Multiplexing and RTP Sessions

Posted on in category rtcweb

The following is extracted from version -02 of our RTP Requirements for RTC-Web draft, and describes some of the issues relating to RTP session multiplexing. This was discussed at IETF 81 in Quebec City. The linked set of slides on Multiplexing RTP Sessions were prepared for the RTC-Web working group session at that meeting, but were not presented to the working group: discussion in a break-out meeting covered the key points.

Expected Topologies

As RTC-Web is focused on peer to peer connections established from clients in web browsers the following topologies further discussed in RTP Topologies [RFC5117] are primarily considered. The topologies are depicted and briefly explained here for ease of the reader.

The point to point topology (Figure 1) is going to be very common in single user to single user applications.

Figure 1: Point to Point

For small multiparty sessions it is practical enough to create RTP sessions by letting every participant send individual unicast RTP/UDP flows to each of the other participants (Figure 2). This is called multi-unicast and is unfortunately not discussed in the RTP Topologies RFC. This topology has the benefit of not requiring central nodes. The downside is that it increases the used bandwidth at each sender by requiring one copy of the media streams for each participant that are part of the same session beyond the sender itself. Thus this is limited to scenarios with few end-points unless the media is very low bandwidth.

Figure 2: Multi-Unicast

It needs to be noted that, if this topology is to be supported by the RTC-Web framework, it needs to be possible to connect one RTP session to multiple established peer to peer flows that are individually established.

An RTP mixer (Figure 3) is a centralised point that selects or mixes content in a conference to optimise the RTP session so that each end- point only needs connect to one entity, the mixer. The mixer also reduces the bit-rate needs as the media sent from the mixer to the end-point can be optimised in different ways. These optimisations include methods like only choosing media from the currently most active speaker or mixing together audio so that only one audio stream is required in stead of 3 in the depicted scenario. The downside of the mixer is that someone is required to provide the actual mixer.

Figure 2: RTP Mixer with Only Unicast Paths

If one wants a less complex central node it is possible to use an relay (called an Transport Translator) (Figure 4) that takes on the role of forwarding the media to the other end-points but doesn't perform any media processing. It simply forwards the media from all other to all the other. Thus one endpoint A will only need to send a media once to the relay, but it will still receive 3 RTP streams with the media if B, C and D all currently transmits.

Figure 2: RTP Translator with Only Unicast Paths

To support legacy end-point (B) that don't fulfil the requirements of RTC-Web it is possible to insert a Translator (Figure 5) that takes on the role to ensure that from A's perspective B looks like a fully compliant end-point. Thus it is the combination of the Translator and B that looks like the end-point B. The intention is that the presence of the translator is transparent to A, however it is not certain that is possible. Thus this case is include so that it can be discussed if any mechanism specified to be used for RTC-Web results in such issues and how to handle them.

Figure 2: RTP Translator Towards Legacy End-Point

RTP Multiplexing Points

There are three fundamental points of multiplexing within the RTP framework:

  • Use of separate RTP Sessions: The first, and the most important, multiplexing point is the RTP session. This multiplexing point does not have an identifier within the RTP protocol itself, but instead relies on the lower layer to separate the different RTP sessions. This is most often done by separating different RTP sessions onto different UDP ports, or by sending to different IP multicast addresses. The distinguishing feature of an RTP session is that it has a separate SSRC identifier space; a single RTP session can span multiple transport connections provided packets are gatewayed such that participants are known to each other. Different RTP sessions are used to separate different types of media within a multimedia session. For example, audio and video flows are sent on separate RTP sessions. But also completely different usages of the same media type, e.g. video of the presenter and the slide video, benefits from being separated.
  • Multiplexing using the SSRC within an RTP session: The second multiplexing point is the SSRC that separates different sources of media within a single RTP session. An example might be different participants in a multiparty teleconference, or different camera views of a presentation. In most cases, each participant within an RTP session has a single SSRC, although this may change over time if collisions are detected. However, in some more complex scenarios participants may generate multiple media streams of the same type simultaneously (e.g., if they have two cameras, and so send two video streams at once) and so will have more than one SSRC in use at once. The RTCP CNAME can be used to distinguish between a single participant using two SSRC values (where the RTCP CNAME will be the same for each SSRC), and two participants (who will have different RTCP CNAMEs).
  • Multiplexing using the Payload Type within an RTP session: If different media encodings of the same media type (audio, video, text, etc) are to be used at different times within an RTP session, for example a single participant that can switch between two different audio codecs, the payload type is used to identify how the media from that particular source is encoded. When changing media formats within an RTP Session, the SSRC of the sender remains unchanged, but the RTP Payload Type changes to indicate the change in media format.

These multiplexing points area fundamental part of the design of RTP and are discussed in Section 5.2 of [RFC3550]. Of special importance is the need to separate different RTP sessions using a multiplexing mechanism at some lower layer than RTP, rather than trying to combine several RTP sessions implicitly into one lower layer flow. This will be further discussed in the next section.

RTP Session Multiplexing

In today's network with prolific use of Network Address Translators (NAT) and Firewalls (FW), there is a desire to reduce the number of transport layer ports used by an real-time media application using RTP. This has led some to suggest multiplexing two or more RTP sessions on a single transport layer flow, using either the Payload Type or SSRC to demultiplex the sessions, in violation of the rules outlined above. It is not the first time some people look at RTP and question the need for using RTP sessions for different media types, and even more the potential need to separate different media streams of the same type into different session due to their different purposes. Section 5.2 of [RFC3550] outlines some of those problems; we elaborate on that discussion, and on other problems that occurs if one violates this part of the RTP design and architecture.

Why RTP Sessions Should be Demultiplexed by the Transport

As discussed in Section 5.2 of [RFC3550], multiplexing several RTP sessions (e.g., audio and video) onto a single transport layer flow introduces the following problems:

  • Payload Identification: If two RTP sessions of the same type are multiplexed onto a single transport layer flow using the same SSRC but relying on the Payload Type to distinguish the session, and one were to change encodings and thus acquire a different RTP payload type, there would be no general way of identifying which stream had changed encodings. This can be avoided by partitioning the SSRC space between the two sessions, but that causes other problems as discussed below.
  • Timing and Sequence Number Space: An RTP SSRC is defined to identify a single timing and sequence number space. Interleaving multiple payload types would require different timing spaces if the media clock rates differ and would require different sequence number spaces to tell which payload type suffered packet loss. Using multiple clock rates in a single RTP session is problematic, as discussed in [I-D.ietf-avtext-multiple-clock-rates]. This can be avoided by partitioning the SSRC space between the two sessions, but that causes other problems as discussed below.
  • RTCP Reception Reports: RTCP sender reports and receiver reports can only describe one timing and sequence number space per SSRC, and do not carry a payload type field. Multiplexing sessions based on the payload type breaks RTCP. This can be avoided by partitioning the SSRC space between the two sessions, but that causes other problems as discussed below.
  • RTP Mixers: Multiplexing RTP sessions of incompatible media type (e.g., audio and video) onto a single transport layer flow breaks the operation of RTP mixers, since they are unable to combine the flows together.
  • RTP Translators: Multiplexing RTP sessions of incompatible media type (e.g., audio and video) onto a single transport layer flow breaks the operation of RTP some types of RTP translator, for example media transcoders, which rely on the RTP requirement that all media are of the same type.
  • Quality of Service: Carrying multiple media in one RTP session precludes the use of different network paths or network resource allocations that are flow based if appropriate. It also makes reception of a subset of the media, for example just audio if video would exceed the available bandwidth, difficult without the use of an RTP translator within the network to filter out the unwanted media which unless they are trusted devices (and included in the key-exchange). This is difficult to combine with media security functions.
  • Separate Endpoints: Multiplexing several sessions into one transport layer flow prevents use of a distributed endpoint implementation, where audio and video are rendered by different processes and/or systems.

We do note that some of the above issues are resolved as long as there is explicit separation of the RTP sessions when transported over the same lower layer transport, for example by inserting a multiplexing layer in between the lower transport and the RTP/RTCP headers. But a number of the above issue are not resolved by this.

In the RTCWEB context, i.e. web browsers running on various end- points it might appear unlikely that flow based QoS is available on the end-points that will support RTCWEB. We don't disagree that it is unlikely for the common case of users in their home- network or at WiFi hotspots will have flow-based QoS available. However, if one considers enterprise users, especially using intranet applications, the availability and desire to use QoS is not implausible. There are also web users who use networks that are more resource-constrained than wired networks and WIFI networks, for example cellular network. The current access network QoS mechanism for user traffic in cellular technology from 3GPP are flow based.

RTP's design hasn't been changed, although session multiplexing related topics have been discussed at various points of RTP's 20 year history. The fact is that numerous RTP mechanism and extensions have been defined assuming that one can perform session multiplexing when needed. Mechanism that has been identified as problematic if one doesn't do session separation are:

  • Scalability: RTP was built with media scalability in consideration. The simplest way of achieving separation between different scalability layers is placing them in different RTP sessions, and using the same SSRC and CNAME in each session to bind them together. This is most commonly done in multicast, and not particularly applicable to RTC-Web, but gatewaying of such a session would then require more alterations and likely stateful translation.
  • RTP Retransmission in Session Multiplexing mode: RTP Retransmission does have a mode for session multiplexing. This would not be the main mode used in RTC-Web, but for interoperability and reduced cost in translation support for different RTP Sessions are beneficial.
  • Forward Error Correction: The RTP Payload Format for Generic Forward Error Correction and its update can only be used on media formats that produce RTP packets that are smaller than half the MTU if the FEC flow and media flow being protected are to be sent in the same RTP session, this is due to RTP Payload for Redundant Audio Data. This is because the SSRC value of the original flow is recovered from the FEC packets SSRC field. So for anything that desires to use these format with RTP payloads that are close to MTU needs to put the FEC data in a separate RTP session compared to the original transmissions. The usage of this type of FEC data has not been decided on in RTC-Web.
  • SSRC Allocation and Collision: The SSRC identifier is a random 32-bit number that is required to be globally unique within an RTP session, and that is reallocated to a new random value if an SSRC collision occurs between participants. If two or more RTP sessions share a transport layer flow, there is no guarantee that their choice of SSRC values will be distinct, and there is no way in standard RTP to signal which SSRC values are used by which RTP session. RTP is explicitly a group-based communication protocol, and new participants can join an RTP session at any time; these new participants may chose SSRC values that conflict with the SSRC values used in any of the multiplexed RTP sessions. This problem can be avoided by partitioning the SSRC space, and signalling how the space is to be subdivided, but this is not backwards compatible with any existing RTP system. In addition, subdividing the SSRC space makes it difficult to gateway between multiplexed RTP sessions and standard RTP sessions: the standard sessions may use parts of the SSRC space reserved in the multiplexed RTP sessions, requiring the gateway to rewrite RTCP packets, as well as the SSRC and CSRC list in RTP data packets. Rewriting RTCP is a difficult task, especially when one considers extensions such as RTCP XR.
  • Conflicting RTCP Report Types: The extension mechanisms used in RTCP depend on separation of RTP sessions for different media types. For example, the RTCP Extended Report block for VoIP is suitable for conversational audio, but clearly not useful for Video. This may cause unusable or unwanted reports to be generated for some streams, wasting capacity and confusing monitoring systems. While this is problem may be unlikely for VoIP reports, it may be an issue for the more detailed media agnostic reports which are sometimes be used for different media types. Also, this makes the implementation of RTCP more complex, since partitioning the SSRC space by media type needs not only to be one the media processing side, but also on the RTCP reporting
  • RTCP Reporting and Scheduling: The RTCP reporting interval and its packet scheduling will be affected if several RTP sessions are multiplexed onto the same transport layer flow. The reporting interval is determined by the session bandwidth, and the reporting interval chosen for a high-rate video session will be different to the interval chosen by a low-rate VoIP session. If such sessions are multiplexed, then participants in one session will see the SSRC values of the other session. This will cause them to overestimate the number of participants in the session by a factor of two, thus doubling their RTCP reporting interval, and making their feedback less timely. In the worst case, when an RTP session with very low RTCP bandwidth is multiplexed with an RTP session with high RTCP bandwidth, this may cause repeated RTCP timer reconsideration, leading to the members of the low bandwidth session timing out. Participants in an RTP session configured with high bandwidth (and short RTCP reporting interval) will see RTCP reports from participants in the low bandwidth session much less often than expected, potentially causing them to repeatedly timeout and re-create state for those participants. The split of RTCP bandwidth between senders and receivers (where at least 25% of the RTCP bandwidth is allocated to senders) will be disrupted if a session with few senders (e.g., a VoIP session) is multiplexed with a session with many senders (e.g., a video session). These issues can be resolved if the partition of the SSRC is signalled, but this is not backwards compatible with any existing RTP system. The partition would require re-implementing large part of the RTCP processing to take the individual sessions into account.
  • Sampling Group Membership: The mechanism defined in RFC2762 to sample the group membership, allowing participants to keep less state, assumes a single flat 32-bit SSRC space, and breaks if the SSRC space is shared between several RTP sessions.

As can be seen, the requirement that separate RTP sessions are carried in separate transport-layer flows is fundamental to the design of RTP. Due to this design principle, implementors of various services or applications using RTP have not commonly violated this model, and have separated RTP sessions onto different transport layer flows. After 15 years of deployment of RTP in its current form, any move to change this assumption must carefully consider the backwards compatibility problems that this will cause. In particular, since widespread use of multiplexed RTP sessions in RTC-Web will almost certainly cause their use in other scenarios, the discussion regarding compatibility must be wider than just whether multiplexing works for the extremely limited subset of RTP use cases currently being considered in the RTC-Web group. Any such multiplexing extension to RTP must therefore be developed by the AVTCORE working group, since it has much broader applicability and scope than RTC- Web.

Arguments for a single transport flow

The arguments we are aware of for why it is desirable to use a single underlying transport (e.g., UDP) flow for all media, rather than one flow for each type of media are the following:

  • End-Point Port Consumption: A given IP address only has 16-bits of available port space per transport protocol for any consumer of ports that exists on the machine. This is normally never an issue for a end-user machine. It can become an issue for servers that has large number of simultaneous flows. However, in RTCWEB where we will use authenticated STUN requests a server can serve multiple end-point from the same local port, and use the whole 5-tuple (source and destination address, source and destination port, protocol) as identifier of flows. Thus, in theory, the minimal number of media server ports needed are the maximum number of simultaneous RTP sessions a single end-point may use, when in practice implementation probably benefit from using more.
  • NAT State: If an end-point is behind a NAT each flow it generates to an external address will result in state on that NAT. That state is a limited resource, either from memory or processing stand- point in home or SOHO NATs, or for large scale NATs serving many internal end-points, the available ports run-out. We see this primarily as a problem for larger centralised NATs where end-point independent mapping do require each flow mapping to use one port for the external IP address, thus affecting the the maximum aggregation of internal users per external IP address. However, we would like to point out that a RTCWEB session with audio and video are likely using 2 or 3 UDP flows. This can be contrasted with that certain web applications that can result that 100+ TCP flows are opened to various servers. Sure they are recovered more quickly due to the explicit session teardown when no longer need, at the same time more web sites may be simultaneously communicated in various browser tabs. So the question is if the UDP mapping space is as heavily used as the TCP mapping space, and that TCP will continue to be the limiting factor for the amount of internal users a particular NAT can support.
  • NAT Traversal taking additional time: When doing NAT/FW traversal it takes additional time to open additional ports. And it takes time in a phase of communication between accepting to communicate and the media path being established which is a fairly critical. The best case scenario for how much extra time it can take following the specified ICE procedures are. 1.5*RTT + Ta*(Additional_Flows-1), where Ta is the pacing timer, which ICE specifies to be no smaller than 20 ms. That assumes a message in one direction, and then an immediate triggered check back. This as ICE first finds one candidate pair that works prior to establish multiple flows. Thus, there is no extra time until one has found a working candidate pair, from that is only the time it takes to in parallel establish the additional flows which in most case are 1 or 2 more additional flows.
  • NAT Traversal Failure Rate: In cases when one needs more than a single flow to be established through the NAT there is some risk that one succeed in establishing the first flow but fails with one or more of the additional flows. The risk that this happens are hard to quantify. However, that risk should be fairly low as one has just prior successfully established one flow from the same interfaces. Thus only rare events as NAT resource overload, or selecting particular port numbers that are filtered etc, should be reasons for failure.

Summary

As we have noted in the preceding sections, implicit multiplexing of multiple RTP sessions onto a single transport flow raises a large number of backwards compatibility issues. It has been argued that these issues are either not important, since the RTP features disrupted are not of interest to the current set of RTC-Web use cases, or can be solved by somehow explicitly dividing the SSRC space into different regions for different RTP sessions. We believe the first argument is short-sighted: those RTP features may not be important today, but the successful deployment of simple RTC-Web applications will generate interest to try more advanced scenarios, which may well need those features. Partitioning the SSRC space to separate RTP sessions results in new set of issues, where the biggest from our point of view is that it effectively creates a new variant of the RTP protocol, which is incompatible with standard RTP. Having two different variants of the core functionality of RTP will make it much more difficult to develop future protocol extensions, and the new variant will likely also have different set of extensions that work. In addition the two versions aren't directly interoperable, and will force anyone that want to interconnect the two version to deploy (complex) gateways. It also reduces the common user base and interest in maintaining and developing either version.

On the other hand, we are sympathetic to the argument that using a single transport flow does save some time in setup processing, it will save some resources on NATs and FWs that are in between the end- points communicating, it may have somewhat higher success rate of session establishment.

Thus we consider it required that RTP sessions are multiplexed using an explicit mechanism. We strongly recommend that the mechanism used to accomplish this multiplexing is to use unique UDP flows for each RTP session, based on simplicity and interoperability. However, we can accept a WG consensus that using a single transport layer flow between peers is the default, and that also the fallback of using separate UDP flows are supported, under one constraint: that the RTP sessions are explicitly multiplexed in such a way existing mechanism or extensions to RTP are not prevented to work, and that the solution does not result in that an alternative variant of RTP is created (i.e., it must not disrupt RTCP processing, and the RTP semantics). In this later case we recommend that some type of multiplexing layer is inserted between UDP flow and the RTP/ RTCP headers to separate the RTP sessions, since removing this shim- layer and gatewaying to standard RTP sessions is simpler than trying to separate RTP sessions that are multiplexed together to gateway them to standard RTP sessions.


RTP Media for RTCWeb

Posted on in category rtcweb

Slides presented in the RTC-Web meeting at IETF 81 on the non-multiplexing related RTP functions for RTC-Web. These represent a combined view of the required RTP features from the authors of draft-perkins-rtcweb-rtp-usage-02 and draft-cbran-rtcweb-media-00.


RTP Requirements for RTC-Web (-02)

Posted on in category rtcweb

This version of the RTP requirements for RTC-Web draft includes a greatly expanded discussion of the requirements for multiplexing RTP sessions on different UDP ports. This is intended for discussion at IETF 81 in Quebec City.

  • Colin Perkins, Magnus Westerlund, and Jörg Ott, RTP Requirements for RTC-Web, Internet Engineering Task Force, July 2011, Work in progress (draft-perkins-rtcweb-rtp-usage-02.txt).

RTP Requirements for RTC-Web (-01)

Posted on in category rtcweb

This is our contribution to the RTC-Web interim meeting.

  • Colin Perkins, Magnus Westerlund, and Jörg Ott, RTP Requirements for RTC-Web, Internet Engineering Task Force, June 2011, Work in progress (draft-perkins-rtcweb-rtp-usage-01.txt).

RTP Requirements for RTC-Web (-00)

Posted on in category rtcweb

This is an initial draft discussing which RTP features and extensions are appropriate for use in the RTC-Web context. The idea is to stimulate discussion, and (hopefully) to ensure that the RTC-Web solution uses a modern RTP protocol stack.

  • Colin Perkins, Magnus Westerlund, and Jörg Ott, RTP Requirements for RTC-Web, Internet Engineering Task Force, March 2011, Work in progress (draft-perkins-rtcweb-rtp-usage-00.txt).