draft-ietf-rmcat-rtp-cc-feedback-06.txt   draft-ietf-rmcat-rtp-cc-feedback-07.txt 
Network Working Group C. S. Perkins Network Working Group C. S. Perkins
Internet-Draft University of Glasgow Internet-Draft University of Glasgow
Intended status: Informational 12 July 2021 Intended status: Informational 25 October 2021
Expires: 13 January 2022 Expires: 28 April 2022
Sending RTP Control Protocol (RTCP) Feedback for Congestion Control in Sending RTP Control Protocol (RTCP) Feedback for Congestion Control in
Interactive Multimedia Conferences Interactive Multimedia Conferences
draft-ietf-rmcat-rtp-cc-feedback-06 draft-ietf-rmcat-rtp-cc-feedback-07
Abstract Abstract
This memo discusses the types of congestion control feedback that it This memo discusses the types of congestion control feedback that it
is possible to send using the RTP Control Protocol (RTCP), and their is possible to send using the RTP Control Protocol (RTCP), and their
suitability of use in implementing congestion control for unicast suitability of use in implementing congestion control for unicast
multimedia applications. multimedia applications.
Status of This Memo Status of This Memo
skipping to change at page 1, line 34 skipping to change at page 1, line 34
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/. Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on 13 January 2022. This Internet-Draft will expire on 28 April 2022.
Copyright Notice Copyright Notice
Copyright (c) 2021 IETF Trust and the persons identified as the Copyright (c) 2021 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/ Provisions Relating to IETF Documents (https://trustee.ietf.org/
license-info) in effect on the date of publication of this document. license-info) in effect on the date of publication of this document.
Please review these documents carefully, as they describe your rights Please review these documents carefully, as they describe your rights
skipping to change at page 2, line 12 skipping to change at page 2, line 12
as described in Section 4.e of the Trust Legal Provisions and are as described in Section 4.e of the Trust Legal Provisions and are
provided without warranty as described in the Simplified BSD License. provided without warranty as described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Possible Models for RTCP Feedback . . . . . . . . . . . . . . 2 2. Possible Models for RTCP Feedback . . . . . . . . . . . . . . 2
3. What Feedback is Achievable With RTCP? . . . . . . . . . . . 4 3. What Feedback is Achievable With RTCP? . . . . . . . . . . . 4
3.1. Scenario 1: Voice Telephony . . . . . . . . . . . . . . . 4 3.1. Scenario 1: Voice Telephony . . . . . . . . . . . . . . . 4
3.2. Scenario 2: Point-to-Point Video Conference . . . . . . . 7 3.2. Scenario 2: Point-to-Point Video Conference . . . . . . . 7
3.3. Scenario 3: Group Video Conference . . . . . . . . . . . 11
3.4. Scenario 4: Screen Sharing . . . . . . . . . . . . . . . 12
4. Discussion and Conclusions . . . . . . . . . . . . . . . . . 12 4. Discussion and Conclusions . . . . . . . . . . . . . . . . . 12
5. Security Considerations . . . . . . . . . . . . . . . . . . . 12 5. Security Considerations . . . . . . . . . . . . . . . . . . . 12
6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 13 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 12
7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 13 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 13
8. Informative References . . . . . . . . . . . . . . . . . . . 13 8. Informative References . . . . . . . . . . . . . . . . . . . 13
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 14 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 14
1. Introduction 1. Introduction
The deployment of WebRTC systems [RFC8825] has resulted in high- The deployment of WebRTC systems [RFC8825] has resulted in high-
quality video conferencing seeing extremely wide use. To ensure the quality video conferencing seeing extremely wide use. To ensure the
stability of the network in the face of this use, WebRTC systems need stability of the network in the face of this use, WebRTC systems need
to use some form of congestion control for their RTP-based media to use some form of congestion control for their RTP-based media
skipping to change at page 3, line 40 skipping to change at page 3, line 36
and received packets, with reception times, over that RTT. As long and received packets, with reception times, over that RTT. As long
as feedback is sent frequently enough that the control loop is as feedback is sent frequently enough that the control loop is
stable, and the sender is kept informed when data leaves the network stable, and the sender is kept informed when data leaves the network
(to provide an equivalent to ACK clocking in TCP), it is not (to provide an equivalent to ACK clocking in TCP), it is not
necessary to report on every packet at the instant it is received necessary to report on every packet at the instant it is received
(indeed, it is unlikely that a video codec can react instantly to a (indeed, it is unlikely that a video codec can react instantly to a
rate change anyway, and there is little point in providing feedback rate change anyway, and there is little point in providing feedback
more often than the codec can adapt). more often than the codec can adapt).
The amount of overhead due to congestion control feedback that is The amount of overhead due to congestion control feedback that is
considered acceptable has to be determined. RTCP data is sent in considered acceptable has to be determined. RTCP feedback is sent in
separate packets to RTP data, and this has some cost in terms of separate packets to RTP data, and this has some cost in terms of
additional header overhead compared to protocols that piggyback additional header overhead compared to protocols that piggyback
feedback on return path data packets. The RTP standards have long feedback on return path data packets. The RTP standards have long
said that a 5% overhead for RTCP traffic generally acceptable, while said that a 5% overhead for RTCP traffic generally acceptable, while
providing the ability to change this fraction. Is this still the providing the ability to change this fraction. Is this still the
case for congestion control feedback? Or is there a desire to either case for congestion control feedback? Or is there a desire to either
see more responsive feedback and congestion control, possibility with see more responsive feedback and congestion control, possibility with
a higher overhead, or is lower overhead wanted, accepting that this a higher overhead, or is lower overhead wanted, accepting that this
might reduce responsiveness of the congestion control algorithm? might reduce responsiveness of the congestion control algorithm?
Finally, the details of how much, and what, data is to be sent in Finally, the details of how much, and what, data is to be sent in
each report will affect the frequency and/or overhead of feedback. each report will affect the frequency and/or overhead of feedback.
There is a fundamental trade-off that the more frequently feedback There is a fundamental trade-off that the more frequently feedback
packets are sent, the less data can be included in each packet to packets are sent, the less data can be included in each packet to
keep the overhead constant. Does the congestion control need high keep the overhead constant. Does the congestion control need high
rate but simple feedback (e.g., like TCP acknowledgements), or is it rate but simple feedback (e.g., like TCP acknowledgements), or is it
acceptable to send more complex feedback less often? acceptable to send more complex feedback less often?
3. What Feedback is Achievable With RTCP? 3. What Feedback is Achievable With RTCP?
The following sections illustrate how the RTCP congestion control
feedback report [RFC8888] can be used in different scenarios, and
illustrate the overheads of this approach.
3.1. Scenario 1: Voice Telephony 3.1. Scenario 1: Voice Telephony
In many ways, point-to-point voice telephony is the simplest scenario In many ways, point-to-point voice telephony is the simplest scenario
for congestion control, since there is only a single media stream to for congestion control, since there is only a single media stream to
control. It's complicated, however, by severe bandwidth constraints control. It's complicated, however, by severe bandwidth constraints
on the feedback, to keep the overhead manageable. on the feedback, to keep the overhead manageable.
Assume a two-party point-to-point voice-over-IP call, using RTP over Assume a two-party point-to-point voice-over-IP call, using RTP over
UDP/IP. A rate adaptive speech codec, such as Opus, is used, encoded UDP/IP. A rate adaptive speech codec, such as Opus, is used, encoded
into RTP packets in frames of duration Tf seconds (Tf = 20ms in many into RTP packets in frames of duration Tf seconds (Tf = 20ms in many
skipping to change at page 4, line 35 skipping to change at page 4, line 39
control algorithm requires feedback every Nr frames, i.e., every Nr * control algorithm requires feedback every Nr frames, i.e., every Nr *
Tf seconds, to ensure effective control. Both parties in the call Tf seconds, to ensure effective control. Both parties in the call
send speech data or comfort noise with sufficient frequency that they send speech data or comfort noise with sufficient frequency that they
are counted as senders for the purpose of the RTCP reporting interval are counted as senders for the purpose of the RTCP reporting interval
calculation. calculation.
RTCP feedback packets can be full, compound, RTCP feedback packets, RTCP feedback packets can be full, compound, RTCP feedback packets,
or non-compound RTCP packets. A compound RTCP packet is sent once or non-compound RTCP packets. A compound RTCP packet is sent once
for every Nnc non-compound RTCP packets. for every Nnc non-compound RTCP packets.
Compound RTCP packets contain a Sender Report (SR) packet and a Compound RTCP packets contain a Sender Report (SR) packet, a Source
Source Description (SDES) packet, and an RTP Congestion Control Description (SDES) packet, and an RTP Congestion Control Feedback
Feedback (CCFB) packet [RFC8888]. Non-compound RTCP packets contain (CCFB) packet [RFC8888]. Non-compound RTCP packets contain only the
only the CCFB packet. Since each participant sends only a single CCFB packet. Since each participant sends only a single RTP media
media stream, the extensions for RTCP report aggregation [RFC8108] stream, the extensions for RTCP report aggregation [RFC8108] and
and reporting group optimisation [RFC8861] are not used. reporting group optimisation [RFC8861] are not used.
Within each compound RTCP packet, the SR packet will contain a sender Within each compound RTCP packet, the SR packet will contain a sender
information block (28 octets) and a single reception report block (24 information block (28 octets) and a single reception report block (24
octets), for a total of 52 octets. A minimal SDES packet will octets), for a total of 52 octets. A minimal SDES packet will
contain a header (4 octets) and a single chunk containing an SSRC (4 contain a header (4 octets) and a single chunk containing an SSRC (4
octets) and a CNAME item, and if the recommendations for choosing the octets) and a CNAME item, and if the recommendations for choosing the
CNAME [RFC7022] are followed, the CNAME item will comprise a 2 octet CNAME [RFC7022] are followed, the CNAME item will comprise a 2 octet
header, 16 octets of data, and 2 octets of padding, for a total SDES header, 16 octets of data, and 2 octets of padding, for a total SDES
packet size of 28 octets. The CCFB packets contains an RTCP header packet size of 28 octets. The CCFB packets contains an RTCP header
and SSRC (8 octets), a report timestamp (4 octets), the SSRC, and SSRC (8 octets), a report timestamp (4 octets), the SSRC,
beginning and ending sequence numbers (8 octets), and 2*Nr octets of beginning and ending sequence numbers (8 octets), and 2*Nr octets of
reports, for a total of 20 + 2*Nr octets. If IPv4 is used, with no reports, for a total of 20 + 2*Nr octets. The compound Secure RTCP
IP options, the UDP/IP header will be 28 octets in size. This gives packet will include 4 octets of trailer followed by an 80 bit (10
a total compound RTCP packet size of Sc = 128 + 2*Nr octets. octet) authentication tag if HMAC-SHA1 authentication is used. If
IPv4 is used, with no IP options, the UDP/IP header will be 28 octets
in size. This gives a total compound RTCP packet size of Sc = 142 +
2*Nr octets.
The non-compound RTCP packets will comprise just the CCFB packet with The non-compound RTCP packets will comprise just the CCFB packet,
a UDP/IP header. It can be seen that these packets will be Snc = 48 SRTCP trailer and authentication tag, and a UDP/IP header. It can be
+ 2*Nr octets in size. seen that these packets will be Snc = 62 + 2*Nr octets in size.
The RTCP reporting interval calculation ([RFC3550], Section 6.2) for The RTCP reporting interval calculation ([RFC3550], Section 6.2) for
a two-party session where both participants are senders, reduces to a two-party session where both participants are senders, reduces to:
Trtcp = n * Srtcp/Brtcp where Srtcp = (Sc + Nnc * Snc)/(1 + Nnc) is
the average RTCP packet size in octets, Brtcp is the bandwidth
allocated to RTCP in octets per second, and n is the number of
participants (n=2 in this scenario).
To ensure a report is sent every Nr frames, it is necessary to set Trtcp = n * Srtcp / Brtcp
the RTCP reporting interval Trtcp = Nr * Tf, which when substituted
into the previous gives Nr * Tf = n * Srtcp/Brtcp.
Solving for the RTCP bandwidth, Brtcp, and expanding the definition where Srtcp = (Sc + Nnc * Snc)/(1 + Nnc) is the average RTCP packet
of Srtcp gives Brtcp = (n * (Sc + Nnc * Snc))/(Nr * Tf * (1 + Nnc)). size in octets, Brtcp is the bandwidth allocated to RTCP in octets
per second, and n is the number of participants in the RTP session
(in this scenario, n = 2).
To ensure an RTCP report containing congestion control feedback is
sent after every Nr frames of audio, it is necessary to set the RTCP
reporting interval Trtcp = Nr * Tf, which when substituted into the
previous gives Nr * Tf = n * Srtcp/Brtcp. Solving this to give the
RTCP bandwidth, Brtcp, and expanding the definition of Srtcp gives:
Brtcp = (n * (Sc + Nnc * Snc))/(Nr * Tf * (1 + Nnc)).
If we assume every report is a compound RTCP packet (i.e., Nnc = 0), If we assume every report is a compound RTCP packet (i.e., Nnc = 0),
the frame duration Tf = 20ms, and an RTCP report is sent for every the frame duration Tf = 20ms, and an RTCP report is sent for every
second frame (i.e., 25 RTCP reports per second), this expression second frame (i.e., 25 RTCP reports per second), this gives an RTCP
gives the needed RTCP bandwidth Brtcp = 51.6kbps. Increasing the feedback bandwidth, Brtcp = 57kbps. Increasing the frame duration,
frame duration, or reducing the frequency of reports, reduces the or reducing the frequency of reports, will reduce the RTCP bandwidth
RTCP bandwidth, as shown below: as shown in Table 1.
+==============+=============+================+ +==============+=============+================+
| Tf (seconds) | Nr (frames) | rtcp_bw (kbps) | | Tf (seconds) | Nr (frames) | rtcp_bw (kbps) |
+==============+=============+================+ +==============+=============+================+
| 20ms | 2 | 51.6 | | 0.020 | 2 | 57.0 |
+--------------+-------------+----------------+ +--------------+-------------+----------------+
| 20ms | 4 | 26.6 | | 0.020 | 4 | 29.3 |
+--------------+-------------+----------------+ +--------------+-------------+----------------+
| 20ms | 8 | 14.1 | | 0.020 | 8 | 15.4 |
+--------------+-------------+----------------+ +--------------+-------------+----------------+
| 20ms | 16 | 7.8 | | 0.020 | 16 | 8.5 |
+--------------+-------------+----------------+ +--------------+-------------+----------------+
| 60ms | 2 | 17.2 | | 0.060 | 2 | 19.0 |
+--------------+-------------+----------------+ +--------------+-------------+----------------+
| 60ms | 4 | 8.9 | | 0.060 | 4 | 9.8 |
+--------------+-------------+----------------+ +--------------+-------------+----------------+
| 60ms | 8 | 4.7 | | 0.060 | 8 | 5.1 |
+--------------+-------------+----------------+ +--------------+-------------+----------------+
| 60ms | 16 | 2.6 | | 0.060 | 16 | 2.8 |
+--------------+-------------+----------------+ +--------------+-------------+----------------+
Table 1: Required RTCP bandwidth for VoIP Table 1: RTCP bandwidth needed for VoIP
feedback feedback
The final row of the table (60ms frames, report every 16 frames) The final row of Table 1 (60ms frames, report every 16 frames) sends
sends RTCP reports once per second, giving an RTCP bandwidth of RTCP reports once per second, giving an RTCP bandwidth overhead of
2.6kbps. 2.8kbps.
The overhead can be reduced by sending some reports in non-compound The overhead can be reduced by sending some reports in non-compound
RTCP packets [RFC5506]. For example, if we alternate compound and RTCP packets [RFC5506]. For example, if we alternate compound and
non-compound RTCP packets, i.e., Nnc = 1, the calculation gives: non-compound RTCP packets, i.e., Nnc = 1, the calculation gives the
results shown in Table 2.
+==============+=============+================+ +==============+=============+================+
| Tf (seconds) | Nr (frames) | rtcp_bw (kbps) | | Tf (seconds) | Nr (frames) | rtcp_bw (kbps) |
+==============+=============+================+ +==============+=============+================+
| 20ms | 2 | 35.9 | | 0.020 | 2 | 41.4 |
+--------------+-------------+----------------+ +--------------+-------------+----------------+
| 20ms | 4 | 18.8 | | 0.020 | 4 | 21.5 |
+--------------+-------------+----------------+ +--------------+-------------+----------------+
| 20ms | 8 | 10.2 | | 0.020 | 8 | 11.5 |
+--------------+-------------+----------------+ +--------------+-------------+----------------+
| 20ms | 16 | 5.9 | | 0.020 | 16 | 6.5 |
+--------------+-------------+----------------+ +--------------+-------------+----------------+
| 60ms | 2 | 12.0 | | 0.060 | 2 | 13.8 |
+--------------+-------------+----------------+ +--------------+-------------+----------------+
| 60ms | 4 | 6.2 | | 0.060 | 4 | 7.2 |
+--------------+-------------+----------------+ +--------------+-------------+----------------+
| 60ms | 8 | 3.4 | | 0.060 | 8 | 3.8 |
+--------------+-------------+----------------+ +--------------+-------------+----------------+
| 60ms | 16 | 2.0 | | 0.060 | 16 | 2.2 |
+--------------+-------------+----------------+ +--------------+-------------+----------------+
Table 2: Required RTCP bandwidth for VoIP Table 2: Required RTCP bandwidth for VoIP
feedback (alternating compound and non- feedback (alternating compound and non-
compound reports) compound reports)
The RTCP bandwidth needed for 60ms frames, reporting every 16 frames The RTCP bandwidth needed for 60ms frames, reporting every 16 frames
(once per second), can be seen to drop to 2.0kbps. This calculation (once per second), can be seen to drop to 2.2kbps. This calculation
can be repeated for other patterns of compound and non-compound RTCP can be repeated for other patterns of compound and non-compound RTCP
packets, feedback frequency, and frame duration, as needed. packets, feedback frequency, and frame duration, as needed.
Note: To achieve the RTCP transmission intervals above the RTP/SAVPF Note: To achieve the RTCP transmission intervals above the RTP/SAVPF
profile with T_rr_interval=0 is used, since even when using the profile with T_rr_interval=0 is used, since even when using the
reduced minimal transmission interval, the RTP/SAVP profile would reduced minimal transmission interval, the RTP/SAVP profile would
only allow sending RTCP at most every 0.11s (every third frame of only allow sending RTCP at most every 0.11s (every third frame of
video). Using RTP/SAVPF with T_rr_interval=0 however is capable of video). Using RTP/SAVPF with T_rr_interval=0 however is capable of
fully utilizing the configured 5% RTCP bandwidth fraction. fully utilizing the configured 5% RTCP bandwidth fraction.
3.2. Scenario 2: Point-to-Point Video Conference 3.2. Scenario 2: Point-to-Point Video Conference
Consider a point to point video call between two end systems. There Consider a point to point video call between two end systems. There
will be four RTP flows in this scenario, two audio and two video, will be four RTP flows in this scenario, two audio and two video,
with all four flows being active for essentially all the time (the with all four flows being active for essentially all the time (the
audio flows will likely use voice activity detection and comfort audio flows will likely use voice activity detection and comfort
noise to reduce the packet rate during silent periods, and does not noise to reduce the packet rate during silent periods, but this does
cause the transmissions to stop). not cause the transmissions to stop).
Assume all four flows are sent in a single RTP session, each using a Assume all four flows are sent in a single RTP session, each using a
separate SSRC; the RTCP reports from co-located audio and video SSRCs separate SSRC. The RTCP reports from the co-located audio and video
at each end point are aggregated [RFC8108]; the optimisations in SSRCs at each end point are aggregated [RFC8108], the optimisations
[RFC8861] are used; and congestion control feedback is sent in [RFC8861] are used, and RTCP congestion control feedback is sent
[RFC8888]. [RFC8888].
When all members are senders, the RTCP timing rules in Section 6.2 When all members are senders, the RTCP reporting interval calculation
and 6.3 of [RFC3550] and [RFC4585] reduce to: in Section 6.2 and 6.3 of [RFC3550] and [RFC4585] reduces to:
rtcp_interval = avg_rtcp_size * n / rtcp_bw Trtcp = n * Srtcp / Brtcp
where n is the number of members in the session, the avg_rtcp_size is where n is the number of members in the session, Srtcp is the average
measured in octets, and the rtcp_bw is the bandwidth available for RTCP packet size in octets, and the Brtcp is the RTCP bandwidth in
RTCP, measured in octets per second (this will typically be 5% of the octets per second.
session bandwidth).
The average RTCP size will depend on the amount of feedback that is The average RTCP packet size, Srtcp, depends on the amount of
sent in each RTCP packet, on the number of members in the session, on feedback sent in each RTCP packet, on the number of members in the
the size of source description (RTCP SDES) information sent, and on session, on the size of source description (RTCP SDES) information
the amount of congestion control feedback sent in each packet. sent, and on the amount of congestion control feedback sent in each
packet.
As a baseline, each RTCP packet will be a compound RTCP packet that As a baseline, each RTCP packet will be a compound RTCP packet that
contains an aggregate of a compound RTCP packet generated by the contains an aggregate of a compound RTCP packet generated by the
video SSRC and a compound RTCP packet generated by the audio SSRC. video SSRC and a compound RTCP packet generated by the audio SSRC.
Since the RTCP reporting group extensions are used, one of these When the RTCP reporting group extensions are used, one of these SSRCs
SSRCs will be a reporting SSRC, and the other will delegate its will be a reporting SSRC, to which the other SSRC will have delegated
reports to that. its reports. No non-compound RTCP packets are sent.
The aggregated compound RTCP packet from the non-reporting SSRC will The aggregated compound RTCP packet from the non-reporting SSRC will
contain an RTCP SR packet, an RTCP SDES packet, and an RTCP RGRS contain an RTCP SR packet, an RTCP SDES packet, and an RTCP RGRS
packet. The RTCP SR packet contains the 28 octet header and sender packet. The RTCP SR packet contains the 28 octet header and sender
information, but no report blocks (since the reporting is delegated). information, but no report blocks (since the reporting is delegated).
The RTCP SDES packet will comprise a header (4 octets), originating The RTCP SDES packet will comprise a header (4 octets), originating
SSRC (4 octets), a CNAME chunk, a terminating chunk, and any padding. SSRC (4 octets), a CNAME chunk, a terminating chunk, and any padding.
If the CNAME follows [RFC7022] and [RFC8834] it will be 18 octets in If the CNAME follows [RFC7022] and [RFC8834] it will be 18 octets in
size, and will need 1 octet of padding, making the SDES packet 28 size, and will need 1 octet of padding, making the SDES packet 28
octets in size. The RTCP RGRS packet will be 12 octets in size. octets in size. The RTCP RGRS packet will be 12 octets in size.
skipping to change at page 9, line 7 skipping to change at page 9, line 7
congestion control feedback packet. The RTCP SR packet will contain congestion control feedback packet. The RTCP SR packet will contain
two report blocks, one for each of the remote SSRCs (the report for two report blocks, one for each of the remote SSRCs (the report for
the other local SSRC is suppressed by the reporting group extension), the other local SSRC is suppressed by the reporting group extension),
for a total of 28 + (2 * 24) = 76 octets. The RTCP SDES packet will for a total of 28 + (2 * 24) = 76 octets. The RTCP SDES packet will
comprise a header (4 octets), originating SSRC (4 octets), a CNAME comprise a header (4 octets), originating SSRC (4 octets), a CNAME
chunk, an RGRP chunk, a terminating chunk, and any padding. If the chunk, an RGRP chunk, a terminating chunk, and any padding. If the
CNAME follows [RFC7022] and [RFC8834] it will be 18 octets in size. CNAME follows [RFC7022] and [RFC8834] it will be 18 octets in size.
The RGRP chunk similarly comprises 18 octets, and 3 octets of padding The RGRP chunk similarly comprises 18 octets, and 3 octets of padding
are needed, for a total of 48 octets. The RTCP congestion control are needed, for a total of 48 octets. The RTCP congestion control
feedback (CCFB) report comprises an 8 octet RTCP header and SSRC,an 4 feedback (CCFB) report comprises an 8 octet RTCP header and SSRC, a 4
report timestamp, then for each of the remote audio and video SSRCs, octet report timestamp, and for each of the remote audio and video
an 8 octet report header, and 2 octets per packet reported upon, and SSRCs, an 8 octet report header, and 2 octets per packet reported
padding to a 4 octet boundary if needed; that is 8 + 4 + 8 + (2 * Nv) upon, and padding to a 4 octet boundary if needed; that is 8 + 4 + 8
+ 8 + (2 * Na) where Nv is the number of video packets per report, + (2 * Nv) + 8 + (2 * Na) where Nv is the number of video packets per
and Na is the number of audio packets per report. report, and Na is the number of audio packets per report.
The complete compound RTCP packet contains the RTCP packets from both The complete compound RTCP packet contains the RTCP packets from both
the reporting and non-reporting SSRCs, an SRTP authentication tag, the reporting and non-reporting SSRCs, an SRTCP trailer and
and a UDP/IPv4 header. The size of this RTCP packet is therefore: authentication tag, and a UDP/IPv4 header. The size of this RTCP
248 + (2 * Nv) + (2 * Na) octets. Since the aggregate RTCP packet packet is therefore: 262 + (2 * Nv) + (2 * Na) octets. Since the
contains reports from two SSRCs, the RTCP packet size is halved aggregate RTCP packet contains reports from two SSRCs, the RTCP
before use [RFC8108]. Accordingly, we define Sc = (248 + (2 * Nv) + packet size is halved before use [RFC8108]. Accordingly, the size of
(2 * Na))/2 for this scenario. the RTCP packets is:
How many packets does the RTCP XR congestion control feedback packet Srtcp = (262 + (2 * Nv) + (2 * Na)) / 2
report on? This is obviously highly dependent on the choice of codec
and encoding parameters, and might be quite bursty if the codec sends
I-frames from which later frames are predicted. For now though,
assume constant rate media with an MTU around 1500 octets, with
reports for both audio and video being aggregated and sent to align
with video frames. This gives the following, assuming Nr =1 and Nnc
= 0 (i.e., send a compound RTCP packet for each video frame, and no
non-compound packets), and using the calculation from Scenario 1:
Brtcp = (n * (Sc + Nnc * Snc))/(Nr * Tf * (1 + Nnc))
+===========+=======+=============+=============+===============+
| Data Rate | Video | Video | Audio | Required RTCP |
| (kbps) | Frame | Packets per | Packets per | bandwidth: |
| | Rate | Report: Nv | Report: Na | Brtcp (kbps) |
+===========+=======+=============+=============+===============+
| 100 | 8 | 1 | 6 | 32.8 (32%) |
+-----------+-------+-------------+-------------+---------------+
| 200 | 16 | 1 | 3 | 64.0 (32%) |
+-----------+-------+-------------+-------------+---------------+
| 350 | 30 | 1 | 2 | 119.1 (34%) |
+-----------+-------+-------------+-------------+---------------+
| 700 | 30 | 2 | 2 | 120.0 (17%) |
+-----------+-------+-------------+-------------+---------------+
| 700 | 60 | 1 | 1 | 236.2 (33%) |
+-----------+-------+-------------+-------------+---------------+
| 1024 | 30 | 3 | 2 | 120.9 (11%) |
+-----------+-------+-------------+-------------+---------------+
| 1400 | 60 | 2 | 1 | 238.1 (17%) |
+-----------+-------+-------------+-------------+---------------+
| 2048 | 30 | 6 | 2 | 123.8 ( 6%) |
+-----------+-------+-------------+-------------+---------------+
| 2048 | 60 | 3 | 1 | 240.0 (11%) |
+-----------+-------+-------------+-------------+---------------+
| 4096 | 30 | 12 | 2 | 129.4 ( 3%) |
+-----------+-------+-------------+-------------+---------------+
| 4096 | 60 | 6 | 1 | 245.6 ( 5%) |
+-----------+-------+-------------+-------------+---------------+
Table 3: Required RTCP bandwidth, reporting on every frame How many RTP packets does the RTCP XR congestion control feedback
packet included in these compound RTCP packets report on? That is,
what are the values of Nv and Na? This depends on the RTCP reporting
interval, Trtcp, the video bit rate and frame rate, Rf, the audio bit
rate and framing interval, and whether the receiver chooses to send
congestion control feedback in each RTCP packet it sends.
The RTCP bandwidth needed scales inversely with Nr. That is, it is To simplify the calculation, assume it is desired to send one RTCP
halved if Nr=2 (report on every second packet), is reduced to one- report for each frame of video received (i.e., Trtcp = 1 / Rf) and to
third if Nr=3 (report on every third packet), and so on. include a congestion control feedback packet in each report. Assume
that video has constant bit rate and frame rate, and that each frame
of packet has to fit into a 1500 octet MTU. Further, assume that the
audio takes negligible bandwidth, and that the audio framing interval
can be varied within reasonable bounds, so that an integral number of
audio frames align with video frame boundaries.
The needed RTCP bandwidth scales as a percentage of the data rate Table 3 shows the resulting values of Nv and Na, the number of video
following the ratio of the frame rate to the data rate. As can be and audio packets covered by each congestion control feedback report,
seen from the table above, the RTCP bandwidth needed is a significant for a range of data rates and video frame rates, assuming congestion
fraction of the media rate, if reporting on every frame for low rate control feedback is sent once per video frame. The table also shows
video. This can be solved by reporting less often at lower rates. the result of inverting the RTCP reporting interval calculation to
For example, to report on every frame of 100kbps/8fps video requires find the corresponding RTCP bandwidth, Brtcp. The RTCP bandwidth is
the RTCP bandwidth to be 21% of the media rate; reporting every given in kbps and as a fraction of the data rate.
fourth frame (i.e., twice per second) reduces this overhead to 5%.
It can be seen that, for example, with a date rate of 1024 kbps and
video sent at 30 frames-per-second, the RTCP congestion control
feedback report sent for each video frame will include reports on 3
video packets and 2 audio packets. The RTCP bandwidth needed to
sustain this reporting rate is 127.5kbps (12% of the data rate).
This assumes an audio framing interval of 16.67ms, so that two audio
packets are sent for each video frame.
+===========+==========+=============+=============+===============+
| Data Rate | Video | Video | Audio | Required RTCP |
| (kbps) | Frame | Packets per | Packets per | bandwidth: |
| | Rate: Rf | Report: Nv | Report: Na | Brtcp (kbps) |
+===========+==========+=============+=============+===============+
| 100 | 8 | 1 | 6 | 34.5 (34%) |
+-----------+----------+-------------+-------------+---------------+
| 200 | 16 | 1 | 3 | 67.5 (33%) |
+-----------+----------+-------------+-------------+---------------+
| 350 | 30 | 1 | 2 | 125.6 (35%) |
+-----------+----------+-------------+-------------+---------------+
| 700 | 30 | 2 | 2 | 126.6 (18%) |
+-----------+----------+-------------+-------------+---------------+
| 700 | 60 | 1 | 1 | 249.4 (35%) |
+-----------+----------+-------------+-------------+---------------+
| 1024 | 30 | 3 | 2 | 127.5 (12%) |
+-----------+----------+-------------+-------------+---------------+
| 1400 | 60 | 2 | 1 | 251.2 (17%) |
+-----------+----------+-------------+-------------+---------------+
| 2048 | 30 | 6 | 2 | 130.3 ( 6%) |
+-----------+----------+-------------+-------------+---------------+
| 2048 | 60 | 3 | 1 | 253.1 (12%) |
+-----------+----------+-------------+-------------+---------------+
| 4096 | 30 | 12 | 2 | 135.9 ( 3%) |
+-----------+----------+-------------+-------------+---------------+
| 4096 | 60 | 6 | 1 | 258.8 ( 6%) |
+-----------+----------+-------------+-------------+---------------+
Table 3: Required RTCP bandwidth, reporting on every frame
Use of reduced size RTCP [RFC5506] would allow the SR and SDES Use of reduced size RTCP [RFC5506] would allow the SR and SDES
packets to be omitted from some reports. These "non-compound" packets to be omitted from some reports. These "non-compound"
(actually, compound but reduced size in this case) RTCP packets would (actually, compound but reduced size in this case) RTCP packets would
contain an RTCP RGRS packet from the non-reporting SSRC, and an RTCP contain an RTCP RGRS packet from the non-reporting SSRC, and an RTCP
SDES RGRP packet and a congestion control feedback packet from the SDES RGRP packet and a congestion control feedback packet from the
reporting SSRC. This will be 12 + 28 + 12 + 8 + 2*Nv + 8 + 2*Na reporting SSRC. This will be 12 + 28 + 12 + 8 + 2*Nv + 8 + 2*Na
octets, plus UDP/IP header. That is, Snc = (96 + 2*Nv + 2*Na)/2. octets, plus the SRTCP trailer and authentication tag, and a UDP/IP
Repeating the analysis above, but alternating compound and non- header. That is, the size of the non-compound packets would be (110
compound reports, i.e., setting Nnc = 1, gives: + 2*Nv + 2*Na)/2 octets. Repeating the analysis above, but
alternating compound and non-compound reports gives results as shown
in Table 4.
+===========+=======+=============+=============+===============+ +===========+==========+=============+=============+===============+
| Data Rate | Video | Video | Audio | Required RTCP | | Data Rate | Video | Video | Audio | Required RTCP |
| (kbps) | Frame | Packets per | Packets per | bandwidth: | | (kbps) | Frame | Packets per | Packets per | bandwidth: |
| | Rate | Report: Nv | Report: Na | Brtcp (kbps) | | | Rate: Rf | Report: Nv | Report: Na | Brtcp (kbps) |
+===========+=======+=============+=============+===============+ +===========+==========+=============+=============+===============+
| 100 | 8 | 1 | 6 | 23.2 (23%) | | 100 | 8 | 1 | 6 | 24.1 (24%) |
+-----------+-------+-------------+-------------+---------------+ +-----------+----------+-------------+-------------+---------------+
| 200 | 16 | 1 | 3 | 45.0 (22%) | | 200 | 16 | 1 | 3 | 46.8 (23%) |
+-----------+-------+-------------+-------------+---------------+ +-----------+----------+-------------+-------------+---------------+
| 350 | 30 | 1 | 2 | 83.4 (23%) | | 350 | 30 | 1 | 2 | 86.7 (24%) |
+-----------+-------+-------------+-------------+---------------+ +-----------+----------+-------------+-------------+---------------+
| 700 | 30 | 2 | 2 | 84.4 (12%) | | 700 | 30 | 2 | 2 | 87.7 (12%) |
+-----------+-------+-------------+-------------+---------------+ +-----------+----------+-------------+-------------+---------------+
| 700 | 60 | 1 | 1 | 165.0 (23%) | | 700 | 60 | 1 | 1 | 171.6 (24%) |
+-----------+-------+-------------+-------------+---------------+ +-----------+----------+-------------+-------------+---------------+
| 1024 | 30 | 3 | 2 | 85.3 ( 8%) | | 1024 | 30 | 3 | 2 | 88.6 ( 8%) |
+-----------+-------+-------------+-------------+---------------+ +-----------+----------+-------------+-------------+---------------+
| 1400 | 60 | 2 | 1 | 166.9 (11%) | | 1400 | 60 | 2 | 1 | 173.4 (12%) |
+-----------+-------+-------------+-------------+---------------+ +-----------+----------+-------------+-------------+---------------+
| 2048 | 30 | 6 | 2 | 88.1 ( 4%) | | 2048 | 30 | 6 | 2 | 91.4 ( 4%) |
+-----------+-------+-------------+-------------+---------------+ +-----------+----------+-------------+-------------+---------------+
| 2048 | 60 | 3 | 1 | 168.8 ( 8%) | | 2048 | 60 | 3 | 1 | 175.3 ( 8%) |
+-----------+-------+-------------+-------------+---------------+ +-----------+----------+-------------+-------------+---------------+
| 4096 | 30 | 12 | 2 | 93.8 ( 2%) | | 4096 | 30 | 12 | 2 | 97.0 ( 2%) |
+-----------+-------+-------------+-------------+---------------+ +-----------+----------+-------------+-------------+---------------+
| 4096 | 60 | 6 | 1 | 174.4 ( 4%) | | 4096 | 60 | 6 | 1 | 180.9 ( 4%) |
+-----------+-------+-------------+-------------+---------------+ +-----------+----------+-------------+-------------+---------------+
Table 4: Required RTCP bandwidth, reporting on every frame, Table 4: Required RTCP bandwidth, reporting on every frame, with
with reduced-size reports reduced-size reports
The use of reduced-size RTCP gives a noticeable reduction in the The use of reduced-size RTCP gives a noticeable reduction in the
needed RTCP bandwidth, and can be combined with reporting every few needed RTCP bandwidth, and can be combined with reporting every few
frames rather than every frames. Overall, it is clear that the RTCP frames rather than every frames. Overall, it is clear that the RTCP
overhead can be reasonable across the range of data and frame rates, overhead can be reasonable across the range of data and frame rates,
if RTCP is configured carefully. if RTCP is configured carefully.
3.3. Scenario 3: Group Video Conference
(tbd)
3.4. Scenario 4: Screen Sharing
(tbd)
4. Discussion and Conclusions 4. Discussion and Conclusions
RTCP as it is currently specified cannot be used to send per-packet RTCP as it is currently specified cannot be used to send per-packet
congestion feedback. RTCP can, however, be used to send congestion congestion feedback with reasonable overhead.
feedback on each frame of video sent, provided the session bandwidth
exceeds a couple of megabits per second (the exact rate depending on RTCP can, however, be used to send congestion feedback on each frame
the number of session participants, the RTCP bandwidth fraction, and of video sent, provided the session bandwidth exceeds a couple of
what RTCP extensions are enabled, and how much detail of feedback is megabits per second (the exact rate depending on the number of
needed). For lower rate sessions, the overhead of reporting on every session participants, the RTCP bandwidth fraction, and what RTCP
frame becomes high, but can be reduced to something reasonable by extensions are enabled, and how much detail of feedback is needed).
sending reports once per N frames (e.g., every second frame), or by For lower rate sessions, the overhead of reporting on every frame
sending non-compound RTCP reports in between the regular reports. becomes high, but can be reduced to something reasonable by sending
reports once per N frames (e.g., every second frame), or by sending
non-compound RTCP reports in between the regular reports.
If it is desired to use RTCP in something close to it's current form If it is desired to use RTCP in something close to it's current form
for congestion feedback in WebRTC, the multimedia congestion control for congestion feedback in WebRTC, the multimedia congestion control
algorithm needs be designed to work with feedback sent every few algorithm needs be designed to work with feedback sent every few
frames, since that fits within the limitations of RTCP. That frames, since that fits within the limitations of RTCP. The provided
feedback can be a little more complex than just an acknowledgement, feedback will be more detailed than just an acknowledgement, however,
provided care is taken to consider the impact of the extra feedback and will provide a loss bitmap, relative arrival time, and received
on the overhead, possibly allowing for a degree of semantic feedback, ECN marks, for each packet sent. This will allow congestion control
meaningful to the codec layer as well as the congestion control that is effective, if slowly responsive, to be implemented.
algorithm.
The format described in [RFC8888] seems sufficient for the needs of The format described in [RFC8888] seems sufficient for the needs of
congestion control feedback. There is little point optimising this congestion control feedback. There is little point optimising this
format: the main overhead comes from the UDP/IP headers and the other format: the main overhead comes from the UDP/IP headers and the other
RTCP packets included in the compound packets, and can be lowered by RTCP packets included in the compound packets, and can be lowered by
using the [RFC5506] extensions and sending reports less frequently. using the [RFC5506] extensions and sending reports less frequently.
Further study of the scenarios of interest is needed, to ensure that Further study of the scenarios of interest is needed, to ensure that
the analysis presented is applicable to other media topologies, and the analysis presented is applicable to other media topologies, and
to sessions with different data rates and sizes of membership. to sessions with different data rates and sizes of membership.
 End of changes. 53 change blocks. 
191 lines changed or deleted 207 lines changed or added

This html diff was produced by rfcdiff 1.46. The latest version is available from http://tools.ietf.org/tools/rfcdiff/