draft-perkins-rtp-redundancy-00.txt   draft-perkins-rtp-redundancy-01.txt 
Expire in six months
INTERNET-DRAFT Mark Handley Mark Handley
draft-perkins-rtp-redundancy-00.txt Vicky Hardman Vicky Hardman
Isidor Kouvelas Isidor Kouvelas
Colin Perkins Colin Perkins
University College London University College London
Jean-Chrysostome Bolot Jean-Chrysostome Bolot
Andres Vega-Garcia Andres Vega-Garcia
Sacha Fosse-Parisis Sacha Fosse-Parisis
INRIA Sophia Antipolis INRIA Sophia Antipolis
4 July 1996
Expires: 9 January 1997
Payload Format Issues for Redundant Encodings in RTP Payload Format Issues for Redundant Encodings in RTP
draft-perkins-rtp-redundancy-01.txt
Status of this Memo Status of this Memo
This document is an Internet-Draft. Internet-Drafts are This document is an Internet-Draft. Internet-Drafts are working
working documents of the Internet Engineering Task Force documents of the Internet Engineering Task Force (IETF), its
(IETF), its areas, and its working groups. Note that other areas, and its working groups. Note that other groups may also
groups may also distribute working documents as Internet- distribute working documents as Internet-Drafts.
Drafts.
Internet-Drafts are draft documents valid for a maximum of Internet-Drafts are draft documents valid for a maximum of
six months and may be updated, replaced, or obsoleted by six months and may be updated, replaced, or obsoleted by
other documents at any time. It is inappropriate to use other documents at any time. It is inappropriate to use
Internet-Drafts as reference material or to cite them other Internet-Drafts as reference material or to cite them other
than as ``work in progress.'' than as ``work in progress.'' To learn the current status
of any Internet-Draft, please check the ``1id-abstracts.txt''
To learn the current status of any Internet-Draft, please listing contained in the Internet-Drafts Shadow Directories
check the ``1id-abstracts.txt'' listing contained in the on ftp.is.co.za (Africa), nic.nordu.net (Europe), munnari.oz.au
Internet-Drafts Shadow Directories on ftp.is.co.za (Africa), (Pacific Rim), ds.internic.net (US East Coast), or ftp.isi.edu
nic.nordu.net (Europe), munnari.oz.au (Pacific Rim), (US West Coast).
ds.internic.net (US East Coast), or ftp.isi.edu (US West
Coast).
Distribution of this document is unlimited. Distribution of this document is unlimited.
Comments are solicited and should be addressed to the Comments are solicited and should be addressed to the authors
authors and/or the AVT working group's mailing list at rem- and/or the AVT working group's mailing list at rem-conf@es.net.
conf@es.net.
Abstract Abstract
This document describes a payload type for use This document describes a payload type for use with
with the real-time transport protocol (RTP), ver- the real-time transport protocol (RTP), version 2, for
sion 2, for encoding redundant data. The primary encoding redundant data. The primary motivation for
motivation for the scheme described herein is the the scheme described herein is the development of
development of audio conferencing tools for use audio conferencing tools for use with lossy packet
with lossy packet networks such as the Internet networks such as the Internet Mbone, although this
Mbone, although this scheme is not limited to such scheme is not limited to such applications.
applications.
1. Introduction 1 Introduction
If multimedia conferencing is to become widely used in the If multimedia conferencing is to become widely used by the
Internet community, users must perceive the quality to be Internet Mbone community, users must perceive the quality to be
sufficiently good for most applications. We have identified sufficiently good for most applications. We have identified
a number of problems which impair the quality of confer- a number of problems which impair the quality of conferences,
ences, the most significant of which is packet loss over the the most significant of which is packet loss over the Mbone.
Mbone. Packet loss is a persistent problem, particularly Packet loss is a persistent problem, particularly given the
given the increasing popularity, and therefore increasing increasing popularity, and therefore increasing load, of the
load, of the Internet. The disruption of speech intelligi- Internet. The disruption of speech intelligibility even at
bility even at low loss rates which is currently experienced low loss rates which is currently experienced may convince a
may convince a whole generation of users that multimedia whole generation of users that multimedia conferencing over the
conferencing over the Internet is not viable. The addition Internet is not viable. The addition of redundancy to the
of redundancy to the data stream is offered as a solution data stream is offered as a solution [1]. If a packet is
[1]. If a packet is lost then the missing information may lost then the missing information may be reconstructed at the
be reconstructed at the receiver from the redundant data receiver from the redundant data that arrives in the following
that arrives in the following packet(s). packet(s).
This draft proposes an RTP payload format for the transmis- This draft proposes an RTP payload format for the transmission
sion of data encoded in a redundant fashion. Although the of data encoded in a redundant fashion. Although the
primary use of this packet format to date has been in audio primary use of this packet format to date has been in audio
applications, the packet format specified is quite general, applications, the packet format specified is quite general, and
and is not limited to these applications. is not limited to these applications.
2. Packetisation problem
The main requirements for a redundant encoding scheme under 2 Packetisation problem
RTP are as follows:
o Packets have to carry a primary encoding and one or The main requirements for a redundant encoding scheme under RTP
more redundant encodings. are as follows:
o Packets have to carry a primary encoding and one or more
redundant encodings.
o As a multitude of encodings may be used for redundant o As a multitude of encodings may be used for redundant
information, each block of redundant encoding has to information, each block of redundant encoding has to have
have an encoding type identifier. an encoding type identifier.
o As the use of variable size encodings is desirable, o As the use of variable size encodings is desirable,
each encoded block in the packet has to have a length each encoded block in the packet has to have a length
indicator. indicator.
o The RTP header provides a timestamp field that o The RTP header provides a timestamp field that corresponds
corresponds to the time of creation of the encoded data. to the time of creation of the encoded data. When
When redundant encodings are used this timestamp field redundant encodings are used this timestamp field can refer
can refer to the time of creation of the primary encod- to the time of creation of the primary encoding data.
ing data. Redundant blocks of data will correspond to Redundant blocks of data will correspond to different time
different time intervals than the primary data. Each intervals than the primary data. Each block of redundant
block of redundant encoding has to have its own times- encoding has to have its own timestamp. To reduce the
tamp. To reduce the number of bytes needed to carry the number of bytes needed to carry the timestamp, it can
timestamp, it can be computed as the difference of the be computed as the difference of the timestamp for the
timestamp for the redundant encoding to timestamp for redundant encoding to timestamp for the primary.
the primary.
There are two essential means by which redundant audio may There are two essential means by which redundant audio may be
be added to the standard RTP specification: a header exten- added to the standard RTP specification: a header extension
sion may hold the redundancy, or one, or more, additional may hold the redundancy, or one, or more, additional payload
payload types may be defined. types may be defined. These are now discussed in turn.
3. Use of RTP Header Extension 3 Use of RTP Header Extension
The RTP specification [2] states that applications should be The RTP specification [2] states that applications should be
prepared to ignore a header extension. Including all the prepared to ignore a header extension. Including all the
redundancy information for a packet in a header extension redundancy information for a packet in a header extension
would make it easy for applications that do not implement would make it easy for applications that do not implement
redundancy to discard it and just process the primary encod- redundancy to discard it and just process the primary encoding
ing data. There are, however, a number of disadvantages data. There are, however, a number of disadvantages with this
with this scheme: scheme:
o There is a large overhead from the number of bytes o There is a large overhead from the number of bytes needed
needed for the extension header (4) and the possible for the extension header (4) and the possible padding that
padding that is needed at the end of the extension to is needed at the end of the extension to round up to
round up to a four byte boundary (up to 3 bytes). Even a four byte boundary (up to 3 bytes). Even for longer
for longer duration packets especially when high duration packets especially when high compression encodings
compression encodings are used the overhead is consider- are used the overhead is considerable.
able.
o Use of the header extension limits applications to a o Use of the header extension limits applications to a single
single redundant encoding, unless further structure is redundant encoding, unless further structure is introduced
introduced into the extension. This would result in into the extension. This would result in further overhead.
further overhead.
For these reasons, the use of RTP header extension to For these reasons, the use of RTP header extension to hold
hold redundant audio encodings is disregarded. redundant audio encodings is disregarded.
4. Use Of Additional RTP Payload Types 4 Use Of Additional RTP Payload Types
Currently the RTP profile for audio and video conferences Currently the RTP profile for audio and video conferences [3]
[3] lists a set of payload types and provides for a dynamic lists a set of payload types and provides for a dynamic range
range of 32 encodings that may be defined through a confer- of 32 encodings that may be defined through a conference
ence control protocol. This leads to two possible schemes control protocol. This leads to two possible schemes for
for assigning additional RTP payload types for redundant assigning additional RTP payload types for redundant audio
applications:
audio applications: (1) a dynamic encoding scheme may be 1.A dynamic encoding scheme may be defined, for each
defined, using the RTP dynamic payload type range (96-127); combination of primary/redundant payload types, using the
or (2) a fixed payload type may be defined to represent a RTP dynamic payload type range.
packet with redundancy.
4.1. Dynamic Encoding Schemes 2.A fixed payload type may be defined to represent a packet
with redundancy. This may then be assigned to either a
static RTP payload type, or the payload type for this may
be assigned dynamically.
It is possible to define a set of payload types that would 4.1 Dynamic Encoding Schemes
signify a particular combination of primary and secondary
encodings for each of the 32 dynamic payload types provided. It is possible to define a set of payload types that signify
This would be a slightly restrictive yet feasible solution a particular combination of primary and secondary encodings for
for packets with a single block of redundancy as the number each of the 32 dynamic payload types provided. This would
of possible combinations is not too large. However the need be a slightly restrictive yet feasible solution for packets
for multiple blocks of redundancy greatly increases the with a single block of redundancy as the number of possible
number of encoding combinations and makes this solution not combinations is not too large. However the need for multiple
viable. blocks of redundancy greatly increases the number of encoding
combinations and makes this solution not viable.
A modified version of the above solution could be to decide A modified version of the above solution could be to decide
prior to the beginning of a conference on a set a 32 encod- prior to the beginning of a conference on a set a 32
ing combinations that will be used for the duration of the encoding combinations that will be used for the duration of the
conference. All tools in the conference can be initialized conference. All tools in the conference can be initialized
with this working set of encoding combinations. Communica- with this working set of encoding combinations. Communication
tion of the working set can be made through the use of an of the working set can be made through the use of an external
external mechanism such as SDR. Setup is complicated as mechanism such as SDR. Setup is complicated as great care needs
great care needs to be taken in starting tools with identi- to be taken in starting tools with identical parameters. This
cal parameters. This scheme is more efficient as only one scheme is more efficient as only one byte is used to identify
byte is used to identify combinations of encodings. combinations of encodings.
4.2. Payload type to mean packet-with-redundancy 4.2 Payload type to mean packet-with-redundancy
A more flexible solution would be to have only one payload A more flexible solution would be to have only one payload
type to signify a packet with redundancy and have each of type to signify a packet with redundancy and have each of
the encoding blocks in the packet contain it's own payload the encoding blocks in the packet contain it's own payload
type field: such a packet acts as a container, encapsulating type field: such a packet acts as a container, encapsulating
multiple packets into one. multiple packets into one.
Such a scheme is flexible, since any number of redundant Such a scheme is flexible, since any number of redundant
encodings may be enclosed within a single packet. There is, encodings may be enclosed within a single packet. There is,
however, a small overhead since each encapsulated packet however, a small overhead since each encapsulated packet must
must be preceded by a header indicating the type of data be preceded by a header indicating the type of data enclosed.
enclosed.
This is the preferred solution, since it is both flexible, This is the preferred solution, since it is both flexible,
extensible, and has a relatively low overhead. The remainder extensible, and has a relatively low overhead. The remainder
of this document describes this solution. of this document describes this solution.
5. RTP payload type for redundant data 5 RTP payload type for redundant data
The assignment of an RTP payload type for this new packet The assignment of an RTP payload type for this new packet
format is outside the scope of this document, and will not format is outside the scope of this document, and will not
be specified here. An RTP packet containing redundant data be specified here. An RTP packet containing redundant data
shall have a standard RTP header, with payload type indicat- shall have a standard RTP header, with payload type indicating
ing redundancy. The other fields of the RTP header relate to redundancy. The other fields of the RTP header relate to the
the primary data block of the redundant data. primary data block of the redundant data.
Following the RTP header are a number of additional headers, Following the RTP header are a number of additional headers,
specified in the figure below, which specify the contents of specified in the figure below, which specify the contents of
each of the encodings carried by the packet. Following these each of the encodings carried by the packet. Following these
additional headers are a number of data blocks, which con- additional headers are a number of data blocks, which contain
tain the standard RTP payload data for these encodings. It the standard RTP payload data for these encodings. It is noted
is noted that all the headers are aligned to a 32 bit boun- that all the headers are aligned to a 32 bit boundary, but that
dary, but that the payload data will typically not be the payload data will typically not be aligned.
aligned.
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|F| block PT | timestamp offset | block length | |F| block PT | timestamp offset | block length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The bits in the header are specified as follows: The bits in the header are specified as follows:
o F: 1 bit First bit in header indicates whether another F: 1 bitFirst bit in header indicates whether another encoding
encoding block follows. If 1 further redundant data block follows. If 1 further redundant data blocks follow,
blocks follow, if 0 this is the last data block. if 0 this is the last data block.
o block PT: 7 bits Payload type for this block.
o timestamp offset: 14 bits Unsigned offset of timestamp block PT: 7 bitsPayload type for this block.
of this block relative to timestamp given in RTP header.
The use of an unsigned offset implies that redundant
data must be sent after the primary data, and is hence a
time to be subtracted from the current timestamp to
determine the timestamp of the data for which this block
is the redundancy.
o block length: 10 bits Length in bytes of the next timestamp offset: 14 bitsUnsigned offset of timestamp of this
redundant data block excluding header. block relative to timestamp given in RTP header. The
use of an unsigned offset implies that redundant data must
be sent after the primary data, and is hence a time
to be subtracted from the current timestamp to determine
the timestamp of the data for which this block is the
redundancy.
block length: 10 bitsLength in bytes of the next redundant
data block excluding header.
It is noted that this limits the use of redundant data It is noted that this limits the use of redundant data
slightly: it is not possible to send redundancy before the slightly: it is not possible to send redundancy before the
primary encoding. This may possibly affect schemes, such as primary encoding. This may possibly affect schemes, such as
GSM audio, where a low bandwidth coding suitable for redun- GSM audio, where a low bandwidth coding suitable for redundancy
dancy is produced early in the encoding process, and hence is produced early in the encoding process, and hence could
could feasibly be transmitted early. The addition of a sign feasibly be transmitted early. The addition of a sign bit
bit would unacceptably reduce the range of the timestamp would unacceptably reduce the range of the timestamp offset,
offset, and increasing the size of the field above 14 bits
limits the block length field. It seems that limiting redun-
dancy to be transmitted after the primary will cause fewer
problems than limiting the size of the other fields.
It is noted that the block length and timestamp offset are and increasing the size of the field above 14 bits limits the
10 bits, and 14 bits respectively; rather than the more block length field. It seems that limiting redundancy to be
obvious 8 and 16 bits. Whilst such an encoding complicates transmitted after the primary will cause fewer problems than
parsing the header information slightly, and adds some addi- limiting the size of the other fields.
tional processing overhead, there are a number of problems
involved with the more obvious choice: An 8 bit block length
field is sufficient for most, but not all, possible encod-
ings: for example 80ms PCM and DVI audio packets comprise
more than 256 bytes, and cannot be encoded with a single
byte length field. It is possible to impose additional
structure on the block length field (for example the high
bit set could imply the lower 7 bits code a length in words,
rather than bytes), however such schemes are complex. The
use of a 10 bit block length field retains simplicity and
provides an enlarged range, at the expense of a reduced
range of timestamp values. A 14 bit timestamp value does,
however, allow for 4.5 complete packets delay with 48KHz
audio, more at lower sampling rates, and it is felt that
this is sufficient.
The primary encoding block should be placed last in the It is noted that the block length and timestamp offset are 10
packet. It is therefore required to omit the timestamp and bits, and 14 bits respectively; rather than the more obvious 8
block-length fields from the header of this block, since and 16 bits. Whilst such an encoding complicates parsing the
they may be determined from the RTP header and overall header information slightly, and adds some additional processing
packet length. The header for the primary (final) block overhead, there are a number of problems involved with the more
comprises only a zero marker bit, and the block payload type obvious choice: An 8 bit block length field is sufficient for
information, a total of 8 bits. This is illustrated in most, but not all, possible encodings: for example 80ms PCM
below: and DVI audio packets comprise more than 256 bytes, and cannot
be encoded with a single byte length field. It is possible
to impose additional structure on the block length field (for
example the high bit set could imply the lower 7 bits code a
length in words, rather than bytes), however such schemes are
complex. The use of a 10 bit block length field retains
simplicity and provides an enlarged range, at the expense of a
reduced range of timestamp values. A 14 bit timestamp value
does, however, allow for 4.5 complete packets delay with 48KHz
audio, more at lower sampling rates, and it is felt that this
is sufficient.
The primary encoding block should be placed last in the packet.
It is therefore required to omit the timestamp and block-length
fields from the header of this block, since they may be
determined from the RTP header and overall packet length. The
header for the primary (final) block comprises only a zero
marker bit, and the block payload type information, a total of
8 bits. This is illustrated in below:
0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+
|0| block PT | |0| Block PT |
+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+
6. Limitations 6 Limitations
The RTP marker bit is not preserved. That is, a redundant The RTP marker bit is not preserved. That is, a redundant copy
copy of the RTP marker is not sent, hence if the primary of the RTP marker is not sent, hence if the primary (containing
(containing this marker) is lost, the marker is lost. It is this marker) is lost, the marker is lost. It is not thought
not thought that this will cause undue problems: even if the that this will cause undue problems: even if the marker bit
marker bit was transmitted with the redundant information, was transmitted with the redundant information, there would be
there would be the possibility of its loss, so applications the possibility of its loss, so applications would still have
would still have to be written with this in mind. to be written with this in mind.
7. Example Packet 7 Example Packet
A redundant audio data packet containing DVI4 (8KHz) pri- A redundant audio data packet containing DVI4 (8KHz) primary,
mary, and a single block of redundancy encoded using 8KHz and a single block of redundancy encoded using 8KHz LPC is
LPC is illustrated: illustrated:
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|V=2|P|X| CC |M| PT | sequence number of primary | |V=2|P|X| CC |M| PT | sequence number of primary |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| timestamp of primary encoding | | timestamp of primary encoding |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| synchronization source (SSRC) identifier | | synchronization source (SSRC) identifier |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
skipping to change at page 7, line 37 skipping to change at line 311
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| | | | | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +
| | | |
+ + + +
| DVI4 encoded primary data | | DVI4 encoded primary data |
+ + + +
| | | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Author's Addresses 8 Author's Addresses
Mark Handley/Vicky Hardman/Isidor Kouvelas/Colin Perkins Mark Handley/Vicky Hardman/Isidor Kouvelas/Colin Perkins
Department of Computer Science Department of Computer Science
University College London University College London
London WC1E 6BT London WC1E 6BT
United Kingdom United Kingdom
Email: {M.Handley|V.Hardman|I.Kouvelas|C.Perkins}@cs.ucl.ac.uk Email: M.Handley|V.Hardman|I.Kouvelas|C.Perkins@cs.ucl.ac.uk
Jean-Chrysostome Bolot/Andres Vega-Garcia/Sacha Fosse-Parisis Jean-Chrysostome Bolot/Andres Vega-Garcia/Sacha Fosse-Parisis
INRIA Sophia Antipolis INRIA Sophia Antipolis
2004 Route des Lucioles, BP 93 2004 Route des Lucioles, BP 93
Sophia Antipolis 06902 Sophia Antipolis
06902
France France
Email: bolot@sophia.inria.fr Email: bolot@sophia.inria.fr
References 9 References
[1] Hardman, V. J. and Sasse, M. A. and Handley, M. and Wat- [1] Hardman, V. J. and Sasse, M. A. and Handley, M.
son, A., and Watson, A., Reliable Audio for Use over the Internet,
Reliable Audio for Use over the Internet, Proceecings INET'95, Honalulu, Oahu, Hawaii, September 1995.
Proceecings INET'95, Honalulu, Oahu, Hawaii, September
1995.
http://www.isoc.org/in95prc/ http://www.isoc.org/in95prc/
[2] H. Schulzrinne, S. Casner, R. Frederick, V. Jacobson, [2] H. Schulzrinne, S. Casner, R. Frederick, V. Jacobson,
RTP: A Transport Protocol for Real-Time Applications, RTP: A Transport Protocol for Real-Time Applications, RFC 1889,
RFC 1889, January 1996 January 1996
[3] H. Schulzrinne, [3] H. Schulzrinne, RTP Profile for Audio and Video Conferences
RTP Profile for Audio and Video Conferences with Minimal with Minimal Control, RFC 1890, January 1996
Control,
RFC 1890, January 1996
 End of changes. 55 change blocks. 
206 lines changed or deleted 189 lines changed or added

This html diff was produced by rfcdiff 1.33. The latest version is available from http://tools.ietf.org/tools/rfcdiff/