draft-ietf-avt-uncomp-video-02.txt | draft-ietf-avt-uncomp-video-03.txt | |||
---|---|---|---|---|
Internet Engineering Task Force AVT WG | Internet Engineering Task Force AVT WG | |||
INTERNET-DRAFT Ladan Gharai | INTERNET-DRAFT Ladan Gharai | |||
draft-ietf-avt-uncomp-video-02.txt Colin Perkins | draft-ietf-avt-uncomp-video-03.txt USC/ISI | |||
USC/ISI | Colin Perkins | |||
27 February 2003 | University of Glasgow | |||
Expires: August 2003 | 29 June 2003 | |||
Expires: December 2003 | ||||
RTP Payload Format for Uncompressed Video | RTP Payload Format for Uncompressed Video | |||
Status of this Memo | Status of this Memo | |||
This document is an Internet-Draft and is in full conformance with all | This document is an Internet-Draft and is in full conformance with | |||
provisions of Section 10 of RFC2026. | all provisions of Section 10 of RFC2026. | |||
Internet-Drafts are working documents of the Internet Engineering Task | Internet-Drafts are working documents of the Internet Engineering | |||
Force (IETF), its areas, and its working groups. Note that other groups | Task Force (IETF), its areas, and its working groups. Note that | |||
may also distribute working documents as Internet-Drafts. | other groups may also distribute working documents as Internet- | |||
Drafts. | ||||
Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference material | time. It is inappropriate to use Internet-Drafts as reference | |||
or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
The list of current Internet-Drafts can be accessed at | The list of current Internet-Drafts can be accessed at | |||
http://www.ietf.org/ietf/1id-abstracts.txt | http://www.ietf.org/ietf/1id-abstracts.txt | |||
The list of Internet-Draft Shadow Directories can be accessed at | The list of Internet-Draft Shadow Directories can be accessed at | |||
http://www.ietf.org/shadow.html. | http://www.ietf.org/shadow.html. | |||
Abstract | Abstract | |||
This memo specifies a packetization scheme for encapsulating | This memo specifies a packetization scheme for encapsulating | |||
uncompressed video into a payload format for the Real-time | uncompressed video into a payload format for the Real-time | |||
Transport Protocol, RTP. It supports a range of standard- and | Transport Protocol, RTP. It supports a range of standard- and | |||
high-definition video formats, including common television | high-definition video formats, including common television | |||
formats such as ITU BT.601, SMPTE 274M and SMPTE 296M. The | formats such as ITU BT.601, and standards from the Society of | |||
format is designed to be extensible as new video formats are | Motion Picture and Television Engineers (SMPTE), such as SMPTE | |||
developed. | 274M and SMPTE 296M. The format is designed to be applicable | |||
and extensible to new video formats as they are developed. | ||||
1. Introduction | 1. Introduction | |||
[Note to RFC Editor: All references to RFC XXXX are to be replaced with | [Note to RFC Editor: All references to RFC XXXX are to be replaced | |||
the RFC number of this memo, when published] | with the RFC number of this memo, when published] | |||
This memo defines a scheme to packetize uncompressed, studio-quality, | This memo defines a scheme to packetize uncompressed, studio-quality, | |||
video streams for transport using RTP [RTP]. It supports a range of | video streams for transport using RTP [RTP]. It supports a range of | |||
standard and high definition video formats, including ITU-R BT.601 | standard and high definition video formats, including ITU-R BT.601 | |||
[601], SMPTE 274M [274] and SMPTE 296M [296]. | [601], SMPTE 274M [274] and SMPTE 296M [296]. | |||
Formats for uncompressed standard definition television are defined by | Formats for uncompressed standard definition television are defined | |||
ITU Recommendation BT.601 [601] along with bit-serial and parallel | by ITU Recommendation BT.601 [601] along with bit-serial and parallel | |||
interfaces in Recommendation BT.656 [656]. These formats allow both 625 | interfaces in Recommendation BT.656 [656]. These formats allow both | |||
line and 525 line operation, with 720 samples per digital active line, | 625 line and 525 line operation, with 720 samples per digital active | |||
4:2:2 color sub-sampling, and 8- or 10-bit digital representation. | line, 4:2:2 color sub-sampling, and 8- or 10-bit digital | |||
representation. | ||||
The representation of uncompressed high definition television is | The representation of uncompressed high definition television is | |||
specified in SMPTE standards 274M [274] and 296M [296]. SMPTE 274M | specified in SMPTE standards 274M [274] and 296M [296]. SMPTE 274M | |||
defines a family of scanning systems with an image format of 1920x1080 | defines a family of scanning systems with an image format of | |||
pixels with progressive and interlaced scanning, while SMPTE 296M | 1920x1080 pixels with progressive and interlaced scanning, while | |||
standard defines systems with an image size of 1280x720 pixels and only | SMPTE 296M defines systems with an image size of 1280x720 pixels and | |||
progressive scanning. In progressive scanning, scan lines are displayed | only progressive scanning. In progressive scanning, scan lines are | |||
in sequence from top to bottom of a full frame. In interlaced scanning, | displayed in sequence from top to bottom of a full frame. In | |||
a frame is divided into its odd and even scan lines (called fields) and | interlaced scanning, a frame is divided into its odd and even scan | |||
the two fields are displayed in succession. | lines (called fields) and the two fields are displayed in succession. | |||
SMPTE 274M and 296M define images with aspect ratios of 16:9, and define | SMPTE 274M and 296M define images with aspect ratios of 16:9, and | |||
the digital representation for RGB and YCbCr components. In the case of | define the digital representation for RGB and YCbCr components. In | |||
YCbCr components, the Cb and Cr components are horizontally sub-sampled | the case of YCbCr components, the Cb and Cr components are | |||
by a factor of two (4:2:2 color encoding). | horizontally sub-sampled by a factor of two (4:2:2 color encoding). | |||
Although these formats differ in their details, they are structurally | Although these formats differ in their details, they are structurally | |||
very similar. This memo specifies a payload format to encapsulate these, | very similar. This memo specifies a payload format to encapsulate | |||
and other similar, video formats for transport within RTP. | these, and other similar, video formats for transport within RTP. | |||
2. Conventions Used in this Document | 2. Conventions Used in this Document | |||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | |||
document are to be interpreted as described in RFC 2119[2119]. | document are to be interpreted as described in RFC 2119 [2119]. | |||
3. Payload Design | 3. Payload Design | |||
Each scan line of digital video is packetized into one or more | Each scan line of digital video is packetized into one or more RTP | |||
(depending on the network MTU) RTP packets. A single RTP packet MAY also | packets. If the data for a complete scan line exceeds the network | |||
contain data for more than one scan line. Only the active samples are | MTU, the scan line SHOULD be fragmented into multiple RTP packets, | |||
included in the RTP payload, inactive samples and the contents of | each smaller than the MTU. A single RTP packet MAY contain data for | |||
horizontal and vertical blanking SHOULD NOT be transported. Scan line | more than one scan line. Only the active samples are included in the | |||
numbers are included in the RTP payload header, along with a field | RTP payload: inactive samples and the contents of horizontal and | |||
identifier for interlaced video. | vertical blanking SHOULD NOT be transported. Scan line numbers are | |||
included in the RTP payload header, along with a field identifier for | ||||
interlaced video. | ||||
For SMPTE 296M format video, valid scan line numbers are from 26 | For SMPTE 296M format video, valid scan line numbers are from 26 | |||
through 745, inclusive. For progressive scan SMPTE 274M format | through 745, inclusive. For progressive scan SMPTE 274M format | |||
video, valid scan lines are from scan line 42 through 1121 | video, valid scan lines are from scan line 42 through 1121 | |||
inclusive. For interlaced scan, valid scan line numbers for field | inclusive. For interlaced scan SMPTE 274M format video, valid scan | |||
one (F=0) are from 21 to 560 and valid scan line numbers for the | line numbers for field one (F=0) are from 21 to 560 and valid scan | |||
second field (F=1) are from 584 to 1123. For ITU-R BT.601 format | line numbers for the second field (F=1) are from 584 to 1123. For | |||
video, the blanking intervals defined in BT.656 are used: for 625 | ITU-R BT.601 format video, the blanking intervals defined in BT.656 | |||
line video, lines 24 to 310 of field one (F=0) and 337 to 623 of | are used: for 625 line video, lines 24 to 310 of field one (F=0) | |||
the second field (F=1) are valid; for 525 line video, lines 21 to | and 337 to 623 of the second field (F=1) are valid; for 525 line | |||
263 of the first field, and 284 to 525 of the second field are | video, lines 21 to 263 of the first field, and 284 to 525 of the | |||
valid. Other formats (e.g. [372]) may define different ranges of | second field are valid. Other formats (e.g. [372]) may define | |||
active lines. | different ranges of active lines. | |||
The payload header contains a 16 bit extension to the standard 16 bit | ||||
RTP sequence number, thereby extending the sequence number to 32 bits | ||||
and enabling RTP to accommodate high data rates. This is necessary as | ||||
the 16 bit RTP sequence number will roll-over very quickly for high data | ||||
rates. For example, for a 1 Gbps video stream with packet sizes of at | ||||
least one thousand octets, the standard RTP packet will roll-over in 0.5 | ||||
seconds, which can be a problem for detecting loss and out of order | ||||
packets particularly in instances where the round trip time is greater | ||||
than half a second. The extended 32 bit number allows for a longer wrap- | ||||
around time of approximately nine hours. | ||||
It is desirable for the video to be both octet aligned when packetized, | ||||
and to adhere to the principles of application level framing [ALF] by | ||||
ensuring that the samples relating to a single pixel are not fragmented | ||||
across two packets. | ||||
Samples may be transfered as 8, 10, 12 or 16 bit values. For 10 bit and | The payload header contains a 16 bit extension to the standard 16 bit | |||
12 bit payloads, care must be taken to pack an appropriate number of | RTP sequence number, thereby extending the sequence number to 32 bits | |||
samples per packet, such that the payload is also octet aligned. For RGB | and enabling the payload format to accommodate high data rates. This | |||
video, it is desirable that the samples corresponding to a single pixel | is necessary as the 16 bit RTP sequence number will roll-over very | |||
are not fragmented across packets. Similarly, for YCrCb video, it is | quickly for high data rates. For example, for a 1 Gbps video stream | |||
desirable that luminance and chrominance values are not fragmented | with packet sizes of at least one thousand octets, the standard RTP | |||
across packets. | packet will roll-over in 0.5 seconds, which can be a problem for | |||
detecting loss and out of order packets particularly in instances | ||||
where the round trip time is greater than half a second. The extended | ||||
32 bit number allows for a longer wrap-around time of approximately | ||||
nine hours. | ||||
For example, in YCrCb video with 4:1:1 color subsampling, each group of | Each scan line comprises of an integer number of pixels. Each pixel | |||
4 pixels is represented by 6 values, Y1 Y2 Y3 Y4 Cr Cb. These should be | is represented by a number of samples. Samples may be coded as 8, 10, | |||
packetized such that these values are not fragmented across a packet | 12 or 16 bit values. A sample may represent color or luminance | |||
boundary. With 10 bit words this is a 60 bit value which is not octet | components of the video. Color samples may be shared between | |||
aligned. To be both octet aligned, and appropriately framed, pixels must | adjacent pixels. The sharing of color samples between adjacent pixels | |||
be framed in 2 groups of 4 pixels, thereby becoming octet aligned on a | is known as color sub-sampling. This is typically done in the YCbCr | |||
15 octet boundary. This length is referred to as the pixel group | color space for the purpose of reducing the size of an image. | |||
("pgroup"), and it is conveyed in the SDP parameters. Tables 1 to 4 | ||||
display the pgroup values, in octets, for a range of color samplings and | ||||
word lengths. | ||||
When packetizing digital active line content, video data MUST NOT be | Pixels that share sample values MUST be transported together as a | |||
fragmented within a pgroup. | pixel group. If 10 bit or 12 bit samples are used, each pixel may | |||
also comprise a non-integer number of octets. In this case, several | ||||
pixels MUST be combined into an octet aligned pixel group for | ||||
transmission. These restrictions simplify the operation of receivers | ||||
by ensuring that a complete payload is octet aligned, and that | ||||
samples relating to a single pixel are not fragmented across multiple | ||||
packets [ALF]. | ||||
Video content is almost always associated with additional information | For example, in YCbCr video with 4:1:1 color sub-sampling, each group | |||
such as audio tracks, time code, etc. In professional digital video | of 4 adjacent pixels comprises 6 samples, Y1 Y2 Y3 Y4 Cr Cb, with the | |||
applications this data is commonly embedded in non-active portions of | Cr and Cb values being shared between all 4 pixels. If samples are 8 | |||
the video stream (horizontal and vertical blanking periods) so that | bit values, the result is a group of 4 pixels comprising 6 octets. | |||
precise and robust synchronization is maintained. This payload format | If, however, samples are 10 bit values, the resulting 60 bit group is | |||
requires that applications using such synchronized ancillary data MUST | not octet aligned. To be both octet aligned and appropriately | |||
deliver it in separate RTP sessions which operate concurrently with the | framed, two groups of 4 adjacent pixels must be collected, thereby | |||
video session. The normal RTP mechanisms SHOULD be used to synchronize | becoming octet aligned on a 15 octet boundary. This length is | |||
the media. | referred to as the pixel group size ("pgroup"). | |||
8 bit words | Formally, the "pgroup" parameter is the size in octets of the | |||
Color ---------------------------------------- | smallest grouping of pixels such that 1) the grouping comprises an | |||
Subsampling Pixels #words octet alignment #samples pgroup | integer number of octets; and 2) if color sub-sampling is used, | |||
octets | samples are only shared within the grouping. When packetizing digital | |||
+-----------+------+---+ +------+---------------+---------------+ | active line content, video data MUST NOT be fragmented within a | |||
|monochrome | 1 |P/I| | 1x8 | 8/8 = 1 | 1 | 1 | | pgroup. | |||
+-----------+------+---+ +------+---------------+---------------+ | ||||
| 4:1:1 | 4 |P/I| | 6x8 | 6x8/8 = 6 | 6 | 6 | | ||||
+-----------+------+---+ +------+---------------+---------------+ | ||||
| 4:2:0 | 4 | P | | 6x8 | 6x8/8 = 6 | 6 | 6 | | ||||
+-----------+------+---+ +------+---------------+---------------+ | ||||
| 4:2:0 | 4 | I | | 4x8 | 4x8/8 = 6 | 4 | 4 | | ||||
+-----------+------+---+ +------+---------------+---------------+ | ||||
| 4:2:2 | 2 |P/I| | 4x8 | 4x8/8 = 8 | 4 | 4 | | ||||
+-----------+------+---+ +------+---------------+---------------+ | ||||
| 4:4:4 | 1 |P/I| | 3x8 | 3x8/8 = 3 | 3 | 3 | | ||||
+-----------+------+---+ +------+---------------+---------------+ | ||||
| 4:4:4:4 | 1 |P/I| | 4x8 | 4x8/8 = 4 | 4 | 4 | | ||||
+-----------+------+---+ +------+---------------+---------------+ | ||||
Table 1: pgroup values for 8 bit sampling | ||||
10 bit words | ||||
Color ---------------------------------------- | ||||
Subsampling Pixels #words octet alignment #samples pgroup | ||||
octets | ||||
+-----------+------+---+ +------+---------------+---------------+ | ||||
|monochrome | 4 |P/I| | 4x10 | 40/8 = 5 | 4 | 5 | | ||||
+-----------+------+---+ +------+---------------+---------------+ | ||||
| 4:1:1 | 4 |P/I| | 6x10 | 2x60/8 = 15 | 12 | 15 | | ||||
+-----------+------+---+ +------+---------------+---------------+ | ||||
| 4:2:0 | 4 | P | | 6x10 | 2x60/8 = 15 | 12 | 15 | | ||||
+-----------+------+---+ +------+---------------+---------------+ | ||||
| 4:2:0 | 4 | I | | 4x10 | 40/8 = 5 | 4 | 5 | | ||||
+-----------+------+---+ +------+---------------+---------------+ | ||||
| 4:2:2 | 2 |P/I| | 4x10 | 40/8 = 5 | 4 | 5 | | ||||
+-----------+------+---+ +------+---------------+---------------+ | ||||
| 4:4:4 | 1 |P/I| | 3x10 | 4x30/8 = 15 | 12 | 15 | | ||||
+-----------+------+---+ +------+---------------+---------------+ | ||||
| 4:4:4:4 | 1 |P/I| | 4x10 | 40/8 = 5 | 4 | 5 | | ||||
+-----------+------+---+ +------+---------------+---------------+ | ||||
Table 2: pgroup values for 10 bit sampling | ||||
12 bit words | Video content is almost always associated with additional information | |||
Color ---------------------------------------- | such as audio tracks, time code, etc. In professional digital video | |||
Subsampling Pixels #words octet alignment #samples pgroup | applications this data is commonly embedded in non-active portions of | |||
octets | the video stream (horizontal and vertical blanking periods) so that | |||
+-----------+------+---+ +------+---------------+---------------+ | precise and robust synchronization is maintained. This payload format | |||
|monochrome | 2 |P/I| | 2x12 | 2x12/8 = 3 | 2 | 3 | | requires that applications using such synchronized ancillary data | |||
+-----------+------+---+ +------+---------------+-------+-------+ | MUST deliver it in separate RTP sessions which operate concurrently | |||
| 4:1:1 | 4 |P/I| | 6x12 | 72/8 = 9 | 6 | 9 | | with the video session. The normal RTP mechanisms SHOULD be used to | |||
+-----------+------+---+ +------+---------------+-------+-------+ | synchronize the media. | |||
| 4:2:0 | 4 | P | | 6x12 | 72/8 = 9 | 6 | 9 | | ||||
+-----------+------+---+ +------+---------------+-------+-------+ | ||||
| 4:2:0 | 4 | I | | 4x12 | 48/8 = 6 | 4 | 6 | | ||||
+-----------+------+---+ +------+---------------+-------+-------+ | ||||
| 4:2:2 | 2 |P/I| | 4x12 | 48/8 = 6 | 4 | 6 | | ||||
+-----------+------+---+ +------+---------------+-------+-------+ | ||||
| 4:4:4 | 2 |P/I| | 6x12 | 2x36/8 = 9 | 6 | 9 | | ||||
+-----------+------+---+ +------+---------------+-------+-------+ | ||||
| 4:4:4:4 | 1 |P/I| | 4x12 | 48/8 = 6 | 4 | 6 | | ||||
+-----------+------+---+ +------+---------------+-------+-------+ | ||||
Table 3: pgroup values for 12 bit sampling | ||||
16 bit words | ||||
Color -------------------------------------- | ||||
Subsampling Pixels #words octet alignment samples pgroup | ||||
octets | ||||
+-----------+------+---+ +------+---------------+-------+-------+ | ||||
|monochrome | 1 |P/I| | 1x16 | 16/8 = 2 | 1 | 2 | | ||||
+-----------+------+---+ +------+---------------+-------+-------+ | ||||
| 4:1:1 | 4 |P/I| | 6x16 | 6x16/8 = 12 | 6 | 12 | | ||||
+-----------+------+---+ +------+---------------+-------+-------+ | ||||
| 4:2:0 | 4 | P | | 6x16 | 6x16/8 = 12 | 6 | 12 | | ||||
+-----------+------+---+ +------+---------------+-------+-------+ | ||||
| 4:2:0 | 4 | I | | 4x16 | 4x16/8 = 8 | 4 | 8 | | ||||
+-----------+------+---+ +------+---------------+-------+-------+ | ||||
| 4:2:2 | 2 |P/I| | 4x16 | 4x16/8 = 8 | 4 | 8 | | ||||
+-----------+------+---+ +------+---------------+-------+-------+ | ||||
| 4:4:4 | 1 |P/I| | 3x16 | 3x16/8 = 6 | 3 | 6 | | ||||
+-----------+------+---+ +------+---------------+-------+-------+ | ||||
| 4:4:4:4 | 1 |P/I| | 4x16 | 4x16/8 = 8 | 4 | 8 | | ||||
+-----------+------+---+ +------+---------------+-------+-------+ | ||||
Table 4: pgroup values for 16 bit sampling | ||||
4. RTP Packetization | 4. RTP Packetization | |||
The standard RTP header is followed by a 4 octet payload header that | The standard RTP header is followed by a 2 octet payload header that | |||
extends the RTP Sequence Number, and by a 6 octet payload header for | extends the RTP Sequence Number, and by a 6 octet payload header for | |||
each line (or partial line) of video included. One or more lines, or | each line (or partial line) of video included. One or more lines, or | |||
partial lines, of video data follow. This format makes the payload | partial lines, of video data follow. This format makes the payload | |||
header 32 bit aligned in the common case, where one scan line (fragment) | header 32 bit aligned in the common case, where one scan line (or | |||
of video is included in each RTP packet. | fragment) of video is included in each RTP packet. | |||
For example, if two lines of video are encapsulated, the payload format | For example, if two lines of video are encapsulated, the payload | |||
will be as shown in Figure 1. | format will be as shown in Figure 1. | |||
0 1 2 3 | 0 1 2 3 | |||
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| V |P|X| CC |M| PT | Sequence Number | | | V |P|X| CC |M| PT | Sequence Number | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| Time Stamp | | | Time Stamp | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| SSRC | | | SSRC | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| Extended Sequence Number | Length | | | Extended Sequence Number | Length | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
|F| Scan Line No |C| Scan Offset | | |F| Line No |C| Offset | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| Length |F| Scan Line No | | | Length |F| Line No | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
|C| Scan Offset | . | |C| Offset | . | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ . | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ . | |||
. . | . . | |||
. Two (partial) lines of video data . | . Two (partial) lines of video data . | |||
. . | . . | |||
+---------------------------------------------------------------+ | +---------------------------------------------------------------+ | |||
Figure 1: RTP Payload Format showing two (partial) lines of video | Figure 1: RTP Payload Format showing two (partial) lines of video | |||
4.1. The RTP Header | 4.1. The RTP Header | |||
The fields of the fixed RTP header have their usual meaning, with the | The fields of the fixed RTP header have their usual meaning, with the | |||
following additional notes: | following additional notes: | |||
Payload Type (PT): 7 bits | Payload Type (PT): 7 bits | |||
A dynamically allocated payload type field which designates the | A dynamically allocated payload type field which designates the | |||
payload as uncompressed video. | payload as uncompressed video. | |||
Timestamp: 32 bits | Timestamp: 32 bits | |||
For progressive scan video, the timestamp denotes the sampling | For progressive scan video, the timestamp denotes the sampling | |||
instant of the frame to which the RTP packet belongs. Packets MUST | instant of the frame to which the RTP packet belongs. Packets MUST | |||
NOT include data from multiple frames, and all packets belonging to | NOT include data from multiple frames, and all packets belonging to | |||
the same frame MUST have the same timestamp. | the same frame MUST have the same timestamp. | |||
For interlaced video, the timestamp denotes the sampling instant of | For interlaced video, the timestamp denotes the sampling instant of | |||
the field to which the RTP packet belongs. Packets MUST NOT | the field to which the RTP packet belongs. Packets MUST NOT | |||
include data from multiple fields, and all packets belonging to the | include data from multiple fields, and all packets belonging to the | |||
same field MUST have the same timestamp. Use of field timestamps, | same field MUST have the same timestamp. Use of field timestamps, | |||
rather than a frame timestamp and and field indicator bit, is | rather than a frame timestamp and and field indicator bit, is | |||
needed to support reverse 3-2 pulldown. | needed to support reverse 3-2 pulldown. | |||
A 90 kHz timestamp MUST be used in both cases. If the sampling | A 90 kHz timestamp MUST be used in both cases. If the sampling | |||
instant does not correspond to an integer value of the clock (as | instant does not correspond to an integer value of the clock (as | |||
may be the case when interleaving, the value SHALL be truncated to | may be the case when interleaving, the value SHALL be truncated to | |||
the next lowest integer). | the next lowest integer). | |||
Marker bit (M): 1 bit | Marker bit (M): 1 bit | |||
The Marker bit denotes the end of a video frame, and MUST be set to | The Marker bit denotes the end of a video frame, and MUST be set to | |||
1 for the last packet of the video frame. It MUST be set to 0 for | 1 for the last packet of the video frame. It MUST be set to 0 for | |||
other packets. | other packets. | |||
Sequence Number: 16 bits | Sequence Number: 16 bits | |||
The low order bits for RTP sequence number. The standard 16 bit | The low order bits for RTP sequence number. The standard 16 bit | |||
sequence number is augmented with another 16 bits in the payload | sequence number is augmented with another 16 bits in the payload | |||
header, in order avoid problems due to wrap-around when operating | header, in order avoid problems due to wrap-around when operating | |||
at high rate rates. | at high rate rates. | |||
4.2. Payload Header | 4.2. Payload Header | |||
Extended Sequence Number : 16 bits | Extended Sequence Number : 16 bits | |||
The high order bits of the extended 32 bit sequence number, in | The high order bits of the extended 32 bit sequence number, in | |||
network byte order. | network byte order. | |||
Scan Line No : 15 bits | Length: 16 bits | |||
Number of octets of data included from this scan line, in network | ||||
byte order. This MUST be a multiple of the pgroup value. | ||||
Line No : 15 bits | ||||
Scan line number of encapsulated data, in network byte order. | Scan line number of encapsulated data, in network byte order. | |||
Successive RTP packets MAY contains parts of the same scan line | Successive RTP packets MAY contains parts of the same scan line | |||
(with an incremented RTP sequence number, but the same timestamp), | (with an incremented RTP sequence number, but the same timestamp), | |||
if it is necessary to fragment a line. | if it is necessary to fragment a line. | |||
Scan Offset : 15 bits | Offset : 15 bits | |||
Scan offset of the first sample in the payload data. If YCrCb | ||||
format data is being transported, this is the offset of the co- | ||||
sited luminance sample and if RGB format data is being transported | ||||
it is the offset of the red sample. The value is in network byte | ||||
order, and the offset has a value of zero if the first sample in | ||||
the payload corresponds to the start of the line. | ||||
Length: 16 bits | ||||
Number of octets of data included from this scan line, in network | Offset of the first pixel of the payload data within the scan line. | |||
byte order. This MUST be a multiple of the pgroup value. | If YCbCr format data is being transported, this is the pixel offset | |||
of the co-sited luminance sample; if RGB format data is being | ||||
transported it is the pixel offset of the red sample. The value is | ||||
in network byte order. The offset has a value of zero if the first | ||||
sample in the payload corresponds to the start of the line, and | ||||
increments by one for each pixel. | ||||
Field Identification (F): 1 bit | Field Identification (F): 1 bit | |||
Identifies which field the scan line belongs to, for interlaced | Identifies which field the scan line belongs to, for interlaced | |||
data. F=0 identifies the the first field and F=1 the second field. | data. F=0 identifies the the first field and F=1 the second field. | |||
For progressive scan data (e.g. SMPTE 296M format video), F MUST | For progressive scan data (e.g. SMPTE 296M format video), F MUST | |||
always be set to zero. | always be set to zero. | |||
Continuation (more lines) bit (C): 1 bit | Continuation (C): 1 bit | |||
Determines if an additional scan line header follows the current | Determines if an additional scan line header follows the current | |||
scan line header in the RTP packet. Set to 1 if an additional | scan line header in the RTP packet. Set to 1 if an additional | |||
header follows, implying that the RTP packet is carrying data for | header follows, implying that the RTP packet is carrying data for | |||
more than one scan line. Set to 0 otherwise. An unlimited number | more than one scan line. Set to 0 otherwise. An unlimited number | |||
of scan lines MAY be included, up to the path MTU limit. The only | of scan lines MAY be included, up to the path MTU limit. The only | |||
way to determine the number of scan lines included per packet is to | way to determine the number of scan lines included per packet is to | |||
parse the payload headers. | parse the payload headers. | |||
4.3. Payload Data | 4.3. Payload Data | |||
Depending on the video format, each RTP packet can include either a | Depending on the video format, each RTP packet can include either a | |||
single complete scan line, a single fragment of a scan line, or one (or | single complete scan line, a single fragment of a scan line, or one | |||
more) complete scan lines plus a fragment of a scan line. Every scan | (or more) complete scan lines and scan line fragments. The length of | |||
line or scan line fragment MUST begin at an octet boundary in the | each scan line or scan line fragment MUST be an integer multiple of | |||
payload data. Scan lines SHOULD be fragmented so that the resulting RTP | the pgroup size in octets. Scan lines SHOULD be fragmented so that | |||
packet is smaller than the path MTU. | the resulting RTP packet is smaller than the path MTU. | |||
It is possible that the scan line length is not evenly divisible by the | It is possible that the scan line length is not evenly divisible by | |||
number of pixels in a pgroup, so the final pixel data of a scan line | the number of pixels in a pgroup, so the final pixel data of a scan | |||
does not align to either an octet or pgroup boundary. Nonetheless the | line does not align to either an octet or pgroup boundary. | |||
payload MUST contain a whole number of pgroups; the sender MUST fill the | Nonetheless the payload MUST contain a whole number of pgroups; the | |||
remaining bits of the final pgroup with zero and the receiver MUST | sender MUST fill the remaining bits of the final pgroup with zero and | |||
ignore the fill data. (In effect, the trailing edge of the image is | the receiver MUST ignore the fill data. (In effect, the trailing edge | |||
black-filled to a pgroup boundary.) | of the image is black-filled to a pgroup boundary.) | |||
If the video is in YUV format, the packing of samples into the payload | For RGB format video, samples are packed in order Red-Green-Blue. For | |||
depends on the color sub-sampling used. For RGB format video, there is a | BGR format video, samples are packed in order Blue-Green-Red. For | |||
single packing scheme. | both formats, if 8 bit samples are used, the pgroup is 3 octets. If | |||
10 bit samples are used, samples from 4 adjacent pixels form 15 octet | ||||
pgroups. If 12 bit samples are used, samples from 2 adjacent pixels | ||||
form 9 octet pgroups. If 16 bits samples are used, each pixel forms a | ||||
separate 6 octet pgroup. | ||||
For RGB format video, samples are packed in order Red-Green-Blue. All | For RGBA format video, samples are packed in order Red-Green-Blue- | |||
samples are the same bit size, which may be 8, 10, 12, or 16 bits. If 8 | Alpha. For 8, 10, 12, or 16 bit samples, each pixel forms its own | |||
bit samples are used, the pgroup is 3 octets. If 10 bit samples are | pgroup, with octet sizes of 4, 5, 6 and 8 respectively. | |||
used, samples from adjacent pixels are packed with no padding, and the | ||||
pgroup is 15 octets (4 pixels). Refer to Tables 1 thru 4. | ||||
For RGBA format video, samples are packed in order Red-Green-Blue-Alpha. | If the video is in YCbCr format, the packing of samples into the | |||
All samples are the same bit size, which may be 8, 10, 12, or 16 bits. | payload depends on the color sub-sampling used. | |||
For pgroups refer to Tables 1 thru 4. | ||||
For YUV 4:4:4 format video, samples are packed in order Cb-Y-Cr for both | For YCbCr 4:4:4 format video, samples are packed in order Cb-Y-Cr for | |||
interlaced and progressive frames. Each sample is either an 8, 10, 12 or | both interlaced and progressive frames. If 8 bit samples are used, | |||
16 bit value. For relevant pgroups refer to Tables 1 to 4. | the pgroup is 3 octets. If 10 bit samples are used, samples from 4 | |||
adjacent pixels form 15 octet pgroups. If 12 bit samples are used, | ||||
samples from 2 adjacent pixels form 9 octet pgroups. If 16 bits | ||||
samples are used, each pixel forms a separate 6 octet pgroup. | ||||
For YUV 4:2:2 format video, the Cb and Cr components are horizontally | For YCbCr 4:2:2 format video, the Cb and Cr components are | |||
sub-sampled by a factor of two (each Cb and Cr samples corresponds to | horizontally sub-sampled by a factor of two (each Cb and Cr sample | |||
two Y components). Samples are packed in order Cb0-Y0-Cr0-Y1 for both | corresponds to two Y components). Samples are packed in order | |||
interlaced and progressive scan lines. Samples are either an 8, 10, 12 | Cb0-Y0-Cr0-Y1 for both interlaced and progressive scan lines. For 8, | |||
or 16 bit value. For relevant pgroups refer to Tables 1 to 4. | 10, 12 or 16 bit samples, the pgroup is formed from two adjacent | |||
pixels (4, 5, 6 or 8 octets respectively). | ||||
For YUV 4:1:1 format video, the Cb and Cr components are horizontally | For YCbCr 4:1:1 format video, the Cb and Cr components are | |||
sub-sampled by a factor of four (each Cb and Cr sample corresponds to | horizontally sub-sampled by a factor of four (each Cb and Cr sample | |||
four Y components). Samples are packed in order Cb0-Y0-Y1-Cr0-Y2-Y3 for | corresponds to four Y components). Samples are packed in order | |||
both interlaced and progressive scan lines. Samples are either an 8, 10, | Cb0-Y0-Y1-Cr0-Y2-Y3 for both interlaced and progressive scan lines. | |||
12 or 16 bit value. For relevant pgroups refer to Tables 1 to 4. | For 8, 10, 12 or 16 bit samples, the pgroup is formed from four | |||
adjacent pixels (6, 15, 9 or 12 octets respectively). | ||||
For YUV 4:2:0 video, the Cb and Cr components are sub-sampled by a | For YCbCr 4:2:0 video, the Cb and Cr components are sub-sampled by a | |||
factor of two both horizontally and vertically. Therefore chrominance | factor of two both horizontally and vertically. Therefore chrominance | |||
values are shared between certain adjacent lines. Figure 2 illustrates | samples are shared between certain adjacent lines. Figure 2 shows | |||
the composition of luminance and chrominance values for 6x6 pixel grid | the composition of luminance and chrominance samples for a 6x6 pixel | |||
in 4:2:0 YUV video. | grid of 4:2:0 YCbCr video. The pixel group is a group of four pixels | |||
arranged in a 2x2 matrix. The octet size of the pgroup for | ||||
progressive scan 4:2:0 video with samples sizes of 8, 10, 12 and 16 | ||||
bits is 6, 5, 9 and 12 octets respectively. For interlaced 4:2:0 | ||||
video the corresponding pgroups are 4, 5, 6 and 8 octets. | ||||
line 0: Y00 Y01 Y02 Y03 Y04 Y05 | line 0: Y00 Y01 Y02 Y03 Y04 Y05 | |||
Cb00 Cr00 Cb01 Cr01 Cb02 Cr02 | Cb00 Cr00 Cb01 Cr01 Cb02 Cr02 | |||
line 1: Y10 Y11 Y12 Y13 Y14 Y15 | line 1: Y10 Y11 Y12 Y13 Y14 Y15 | |||
line 2: Y20 Y21 Y22 Y23 Y24 Y25 | line 2: Y20 Y21 Y22 Y23 Y24 Y25 | |||
Cb10 Cr10 Cb11 Cr11 Cb12 Cr12 | Cb10 Cr10 Cb11 Cr11 Cb12 Cr12 | |||
line 3: Y30 Y31 Y32 Y33 Y34 Y35 | line 3: Y30 Y31 Y32 Y33 Y34 Y35 | |||
line 4: Y40 Y41 Y42 Y43 Y44 Y45 | line 4: Y40 Y41 Y42 Y43 Y44 Y45 | |||
Cb20 Cr20 Cb21 Cr21 Cb22 Cr22 | Cb20 Cr20 Cb21 Cr21 Cb22 Cr22 | |||
line 5: Y50 Y51 Y52 Y53 Y54 Y55 | line 5: Y50 Y51 Y52 Y53 Y54 Y55 | |||
Figure 2: Chrominance and luminance compostion in 4:2:0 YUV video. | Figure 2: Chrominance/luminance composition in 4:2:0 YCbCr video | |||
Transport of progressive scan 4:2:0 YUV video entails the transport of | When packetizing progressive scan 4:2:0 YCbCr video, samples from two | |||
two scan lines together such that: | consecutive scan lines are included in each packet. The scan line | |||
number in the payload header is set to that of the first scan line of | ||||
the pair: | ||||
line 0,1: | line 0/1: | |||
Y00-Y01-Y10-Y11-Cb00-Cr00 Y02-Y03-Y12-Y13-Cb01-Cr01 | Y00-Y01-Y10-Y11-Cb00-Cr00 Y02-Y03-Y12-Y13-Cb01-Cr01 | |||
Y04-Y05-Y14-Y15-Cb02-Cr02 | Y04-Y05-Y14-Y15-Cb02-Cr02 | |||
line 2,3: | line 2/3: | |||
Y20-Y21-Y30-Y31-Cb10-Cr10 Y22-Y23-Y32-Y33-Cb11-Cr11 | Y20-Y21-Y30-Y31-Cb10-Cr10 Y22-Y23-Y32-Y33-Cb11-Cr11 | |||
Y24-Y25-Y34-Y35-Cb12-Cr12 | Y24-Y25-Y34-Y35-Cb12-Cr12 | |||
line 4,5: | line 4/5: | |||
Y40-Y41-Y50-Y51-Cb20-Cr20 Y42-Y43-Y52-Y53-Cb21-Cr21 | Y40-Y41-Y50-Y51-Cb20-Cr20 Y42-Y43-Y52-Y53-Cb21-Cr21 | |||
Y44-Y45-Y54-Y55-Cb22-Cr22 | Y44-Y45-Y54-Y55-Cb22-Cr22 | |||
For interlaced transportm chrominance values are transported with every | For interlaced transport chrominance samples are transported with | |||
other line: | every other line: | |||
field 0: | field 0: | |||
line 0: Y00-Y01-Cb00-Cr00 Y02-Y03-Cb01-Cr01 Y04-Y05-Cb02-Cr02 | line 0: Y00-Y01-Cb00-Cr00 Y02-Y03-Cb01-Cr01 Y04-Y05-Cb02-Cr02 | |||
line 2: Y20-Y21 Y22-Y23 Y24-Y25 | line 2: Y20-Y21 Y22-Y23 Y24-Y25 | |||
line 4: Y40-Y41-Cb20-Cr20 Y42-Y43-Cb21-Cr21 Y44-Y45-Cb22-Cr22 | line 4: Y40-Y41-Cb20-Cr20 Y42-Y43-Cb21-Cr21 Y44-Y45-Cb22-Cr22 | |||
field 1: | field 1: | |||
line 1: Y10-Y11 Y12-Y13 Y14-Y15 | line 1: Y10-Y11 Y12-Y13 Y14-Y15 | |||
line 3: Y30-Y31-Cb10-Cr10 Y32-Y33-Cb11 Cr11 Y34-Y35-Cb12-Cr12 | line 3: Y30-Y31-Cb10-Cr10 Y32-Y33-Cb11 Cr11 Y34-Y35-Cb12-Cr12 | |||
line 5: Y50-Y51 Y52-Y53 Y54-Y55 | line 5: Y50-Y51 Y52-Y53 Y54-Y55 | |||
5. RTCP Considerations | 5. RTCP Considerations | |||
RTCP SHOULD be used as specified in RFC1889 [RTP], which specifies two | RTCP SHOULD be used as specified in RFC1889 [RTP], which specifies | |||
limits on the RTCP packet rate: RTCP bandwidth should be limited to 5% | two limits on the RTCP packet rate: RTCP bandwidth should be limited | |||
of the data rate, and the minimum for the average of the randomized | to 5% of the data rate, and the minimum for the average of the | |||
intervals between RTCP packets should be 5 seconds. Considering the | randomized intervals between RTCP packets should be 5 seconds. | |||
high data rate of many uncompressed video formats, the minimum interval | Considering the high data rate of many uncompressed video formats, | |||
is the governing factor in many cases. | the minimum interval is the governing factor in many cases. | |||
It should be noted that the sender's octet count in SR packets and the | It should be noted that the sender's octet count in SR packets and | |||
cumulative number of packets lost will wrap around quickly for high | the cumulative number of packets lost will wrap around quickly for | |||
data rate streams. This means these two fields may not accurately | high data rate streams. This means these two fields may not | |||
represent octet count and number of packets lost since the beginning of | accurately represent octet count and number of packets lost since the | |||
transmission, as defined in RFC 1889. Therefore for network monitoring | beginning of transmission, as defined in RFC 1889. Therefore for | |||
purposes other means of keeping track of these variables SHOULD be used. | network monitoring purposes other means of keeping track of these | |||
variables SHOULD be used. | ||||
6. IANA Considerations | 6. IANA Considerations | |||
6.1. MIME type registration | 6.1. MIME type registration | |||
MIME media type name: video | MIME media type name: video | |||
MIME subtype name: raw | MIME subtype name: raw | |||
Required parameters: | Required parameters: | |||
rate: The RTP timestamp clock rate. Applications using this payload | rate: The RTP timestamp clock rate. Applications using this payload | |||
format MUST be 90000 for this format. | format MUST use a value of 90000. | |||
pgroup: The number of octets per the pixel group. See section 3 of | ||||
RFC XXXX. | ||||
color-mode: Determines the color mode of the video stream. Valid | color-mode: Determines the color mode of the video stream. | |||
values for this parameter are: RGB, RGBA, and YUV. | Currently defined values are: RGB, RGBA, and YCbCr. New values may | |||
be registered as described in section 6.2 of RFC XXXX. | ||||
sub-sampling: Determines the type of color sub-sampling of the | sub-sampling: Determines the type of color sub-sampling of the | |||
video stream. Valid values are: mono, 4:1:1, 4:2:0, 4:2:2, 4:4:4 | video stream. Currently defined values are: mono, 4:1:1, 4:2:0, | |||
and 4:4:4:4. | 4:2:2, and 4:4:4. New values may be registered as described in | |||
section 6.2 of RFC XXXX. | ||||
width: Determines the number of pixels per line. This is an integer | width: Determines the number of pixels per line. This is an integer | |||
between 1 and 32767. | between 1 and 32767. | |||
height: Determines the number of lines per frame. This is an | height: Determines the number of lines per frame. This is an | |||
integer between 1 and 32767. | integer between 1 and 32767. | |||
depth: Determines the number of bits per samples. This is a decimal | depth: Determines the number of bits per samples. This is an | |||
integer; typical values include 8, 10, 12, and 16. | integer with typical values including 8, 10, 12, and 16. | |||
colorimetry: This parameter defines the set of colorimetric | colorimetry: This parameter defines the set of colorimetric | |||
specfications and other transfer characteristics for the video | specifications and other transfer characteristics for the video | |||
source, by reference to an external specification. Valid values and | source, by reference to an external specification. Valid values and | |||
their specification are: | their specification are: | |||
BT601-5 ITU Recommendation BT.601-5 [601] | BT601-5 ITU Recommendation BT.601-5 [601] | |||
BT709-2 ITU Recommendation BT.709-2 [709] | BT709-2 ITU Recommendation BT.709-2 [709] | |||
SMPTE240M SMPTE standard 240M [240M] | SMPTE240M SMPTE standard 240M [240] | |||
NTSC The NTSC specification [NTSC] | ||||
PAL The PAL specificaiton [PAL] | ||||
New values may be registered as described in section 6.2 of RFC | New values may be registered as described in section 6.2 of RFC | |||
XXXX. | XXXX. | |||
Optional parameters: | Optional parameters: | |||
Interlace: If this optional parameter is present it indicates that | Interlace: If this OPTIONAL parameter is present, it indicates that | |||
the video stream is interlaced. If absent, progressive scan is | the video stream is interlaced. If absent, progressive scan is | |||
implied. | implied. | |||
Encoding considerations: Uncompressed video can be transmitted with RTP | Encoding considerations: | |||
as specified in RFC XXXX. No file format is defined at this time. | ||||
Security considerations: See section 9 of RFC XXXX. | Uncompressed video can be transmitted with RTP as specified in RFC | |||
XXXX. No file format is defined at this time. | ||||
Interoperability considerations: NONE. | Security considerations: See section 9 of RFC XXXX. | |||
Published specification: RFC XXXX. | Interoperability considerations: NONE. | |||
Applications which use this media type: Video communication. | Published specification: RFC XXXX. | |||
Additional information: None | Applications which use this media type: Video communication. | |||
Magic number(s): None | Additional information: None | |||
File extension(s): None | Magic number(s): None | |||
Macintosh File Type Code(s): None | File extension(s): None | |||
Person & email address to contact for further information: | Macintosh File Type Code(s): None | |||
Person & email address to contact for further information: | ||||
Ladan Gharai <ladan@isi.edu> | Ladan Gharai <ladan@isi.edu> | |||
IETF Audio/Video Transport working group. | IETF Audio/Video Transport working group. | |||
Intended usage: COMMON | Intended usage: COMMON | |||
Author/Change controller: Ladan Gharai <ladan@isi.edu> | Author/Change controller: Ladan Gharai <ladan@isi.edu> | |||
6.2. Parameter Registration | 6.2. Parameter Registration | |||
New values of the "colorimetry" parameter MAY be registered with the | New values of the "sampling" parameter MAY be registered with the | |||
IANA provided they reference an RFC or other permanent and readily | IANA provided they reference an RFC or other permanent and readily | |||
available specification (the Specification Required policy of RFC 2434 | available specification (the Specification Required policy of RFC | |||
[2434]). | 2434 [2434]). A new registration MUST define the packing order of | |||
samples and a valid combinations of color and sub-sampling modes. | ||||
New values of the "colorimetry" parameter MAY be registered with the | ||||
IANA provided they reference an RFC or other permanent and readily | ||||
available specification if colorimetric parameters and other | ||||
applicable transfer characteristics (the Specification Required | ||||
policy of RFC 2434 [2434]). | ||||
7. Mapping to SDP Parameters | 7. Mapping to SDP Parameters | |||
Parameters are mapped to SDP [SDP] as in the following example: | Parameters are mapped to SDP [SDP] as in the following example: | |||
m=video 30000 RTP/AVP 112 | m=video 30000 RTP/AVP 112 | |||
a=rtpmap:112 raw/90000 | a=rtpmap:112 raw/90000 | |||
a=fmtp:112 color-mode=YUV; sub-sampling=4:2:2; width=1280; height=720; | a=fmtp:112 sampling=YUV-4:2:2; width=1280; height=720; depth=10; | |||
depth=10; colorimetry=BT.709-2 | colorimetry=BT.709-2 | |||
In this example, a dynamic payload type 111 is used for uncompressed | In this example, a dynamic payload type 112 is used for uncompressed | |||
video. The RTP sampling clock is 90kHz. Note that the "a=fmtp:" line | video. The RTP sampling clock is 90kHz. Note that the "a=fmtp:" line | |||
has been wrapped to fit this page, and will be a single long line in the | has been wrapped to fit this page, and will be a single long line in | |||
SDP file. | the SDP file. | |||
8. Security Considerations | 8. Security Considerations | |||
RTP packets using the payload format defined in this specification are | RTP packets using the payload format defined in this specification | |||
subject to the security considerations discussed in the RTP | are subject to the security considerations discussed in the RTP | |||
specification, and any appropriate RTP profile. This implies that | specification, and any appropriate RTP profile. This implies that | |||
confidentiality of the media streams is achieved by encryption. | confidentiality of the media streams is achieved by encryption. | |||
This payload type does not exhibit any significant non-uniformity in the | This payload type does not exhibit any significant non-uniformity in | |||
receiver side computational complexity for packet processing to cause a | the receiver side computational complexity for packet processing to | |||
potential denial-of-service threat. | cause a potential denial-of-service threat. | |||
It is important to be note that uncompressed video can have immense | It is important to be note that uncompressed video can have immense | |||
bandwidth requirements (up 270 Mbps for standard definition video, and | bandwidth requirements (up 270 Mbps for standard definition video, | |||
approximately 1 Gbps for high definition video). This is sufficient to | and approximately 1 Gbps for high definition video). This is | |||
cause potential for denial-of-service if transmitted onto most currently | sufficient to cause potential for denial-of-service if transmitted | |||
available Internet paths. | onto most currently available Internet paths. | |||
Accordingly, if best-effort service is being used, users of this payload | Accordingly, if best-effort service is being used, users of this | |||
format SHOULD monitor packet loss to ensure that the packet loss rate is | payload format SHOULD monitor packet loss to ensure that the packet | |||
within acceptable parameters. Packet loss is considered acceptable if a | loss rate is within acceptable parameters. Packet loss is considered | |||
TCP flow across the same network path, and experiencing the same network | acceptable if a TCP flow across the same network path, and | |||
conditions, would achieve an average throughput, measured on a | experiencing the same network conditions, would achieve an average | |||
reasonable timescale, that is not less than the RTP flow is achieving. | throughput, measured on a reasonable timescale, that is not less than | |||
This condition can be satisfied by implementing congestion control | the RTP flow is achieving. This condition can be satisfied by | |||
mechanisms to adapt the transmission rate (or the number of layers | implementing congestion control mechanisms to adapt the transmission | |||
subscribed for a layered multicast session), or by arranging for a | rate (or the number of layers subscribed for a layered multicast | |||
receiver to leave the session if the loss rate is unacceptably high. | session), or by arranging for a receiver to leave the session if the | |||
loss rate is unacceptably high. | ||||
This payload format may also be used in networks which provide quality | This payload format may also be used in networks which provide | |||
of service guarantees. If enhanced service is being used, receivers | quality of service guarantees. If enhanced service is being used, | |||
SHOULD monitor packet loss to ensure that the service that was requested | receivers SHOULD monitor packet loss to ensure that the service that | |||
is actually being delivered. If it is not, then they SHOULD assume that | was requested is actually being delivered. If it is not, then they | |||
they are receiving best-effort service and behave accordingly. | SHOULD assume that they are receiving best-effort service and behave | |||
accordingly. | ||||
9. Relation to RFC 2431 | 9. Relation to RFC 2431 | |||
In comparison with RFC 2431 this memo specifies support for a wider | In comparison with RFC 2431 this memo specifies support for a wider | |||
variety of uncompressed video, in terms of frame size, color subsampling | variety of uncompressed video, in terms of frame size, color sub- | |||
and sample sizes. While [BT656] can transport up to 4096 scan lines and | sampling and sample sizes. While [BT656] can transport up to 4096 | |||
2048 pixels per line, our payload type can support up to 64k scan lines | scan lines and 2048 pixels per line, our payload type can support up | |||
and pixels per line. Also, RFC 2431 only address 4:2:2 YUV data, while | to 64k scan lines and pixels per line. Also, RFC 2431 only address | |||
this memo covers YUV and RGB and most common color subsampling schemes. | 4:2:2 YCbCr data, while this memo covers YCbCr and RGB and most | |||
Given the variety of video types that we cover, this memo also assumes | common color sub-sampling schemes. Given the variety of video types | |||
out-of-band signaling for sample size and data types (RFC 2431 uses in | that we cover, this memo also assumes out-of-band signaling for | |||
band signaling). | sample size and data types (RFC 2431 uses in band signaling). | |||
10. Relation to RFC YYYY | 10. Relation to RFC 3497 | |||
(tbd) | RFC 3497 [292RTP] specifies a RTP payload format for encapsulating | |||
SMPTE 292M video. The SMPTE 292M standard defines a bit-serial | ||||
digital interface for local area High Definition Television (HDTV) | ||||
transport. As a transport medium, SMPTE 292M utilizes 10 bit words | ||||
and a fixed 1.485Gbps (and 1.485/1.001Gbps) data rate. SMPTE 292M is | ||||
typically used in the broadcast industry for the transport of other | ||||
video formats such as SMPTE 260M, SMPTE 295M, SMPTE 274M and SMPTE | ||||
296M. | ||||
Relation [292RTP] | RFC 3497 defines a circuit emulation for the transport of SMPTE 292M | |||
over RTP. It is very specific to SMPTE 292 and has been designed to | ||||
be interoperable with existing broadcast equipment with a constant | ||||
rate of 1.485Gbps. | ||||
RFC XXXX, defines a flexible native packetization scheme which can | ||||
packetize any uncompressed video, at varying data rates. In addition, | ||||
unlike RFC 3497, RFC XXXX only transports active video pixels (i.e. | ||||
horizontal and vertical blanking are not transported). | ||||
11. Full Copyright Statement | 11. Full Copyright Statement | |||
Copyright (C) The Internet Society (2003). All Rights Reserved. | Copyright (C) The Internet Society (2003). All Rights Reserved. | |||
This document and translations of it may be copied and furnished to | This document and translations of it may be copied and furnished to | |||
others, and derivative works that comment on or otherwise explain it or | others, and derivative works that comment on or otherwise explain it | |||
assist in its implementation may be prepared, copied, published and | or assist in its implementation may be prepared, copied, published | |||
distributed, in whole or in part, without restriction of any kind, | and distributed, in whole or in part, without restriction of any | |||
provided that the above copyright notice and this paragraph are included | kind, provided that the above copyright notice and this paragraph are | |||
on all such copies and derivative works. | included on all such copies and derivative works. | |||
However, this document itself may not be modified in any way, such as by | However, this document itself may not be modified in any way, such as | |||
removing the copyright notice or references to the Internet Society or | by removing the copyright notice or references to the Internet | |||
other Internet organizations, except as needed for the purpose of | Society or other Internet organizations, except as needed for the | |||
developing Internet standards in which case the procedures for | purpose of developing Internet standards in which case the procedures | |||
copyrights defined in the Internet Standards process must be followed, | for copyrights defined in the Internet Standards process must be | |||
or as required to translate it into languages other than English. | followed, or as required to translate it into languages other than | |||
English. | ||||
The limited permissions granted above are perpetual and will not be | The limited permissions granted above are perpetual and will not be | |||
revoked by the Internet Society or its successors or assigns. | revoked by the Internet Society or its successors or assigns. | |||
This document and the information contained herein is provided on an "AS | This document and the information contained herein is provided on an | |||
IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK | "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING | |||
FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT | TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING | |||
LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT | BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION | |||
INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR | HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF | |||
FITNESS FOR A PARTICULAR PURPOSE." | MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE." | |||
12. Acknowledgements | ||||
The authors are grateful to Philippe Gentric and Chuck Harrison for | 12. Acknowledgments | |||
their feedback. | ||||
The authors are grateful to Philippe Gentric and Chuck Harrison for | ||||
their feedback. | ||||
13. Authors' Addresses | 13. Authors' Addresses | |||
Ladan Gharai <ladan@isi.edu> | Ladan Gharai <ladan@isi.edu> | |||
Colin Perkins <csp@csperkins.org> | USC Information Sciences Institute | |||
3811 N. Fairfax Drive, #200 | ||||
Arlington, VA 22203 | ||||
USA | ||||
USC Information Sciences Institute | Colin Perkins <csp@csperkins.org> | |||
3811 N. Fairfax Drive, #200 | University of Glasgow | |||
Arlington, VA 22203 | Department of Computing Science | |||
USA | 17 Lilybank Gardens | |||
Glasgow G12 8QQ | ||||
United Kingdom | ||||
Normative References | Normative References | |||
[RTP] H. Schulzrinne, S. Casner, R. Frederick and V. Jacobson, | [RTP] H. Schulzrinne, S. Casner, R. Frederick and V. Jacobson, | |||
"RTP: A Transport Protocol for Real-Time Applications", | "RTP: A Transport Protocol for Real-Time Applications", | |||
Internet Engineering Task Force, RFC 1889, January 1996. | Internet Engineering Task Force, RFC 1889, January 1996. | |||
[2119] S. Bradner, "Key words for use in RFCs to Indicate Requirement | [2119] S. Bradner, "Key words for use in RFCs to Indicate | |||
Levels", RFC 2119. | Requirement Levels", RFC 2119. | |||
[2434] T. Narten and H. Alvestrand, "Guidelines for Writing an IANA | [2434] T. Narten and H. Alvestrand, "Guidelines for Writing an IANA | |||
Considerations Section in RFCs", RFC 2434, October 1998. | Considerations Section in RFCs", RFC 2434, October 1998. | |||
Informative References | Informative References | |||
[274] Society of Motion Picture and Television Engineers, | [274] Society of Motion Picture and Television Engineers, | |||
1920x1080 Scanning and Analog and Parallel Digital Interfaces | "1920x1080 Scanning and Analog and Parallel Digital | |||
for Multiple Picture Rates, SMPTE 274M-1998. | Interfaces | |||
for Multiple Picture Rates", SMPTE 274M-1998. | ||||
[268] Society of Motion Picture and Television Engineers, | ||||
File Format for Digital Moving Picture Exchange (DPX), | ||||
SMPTE 268M-1994. (Currently under revision.) | ||||
[296] Society of Motion Picture and Television Engineers, | ||||
1280x720 Scanning, Analog and Digital Representation and Analog | ||||
Interfaces, SMPTE 296M-1998. | ||||
[372] Society of Motion Picture and Television Engineers, | [268] Society of Motion Picture and Television Engineers, | |||
Dual Link 292M Interface for 1920 x 1080 Picture Raster, | "File Format for Digital Moving Picture Exchange (DPX)", | |||
SMPTE 372M-2002. | SMPTE 268M-1994. (Currently under revision.) | |||
[ALF] Clark, D. D., and Tennenhouse, D. L., "Architectural | [296] Society of Motion Picture and Television Engineers, | |||
Considerations for a New Generation of Protocols", In | "1280x720 Scanning, Analog and Digital Representation and | |||
Proceedings of SIGCOMM '90 (Philadelphia, PA, Sept. 1990), ACM. | Analog Interfaces", SMPTE 296M-1998. | |||
[SDP] M. Handley and V. Jacobson, "SDP: Session Description Protocol", | [372] Society of Motion Picture and Television Engineers, | |||
RFC 2327, April 1998. | "Dual Link 292M Interface for 1920 x 1080 Picture Raster", | |||
SMPTE 372M-2002. | ||||
[BT656] D. Tynan, "RTP Payload Format for BT.656 Video Encoding", | [ALF] Clark, D. D., and Tennenhouse, D. L., "Architectural | |||
Internet Engineering Task Force, RFC 2431, October 1998. | Considerations for a New Generation of Protocols", In | |||
Proceedings of SIGCOMM '90 (Philadelphia, PA, Sept. 1990), | ||||
ACM. | ||||
[292RTP] L. Gharai et al., "RTP Payload Format for SMPTE 292M Video", | [SDP] M. Handley and V. Jacobson, "SDP: Session Description | |||
Internet Draft, draft-ietf-avt-smpte292-video-07.txt, | Protocol", RFC 2327, April 1998. | |||
Work in progress. | ||||
[601] International Telecommunication Union, "Studio encoding | [BT656] D. Tynan, "RTP Payload Format for BT.656 Video Encoding", | |||
parameters of digital television for standard 4:3 and wide | Internet Engineering Task Force, RFC 2431, October 1998. | |||
screen 16:9 aspect ratios", Recommendation BT.601, October 1995. | ||||
[656] International Telecommunication Union, "Interfaces for Digital | [292RTP] L. Gharai et al., "RTP Payload Format for SMPTE 292M Video", | |||
Component Video Signals in 525-line and 625-line Television | RFC 3497, March 2003. | |||
Systems Operating at the 4:2:2 Level of Recommendation ITU-R | ||||
BT.601 (Part A)", Recommendation BT.656, April 1998. | ||||
[22028] ISO TC42 (Photography), Photography and graphic technology - | [601] International Telecommunication Union, "Studio encoding | |||
Extended colour encodings for digital image storage, | parameters of digital television for standard 4:3 and wide | |||
manipulation and interchange - Part 1: Architecture and | screen 16:9 aspect ratios", Recommendation BT.601, October | |||
requirements, ISO/CD 22028-1, Work in Progress. | 1995. | |||
[709] ITU Recommendation BT.709-2 | [656] International Telecommunication Union, "Interfaces for | |||
Digital Component Video Signals in 525-line and 625-line | ||||
Television Systems Operating at the 4:2:2 Level of | ||||
Recommendation ITU-R BT.601 (Part A)", Recommendation | ||||
BT.656, April 1998. | ||||
[240M] SMPTE Standard 240M | [22028] ISO TC42 (Photography), Photography and graphic technology - | |||
Extended colour encodings for digital image storage, | ||||
manipulation and interchange - Part 1: Architecture and | ||||
requirements, ISO/CD 22028-1, Work in Progress. | ||||
[NTSC] (tbd) | [709] International Telecommunication Union, "Parameter Values for | |||
HDTV Standards for Production and International Programme | ||||
Exchange", Recommendation BT.709-2 | ||||
[PAL] (tbd) | [240] Society of Motion Picture and Television Engineers, | |||
"Television - Signal Parameters - 1125-Line High-Definition | ||||
Production", SMPTE 240M-1999. | ||||
End of changes. 116 change blocks. | ||||
442 lines changed or deleted | 420 lines changed or added | |||
This html diff was produced by rfcdiff 1.46. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |