draft-ietf-avt-uncomp-video-00.txt | draft-ietf-avt-uncomp-video-01.txt | |||
---|---|---|---|---|
Internet Engineering Task Force AVT WG | Internet Engineering Task Force AVT WG | |||
INTERNET-DRAFT Ladan Gharai | INTERNET-DRAFT Ladan Gharai | |||
draft-ietf-avt-uncomp-video-00.txt Colin Perkins | draft-ietf-avt-uncomp-video-01.txt Colin Perkins | |||
USC/ISI | USC/ISI | |||
17 October 2002 | 3 November 2002 | |||
Expires: April 2003 | Expires: May 2003 | |||
RTP Payload Format for Uncompressed Video | RTP Payload Format for Uncompressed Video | |||
Status of this Memo | Status of this Memo | |||
This document is an Internet-Draft and is in full conformance with all | This document is an Internet-Draft and is in full conformance with all | |||
provisions of Section 10 of RFC2026. | provisions of Section 10 of RFC2026. | |||
Internet-Drafts are working documents of the Internet Engineering Task | Internet-Drafts are working documents of the Internet Engineering Task | |||
Force (IETF), its areas, and its working groups. Note that other groups | Force (IETF), its areas, and its working groups. Note that other groups | |||
skipping to change at page 1, line 35 ¶ | skipping to change at page 1, line 34 ¶ | |||
The list of current Internet-Drafts can be accessed at | The list of current Internet-Drafts can be accessed at | |||
http://www.ietf.org/ietf/1id-abstracts.txt | http://www.ietf.org/ietf/1id-abstracts.txt | |||
The list of Internet-Draft Shadow Directories can be accessed at | The list of Internet-Draft Shadow Directories can be accessed at | |||
http://www.ietf.org/shadow.html. | http://www.ietf.org/shadow.html. | |||
Abstract | Abstract | |||
This memo specifies a packetization scheme for encapsulating | This memo specifies a packetization scheme for encapsulating | |||
uncompressed HDTV as defined by SMPTE 274M and SMPTE 296M into | uncompressed video into a payload format for the Real-time | |||
a payload format for the Real-Time Transport Protocol (RTP). | Transport Protocol, RTP. It supports a range of standard- and | |||
SMPTE 274M and SMPTE 296M define the analog and digital | high-definition video formats, including common television | |||
representation of HDTV with image formats of 1920x1080 and | formats such as ITU BT.601, SMPTE 274M and SMPTE 296M. The | |||
1280x720, respectively. The payload has been designed such | format is designed to be extensible as new video formats are | |||
that it may scale to future higher resolutions, such as | developed. | |||
Digital Cinema. | ||||
1. Introduction | 1. Introduction | |||
This memo defines a scheme to packetize uncompressed, studio-quality, | This memo defines a scheme to packetize uncompressed, studio-quality, | |||
video streams for transport using RTP [RTP]. It supports a range of | video streams for transport using RTP [RTP]. It supports a range of | |||
standard and high definition video formats, including ITU-R BT.601 | standard and high definition video formats, including ITU-R BT.601 | |||
[601], SMPTE 274M [274] and SMPTE 296M [296]. | [601], SMPTE 274M [274] and SMPTE 296M [296]. | |||
Formats for uncompressed standard definition television are defined by | Formats for uncompressed standard definition television are defined by | |||
ITU Recommendation BT.601 [601] along with bit-serial and parallel | ITU Recommendation BT.601 [601] along with bit-serial and parallel | |||
skipping to change at page 2, line 43 ¶ | skipping to change at page 2, line 43 ¶ | |||
Although these formats differ in their details, they are structurally | Although these formats differ in their details, they are structurally | |||
very similar. This memo specifies a payload format to encapsulate these, | very similar. This memo specifies a payload format to encapsulate these, | |||
and other similar, video formats for transport within RTP. | and other similar, video formats for transport within RTP. | |||
2. Conventions Used in this Document | 2. Conventions Used in this Document | |||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | |||
document are to be interpreted as described in RFC 2119[2119]. | document are to be interpreted as described in RFC 2119[2119]. | |||
3. Payload Design | 3. Applicability Statement | |||
This RTP payload format is designed to transport uncompressed, studio | ||||
quality, video streams. Such content can be very high bandwidth and, by | ||||
definition, is not congestion controlled. The intended use of this | ||||
format is within a production facility or on a suitably connected | ||||
private network that is specifically engineered to support this content. | ||||
This format is NOT RECOMMENDED for use on public network links, unless | ||||
those links support appropriate quality of service guarantees. See also | ||||
Section 10 "Security Considerations". | ||||
4. Payload Design | ||||
Each scan line of digital video is packetized into one or more | Each scan line of digital video is packetized into one or more | |||
(depending on the current MTU) RTP packets. A single RTP packet MAY also | (depending on the current MTU) RTP packets. A single RTP packet MAY also | |||
contain data for more than one scan line. Only the active samples are | contain data for more than one scan line. Only the active samples are | |||
included in the RTP payload, inactive samples and the contents of | included in the RTP payload, inactive samples and the contents of | |||
horizontal and vertical blanking SHOULD NOT be transported. Scan line | horizontal and vertical blanking SHOULD NOT be transported. Scan line | |||
numbers are included in the RTP payload header, along with a field | numbers are included in the RTP payload header, along with a field | |||
identifier for interlaced video. | identifier for interlaced video. | |||
For SMPTE 296M format video, valid scan line numbers are from 26 | For SMPTE 296M format video, valid scan line numbers are from 26 | |||
through 745, inclusive. For progressive scan SMPTE 274M format | through 745, inclusive. For progressive scan SMPTE 274M format | |||
video, valid scan lines are from scan line 42 through 1121 | video, valid scan lines are from scan line 42 through 1121 | |||
inclusive. For interlaced scan, valid scan line numbers for field | inclusive. For interlaced scan, valid scan line numbers for field | |||
one (F=0) are from 21 to 560 and valid scan line numbers for the | one (F=0) are from 21 to 560 and valid scan line numbers for the | |||
second field (F=1) are from 584 to 1123. For ITU-R BT.601 format | second field (F=1) are from 584 to 1123. For ITU-R BT.601 format | |||
video, the blanking intervals defined in BT.656 are used: for 625 | video, the blanking intervals defined in BT.656 are used: for 625 | |||
line video, lines 24 to 310 of field one (F=0) and 337 to 623 of | line video, lines 24 to 310 of field one (F=0) and 337 to 623 of | |||
the second field (F=1) are valid; for 525 line video, lines 21 to | the second field (F=1) are valid; for 525 line video, lines 21 to | |||
263 of the first field, and 284 to 525 of the second field are | 263 of the first field, and 284 to 525 of the second field are | |||
valid. Other formats may define different ranges of active lines. | valid. Other formats (e.g. [372]) may define different ranges of | |||
active lines. | ||||
Sample values for pixels may be transfered as 8 bit or 10 bit values. | It is desirable for the video to be both octet aligned when packetized, | |||
For 10 bit payloads, care must be taken such that the payload is also | and to adhere to the principles of application level framing [ALF] by | |||
octet aligned. | ensuring that the samples relating to a single pixel are not fragmented | |||
across two packets. | ||||
However, for video content it is desirable for the video to be both | Samples may be transfered as 8, 10, 12 or 16 bit values. For 10 bit and | |||
octet aligned when packetized and also adhere to the principles of | 12 bit payloads, care must be taken to pack an appropriate number of | |||
application level framing [ALF]. For YCrCb video, the ALF principle | samples per packet, such that the payload is also octet aligned. For RGB | |||
translates into not fragmenting related luminance and chrominance values | video, it is desirable that the samples corresponding to a single pixel | |||
across packets. For example, with 4:2:0 color subsampling each group of | are not fragmented across packets. Similarly, for YCrCb video, it is | |||
4 pixels is represented by 6 values, Y1 Y2 Y3 Y4 Cr Cb, and video | desirable that luminance and chrominance values are not fragmented | |||
content should be packetized such that these values are not fragmented | across packets. | |||
across a packet boundary. With 10 bit words this is a 60 bit value which | ||||
is not octet aligned. To be both octet aligned, and appropriately | ||||
framed, pixels must be framed in 2 groups of 4, thereby becoming octet | ||||
aligned on a 15 octet boundary. This length is referred to as the pixel | ||||
group ("pgroup"), and it is conveyed in the SDP parameters. Tables 1 and | ||||
2 display the pgroup value for 4:2:2 and 4:4:4 color samplings, for 10 | ||||
bit and 8 bit words. | ||||
10 bit words | For example, in YCrCb video with 4:2:0 color subsampling, each group of | |||
Color -------------------------------- | 4 pixels is represented by 6 values, Y1 Y2 Y3 Y4 Cr Cb. These should be | |||
Subsampling Pixels #words octet alignment pgroup | packetized such that these values are not fragmented across a packet | |||
+-----------+------+ +------+---------------+-------+ | boundary. With 10 bit words this is a 60 bit value which is not octet | |||
| 4:2:0 | 4 | | 6x10 | 2x60/8 = 15 | 15 | | aligned. To be both octet aligned, and appropriately framed, pixels must | |||
+-----------+------+ +------+---------------+-------+ | be framed in 2 groups of 4, thereby becoming octet aligned on a 15 octet | |||
| 4:2:2 | 2 | | 4x10 | 40/8 = 5 | 5 | | boundary. This length is referred to as the pixel group ("pgroup"), and | |||
+-----------+------+ +------+---------------+-------+ | it is conveyed in the SDP parameters. Tables 1 to 4 display the pgroup | |||
| 4:4:4 | 1 | | 3x10 | 4x30/8 = 15 | 15 | | values for a range of color samplings and word lengths. | |||
+-----------+------+ +------+---------------+-------+ | ||||
10 bit words | ||||
Color -------------------------------- | ||||
Subsampling Pixels #words octet alignment pgroup | ||||
+-----------+------+ +------+---------------+-------+ | ||||
|monochrome | 4 | | 4x10 | 40/8 = 5 | 5 | | ||||
+-----------+------+ +------+---------------+-------+ | ||||
| 4:2:0 | 4 | | 6x10 | 2x60/8 = 15 | 15 | | ||||
+-----------+------+ +------+---------------+-------+ | ||||
| 4:2:2 | 2 | | 4x10 | 40/8 = 5 | 5 | | ||||
+-----------+------+ +------+---------------+-------+ | ||||
| 4:4:4 | 1 | | 3x10 | 4x30/8 = 15 | 15 | | ||||
+-----------+------+ +------+---------------+-------+ | ||||
| 4:4:4:4 | 1 | | 4x10 | 40/8 = 5 | 5 | | ||||
+-----------+------+ +------+---------------+-------+ | ||||
Table 1: pgroup values for 10 bit sampling | Table 1: pgroup values for 10 bit sampling | |||
8 bit words | ||||
Color -------------------------------- | 8 bit words | |||
Subsampling Pixels #words octet alignment pgroup | Color -------------------------------- | |||
+-----------+------+ +------+---------------+-------+ | Subsampling Pixels #words octet alignment pgroup | |||
| 4:2:0 | 4 | | 6x8 | 6x8/8 = 6 | 6 | | +-----------+------+ +------+---------------+-------+ | |||
+-----------+------+ +------+---------------+-------+ | |monochrome | 1 | | 1x8 | 8/8 = 1 | 1 | | |||
| 4:2:2 | 2 | | 4x8 | 4x8/8 = 8 | 4 | | +-----------+------+ +------+---------------+-------+ | |||
+-----------+------+ +------+---------------+-------+ | | 4:2:0 | 4 | | 6x8 | 6x8/8 = 6 | 6 | | |||
| 4:4:4 | 1 | | 3x8 | 3x8/8 = 3 | 3 | | +-----------+------+ +------+---------------+-------+ | |||
+-----------+------+ +------+---------------+-------+ | | 4:2:2 | 2 | | 4x8 | 4x8/8 = 8 | 4 | | |||
+-----------+------+ +------+---------------+-------+ | ||||
| 4:4:4 | 1 | | 3x8 | 3x8/8 = 3 | 3 | | ||||
+-----------+------+ +------+---------------+-------+ | ||||
| 4:4:4:4 | 1 | | 4x8 | 4x8/8 = 4 | 4 | | ||||
+-----------+------+ +------+---------------+-------+ | ||||
Table 2: pgroup values for 8 bit sampling | Table 2: pgroup values for 8 bit sampling | |||
12 bit words | ||||
Color -------------------------------- | ||||
Subsampling Pixels #words octet alignment pgroup | ||||
+-----------+------+ +------+---------------+-------+ | ||||
|monochrome | 2 | | 2x12 | 2x12/8 = 3 | 3 | | ||||
+-----------+------+ +------+---------------+-------+ | ||||
| 4:2:0 | 4 | | 6x12 | 72/8 = 9 | 9 | | ||||
+-----------+------+ +------+---------------+-------+ | ||||
| 4:2:2 | 2 | | 4x12 | 48/8 = 6 | 6 | | ||||
+-----------+------+ +------+---------------+-------+ | ||||
| 4:4:4 | 2 | | 6x12 | 2x36/8 = 9 | 9 | | ||||
+-----------+------+ +------+---------------+-------+ | ||||
| 4:4:4:4 | 1 | | 4x12 | 48/8 = 6 | 6 | | ||||
+-----------+------+ +------+---------------+-------+ | ||||
Table 3: pgroup values for 12 bit sampling | ||||
16 bit words | ||||
Color -------------------------------- | ||||
Subsampling Pixels #words octet alignment pgroup | ||||
+-----------+------+ +------+---------------+-------+ | ||||
|monochrome | 1 | | 1x16 | 16/8 = 2 | 2 | | ||||
+-----------+------+ +------+---------------+-------+ | ||||
| 4:2:0 | 4 | | 6x16 | 6x16/8 = 12 | 12 | | ||||
+-----------+------+ +------+---------------+-------+ | ||||
| 4:2:2 | 2 | | 4x16 | 4x16/8 = 8 | 8 | | ||||
+-----------+------+ +------+---------------+-------+ | ||||
| 4:4:4 | 1 | | 3x16 | 3x16/8 = 6 | 6 | | ||||
+-----------+------+ +------+---------------+-------+ | ||||
| 4:4:4:4 | 1 | | 4x16 | 4x16/8 = 8 | 8 | | ||||
+-----------+------+ +------+---------------+-------+ | ||||
Table 4: pgroup values for 16 bit sampling | ||||
When packetizing digital active line content, video data MUST NOT be | When packetizing digital active line content, video data MUST NOT be | |||
fragmented within a pgroup. | fragmented within a pgroup. | |||
4. RTP Packetization | Video content is almost always associated with additional information | |||
such as audio tracks, time code, etc. In professional digital video | ||||
applications this data is commonly embedde d in non-video portions of | ||||
the data stream (horizontal and vertical blanking periods) so that | ||||
precise and robust synchronization is maintained. This payload format | ||||
envisions that applications requiring such synchronized ancillary data | ||||
should deliver it in separate RTP sessions which operate concurrently | ||||
with the video session. The normal RTP mechanisms SHOULD be used to | ||||
synchronize the media. | ||||
The standard RTP header is followed by a 8 octet payload header for each | 5. RTP Packetization | |||
line (or partial line) of video included. One or more lines, or partial | ||||
lines, of payload data follow. For example, if two lines of video are | The standard RTP header is followed by an 8 octet payload header for | |||
encapsulated, the payload format will be as shown in Figure 1. | each line (or partial line) of video included. One or more lines, or | |||
partial lines, of payload data follow. For example, if two lines of | ||||
video are encapsulated, the payload format will be as shown in Figure 1. | ||||
0 1 2 3 | 0 1 2 3 | |||
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| V |P|X| CC |M| PT | Sequence No | | | V |P|X| CC |M| PT | Sequence No | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| Time Stamp | | | Time Stamp | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| SSRC | | | SSRC | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| Scan Line No | Scan Offset | | | Scan Line No | Scan Offset | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| Length |F|M| Z | | | Length |F|C| Z | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| Scan Line No | Scan Offset | | | Scan Line No | Scan Offset | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| Length |F|M| Z | | | Length |F|C| Z | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
. . | . . | |||
. Two (partial) lines of video data . | . Two (partial) lines of video data . | |||
. . | . . | |||
+---------------------------------------------------------------+ | +---------------------------------------------------------------+ | |||
Figure 1: RTP Payload Format showing two (partial) lines of video | Figure 1: RTP Payload Format showing two (partial) lines of video | |||
4.1. The RTP Header | 5.1. The RTP Header | |||
The fields of the fixed RTP header have their usual meaning, with the | The fields of the fixed RTP header have their usual meaning, with the | |||
following additional notes: | following additional notes: | |||
Payload Type (PT): 7 bits | Payload Type (PT): 7 bits | |||
A dynamically allocated payload type field which designates the | A dynamically allocated payload type field which designates the | |||
payload as uncompressed video. | payload as uncompressed video. | |||
Timestamp: 32 bits | Timestamp: 32 bits | |||
skipping to change at page 5, line 16 ¶ | skipping to change at page 7, line 4 ¶ | |||
The fields of the fixed RTP header have their usual meaning, with the | The fields of the fixed RTP header have their usual meaning, with the | |||
following additional notes: | following additional notes: | |||
Payload Type (PT): 7 bits | Payload Type (PT): 7 bits | |||
A dynamically allocated payload type field which designates the | A dynamically allocated payload type field which designates the | |||
payload as uncompressed video. | payload as uncompressed video. | |||
Timestamp: 32 bits | Timestamp: 32 bits | |||
A 90 kHz timestamp MUST be used to denote the sampling instant of | A 90 kHz timestamp MUST be used to denote the sampling instant of | |||
the video frame to which the RTP packet belongs. Packets MUST NOT | the video frame to which the RTP packet belongs. Packets MUST NOT | |||
include data from multiple frames, and all packets belonging to the | include data from multiple frames, and all packets belonging to the | |||
same frame MUST have the same timestamp. | same frame MUST have the same timestamp. | |||
TBD: Consider whether the two fields of interlaced video MAY have | ||||
distinct timestamps. In some ways this is more "natural" for true | ||||
interlaced video and distinguishes it from "progressive segmented | ||||
frame" (PsF) mode in which the two fields really do refer to the | ||||
same time instant. | ||||
Marker bit (M): 1 bit | Marker bit (M): 1 bit | |||
The Marker bit denotes the end of a video frame, and MUST be set to | The Marker bit denotes the end of a video frame, and MUST be set to | |||
1 for the last packet of the video frame. It MUST be set to 0 for | 1 for the last packet of the video frame. It MUST be set to 0 for | |||
other packets. | other packets. | |||
4.2. Payload Header | 5.2. Payload Header | |||
Scan Line No : 16 bits | Scan Line No : 16 bits | |||
Scan line number of encapsulated data in network byte order. | Scan line number of encapsulated data in network byte order. | |||
Successive RTP packets MAY contains parts of the same scan line | Successive RTP packets MAY contains parts of the same scan line | |||
(with an incremented RTP sequence number, but the same timestamp), | (with an incremented RTP sequence number, but the same timestamp), | |||
if it is necessary to fragment a line. | if it is necessary to fragment a line. | |||
Scan Offset : 16 bits | Scan Offset : 16 bits | |||
Sample number of the co-sited luminance sample (if YUV format data | Scan offset of the first sample in the payload data. If YCrCb | |||
is being transported), or the red sample (if RGB format data is | format data is being transported, this is the offset of the co- | |||
transported) where the scan line is fragmented, in network byte | sited luminance sample and if RGB format data is being transported | |||
order. | it is the offset of the red sample. The value is in network byte | |||
order, and the offset has a value of zero if the first sample in | ||||
the payload corresponds to the start of the line. | ||||
Length: 16 bits | Length: 16 bits | |||
Number of octets of data included. This MUST be a multiple of the | Number of octets of data included. This MUST be a multiple of the | |||
pgroup value. | pgroup value. | |||
Field Identification (F): 1 bit | Field Identification (F): 1 bit | |||
Identifies which field the scan line belongs to, for interlaced | Identifies which field the scan line belongs to, for interlaced | |||
data. F=0 identifies the the first field and F=1 the second field. | data. F=0 identifies the the first field and F=1 the second field. | |||
For progressive data (SMPTE 296M) F MUST always be set to zero. | For progressive scan data (e.g. SMPTE 296M format video), F MUST | |||
always be set to zero. | ||||
Follow On (more lines) bit (M): 1 bit | Continuation (more lines) bit (C): 1 bit | |||
Determines if an additional payload header follows the current | Determines if an additional payload header follows the current | |||
header in the RTP packet. Set to 1 if an additional header follows, | header in the RTP packet. Set to 1 if an additional header follows, | |||
implying that the RTP packet is carrying data for more than one | implying that the RTP packet is carrying data for more than one | |||
scan line. Set to 0 otherwise. | scan line. Set to 0 otherwise. | |||
Reserved (Z): 14 bits | Reserved (Z): 14 bits | |||
These bits SHOULD be set to zero by the sender and MUST be ignored | These bits SHOULD be set to zero by the sender and MUST be ignored | |||
by receivers. | by receivers. | |||
4.3. Payload Data | 5.3. Payload Data | |||
Depending on the video format, each RTP packet can include either a | Depending on the video format, each RTP packet can include either a | |||
single complete scan line, a single fragment of a scan line, or one (or | single complete scan line, a single fragment of a scan line, or one (or | |||
more) complete scan lines plus a fragment of a scan line. | more) complete scan lines plus a fragment of a scan line. Every scan | |||
line or scan line fragment MUST begin at an octet boundary in the | ||||
payload data. | ||||
If the video is in YUV format, the packing of samples into the payload | If the video is in YUV format, the packing of samples into the payload | |||
depends on the color sub-sampling used. For RGB format video, there is a | depends on the color sub-sampling used. For RGB format video, there is a | |||
single packing scheme. | single packing scheme. | |||
For RGB format video, samples are packed in order Red-Green-Blue. Each | For RGB format video, samples are packed in order Red-Green-Blue. All | |||
sample is either an 8 bit or a 10 bit value. If 8 bit samples are used, | samples are the same bit size, which may be 8, 10, 12, or 16 bits. If 8 | |||
the pgroup is 3 octets. If 10 bit samples are used, samples from | bit samples are used, the pgroup is 3 octets. If 10 bit samples are | |||
adjacent pixels are packed with no padding, and the pgroup is 15 octets | used, samples from adjacent pixels are packed with no padding, and the | |||
(4 pixels). | pgroup is 15 octets (4 pixels). Refer to Tables 1 thru 4. | |||
For RGBA format video, samples are packed in order Red-Green-Blue-Alpha. | ||||
All samples are the same bit size, which may be 8, 10, 12, or 16 bits. | ||||
Refer to Tables 1 thru 4. | ||||
For YUV 4:4:4 format video, samples are packed in order Cb-Y-Cr. Each | For YUV 4:4:4 format video, samples are packed in order Cb-Y-Cr. Each | |||
sample is either an 8 bit or a 10 bit value. If 8 bit samples are used, | sample is either an 8 bit or a 10 bit value. If 8 bit samples are used, | |||
the pgroup is 3 octets. If 10 bit samples are used, samples from | the pgroup is 3 octets. If 10 bit samples are used, samples from | |||
adjacent pixels are packed with no padding, and the pgroup is 15 octets | adjacent pixels are packed with no padding, and the pgroup is 15 octets | |||
(4 pixels). | (4 pixels). | |||
For YUV 4:2:2 format video, the Cb and Cr components are horizontally | For YUV 4:2:2 format video, the Cb and Cr components are horizontally | |||
sub-sampled by a factor of two (each Cb and Cr samples corresponds to | sub-sampled by a factor of two (each Cb and Cr samples corresponds to | |||
two Y components). Samples are packed in order Cb0-Y0-Cr0-Y1. If 8 bit | two Y components). Samples are packed in order Cb0-Y0-Cr0-Y1. If 8 bit | |||
samples are used, the pgroup is 4 octets. If 10 bit samples are used, | samples are used, the pgroup is 4 octets. If 10 bit samples are used, | |||
the pgroup is 5 octets. | the pgroup is 5 octets. | |||
(tbd: YUV 4:2:0 format video) | (tbd: YUV 4:2:0 format video) | |||
5. Required Parameters | It is possible that the scan line length is not evenly divisible by the | |||
number of pixels in a pgroup, so the final pixel data of a scan line | ||||
does not align to either an octet or pgroup boundary. Nonetheless the | ||||
payload MUST contain a whole number of pgroups; the sender MUST fill the | ||||
remaining bits of the final pgroup with zero and the receiver MUST | ||||
ignore the fill data. (In effect, the trailing edge of the image is | ||||
black-filled to a pgroup boundary.) | ||||
6. Required Parameters | ||||
(tbd) | (tbd) | |||
Parameters are: color mode (RGB/YUV), color sub-sampling | Parameters are: color mode (RGB/YUV), color sub-sampling | |||
(4:4:4, 4:2:2, 4:2:0), lines per frame, pixels per line, and | (4:4:4, 4:2:2, 4:2:0), lines per frame, pixels per line, bits | |||
scan mode (progressive or interlaced). Propose to map these to | per sample and scan mode (progressive or interlaced). Propose | |||
SDP a=fmtp: values. | to map these to SDP a=fmtp: values. | |||
6. RTCP Considerations | Optional parameters are: colorimetry (primaries, whitepoint, | |||
reference medium), transfer function (log, gamma, toe | ||||
treatment, black offset), image orientation, capture temporal | ||||
mode (field integration, frame integration, spot scan, | ||||
pushbroom scan). [286], [22028] | ||||
7. RTCP Considerations | ||||
RFC1889 recommends transmission of RTCP packets every 5 seconds or at a | RFC1889 recommends transmission of RTCP packets every 5 seconds or at a | |||
reduced minimum in seconds of 360 divided by the session bandwidth in | reduced minimum in seconds of 360 divided by the session bandwidth in | |||
kilobits/seconds. At the 1.485 Gbps (uncompressed HDTV rate) the reduced | kilobits/seconds. At the 1.485 Gbps (uncompressed HDTV rate) the reduced | |||
minimum interval computes to 0.2ms or 4028 packets per second. | minimum interval computes to 0.2ms or 4028 packets per second. | |||
It should be noted that the sender's octet count in SR packets wraps | It should be noted that the sender's octet count in SR packets wraps | |||
around in 23 seconds, and that the cumulative number of packets lost | around in 23 seconds, and that the cumulative number of packets lost | |||
wraps around in 93 seconds. This means these two fields cannot | wraps around in 93 seconds. This means these two fields cannot | |||
accurately represent octet count and number of packets lost since the | accurately represent octet count and number of packets lost since the | |||
beginning of transmission, as defined in RFC 1889. Therefore for network | beginning of transmission, as defined in RFC 1889. Therefore for network | |||
monitoring purposes other means of keeping track of these variables | monitoring purposes other means of keeping track of these variables | |||
SHOULD be used. | SHOULD be used. | |||
7. IANA Considerations | 8. IANA Considerations | |||
This memo defines a new RTP payload format and associated MIME type. | This memo defines a new RTP payload format and associated MIME type. | |||
The MIME registration form is enclosed below: | The MIME registration form is enclosed below: | |||
MIME media type name: video | MIME media type name: video | |||
MIME subtype name: raw | MIME subtype name: raw | |||
Required parameters: rate | Required parameters: rate | |||
skipping to change at page 8, line 32 ¶ | skipping to change at page 11, line 5 ¶ | |||
Person & email address to contact for further information: | Person & email address to contact for further information: | |||
Ladan Gharai <ladan@isi.edu> | Ladan Gharai <ladan@isi.edu> | |||
IETF AVT working group. | IETF AVT working group. | |||
Intended usage: COMMON | Intended usage: COMMON | |||
Author/Change controller: | Author/Change controller: | |||
Ladan Gharai <ladan@isi.edu> | Ladan Gharai <ladan@isi.edu> | |||
8. Mapping to SDP Parameters | 9. Mapping to SDP Parameters | |||
Parameters are mapped to SDP [SDP] as follows: | Parameters are mapped to SDP [SDP] as follows: | |||
m=video 30000 RTP/AVP 111 | m=video 30000 RTP/AVP 111 | |||
a=rtpmap:111 raw/90000 | a=rtpmap:111 raw/90000 | |||
a=fmtp:111 (tbd) | a=fmtp:111 (tbd) | |||
In this example, a dynamic payload type 111 is used for uncompressed | In this example, a dynamic payload type 111 is used for uncompressed | |||
video. The RTP sampling clock is 90kHz. | video. The RTP sampling clock is 90kHz. | |||
9. Security Considerations | 10. Security Considerations | |||
RTP packets using the payload format defined in this specification are | RTP packets using the payload format defined in this specification are | |||
subject to the security considerations discussed in the RTP | subject to the security considerations discussed in the RTP | |||
specification, and any appropriate RTP profile. This implies that | specification, and any appropriate RTP profile. This implies that | |||
confidentiality of the media streams is achieved by encryption. | confidentiality of the media streams is achieved by encryption. | |||
This payload type does not exhibit any significant non-uniformity in the | This payload type does not exhibit any significant non-uniformity in the | |||
receiver side computational complexity for packet processing to cause a | receiver side computational complexity for packet processing to cause a | |||
potential denial-of-service threat. | potential denial-of-service threat. | |||
It is to be noted that uncompressed video can have immense bandwidth | It is important to be note that uncompressed video can have immense | |||
requirements (270 Mbps for standard definition video, and approximately | bandwidth requirements (270 Mbps for standard definition video, and | |||
1 Gbps for high definition video). This is sufficient to cause potential | approximately 1 Gbps for high definition video), and is not congestion | |||
for denial-of-service if transmitted onto most currently available | controlled. This is sufficient to cause potential for denial-of-service | |||
Internet paths. In the absence from the standards track of a suitable | if transmitted onto most currently available Internet paths. Use of the | |||
congestion control mechanism for flows of this sort, use of the payload | payload format defined here MUST be narrowly limited to suitably | |||
SHOULD be narrowly limited to suitably connected network endpoints, or | connected private networks, or to networks where quality of service | |||
to networks where QoS guarantees are available, and great care taken | guarantees are available. This potential threat is common to all high | |||
with the scope of multicast transmissions. This potential threat is | rate applications without congestion control. | |||
common to all high bit rate applications without congestion control. | ||||
10. Relation to RFC 2431 | 11. Relation to RFC 2431 | |||
(tbd) [BT656] | In comparison with RFC 2431 this memo specifies support for a wider | |||
variety of uncompressed video, in terms of frame size, color subsampling | ||||
and sample sizes. While [BT656] can transport up to 4096 scan lines and | ||||
2048 pixels per line, our payload type can support up to 64k scan lines | ||||
and pixels per line. Also, RFC 2431 only address 4:2:2 YUV data, while | ||||
this memo covers YUV and RGB and most common color subsampling schemes. | ||||
Given the variety of video types that we cover, this memo also assumes | ||||
out-of-band signaling for sample size and data types (RFC 2431 uses in | ||||
band signaling). | ||||
11. Full Copyright Statement | 12. Full Copyright Statement | |||
Copyright (C) The Internet Society (2002). All Rights Reserved. | Copyright (C) The Internet Society (2002). All Rights Reserved. | |||
This document and translations of it may be copied and furnished to | This document and translations of it may be copied and furnished to | |||
others, and derivative works that comment on or otherwise explain it or | others, and derivative works that comment on or otherwise explain it or | |||
assist in its implementation may be prepared, copied, published and | assist in its implementation may be prepared, copied, published and | |||
distributed, in whole or in part, without restriction of any kind, | distributed, in whole or in part, without restriction of any kind, | |||
provided that the above copyright notice and this paragraph are included | provided that the above copyright notice and this paragraph are included | |||
on all such copies and derivative works. | on all such copies and derivative works. | |||
skipping to change at page 10, line 6 ¶ | skipping to change at page 12, line 33 ¶ | |||
The limited permissions granted above are perpetual and will not be | The limited permissions granted above are perpetual and will not be | |||
revoked by the Internet Society or its successors or assigns. | revoked by the Internet Society or its successors or assigns. | |||
This document and the information contained herein is provided on an "AS | This document and the information contained herein is provided on an "AS | |||
IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK | IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK | |||
FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT | FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT | |||
LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT | LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT | |||
INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR | INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR | |||
FITNESS FOR A PARTICULAR PURPOSE." | FITNESS FOR A PARTICULAR PURPOSE." | |||
12. Authors' Addresses | 13. Acknowledgements | |||
The authors are grateful to Philippe Gentric and Chuck Harrison for | ||||
their feedback. | ||||
14. Authors' Addresses | ||||
Ladan Gharai <ladan@isi.edu> | Ladan Gharai <ladan@isi.edu> | |||
Colin Perkins <csp@isi.edu> | Colin Perkins <csp@csperkins.org> | |||
USC Information Sciences Institute | USC Information Sciences Institute | |||
3811 N. Fairfax Drive | 3811 N. Fairfax Drive, #200 | |||
Arlington, VA 22203-1695 | Arlington, VA 22203 | |||
USA | USA | |||
Bibliography | Bibliography | |||
[274] Society of Motion Picture and Television Engineers, | [274] Society of Motion Picture and Television Engineers, | |||
1920x1080 Scanning and Analog and Parallel Digital Interfaces | 1920x1080 Scanning and Analog and Parallel Digital Interfaces | |||
for Multiple Picture Rates, SMPTE 274M-1998. | for Multiple Picture Rates, SMPTE 274M-1998. | |||
[268] Society of Motion Picture and Television Engineers, | ||||
File Format for Digital Moving Picture Exchange (DPX), | ||||
SMPTE 268M-1994. (Currently under revision.) | ||||
[296] Society of Motion Picture and Television Engineers, | [296] Society of Motion Picture and Television Engineers, | |||
1280x720 Scanning, Analog and Digital Representation and Analog | 1280x720 Scanning, Analog and Digital Representation and Analog | |||
Interfaces, SMPTE 296M-1998. | Interfaces, SMPTE 296M-1998. | |||
[372] Society of Motion Picture and Television Engineers, | ||||
Dual Link 292M Interface for 1920 x 1080 Picture Raster, | ||||
SMPTE 372M-2002. | ||||
[2119] S. Bradner, "Key words for use in RFCs to Indicate Requirement | [2119] S. Bradner, "Key words for use in RFCs to Indicate Requirement | |||
Levels", RFC 2119. | Levels", RFC 2119. | |||
[ALF] Clark, D. D., and Tennenhouse, D. L., "Architectural | [ALF] Clark, D. D., and Tennenhouse, D. L., "Architectural | |||
Considerations for a New Generation of Protocols", In | Considerations for a New Generation of Protocols", In | |||
Proceedings of SIGCOMM '90 (Philadelphia, PA, Sept. 1990), ACM. | Proceedings of SIGCOMM '90 (Philadelphia, PA, Sept. 1990), ACM. | |||
[SDP] M. Handley and V. Jacobson, "SDP: Session Description Protocol", | [SDP] M. Handley and V. Jacobson, "SDP: Session Description Protocol", | |||
RFC 2327, April 1998. | RFC 2327, April 1998. | |||
[BT656] D. Tynan, "RTP Payload Format for BT.656 Video Encoding", | [BT656] D. Tynan, "RTP Payload Format for BT.656 Video Encoding", | |||
Internet Engineering Task Force, RFC 2431, October 1998. | Internet Engineering Task Force, RFC 2431, October 1998. | |||
[RTP] H. Schulzrinne, S. Casner, R. Frederick and V. Jacobson, | [RTP] H. Schulzrinne, S. Casner, R. Frederick and V. Jacobson, | |||
"RTP: A Transport Protocol for Real-Time Applications", | "RTP: A Transport Protocol for Real-Time Applications", | |||
Internet Engineering Task Force, RFC 1889, January 1996. | Internet Engineering Task Force, RFC 1889, January 1996. | |||
[292RTP] L. Gharai et al., "RTP Payload Format for SMPTE 292M Video", | ||||
Internet Draft, draft-ietf-avt-smpte292-video-07.txt, | ||||
Work in progress. | ||||
[601] International Telecommunication Union, "Studio encoding | [601] International Telecommunication Union, "Studio encoding | |||
parameters of digital television for standard 4:3 and wide | parameters of digital television for standard 4:3 and wide | |||
screen 16:9 aspect ratios", Recommendation BT.601, October 1995. | screen 16:9 aspect ratios", Recommendation BT.601, October 1995. | |||
[656] International Telecommunication Union, "Interfaces for Digital | [656] International Telecommunication Union, "Interfaces for Digital | |||
Component Video Signals in 525-line and 625-line Television | Component Video Signals in 525-line and 625-line Television | |||
Systems Operating at the 4:2:2 Level of Recommendation ITU-R | Systems Operating at the 4:2:2 Level of Recommendation ITU-R | |||
BT.601 (Part A)", Recommendation BT.656, April 1998. | BT.601 (Part A)", Recommendation BT.656, April 1998. | |||
[22028] ISO TC42 (Photography), Photography and graphic technology - | ||||
Extended colour encodings for digital image storage, | ||||
manipulation and interchange - Part 1: Architecture and | ||||
requirements, ISO/CD 22028-1, Work in Progress. | ||||
End of changes. 45 change blocks. | ||||
101 lines changed or deleted | 216 lines changed or added | |||
This html diff was produced by rfcdiff 1.46. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |