draft-ietf-avt-uncomp-video-03.txt   draft-ietf-avt-uncomp-video-04.txt 
Internet Engineering Task Force AVT WG Internet Engineering Task Force AVT WG
INTERNET-DRAFT Ladan Gharai INTERNET-DRAFT Ladan Gharai
draft-ietf-avt-uncomp-video-03.txt USC/ISI draft-ietf-avt-uncomp-video-04.txt USC/ISI
Colin Perkins Colin Perkins
University of Glasgow University of Glasgow
Expires: December 2003 26 October 2003
RTP Payload Format for Uncompressed Video RTP Payload Format for Uncompressed Video
Status of this Memo Status of this Memo
This document is an Internet-Draft and is in full conformance with This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026. all provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet- other groups may also distribute working documents as Internet-
skipping to change at page 2, line 28 skipping to change at page 2, line 28
interfaces in Recommendation BT.656 [656]. These formats allow both interfaces in Recommendation BT.656 [656]. These formats allow both
625 line and 525 line operation, with 720 samples per digital active 625 line and 525 line operation, with 720 samples per digital active
line, 4:2:2 color sub-sampling, and 8- or 10-bit digital line, 4:2:2 color sub-sampling, and 8- or 10-bit digital
representation. representation.
The representation of uncompressed high definition television is The representation of uncompressed high definition television is
specified in SMPTE standards 274M [274] and 296M [296]. SMPTE 274M specified in SMPTE standards 274M [274] and 296M [296]. SMPTE 274M
defines a family of scanning systems with an image format of defines a family of scanning systems with an image format of
1920x1080 pixels with progressive and interlaced scanning, while 1920x1080 pixels with progressive and interlaced scanning, while
SMPTE 296M defines systems with an image size of 1280x720 pixels and SMPTE 296M defines systems with an image size of 1280x720 pixels and
only progressive scanning. In progressive scanning, scan lines are progressive scanning. In progressive scanning, scan lines are
displayed in sequence from top to bottom of a full frame. In displayed in sequence from top to bottom of a full frame. In
interlaced scanning, a frame is divided into its odd and even scan interlaced scanning, a frame is divided into its odd and even scan
lines (called fields) and the two fields are displayed in succession. lines (called fields) and the two fields are displayed in succession.
SMPTE 274M and 296M define images with aspect ratios of 16:9, and SMPTE 274M and 296M define images with aspect ratios of 16:9, and
define the digital representation for RGB and YCbCr components. In define the digital representation for RGB and YCbCr components. In
the case of YCbCr components, the Cb and Cr components are the case of YCbCr components, the Cb and Cr components are
horizontally sub-sampled by a factor of two (4:2:2 color encoding). horizontally sub-sampled by a factor of two (4:2:2 color encoding).
Although these formats differ in their details, they are structurally Although these formats differ in their details, they are structurally
skipping to change at page 3, line 32 skipping to change at page 3, line 32
line numbers for the second field (F=1) are from 584 to 1123. For line numbers for the second field (F=1) are from 584 to 1123. For
ITU-R BT.601 format video, the blanking intervals defined in BT.656 ITU-R BT.601 format video, the blanking intervals defined in BT.656
are used: for 625 line video, lines 24 to 310 of field one (F=0) are used: for 625 line video, lines 24 to 310 of field one (F=0)
and 337 to 623 of the second field (F=1) are valid; for 525 line and 337 to 623 of the second field (F=1) are valid; for 525 line
video, lines 21 to 263 of the first field, and 284 to 525 of the video, lines 21 to 263 of the first field, and 284 to 525 of the
second field are valid. Other formats (e.g. [372]) may define second field are valid. Other formats (e.g. [372]) may define
different ranges of active lines. different ranges of active lines.
The payload header contains a 16 bit extension to the standard 16 bit The payload header contains a 16 bit extension to the standard 16 bit
RTP sequence number, thereby extending the sequence number to 32 bits RTP sequence number, thereby extending the sequence number to 32 bits
and enabling the payload format to accommodate high data rates. This and enabling the payload format to accommodate high data rates
is necessary as the 16 bit RTP sequence number will roll-over very without ambiguity. This is necessary as the 16 bit RTP sequence
quickly for high data rates. For example, for a 1 Gbps video stream number will roll-over very quickly for high data rates. For example,
with packet sizes of at least one thousand octets, the standard RTP for a 1 Gbps video stream with packet sizes of at least one thousand
packet will roll-over in 0.5 seconds, which can be a problem for octets, the standard RTP packet will roll-over in 0.5 seconds, which
detecting loss and out of order packets particularly in instances can be a problem for detecting loss and out of order packets
where the round trip time is greater than half a second. The extended particularly in instances where the round trip time is greater than
32 bit number allows for a longer wrap-around time of approximately half a second. The extended 32 bit number allows for a longer wrap-
nine hours. around time of approximately nine hours.
Each scan line comprises of an integer number of pixels. Each pixel Each scan line comprises of an integer number of pixels. Each pixel
is represented by a number of samples. Samples may be coded as 8, 10, is represented by a number of samples. Samples may be coded as 8, 10,
12 or 16 bit values. A sample may represent color or luminance 12 or 16 bit values. A sample may represent a color component or a
components of the video. Color samples may be shared between luminance component of the video. Color samples may be shared
adjacent pixels. The sharing of color samples between adjacent pixels between adjacent pixels. The sharing of color samples between
is known as color sub-sampling. This is typically done in the YCbCr adjacent pixels is known as color sub-sampling. This is typically
color space for the purpose of reducing the size of an image. done in the YCbCr color space for the purpose of reducing the size of
the image data.
Pixels that share sample values MUST be transported together as a Pixels that share sample values MUST be transported together as a
pixel group. If 10 bit or 12 bit samples are used, each pixel may "pixel group". If 10 bit or 12 bit samples are used, each pixel may
also comprise a non-integer number of octets. In this case, several also comprise a non-integer number of octets. In this case, several
pixels MUST be combined into an octet aligned pixel group for pixels MUST be combined into an octet aligned pixel group for
transmission. These restrictions simplify the operation of receivers transmission. These restrictions simplify the operation of receivers
by ensuring that a complete payload is octet aligned, and that by ensuring that the complete payload is octet aligned, and that
samples relating to a single pixel are not fragmented across multiple samples relating to a single pixel are not fragmented across multiple
packets [ALF]. packets [ALF].
For example, in YCbCr video with 4:1:1 color sub-sampling, each group For example, in YCbCr video with 4:1:1 color sub-sampling, each group
of 4 adjacent pixels comprises 6 samples, Y1 Y2 Y3 Y4 Cr Cb, with the of 4 adjacent pixels comprises 6 samples, Y1 Y2 Y3 Y4 Cr Cb, with the
Cr and Cb values being shared between all 4 pixels. If samples are 8 Cr and Cb values being shared between all 4 pixels. If samples are 8
bit values, the result is a group of 4 pixels comprising 6 octets. bit values, the result is a group of 4 pixels comprising 6 octets.
If, however, samples are 10 bit values, the resulting 60 bit group is If, however, samples are 10 bit values, the resulting 60 bit group is
not octet aligned. To be both octet aligned and appropriately not octet aligned. To be both octet aligned and appropriately
framed, two groups of 4 adjacent pixels must be collected, thereby framed, two groups of 4 adjacent pixels must be collected, thereby
skipping to change at page 5, line 49 skipping to change at page 5, line 50
For progressive scan video, the timestamp denotes the sampling For progressive scan video, the timestamp denotes the sampling
instant of the frame to which the RTP packet belongs. Packets MUST instant of the frame to which the RTP packet belongs. Packets MUST
NOT include data from multiple frames, and all packets belonging to NOT include data from multiple frames, and all packets belonging to
the same frame MUST have the same timestamp. the same frame MUST have the same timestamp.
For interlaced video, the timestamp denotes the sampling instant of For interlaced video, the timestamp denotes the sampling instant of
the field to which the RTP packet belongs. Packets MUST NOT the field to which the RTP packet belongs. Packets MUST NOT
include data from multiple fields, and all packets belonging to the include data from multiple fields, and all packets belonging to the
same field MUST have the same timestamp. Use of field timestamps, same field MUST have the same timestamp. Use of field timestamps,
rather than a frame timestamp and and field indicator bit, is rather than a frame timestamp and field indicator bit, is needed to
needed to support reverse 3-2 pulldown. support reverse 3-2 pulldown.
A 90 kHz timestamp MUST be used in both cases. If the sampling A 90 kHz timestamp SHOULD be used in both cases. If the sampling
instant does not correspond to an integer value of the clock (as instant does not correspond to an integer value of the clock (as
may be the case when interleaving, the value SHALL be truncated to may be the case when interleaving) the value SHALL be truncated to
the next lowest integer). the next lowest integer, with no loss of information.
Marker bit (M): 1 bit Marker bit (M): 1 bit
The Marker bit denotes the end of a video frame, and MUST be set to If progressive scan video is being transmitted, the marker bit
1 for the last packet of the video frame. It MUST be set to 0 for denotes the end of a video frame. If interlaced video is being
other packets. transmitted, it denotes the end of the field. The marker bit MUST
be set to 1 for the last packet of the video frame/field. It MUST
be set to 0 for other packets.
Sequence Number: 16 bits Sequence Number: 16 bits
The low order bits for RTP sequence number. The standard 16 bit The low order bits for RTP sequence number. The standard 16 bit
sequence number is augmented with another 16 bits in the payload sequence number is augmented with another 16 bits in the payload
header, in order avoid problems due to wrap-around when operating header in order avoid problems due to wrap-around when operating at
at high rate rates. high rate rates.
4.2. Payload Header 4.2. Payload Header
Extended Sequence Number : 16 bits Extended Sequence Number : 16 bits
The high order bits of the extended 32 bit sequence number, in The high order bits of the extended 32 bit sequence number, in
network byte order. network byte order.
Length: 16 bits Length: 16 bits
skipping to change at page 6, line 46 skipping to change at page 6, line 49
Scan line number of encapsulated data, in network byte order. Scan line number of encapsulated data, in network byte order.
Successive RTP packets MAY contains parts of the same scan line Successive RTP packets MAY contains parts of the same scan line
(with an incremented RTP sequence number, but the same timestamp), (with an incremented RTP sequence number, but the same timestamp),
if it is necessary to fragment a line. if it is necessary to fragment a line.
Offset : 15 bits Offset : 15 bits
Offset of the first pixel of the payload data within the scan line. Offset of the first pixel of the payload data within the scan line.
If YCbCr format data is being transported, this is the pixel offset If YCbCr format data is being transported, this is the pixel offset
of the co-sited luminance sample; if RGB format data is being of the luminance sample; if RGB format data is being transported it
transported it is the pixel offset of the red sample. The value is is the pixel offset of the red sample; if BGR format data is being
transported it is the pixel offset of the blue sample. The value is
in network byte order. The offset has a value of zero if the first in network byte order. The offset has a value of zero if the first
sample in the payload corresponds to the start of the line, and sample in the payload corresponds to the start of the line, and
increments by one for each pixel. increments by one for each pixel.
Field Identification (F): 1 bit Field Identification (F): 1 bit
Identifies which field the scan line belongs to, for interlaced Identifies which field the scan line belongs to, for interlaced
data. F=0 identifies the the first field and F=1 the second field. data. F=0 identifies the the first field and F=1 the second field.
For progressive scan data (e.g. SMPTE 296M format video), F MUST For progressive scan data (e.g. SMPTE 296M format video), F MUST
always be set to zero. always be set to zero.
Continuation (C): 1 bit Continuation (C): 1 bit
Determines if an additional scan line header follows the current Determines if an additional scan line header follows the current
scan line header in the RTP packet. Set to 1 if an additional scan line header in the RTP packet. Set to 1 if an additional
header follows, implying that the RTP packet is carrying data for header follows, implying that the RTP packet is carrying data for
more than one scan line. Set to 0 otherwise. An unlimited number more than one scan line. Set to 0 otherwise. Several scan lines
of scan lines MAY be included, up to the path MTU limit. The only MAY be included in a single packet, up to the path MTU limit. The
way to determine the number of scan lines included per packet is to only way to determine the number of scan lines included per packet
parse the payload headers. is to parse the payload headers.
4.3. Payload Data 4.3. Payload Data
Depending on the video format, each RTP packet can include either a Depending on the video format, each RTP packet can include either a
single complete scan line, a single fragment of a scan line, or one single complete scan line, a single fragment of a scan line, or one
(or more) complete scan lines and scan line fragments. The length of (or more) complete scan lines and scan line fragments. The length of
each scan line or scan line fragment MUST be an integer multiple of each scan line or scan line fragment MUST be an integer multiple of
the pgroup size in octets. Scan lines SHOULD be fragmented so that the pgroup size in octets. Scan lines SHOULD be fragmented so that
the resulting RTP packet is smaller than the path MTU. the resulting RTP packet is smaller than the path MTU.
skipping to change at page 7, line 44 skipping to change at page 7, line 48
Nonetheless the payload MUST contain a whole number of pgroups; the Nonetheless the payload MUST contain a whole number of pgroups; the
sender MUST fill the remaining bits of the final pgroup with zero and sender MUST fill the remaining bits of the final pgroup with zero and
the receiver MUST ignore the fill data. (In effect, the trailing edge the receiver MUST ignore the fill data. (In effect, the trailing edge
of the image is black-filled to a pgroup boundary.) of the image is black-filled to a pgroup boundary.)
For RGB format video, samples are packed in order Red-Green-Blue. For For RGB format video, samples are packed in order Red-Green-Blue. For
BGR format video, samples are packed in order Blue-Green-Red. For BGR format video, samples are packed in order Blue-Green-Red. For
both formats, if 8 bit samples are used, the pgroup is 3 octets. If both formats, if 8 bit samples are used, the pgroup is 3 octets. If
10 bit samples are used, samples from 4 adjacent pixels form 15 octet 10 bit samples are used, samples from 4 adjacent pixels form 15 octet
pgroups. If 12 bit samples are used, samples from 2 adjacent pixels pgroups. If 12 bit samples are used, samples from 2 adjacent pixels
form 9 octet pgroups. If 16 bits samples are used, each pixel forms a form 9 octet pgroups. If 16 bit samples are used, each pixel forms a
separate 6 octet pgroup. separate 6 octet pgroup.
For RGBA format video, samples are packed in order Red-Green-Blue- For RGBA format video, samples are packed in order Red-Green-Blue-
Alpha. For 8, 10, 12, or 16 bit samples, each pixel forms its own Alpha. For BGRA format video, samples are packet in order Blue-
pgroup, with octet sizes of 4, 5, 6 and 8 respectively. Green-Red-Alpha. For 8, 10, 12, or 16 bit samples, each pixel forms
its own pgroup, with octet sizes of 4, 5, 6 and 8 respectively.
If the video is in YCbCr format, the packing of samples into the If the video is in YCbCr format, the packing of samples into the
payload depends on the color sub-sampling used. payload depends on the color sub-sampling used.
For YCbCr 4:4:4 format video, samples are packed in order Cb-Y-Cr for For YCbCr 4:4:4 format video, samples are packed in order Cb-Y-Cr for
both interlaced and progressive frames. If 8 bit samples are used, both interlaced and progressive frames. If 8 bit samples are used,
the pgroup is 3 octets. If 10 bit samples are used, samples from 4 the pgroup is 3 octets. If 10 bit samples are used, samples from 4
adjacent pixels form 15 octet pgroups. If 12 bit samples are used, adjacent pixels form 15 octet pgroups. If 12 bit samples are used,
samples from 2 adjacent pixels form 9 octet pgroups. If 16 bits samples from 2 adjacent pixels form 9 octet pgroups. If 16 bits
samples are used, each pixel forms a separate 6 octet pgroup. samples are used, each pixel forms a separate 6 octet pgroup.
skipping to change at page 9, line 21 skipping to change at page 9, line 36
Y04-Y05-Y14-Y15-Cb02-Cr02 Y04-Y05-Y14-Y15-Cb02-Cr02
line 2/3: line 2/3:
Y20-Y21-Y30-Y31-Cb10-Cr10 Y22-Y23-Y32-Y33-Cb11-Cr11 Y20-Y21-Y30-Y31-Cb10-Cr10 Y22-Y23-Y32-Y33-Cb11-Cr11
Y24-Y25-Y34-Y35-Cb12-Cr12 Y24-Y25-Y34-Y35-Cb12-Cr12
line 4/5: line 4/5:
Y40-Y41-Y50-Y51-Cb20-Cr20 Y42-Y43-Y52-Y53-Cb21-Cr21 Y40-Y41-Y50-Y51-Cb20-Cr20 Y42-Y43-Y52-Y53-Cb21-Cr21
Y44-Y45-Y54-Y55-Cb22-Cr22 Y44-Y45-Y54-Y55-Cb22-Cr22
Figure 3: Packetization of progressive 4:2:0 YCbCr video
For interlaced transport chrominance samples are transported with For interlaced transport chrominance samples are transported with
every other line: every other line. The first set of chrominance samples may be
transported with either the first line of the field 0, or the first
line of field 1. The example below illustrates the transport of
chrominance samples starting with the first line of field 0.
field 0: field 0:
line 0: Y00-Y01-Cb00-Cr00 Y02-Y03-Cb01-Cr01 Y04-Y05-Cb02-Cr02 line 0: Y00-Y01-Cb00-Cr00 Y02-Y03-Cb01-Cr01 Y04-Y05-Cb02-Cr02
line 2: Y20-Y21 Y22-Y23 Y24-Y25 line 2: Y20-Y21 Y22-Y23 Y24-Y25
line 4: Y40-Y41-Cb20-Cr20 Y42-Y43-Cb21-Cr21 Y44-Y45-Cb22-Cr22 line 4: Y40-Y41-Cb20-Cr20 Y42-Y43-Cb21-Cr21 Y44-Y45-Cb22-Cr22
field 1: field 1:
line 1: Y10-Y11 Y12-Y13 Y14-Y15 line 1: Y10-Y11 Y12-Y13 Y14-Y15
line 3: Y30-Y31-Cb10-Cr10 Y32-Y33-Cb11 Cr11 Y34-Y35-Cb12-Cr12 line 3: Y30-Y31-Cb10-Cr10 Y32-Y33-Cb11 Cr11 Y34-Y35-Cb12-Cr12
line 5: Y50-Y51 Y52-Y53 Y54-Y55 line 5: Y50-Y51 Y52-Y53 Y54-Y55
5. RTCP Considerations Figure 4: Packetization of interlaced 4:2:0 YCbCr video with
top-field-first.
RTCP SHOULD be used as specified in RFC1889 [RTP], which specifies Chrominance values may be sampled with different offsets relative to
two limits on the RTCP packet rate: RTCP bandwidth should be limited luminance values. For instance, in Figure 2, chrominance values are
to 5% of the data rate, and the minimum for the average of the sample at the same distance from neighboring luminance samples. It is
randomized intervals between RTCP packets should be 5 seconds. also possible for a chrominance sample to be co-sited with a
Considering the high data rate of many uncompressed video formats, luminance sample, as in Figure 5:
the minimum interval is the governing factor in many cases.
It should be noted that the sender's octet count in SR packets and line 0: Y00-C Y01 Y02-C Y03 Y04-C Y05
the cumulative number of packets lost will wrap around quickly for
high data rate streams. This means these two fields may not line 1: Y10 Y11 Y12 Y13 Y14 Y15
accurately represent octet count and number of packets lost since the
beginning of transmission, as defined in RFC 1889. Therefore for line 2: Y20-C Y21 Y22-C Y23 Y24-C Y25
network monitoring purposes other means of keeping track of these
variables SHOULD be used. line 3: Y30 Y31 Y32 Y33 Y34 Y35
line 4: Y40-C Y41 Y42-C Y43 Y44-C Y45
line 5: Y50 Y51 Y52 Y53 Y54 Y55
Figure 5: Co-sited video sampling in 4:2:0 YCbCr video where C
designates a CbCr pair
In general chrominance values may be place in between luminance
samples or co-sited. Positions can be designated by an integer
numbering system starting from left to right and top to bottom. The
following position matrices apply for 4:2:0, 4:2:2 and 4:1:1 video:
line N: Y[0] [1] Y[2] Y[0] [1] Y[2]
[3] [4] Y[5] [3] [4] [5]
line N+1: Y[6] [7] Y[8] Y[6] [7] Y[8]
Figure 6: Chrominance position matrix for 4:2:0 YCbCr video
line N: Y[0] [1] Y[2] [3] Y[0] [1] Y[2] [3]
line N+1: Y[0] [1] Y[2] [3] Y[0] [1] Y[2] [3]
Figure 7: Chrominance position matrix for 4:2:2 YCbCr video
line N: Y[0] [1] Y[2] [3] Y[4] [5] Y[6]
line N+1: Y[0] [1] Y[2] [3] Y[4] [5] Y[6]
Figure 8: Chrominance position matrix for 4:1:1 YCbCr video
While these positions do not effect the packetization order of
chrominance and luminance samples, the information is needed for
interpolation prior to display and therefore should be signaled to
the receiver.
5. RTCP Considerations
RTCP SHOULD be used as specified in RFC3550 [RTP]. It is to be noted
that the sender's octet count in SR packets and the cumulative number
of packets lost will wrap around quickly for high data rate streams.
This means these two fields may not accurately represent octet count
and number of packets lost since the beginning of transmission, as
defined in RFC 3550. Therefore for network monitoring purposes other
means of keeping track of these variables SHOULD be used.
6. IANA Considerations 6. IANA Considerations
The IANA is requested to register one new MIME subtype and associated
RTP Payload Format, as described in the following.
6.1. MIME type registration 6.1. MIME type registration
MIME media type name: video MIME media type name: video
MIME subtype name: raw MIME subtype name: raw
Required parameters: Required parameters:
rate: The RTP timestamp clock rate. Applications using this payload rate: The RTP timestamp clock rate. Applications using this payload
format MUST use a value of 90000. format SHOULD use a value of 90000.
color-mode: Determines the color mode of the video stream.
Currently defined values are: RGB, RGBA, and YCbCr. New values may
be registered as described in section 6.2 of RFC XXXX.
sub-sampling: Determines the type of color sub-sampling of the sampling: Determines the color (sub-)sampling mode of the video
video stream. Currently defined values are: mono, 4:1:1, 4:2:0, stream. Currently defined values are RGB, RGBA, BGR, BGRA,
4:2:2, and 4:4:4. New values may be registered as described in YCbCr-4:4:4, YCbCr-4:2:2, YCbCr-4:2:0, and YCbCr-4:1:1. New values
section 6.2 of RFC XXXX. may be registered as described in section 6.2 of RFC XXXX.
width: Determines the number of pixels per line. This is an integer width: Determines the number of pixels per line. This is an integer
between 1 and 32767. between 1 and 32767.
height: Determines the number of lines per frame. This is an height: Determines the number of lines per frame. This is an
integer between 1 and 32767. integer between 1 and 32767.
depth: Determines the number of bits per samples. This is an depth: Determines the number of bits per sample. This is an integer
integer with typical values including 8, 10, 12, and 16. with typical values including 8, 10, 12, and 16.
colorimetry: This parameter defines the set of colorimetric colorimetry: This parameter defines the set of colorimetric
specifications and other transfer characteristics for the video specifications and other transfer characteristics for the video
source, by reference to an external specification. Valid values and source, by reference to an external specification. Valid values and
their specification are: their specification are:
BT601-5 ITU Recommendation BT.601-5 [601] BT601-5 ITU Recommendation BT.601-5 [601]
BT709-2 ITU Recommendation BT.709-2 [709] BT709-2 ITU Recommendation BT.709-2 [709]
SMPTE240M SMPTE standard 240M [240] SMPTE240M SMPTE standard 240M [240]
New values may be registered as described in section 6.2 of RFC New values may be registered as described in section 6.2 of RFC
XXXX. XXXX.
chroma-position: This parameter defines the position of chrominance
samples relative to luminance samples. It is either a single
integer or a comma separated pair of integers. Integer valuess
range from 0 to 8, as specified in Figures 6-8 of RFC XXXX. A
single integer implies that Cb and Cr are co-sited. A comma
separated pair of integers designates the locations of Cb and Cr
samples, respectively.
Optional parameters: Optional parameters:
Interlace: If this OPTIONAL parameter is present, it indicates that Interlace: If this OPTIONAL parameter is present, it indicates that
the video stream is interlaced. If absent, progressive scan is the video stream is interlaced. If absent, progressive scan is
implied. implied.
Top-field-first: If this OPTIONAL parameter is present, it
indicates that chrominance samples are packetized starting with the
first line of field 0. Its absence implies that chrominance samples
are packetized starting with the first line of field 1.
Encoding considerations: Encoding considerations:
Uncompressed video can be transmitted with RTP as specified in RFC Uncompressed video can be transmitted with RTP as specified in RFC
XXXX. No file format is defined at this time. XXXX. No file format is defined at this time.
Security considerations: See section 9 of RFC XXXX. Security considerations: See section 9 of RFC XXXX.
Interoperability considerations: NONE. Interoperability considerations: NONE.
Published specification: RFC XXXX. Published specification: RFC XXXX.
skipping to change at page 12, line 13 skipping to change at page 14, line 18
available specification if colorimetric parameters and other available specification if colorimetric parameters and other
applicable transfer characteristics (the Specification Required applicable transfer characteristics (the Specification Required
policy of RFC 2434 [2434]). policy of RFC 2434 [2434]).
7. Mapping to SDP Parameters 7. Mapping to SDP Parameters
Parameters are mapped to SDP [SDP] as in the following example: Parameters are mapped to SDP [SDP] as in the following example:
m=video 30000 RTP/AVP 112 m=video 30000 RTP/AVP 112
a=rtpmap:112 raw/90000 a=rtpmap:112 raw/90000
a=fmtp:112 sampling=YUV-4:2:2; width=1280; height=720; depth=10; a=fmtp:112 sampling=YCbCr-4:2:2; width=1280; height=720; depth=10;
colorimetry=BT.709-2 colorimetry=BT.709-2; chroma-position=1
In this example, a dynamic payload type 112 is used for uncompressed In this example, a dynamic payload type 112 is used for uncompressed
video. The RTP sampling clock is 90kHz. Note that the "a=fmtp:" line video. The RTP sampling clock is 90kHz. Note that the "a=fmtp:" line
has been wrapped to fit this page, and will be a single long line in has been wrapped to fit this page, and will be a single long line in
the SDP file. the SDP file.
8. Security Considerations 8. Security Considerations
RTP packets using the payload format defined in this specification RTP packets using the payload format defined in this specification
are subject to the security considerations discussed in the RTP are subject to the security considerations discussed in the RTP
skipping to change at page 12, line 39 skipping to change at page 14, line 44
the receiver side computational complexity for packet processing to the receiver side computational complexity for packet processing to
cause a potential denial-of-service threat. cause a potential denial-of-service threat.
It is important to be note that uncompressed video can have immense It is important to be note that uncompressed video can have immense
bandwidth requirements (up 270 Mbps for standard definition video, bandwidth requirements (up 270 Mbps for standard definition video,
and approximately 1 Gbps for high definition video). This is and approximately 1 Gbps for high definition video). This is
sufficient to cause potential for denial-of-service if transmitted sufficient to cause potential for denial-of-service if transmitted
onto most currently available Internet paths. onto most currently available Internet paths.
Accordingly, if best-effort service is being used, users of this Accordingly, if best-effort service is being used, users of this
payload format SHOULD monitor packet loss to ensure that the packet payload format MUST monitor packet loss to ensure that the packet
loss rate is within acceptable parameters. Packet loss is considered loss rate is within acceptable parameters. Packet loss is considered
acceptable if a TCP flow across the same network path, and acceptable if a TCP flow across the same network path, and
experiencing the same network conditions, would achieve an average experiencing the same network conditions, would achieve an average
throughput, measured on a reasonable timescale, that is not less than throughput, measured on a reasonable timescale, that is not less than
the RTP flow is achieving. This condition can be satisfied by the RTP flow is achieving. This condition can be satisfied by
implementing congestion control mechanisms to adapt the transmission implementing congestion control mechanisms to adapt the transmission
rate (or the number of layers subscribed for a layered multicast rate (or the number of layers subscribed for a layered multicast
session), or by arranging for a receiver to leave the session if the session), or by arranging for a receiver to leave the session if the
loss rate is unacceptably high. loss rate is unacceptably high.
skipping to change at page 13, line 18 skipping to change at page 15, line 22
was requested is actually being delivered. If it is not, then they was requested is actually being delivered. If it is not, then they
SHOULD assume that they are receiving best-effort service and behave SHOULD assume that they are receiving best-effort service and behave
accordingly. accordingly.
9. Relation to RFC 2431 9. Relation to RFC 2431
In comparison with RFC 2431 this memo specifies support for a wider In comparison with RFC 2431 this memo specifies support for a wider
variety of uncompressed video, in terms of frame size, color sub- variety of uncompressed video, in terms of frame size, color sub-
sampling and sample sizes. While [BT656] can transport up to 4096 sampling and sample sizes. While [BT656] can transport up to 4096
scan lines and 2048 pixels per line, our payload type can support up scan lines and 2048 pixels per line, our payload type can support up
to 64k scan lines and pixels per line. Also, RFC 2431 only address to 32768 scan lines and pixels per line. Also, RFC 2431 only address
4:2:2 YCbCr data, while this memo covers YCbCr and RGB and most 4:2:2 YCbCr data, while this memo covers YCbCr, RGB, RGBA, BGR and
common color sub-sampling schemes. Given the variety of video types BGRA and most common color sub-sampling schemes. Given the variety of
that we cover, this memo also assumes out-of-band signaling for video types that we cover, this memo also assumes out-of-band
sample size and data types (RFC 2431 uses in band signaling). signaling for sample size and data types (RFC 2431 uses in band
signaling).
10. Relation to RFC 3497 10. Relation to RFC 3497
RFC 3497 [292RTP] specifies a RTP payload format for encapsulating RFC 3497 [292RTP] specifies a RTP payload format for encapsulating
SMPTE 292M video. The SMPTE 292M standard defines a bit-serial SMPTE 292M video. The SMPTE 292M standard defines a bit-serial
digital interface for local area High Definition Television (HDTV) digital interface for local area High Definition Television (HDTV)
transport. As a transport medium, SMPTE 292M utilizes 10 bit words transport. As a transport medium, SMPTE 292M utilizes 10 bit words
and a fixed 1.485Gbps (and 1.485/1.001Gbps) data rate. SMPTE 292M is and a fixed 1.485Gbps (and 1.485/1.001Gbps) data rate. SMPTE 292M is
typically used in the broadcast industry for the transport of other typically used in the broadcast industry for the transport of other
video formats such as SMPTE 260M, SMPTE 295M, SMPTE 274M and SMPTE video formats such as SMPTE 260M, SMPTE 295M, SMPTE 274M and SMPTE
296M. 296M.
RFC 3497 defines a circuit emulation for the transport of SMPTE 292M RFC 3497 defines a circuit emulation for the transport of SMPTE 292M
over RTP. It is very specific to SMPTE 292 and has been designed to over RTP. It is very specific to SMPTE 292 and has been designed to
be interoperable with existing broadcast equipment with a constant be interoperable with existing broadcast equipment with a constant
rate of 1.485Gbps. rate of 1.485Gbps.
RFC XXXX, defines a flexible native packetization scheme which can This memo defines a flexible native packetization scheme which can
packetize any uncompressed video, at varying data rates. In addition, packetize any uncompressed video, at varying data rates. In addition,
unlike RFC 3497, RFC XXXX only transports active video pixels (i.e. unlike RFC 3497, this memo only transports active video pixels (i.e.
horizontal and vertical blanking are not transported). horizontal and vertical blanking are not transported).
11. Full Copyright Statement 11. Full Copyright Statement
Copyright (C) The Internet Society (2003). All Rights Reserved. Copyright (C) The Internet Society (2003). All Rights Reserved.
This document and translations of it may be copied and furnished to This document and translations of it may be copied and furnished to
others, and derivative works that comment on or otherwise explain it others, and derivative works that comment on or otherwise explain it
or assist in its implementation may be prepared, copied, published or assist in its implementation may be prepared, copied, published
and distributed, in whole or in part, without restriction of any and distributed, in whole or in part, without restriction of any
skipping to change at page 14, line 30 skipping to change at page 16, line 36
This document and the information contained herein is provided on an This document and the information contained herein is provided on an
"AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE." MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE."
12. Acknowledgments 12. Acknowledgments
The authors are grateful to Philippe Gentric and Chuck Harrison for The authors are grateful to Philippe Gentric, Chuck Harrison, Stephan
their feedback. Wenger and Dave Singer for their feedback.
13. Authors' Addresses This work is based upon work supported by the National Science
Foundation (NSF) under Grant No. 0230738. Any opinions, findings and
conclusions or recommendations expressed in this material are those
of the authors and do not necessarily reflect the views of NSF.
13. Authors' Addresses
Ladan Gharai <ladan@isi.edu> Ladan Gharai <ladan@isi.edu>
USC Information Sciences Institute USC Information Sciences Institute
3811 N. Fairfax Drive, #200 3811 N. Fairfax Drive, #200
Arlington, VA 22203 Arlington, VA 22203
USA USA
Colin Perkins <csp@csperkins.org> Colin Perkins <csp@csperkins.org>
University of Glasgow University of Glasgow
Department of Computing Science Department of Computing Science
17 Lilybank Gardens 17 Lilybank Gardens
Glasgow G12 8QQ Glasgow G12 8QQ
United Kingdom United Kingdom
Normative References Normative References
[RTP] H. Schulzrinne, S. Casner, R. Frederick and V. Jacobson, [RTP] H. Schulzrinne, S. Casner, R. Frederick and V. Jacobson,
"RTP: A Transport Protocol for Real-Time Applications", "RTP: A Transport Protocol for Real-Time Applications",
Internet Engineering Task Force, RFC 1889, January 1996. Internet Engineering Task Force, RFC 3550, July 2003.
[2119] S. Bradner, "Key words for use in RFCs to Indicate [2119] S. Bradner, "Key words for use in RFCs to Indicate
Requirement Levels", RFC 2119. Requirement Levels", RFC 2119.
[2434] T. Narten and H. Alvestrand, "Guidelines for Writing an IANA [2434] T. Narten and H. Alvestrand, "Guidelines for Writing an IANA
Considerations Section in RFCs", RFC 2434, October 1998. Considerations Section in RFCs", RFC 2434, October 1998.
Informative References Informative References
[274] Society of Motion Picture and Television Engineers, [274] Society of Motion Picture and Television Engineers,
"1920x1080 Scanning and Analog and Parallel Digital "1920x1080 Scanning and Analog and Parallel Digital
Interfaces Interfaces for Multiple Picture Rates", SMPTE 274M-1998.
for Multiple Picture Rates", SMPTE 274M-1998.
[268] Society of Motion Picture and Television Engineers, [268] Society of Motion Picture and Television Engineers,
"File Format for Digital Moving Picture Exchange (DPX)", "File Format for Digital Moving Picture Exchange (DPX)",
SMPTE 268M-1994. (Currently under revision.) SMPTE 268M-1994. (Currently under revision.)
[296] Society of Motion Picture and Television Engineers, [296] Society of Motion Picture and Television Engineers,
"1280x720 Scanning, Analog and Digital Representation and "1280x720 Scanning, Analog and Digital Representation and
Analog Interfaces", SMPTE 296M-1998. Analog Interfaces", SMPTE 296M-1998.
[372] Society of Motion Picture and Television Engineers, [372] Society of Motion Picture and Television Engineers,
 End of changes. 38 change blocks. 
82 lines changed or deleted 149 lines changed or added

This html diff was produced by rfcdiff 1.33. The latest version is available from http://tools.ietf.org/tools/rfcdiff/