Internet-Draft | RTCP Messages for Point Cloud Prioritiza | September 2024 |
Engelbart, et al. | Expires 29 March 2025 | [Page] |
This document specifies RTCP messages and RTP header extensions for exchanging parameters of real-time streamed point clouds. A sender can notify receivers of the currently applied parameters, such as selected regions, and their parameters, such as the respective resolutions and included point attributes. A receiver can request updates to the same parameters using RTCP feedback messages.¶
This note is to be removed before publishing as an RFC.¶
The latest revision of this draft can be found at https://mengelbart.github.io/draft-engelbart-avtcore-rtcp-point-cloud-roi/draft-engelbart-avtcore-rtcp-point-cloud-roi.html. Status information for this document may be found at https://datatracker.ietf.org/doc/draft-engelbart-avtcore-rtcp-point-cloud-roi/.¶
Discussion of this document takes place on the Audio/Video Transport Core Maintenance Working Group mailing list (mailto:[email protected]), which is archived at https://mailarchive.ietf.org/arch/browse/avt/. Subscribe at https://www.ietf.org/mailman/listinfo/avt/.¶
Source for this draft and an issue tracker can be found at https://github.com/mengelbart/draft-engelbart-avtcore-rtcp-point-cloud-roi.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 29 March 2025.¶
Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
A point cloud is a set of data points in a three-dimensional coordinate system where each point is defined by its three coordinates. Point clouds can represent three-dimensional environments, such as a vehicle's surroundings. Each point in a point cloud may optionally be associated with additional attributes. Attributes can, for example, be colors or reflectance of objects in the scene. Sequences of point clouds can be generated by sensors such as Lidar ("light detection and ranging"), Radar ("radio detection and ranging"), or a multi-camera setup. Due to the high number of points in a scene, the bandwidth requirements of transmitting point clouds over a network are often higher than those of streaming video. In video streaming, efficient codecs are used to reduce the bandwidth requirement. Similar codecs are being developed for point clouds. However, when streaming point cloud data, consumers of the data usually aren't equally interested in each point. For further processing, focusing on some regions of a point cloud stream is often sufficient, allowing the producer of a point cloud sequence to filter out points of lower interest to further reduce the bandwidth requirements and prioritize bandwidth usage for points of higher importance. When selecting the regions to prioritize and deciding which points can be excluded from transmission, producers and consumers need a mechanism to signal a) which regions of a scene are currently being prioritized and b) request updates to prioritize different regions in the future. This document describes such a mechanism for real-time transmission of point clouds building on the RTP Control protocol RTCP, which is part of the Real-time Transport Protocol RTP. Additionally, this document provides RTP header extensions for senders to inform receivers about the currently applied parameters. The information might be useful for receivers in determining if an RTCP message requesting an update was received and acted upon by the sender.¶
In Section 4, this document first defines a set of shared encoding schemes to represent point cloud parameters. Section 5 and Section 6 define RTCP messages and RTP header extensions, respectively. The RTCP messages and RTP header extensions reference the encoding scheme in Section 4. Section 7 defines the SDP parameters required to negotiate the usage of the presented RTCP messages and header extensions.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
TODO: Add definitions or remove section¶
This section introduces encodings for point cloud regions and their parameters, such as the included attributes and the level of detail. It also provides an encoding for the definition of a viewport, including a dynamically adaptable precision.¶
The encodings described in the following subsections are reused in RTCP feedback messages and RTP header extension as described in Section 5 and Section 6.¶
This section defines an encoding that is reused in RTCP message types to index regions in a point cloud. The encoding uses a recursive octree encoding to signal the presence of certain regions of a point cloud. The encoding can be used to set parameters for the present regions, e.g., receivers can signal individual priority of every region or sets of attributes to include in every region.¶
The root node in every tree is a single byte where every bit indicates whether a child node of an octant is present. Child nodes are appended in pre-order traversal order. A zero-byte encodes a leaf node. The order of octants by the signs of the points of the X-, Y-, and Z-coordinates in each octant is given in Table 1.¶
Regions in the octree are considered present when a leaf node encoding that
region is present. For example, a single zero-byte encodes a single leaf node,
and the root region covering the complete space is present. An encoded value of
the two bytes 0x4000
would indicate that only the octant with X<0, Y>=0, and
Z>=0 is present.¶
Bit | X | Y | Z |
---|---|---|---|
0 | + | + | + |
1 | - | + | + |
2 | - | - | + |
3 | + | - | + |
4 | + | + | - |
5 | - | + | - |
6 | - | - | - |
7 | + | - | - |
The octree encoding can be used in absolute and relative forms. In the absolute form, the bounding box of the octree is implicitly defined by the space covered by the point cloud-generating source. In the relative form, the bounding box can be explicitly defined by setting the min and max values of the X-, Y-, and Z-coordinates. The coordinates are prefixed before the octree encoding as four-byte integers each, as shown in Figure 1.¶
Attributes are additional values optionally associated with every point in a point cloud. The available attributes depend on the context of an application and thus need to be configured out-of-band. The attribute encoding described in this section requires mapping attributes to a bitmask value. Section 7 describes one option to negotiate the mapping when using SDP.¶
To signal the presence of attributes per region, senders and receivers can use the octree encoding presented in Section 4.1. For every region presented in the octree encoding, N bytes will be used to indicate the presence of attributes by setting the bits of the bitmask of the respective attributes to true. The size of N depends on the number of attributes and is implicitly given by the smallest number of bytes that can represent the largest bitmask negotiated during signaling.¶
Bitmask values can be shared among attributes to allow signaling of attribute sets that always occur in combination.¶
This document describes RTCP message types that can be used to signal interest in the prioritization of regions and their parameters. Receivers can signal an interest in receiving a region with a higher priority. Different regions can be requested in different resolutions or with different sets of attributes included. Additionally, receivers can request a viewport precision update.¶
A sender can acknowledge a parameter update using the RTP header extensions described in Section 6. Receivers should not retransmit the same request multiple times to avoid unnecessary overhead, but if the receiver can assume that a request was lost, it may retransmit the request. The request should not be retransmitted earlier than at least one RTT after the first request was transmitted.¶
The following sections define the available message types in detail. All RTCP feedback messages use the common header format shown in Figure 2. The first eight bytes of the header follow the format of RTCP message headers defined in [RFC3550]. The common header is followed by an additional byte consisting of flags describing the payload.¶
The payload of the RTCP messages following the header contains one or more of the parameter encodings in the following order:¶
Absolute/Relative Octree Encoding¶
(optional) Priority¶
(optional) Attribute Encoding¶
(optional) Level-of-Detail Encoding¶
Note: alternative form:
<Header><{abs,rel}-octree>[<priority>][<attributes>][<level-of-detail>]
¶
TODO: In addition to the encodings already described here, one could come up with additional parameter requests such as viewport precision, and sender position and possibly others.¶
The semantics of the different payload formats are explained in the following subsections.¶
The fields V, P, SSRC and length are used as defined in the RTP specification [RFC3550]. The respective meaning is summarized below:¶
This field identifies the RTP version. The current version is 2.¶
If set, the padding bit indicates that the packet contains additional padding octets at the end that are not part of the control information but are included in the length field.¶
The feedback message type is XX (TODO: Use correct value).¶
The RTCP packet type is PSFB (206).¶
This field distinguishes between absolute and relative region requests (see Section 4.1. If the bit is set, the message contains a relative octree encoding and the minimum and maximum fields described in Figure 1 are present.¶
If set, the RTCP message contains a priority request and the octree encoding is followed by one byte indicating the requested priority for each region encoded as leaf node in the octree encoding.¶
If this bit is set, the RTCP message contains an attribute request and the octree encoding is followed by an attribute encoding for every region encoded as leaf node in the octree encdoing.¶
If this bit is set, the RTCP message contains a level-of-detail request and the octree encoding (or the attribute encoding, if it is present) is followed by a level-of-detail encoding for every region encoded as leaf node in the octree encoding.¶
A receiver can use region of Interest signaling to indicate interest in some areas of a point cloud, and a sender can react by prioritizing the regions of interest when allocating bandwidth.¶
The region interest request uses the standard header defined in Figure 2.¶
Receivers can append a priority encoding to the octree to signal fine-grained priorities per region. A priority is a value encoded as a single byte where a higher value indicates a higher priority. A priority request contains a priority byte for every region encoded as a leaf node in the preceding octree encoding.¶
By setting the header Flag A to 1, the receiver can include an attribute encoding as described in Section 4.2 to the feedback message to request attribute sets for every region encoded as a leaf node of the octree. The attribute encoding length depends on the number of negotiated attributes and their associated bitmask values. A single byte can indicate up to eight attributes or attribute sets. For example, if the two attributes, color, and reflectance, are negotiated with bitmask values of 0x01 and 0x02, one byte will be appended to the octree for every leaf node where the bits 0x01 and 0x02 are set to one for every region, which should include these attributes.¶
TODO: Using the same mechanism as for attrbiutes, we can signal resolution per region, but we need to define how to express resolution.¶
RTP senders use RTP header extensions to acknowledge the applied parameters to the receiver. The parameters chosen by the sender may differ from the ones requested by the receiver. The header extension uses the two-byte header form defined in [RFC8285]. The payload of the header extension element uses the same format as the RTCP messages defined in Section 5. Each extension element begins with the same one-byte header (Figure 4) followed by an octree encoding and optional priority, attribute, or level-of-detail encodings.¶
This section defines SDP parameters for negotiating usage of the RTCP messages and RTP header extension described in this document.¶
The rtcp-fb attribute is extended to indicate the capability to send or receive the RTCP feedback defined in this document. This document adds a new parameter "oerr" to the "ccm" feedback value defined in [RFC5104]. The "oerr" (Octree Encoded Region Request) parameter indicates support for the RTCP feedback message defined in Section 5.¶
rtcp-fb-ccm-param =/ SP "oerr" ; Octree Encoded Region Request¶
The URI for indicating support for the RTP Header extension described in Section 6 is "urn:ietf:params:rtp-hdrext:octree-region".¶
The following value are requested to be registered as FMT value in the "FMT Values for PSFB Payload Types" Registry:¶
Note to RFC Editor: This section may be removed after carrying out all the instructions of this section.¶
TODO: Consider¶
TODO acknowledge.¶