空 挡 广 告 位 | 空 挡 广 告 位

Apple Patent | Block-based predictive coding for point cloud compression

Patent: Block-based predictive coding for point cloud compression

Drawings: Click to check drawins

Publication Number: 20210105493

Publication Date: 20210408

Applicant: Apple

Assignee: Apple Inc.

Abstract

An encoder is configured to compress point cloud information using a blocks of nodes determined from a prediction tree. A prediction tree is generated for a point cloud. Segments of the prediction tree are identified. The segments are divided into blocks that are predicted by predecessor blocks within the segments. The blocks of the prediction tree may then be encoded and may be provided for transmission to a decoder that can regenerate the point cloud from the blocks of the prediction tree.

Claims

  1. One or more non-transitory, computer-readable storage media, storing program instructions that when executed on or across one or more computing devices cause the one or more computing devices to: generate a prediction tree comprising a plurality of nodes that correspond to a plurality of points that make up a point cloud, wherein respective ones of the points comprise spatial information for the point; identify different segments of the prediction tree according to a graph traversal technique; divide the segments into respective blocks, wherein node values of nodes included in the respective blocks are predicted based on node values of nodes included in one or more predecessor blocks in the segment; encode the blocks of the prediction tree to encode the point cloud; and send or store the encoded blocks for the point cloud.

  2. The one or more non-transitory, computer-readable storage media of claim 1, wherein, to encode the blocks of the prediction tree, the program instructions cause the one or more computing devices to further: apply a transform to residual values of the blocks of the prediction tree; and apply quantization to coefficients generated from the transformed residual values of the blocks of the prediction tree.

  3. The one or more non-transitory, computer-readable storage media of claim 1, wherein, to identify different segments of the prediction tree according to the graph traversal technique, the program instructions cause the one or more computing devices to: traverse nodes of the prediction tree from a root node of the prediction tree to a first leaf node of the prediction tree to identify a first segment of the segments; and iteratively traverse other nodes of the prediction tree from next nodes identified according to the graph traversal technique to other leaf nodes until remaining nodes in the prediction tree are included in another one of the segments.

  4. The one or more non-transitory, computer-readable storage media of claim 1, wherein a respective prediction technique for predicting node values of one of the blocks is different than a respective prediction technique for predicting node values of another one of the blocks.

  5. The one or more non-transitory, computer-readable storage media of claim 1, wherein respective prediction techniques to be used to predict node values for the nodes of the prediction tree are signaled at a block level.

  6. The one or more non-transitory, computer-readable storage media of claim 1, wherein different prediction techniques are signaled for different nodes included in a same block.

  7. The one or more non-transitory, computer-readable storage media of claim 6, wherein the program instructions cause the one or more computing devices to: apply a transform selected for a block to residual values calculated based on predictions that use different prediction techniques for two or more nodes of the block; and signal the transform selected for the block at a block-level.

  8. The one or more non-transitory, computer-readable storage media of claim 7, wherein: a first transform is signaled at the block-level for decompressing attribute values of the nodes; and a different transform is signaled at the block-level for decompressing geometry values of the nodes.

  9. The one or more non-transitory, computer-readable storage media of claim 6, wherein a quantization parameter to be applied to coefficients resulting from the transform being applied to the residual values is signaled at a block-level.

  10. The one or more non-transitory, computer-readable storage media of claim 1, wherein the program instructions cause the one or more computing devices to: apply a transform to residual attribute values for nodes of a block, wherein the transform transforms the residual attribute values of the nodes of the block into transfer function coefficients, and wherein the transform determines the transform function coefficients based on: relationships between the block and other blocks of the prediction tree; and geometry relationships between the nodes of the block in a geometry of the point cloud.

  11. The one or more non-transitory, computer-readable storage media of claim 1, wherein the program instructions cause the one or more computing devices to: perform a rate distortion optimization (RDO) analysis to determine a number of nodes to be included in the blocks.

  12. The one or more non-transitory, computer-readable storage media of claim 1, wherein the program instructions cause the one or more computing devices to: apply a transform to residual attribute values for nodes of a block, wherein the transform is selected from a set of supported transforms, comprising: a one-dimensional discrete cosine transform; a wavelet transform; a discrete wavelet transform; a Haar transform; a Hadamard transform; a graph transform; or a lifting scheme.

  13. The one or more non-transitory, computer-readable storage media of claim 1, wherein the program instructions cause the one or more computing devices to: determine respective nodes values for nodes of a block using prediction techniques applied to one or more ancestor nodes of respective nodes being predicted, wherein the prediction techniques are selected from a set of supported prediction techniques comprising: not predicted, wherein a node value for a child node is coded without prediction; a delta prediction technique, wherein a node value for a child node is predicted as a difference from a node value of a parent node; a linear prediction technique, wherein a node value for a child node is predicted based on a relationship between a parent node and a grandparent node of the child node; a parallelogram prediction technique, wherein a node value for a child node is determined based on a relationship between a parent node, a grandparent node, and a great grandparent node of the child node.

  14. One or more non-transitory, computer-readable storage media, storing program instructions that, when executed on or across one or more computing devices, cause the one or more computing devices to: receive a plurality of encoded blocks of a prediction tree for a plurality of points of a point cloud; decode the blocks of the prediction tree, wherein, in decoding the prediction tree, the program instructions cause the one or more computing devices to further: decode individual ones of the blocks according to respectively determined prediction techniques for the blocks to decode points of the point cloud associated with the blocks; and store or render the plurality of points of the point cloud decoded from the blocks of the prediction tree.

  15. The one or more non-transitory, computer-readable storage media of claim 14, wherein to decode an individual one of the blocks, the program instructions cause the one or more computing devices to: apply an inverse transform to coefficient values signaled for the block to determine residual values for nodes of the block, wherein the inverse transform to be applied is indicated at a block-level.

  16. The one or more non-transitory, computer-readable storage media of claim 15, wherein to decode an individual one of the blocks, the program instructions further cause the one or more computing devices to: apply an inverse quantization to coefficient values signaled for the block prior to applying the inverse transform, wherein the inverse quantization to be applied is indicated at the block-level.

  17. The one or more non-transitory, computer-readable storage media of claim 15, wherein to decode an individual one of the blocks, the program instructions further cause the one or more computing devices to: predict node values for the nodes of a block based on prediction techniques indicated in the encoded blocks; and apply the residual values to the predicted node values to determine reconstructed node values for the nodes of the block.

  18. The one or more non-transitory, computer-readable storage media of claim 17, wherein the prediction techniques are signaled at a block-level in the encoded blocks.

  19. A system, comprising: a memory storing program instructions, and one or more processors configured to execute the program instructions to: generate a prediction tree comprising a plurality of nodes that correspond to a plurality of points that make up a point cloud captured from one or more sensors, wherein respective ones of the points comprise spatial information for the point; identify different segments of the prediction tree according to a graph traversal technique; divide the segments into respective blocks, wherein node values of nodes included in the respective blocks are predicted based on node values of nodes included in one or more predecessor blocks in the segment; encode residual values for the node values of the blocks of the prediction tree to encode the point cloud; and send or store the encoded blocks for the point cloud.

  20. The system of claim 19, further comprising: the one or more sensors, wherein the one or more sensors are LIDAR sensors.

Description

PRIORITY CLAIM

[0001] This application claims benefit of priority to U.S. Provisional Application Ser. No. 62/911,200, entitled “BLOCK-BASED PREDICTIVE CODING FOR POINT CLOUD COMPRESSION,” filed Oct. 4, 2019, and which is incorporated herein by reference in its entirety.

BACKGROUND

Technical Field

[0002] This disclosure relates generally to compression and decompression of point clouds comprising a plurality of points, each having associated spatial and/or attribute information.

Description of the Related Art

[0003] Various types of sensors, such as light detection and ranging (LIDAR) systems, 3-D-cameras, 3-D scanners, etc. may capture data indicating positions of points in three dimensional space, for example positions in the X, Y, and Z planes. Also, such systems may further capture attribute information in addition to spatial information for the respective points, such as color information (e.g. RGB values), intensity attributes, reflectivity attributes, motion related attributes, modality attributes, or various other attributes. In some circumstances, additional attributes may be assigned to the respective points, such as a time-stamp when the point was captured. Points captured by such sensors may make up a “point cloud” comprising a set of points each having associated spatial information and one or more associated attributes. In some circumstances, a point cloud may include thousands of points, hundreds of thousands of points, millions of points, or even more points. Also, in some circumstances, point clouds may be generated, for example in software, as opposed to being captured by one or more sensors. In either case, such point clouds may include large amounts of data and may be costly and time-consuming to store and transmit.

SUMMARY OF EMBODIMENTS

[0004] In various embodiments, block-based predictive coding techniques are implemented to compress or otherwise encode information for point clouds, such as spatial or other geometric information or other attribute values. A prediction tree is generated for a point cloud. Segments of the prediction tree are identified. The segments are divided into blocks that are predicted by predecessor blocks within the segments. The blocks of the prediction tree may then be encoded and may be provided for transmission to a decoder that can regenerate the point cloud from the blocks of the prediction tree.

BRIEF DESCRIPTION OF THE DRAWINGS

[0005] FIG. 1 illustrates a system comprising a sensor that captures information for points of a point cloud and an encoder that compresses attribute information and/or spatial information of the point cloud, where the compressed point cloud information is sent to a decoder, according to some embodiments.

[0006] FIG. 2A is a high-level flowchart illustrating various techniques for block-based predictive coding for point clouds, according to some embodiments.

[0007] FIG. 2B is an example prediction tree, according to some embodiments.

[0008] FIG. 2C is an example of identified segments of a prediction tree, according to some embodiments.

[0009] FIG. 2D is an example of a segment of a prediction tree divided into blocks, according to some embodiments.

[0010] FIG. 3 is a high-level flowchart illustrating various techniques for generating a prediction tree according to a space filling curve, according to some embodiments.

[0011] FIG. 4 is a high-level flowchart illustrating various techniques for generating a prediction tree according to a buffer of possible predictors, according to some embodiments.

[0012] FIG. 5 is high-level flowchart illustrating various techniques for encoding prediction tree blocks, according to some embodiments.

[0013] FIG. 6 is a high-level flowchart illustrating various techniques for decoding prediction tree blocks for a point cloud, according to some embodiments.

[0014] FIG. 7 is high-level flowchart illustrating various techniques for decoding prediction tree blocks, according to some embodiments.

[0015] FIG. 8A illustrates components of an encoder, according to some embodiments.

[0016] FIG. 8B illustrates components of a decoder, according to some embodiments.

[0017] FIG. 9 illustrates compressed point cloud information being used in a 3-D application, according to some embodiments.

[0018] FIG. 10 illustrates compressed point cloud information being used in a virtual reality application, according to some embodiments.

[0019] FIG. 11 illustrates an example computer system that may implement an encoder or decoder, according to some embodiments.

[0020] This specification includes references to “one embodiment” or “an embodiment.” The appearances of the phrases “in one embodiment” or “in an embodiment” do not necessarily refer to the same embodiment. Particular features, structures, or characteristics may be combined in any suitable manner consistent with this disclosure.

[0021] “Comprising.” This term is open-ended. As used in the appended claims, this term does not foreclose additional structure or steps. Consider a claim that recites: “An apparatus comprising one or more processor units … .” Such a claim does not foreclose the apparatus from including additional components (e.g., a network interface unit, graphics circuitry, etc.).

[0022] “Configured To.” Various units, circuits, or other components may be described or claimed as “configured to” perform a task or tasks. In such contexts, “configured to” is used to connote structure by indicating that the units/circuits/components include structure (e.g., circuitry) that performs those task or tasks during operation. As such, the unit/circuit/component can be said to be configured to perform the task even when the specified unit/circuit/component is not currently operational (e.g., is not on). The units/circuits/components used with the “configured to” language include hardware–for example, circuits, memory storing program instructions executable to implement the operation, etc. Reciting that a unit/circuit/component is “configured to” perform one or more tasks is expressly intended not to invoke 35 U.S.C. .sctn. 112(f), for that unit/circuit/component. Additionally, “configured to” can include generic structure (e.g., generic circuitry) that is manipulated by software and/or firmware (e.g., an FPGA or a general-purpose processor executing software) to operate in manner that is capable of performing the task(s) at issue. “Configure to” may also include adapting a manufacturing process (e.g., a semiconductor fabrication facility) to fabricate devices (e.g., integrated circuits) that are adapted to implement or perform one or more tasks.

[0023] “First,” “Second,” etc. As used herein, these terms are used as labels for nouns that they precede, and do not imply any type of ordering (e.g., spatial, temporal, logical, etc.). For example, a buffer circuit may be described herein as performing write operations for “first” and “second” values. The terms “first” and “second” do not necessarily imply that the first value must be written before the second value.

[0024] “Based On.” As used herein, this term is used to describe one or more factors that affect a determination. This term does not foreclose additional factors that may affect a determination. That is, a determination may be solely based on those factors or based, at least in part, on those factors. Consider the phrase “determine A based on B.” While in this case, B is a factor that affects the determination of A, such a phrase does not foreclose the determination of A from also being based on C. In other instances, A may be determined based solely on B.

DETAILED DESCRIPTION

[0025] As data acquisition and display technologies have become more advanced, the ability to capture point clouds comprising thousands or millions of points in 2-D or 3-D space, such as via LIDAR systems, has increased. Also, the development of advanced display technologies, such as virtual reality or augmented reality systems, has increased potential uses for point clouds. However, point cloud files are often very large and may be costly and time-consuming to store and transmit. For example, communication of point clouds over private or public networks, such as the Internet, may require considerable amounts of time and/or network resources, such that some uses of point cloud data, such as real-time uses, may be limited. Also, storage requirements of point cloud files may consume a significant amount of storage capacity of devices storing the point cloud files, which may also limit potential applications for using point cloud data.

[0026] In some embodiments, an encoder may be used to generate a compressed point cloud to reduce costs and time associated with storing and transmitting large point cloud files. In some embodiments, a system may include an encoder that compresses attribute information and/or spatial information (also referred to herein as geometry information) of a point cloud file such that the point cloud file may be stored and transmitted more quickly than non-compressed point clouds and in a manner such that the point cloud file may occupy less storage space than non-compressed point clouds. In some embodiments, compression of spatial information and/or attributes of points in a point cloud may enable a point cloud to be communicated over a network in real-time or in near real-time. For example, a system may include a sensor that captures spatial information and/or attribute information about points in an environment where the sensor is located, wherein the captured points and corresponding attributes make up a point cloud. The system may also include an encoder that compresses the captured point cloud attribute information. The compressed attribute information of the point cloud may be sent over a network in real-time or near real-time to a decoder that decompresses the compressed attribute information of the point cloud. The decompressed point cloud may be further processed, for example to make a control decision based on the surrounding environment at the location of the sensor. The control decision may then be communicated back to a device at or near the location of the sensor, wherein the device receiving the control decision implements the control decision in real-time or near real-time. In some embodiments, the decoder may be associated with an augmented reality system and the decompressed spatial and/or attribute information may be displayed or otherwise used by the augmented reality system. In some embodiments, compressed attribute information for a point cloud may be sent with compressed spatial information for points of the point cloud. In other embodiments, spatial information and attribute information may be separately encoded and/or separately transmitted to a decoder.

[0027] In some embodiments, a system may include a decoder that receives one or more point cloud files comprising compressed attribute information via a network from a remote server or other storage device that stores the one or more point cloud files. For example, a 3-D display, a holographic display, or a head-mounted display may be manipulated in real-time or near real-time to show different portions of a virtual world represented by point clouds. In order to update the 3-D display, the holographic display, or the head-mounted display, a system associated with the decoder may request point cloud files from the remote server based on user manipulations of the displays, and the point cloud files may be transmitted from the remote server to the decoder and decoded by the decoder in real-time or near real-time. The displays may then be updated with updated point cloud data responsive to the user manipulations, such as updated point attributes.

[0028] In some embodiments, a system, may include one or more LIDAR systems, 3-D cameras, 3-D scanners, etc., and such sensor devices may capture spatial information, such as X, Y, and Z coordinates for points in a view of the sensor devices. In some embodiments, the spatial information may be relative to a local coordinate system or may be relative to a global coordinate system (for example, a Cartesian coordinate system may have a fixed reference point, such as a fixed point on the earth, or may have a non-fixed local reference point, such as a sensor location).

[0029] In some embodiments, such sensors may also capture attribute information for one or more points, such as color attributes, reflectivity attributes, velocity attributes, acceleration attributes, time attributes, modalities, and/or various other attributes. In some embodiments, other sensors, in addition to LIDAR systems, 3-D cameras, 3-D scanners, etc., may capture attribute information to be included in a point cloud. For example, in some embodiments, a gyroscope or accelerometer, may capture motion information to be included in a point cloud as an attribute associated with one or more points of the point cloud. For example, a vehicle equipped with a LIDAR system, a 3-D camera, or a 3-D scanner may include the vehicle’s direction and speed in a point cloud captured by the LIDAR system, the 3-D camera, or the 3-D scanner. For example, when points in a view of the vehicle are captured they may be included in a point cloud, wherein the point cloud includes the captured points and associated motion information corresponding to a state of the vehicle when the points were captured.

[0030] In some embodiments, attribute information may comprise string values, such as different modalities. For example attribute information may include string values indicating a modality such as “walking”, “running”, “driving”, etc. In some embodiments, an encoder may comprise a “string-value” to integer index, wherein certain strings are associated with certain corresponding integer values. In some embodiments, a point cloud may indicate a string value for a point by including an integer associated with the string value as an attribute of the point. The encoder and decoder may both store a common string value to integer index, such that the decoder can determine string values for points based on looking up the integer value of the string attribute of the point in a string value to integer index of the decoder that matches or is similar to the string value to integer index of the encoder.

[0031] In some embodiments, an encoder compresses and encodes geometric or other spatial information of a point cloud in addition to compressing attribute information for attributes of the points of the point cloud.

[0032] In some embodiments, some applications may be sensitive to the latency or time that is taken to encode and decode point cloud. While some point cloud encoding techniques may implement features that provide good compression results, such as octrees utilized in Geometry-based Point Cloud Compression (G-PCC), the time to encode and decode point cloud data may limit the utilization of the compression in latency sensitive applications. For example, while octree techniques may provide excellent compression results for dense point cloud, the compression gains achieved for a sparse point cloud (e.g. a sparse Lidar point cloud) may not be as effective, as the computational complexity for building the octree and computing features of the octree, such as neighborhood occupancy information, may result in computational costs that outweigh the obtained compression gains. Furthermore, in some scenarios, some coding techniques, like octree-based coding, may incur a high latency (e.g., by using a high number of points before the compression/decompression process could start). Predictive coding techniques, in various embodiments, may provide various performance benefits, including low latency implementations, which can achieve more performant computational costs and lower latency. For example, predictive coding techniques as discussed below may be implemented for low latency or other latency sensitive applications, allow for low delay streaming, and be implemented with low complexity decoding.

[0033] FIG. 1 illustrates a system comprising a sensor that captures information for points of a point cloud and an encoder that compresses spatial and/or attribute information of the point cloud, where the compressed spatial and/or attribute information is sent to a decoder, according to some embodiments.

[0034] System 100 includes sensor 102 and encoder 104. Sensor 102 captures a point cloud 110 comprising points representing structure 106 in view 108 of sensor 102. For example, in some embodiments, structure 106 may be a mountain range, a building, a sign, an environment surrounding a street, or any other type of structure. In some embodiments, a captured point cloud, such as captured point cloud 110, may include spatial and attribute information for the points included in the point cloud. For example, point A of captured point cloud 110 comprises X, Y, Z coordinates and attributes 1, 2, and 3. In some embodiments, attributes of a point may include attributes such as R, G, B color values, a velocity at the point, an acceleration at the point, a reflectance of the structure at the point, a time stamp indicating when the point was captured, a string-value indicating a modality when the point was captured, for example “walking”, or other attributes. The captured point cloud 110 may be provided to encoder 104, wherein encoder 104 generates a compressed version of the point cloud (compressed point cloud information 112) that is transmitted via network 114 to decoder 116. In some embodiments, a compressed version of the point cloud, such as compressed point cloud information 112, may be included in a common compressed point cloud that also includes compressed spatial information for the points of the point cloud or, in some embodiments, compressed spatial information and compressed attribute information may be communicated as separate files.

[0035] In some embodiments, encoder 104 may be integrated with sensor 102. For example, encoder 104 may be implemented in hardware or software included in a sensor device, such as sensor 102. In other embodiments, encoder 104 may be implemented on a separate computing device that is proximate to sensor 102.

[0036] FIG. 2A is a high-level flowchart illustrating various techniques for block-based predictive coding for point clouds, according to some embodiments. As indicated at 210, a prediction tree may be generated that includes multiple nodes from points that make up a point cloud captured from sensor(s), in various embodiments. A prediction tree may serve as a prediction structure, where each point in the point cloud is associated with a node (sometimes referred to as a vertex) of the prediction tree, in some embodiments. In some embodiments, each node may be predicted from only the ancestors of the node in the tree.

[0037] As part of generating the prediction tree, individual points of the point cloud may be selected for inclusion in the prediction tree, as indicated at 220. As indicated at 230, predicted node values may be determined for the individual points from prediction techniques applied to ancestor nodes in the prediction tree, in some embodiments. FIGS. 3 and 4, discussed below, provide examples prediction tree generation techniques.

[0038] Various prediction techniques may be implemented to predict a node from ancestor nodes. These prediction techniques may be signaled as prediction modes or prediction indicators (e.g., mapped to prediction mode values “0”=prediction technique A, “1”=prediction technique B, and so on). In some embodiments, a node in the prediction tree (corresponding to one point in the point cloud) may not have a prediction technique as it may be the first or root node of the prediction tree. The prediction mode for such a node may be indicated as “none” or “root” in some embodiments. The actual information (e.g., spatial information and/or attribute information) for such a node may be encoded instead of the residual information encoded for other nodes in the tree that is used to derive the actual information when applied to predicted values.

[0039] As illustrated in FIG. 2B, prediction tree 260 may include various nodes that are predicted according to a prediction technique applied to one or more ancestor nodes, indicated by the arrows. For example, leaf node 264 may be predicted by ancestor nodes 266, according to various ones of the prediction techniques discussed below. Some nodes, like root node 262, may not be predicted but encoded as part of prediction tree 260 using the actual values. Other nodes, like leaf node 266, may be predicted according to a single ancestor node.

[0040] In some embodiments, delta prediction may be implemented or supported as a prediction technique. Delta prediction may use a position of a parent node of a current node as a predictor of the current node.

[0041] In some embodiments, linear prediction may be implemented or supported as a prediction technique. For example, in linear prediction, a point “p0” may be the position of a parent node and “p1” may be the position of a grandparent node. The position of a current node may be predicted as (2.times.p0-p1).

[0042] In some embodiments, parallelogram prediction may be implemented or supported as a prediction technique. For example, in parallelogram prediction “p0” may be the position of the parent node, “p1” the position of the grandparent node, and “p2” be the position of the great-grandparent node. A current node’s position may then be determined as (p0+p1-p2).

[0043] In some embodiments, rectangular prediction may be implemented or supported as a prediction technique. For example, in rectangular prediction “p0” may be the position of the parent node, “p1” the position of the grandparent node, and “p2” be the position of the great-grandparent node. A current node’s position may then be determined as (p0+p2-p1).

[0044] In some embodiments, polar prediction may be implemented or supported as a prediction technique. For example, in polar prediction (.theta..sub.0, r.sub.0, z.sub.0) may be the polar coordinates of the parent node and (.theta..sub.1, r.sub.1, z.sub.1) may be the polar coordinates of the grandparent node. The position of the current node is predicted as

( 2 .theta. 0 - .theta. 1 , r 0 + r 1 2 0 , z 0 + z 1 2 ) . ##EQU00001##

[0045] In some embodiments, modified polar prediction may be implemented or supported as a prediction technique. For example, in modified polar prediction (.theta..sub.0, r.sub.0, z.sub.0) may be the polar coordinates of the parent node and (.theta..sub.1, r.sub.1, z.sub.1) be the polar coordinates of the grandparent node. The position of the current node may be predicted as (2.theta..sub.0-.theta..sub.1, r.sub.0, z.sub.0).

[0046] In some embodiments, average prediction may be implemented or supported as a prediction technique. For example, in average prediction “p0” may be the position of the parent node and “p1” the position of the grandparent node. The position of the current node may be predicted as ((p0+p1)/2).

[0047] In some embodiments, average prediction of order 3 may be implemented or supported as a prediction technique. For example, in average prediction of order 3, “p0” may be the position of the parent node, “p1” may be the position of the grandparent node and “p2” may be the position of the great-grandparent node. The position of the current node may be predicted as ((p0+p1+p2)/3).

[0048] In some embodiments, average prediction of order k may be implemented or supported as a prediction technique. For example, in average prediction of order k, the positions of ancestor nodes of the current node may be averaged up to the order k ancestor nodes.

[0049] The choice of the prediction technique to be applied for each node of the prediction tree may be determined according to a rate-distortion optimization procedure, in some embodiments. In some embodiments, the choice may be adaptive per node or per group of nodes. In some embodiments, the choice may be signaled explicitly in the bitstream or may be implicitly derived based on the location of the node if the prediction graph and decoded positions and prediction modes of the node ancestors. In some embodiments, the choice may be signaled at a block-level for a set of nodes included in a block of a segment of the prediction tree.

[0050] The prediction tree may be encoded, including the prediction techniques applied to determine the predicted node values. For example, a node may be encoded along with a number of child nodes, and respective prediction modes to determine each child node (which may be the same for each child, different for each child, or independently determined for each child (even if determined to be the same) in some embodiments). In various embodiments, the prediction tree may be encoded by traversing the tree in a predefined order (e.g., depth first, breath first) and encoding for each node the number of its children. The positions of the nodes may be encoded by encoding first the chosen prediction mode and then the obtained residuals after prediction, in some embodiments. In various embodiments, the number of children and the prediction mode for nodes can be encoded.

[0051] In various embodiments, the prediction residuals could be encoded (e.g., arithmetically encoded) in order to further exploit statistical correlations. The residuals could be encoded by compressing the sign of each residue, the position of the most significant bit (equivalent to Floor(Log 2(Abs (residue)))) and the binary representation of the remaining bits, in some embodiments. Correlations between the X, Y, Z coordinates could be exploited by using a different entropy/arithmetic context based on the encoded values of the first encoded components, in some embodiments.

[0052] Block-based predictive coding techniques may be implemented for encoding residuals, in various embodiments. For example, as indicated at 240, different segments of the prediction tree may be identified according to a graph traversal technique, in some embodiments. For example, a traversal technique may start with a root node (e.g., root node 262 in FIG. 2) and traverse the nodes until a leaf node is reached. This first path may be a first segment. For example, as illustrated in FIG. 2C, segment 271 may include root node 262 and may traverse a path until a leaf node, like leaf node 266, is reached. A next node in a traversal order according to a traversal technique may be selected to identify another segment that ends with another leaf node. For instance, segment 273 may be identified. Some techniques may be iteratively performed until each node of a prediction tree is identified in a segment. In FIG. 2C, for example, segments 275, 277, and 279, may also be identified which together with segments 271 and 273 may divide prediction tree into different segments that together include all of the nodes of prediction tree 260. Although the example traversal technique discussed above may be used to identify segments, in some embodiments, many traversal techniques could be applied in various embodiments (e.g., depth first search, breadth first search, and so on).

……
……
……

您可能还喜欢...