Sony Patent | Information processing apparatus and information processing method

编辑：映维 | 分类：Sony | 2022年3月10日

Patent: Information processing apparatus and information processing method

Drawings: Click to check drawins

Publication Number: 20220076485

Publication Date: 20220310

Applicant: Sony

Assignee: Sony Group Corporation

Sony Patent | Information processing apparatus and information processing method

Abstract

There is provided an information processing apparatus and an information processing method that enable use of a thumbnail for 3D object still image content. A 3D object is used as original data, and role information that is information indicating that thumbnail data generated from the original data is a thumbnail based on the original data is generated. Then, the role information and encoded data obtained by encoding one frame of the 3D object by a predetermined encoding method are stored in a file having a predetermined file structure. The present technology can be applied to, for example, a data generation apparatus that generates a file that stores encoded data of Point Cloud without time information and its thumbnail.

Claims

An information processing apparatus comprising: a preprocessing unit that uses a 3D object as original data and generates role information that is information indicating that thumbnail data generated from the original data is a thumbnail based on the original data; and a file generation unit that stores the role information and encoded data obtained by encoding one frame of the 3D object by a predetermined encoding method, in a file having a predetermined file structure.
The information processing apparatus according to claim 1, wherein the role information includes information that serves as a starting point of reproduction of the encoded data.
The information processing apparatus according to claim 2, wherein the information that serves as the starting point of reproduction also includes group identification information that identifies a stream of the encoded data to be reproduced.
The information processing apparatus according to claim 3, wherein the preprocessing unit generates the role information indicating two-dimensional still image data as the thumbnail, wherein the two-dimensional still image data is obtained by displaying the 3D object according to a specific viewpoint position, viewpoint direction, and angle of view.
The information processing apparatus according to claim 3, wherein the preprocessing unit generates the role information indicating as the thumbnail a video thumbnail that is moving image data including an image which is obtained by displaying the 3D object according to a plurality of viewpoint positions, viewpoint directions, and angles of view.
The information processing apparatus according to claim 5, wherein the file generation unit stores the role information indicating the video thumbnail in ItemReferenceBox.
The information processing apparatus according to claim 5, wherein the file generation unit stores the role information indicating the video thumbnail in EntityToGroupBox.
The information processing apparatus according to claim 2, wherein the preprocessing unit generates the role information indicating as the thumbnail a 3D object thumbnail that is the 3D object encoded at low resolution.
The information processing apparatus according to claim 8, wherein the file generation unit stores the role information indicating the 3D object thumbnail in ItemReferenceBox.
The information processing apparatus according to claim 8, wherein the file generation unit stores the role information indicating the 3D object thumbnail in EntityToGroupBox.
The information processing apparatus according to claim 8, wherein the preprocessing unit generates a display rule for the 3D object thumbnail.
The information processing apparatus according to claim 11, wherein the display rule for the 3D object thumbnail is indicated by rotation during display of the 3D object thumbnail.
The information processing apparatus according to claim 11, wherein the display rule for the 3D object thumbnail is indicated by a viewpoint position, a line-of-sight direction, and an angle of view during display of the 3D object thumbnail.
The information processing apparatus according to claim 11, wherein the file generation unit stores an initial position of display of the 3D object thumbnail in the file.
The information processing apparatus according to claim 11, wherein the file generation unit stores the display rule for the 3D object thumbnail in ItemProperty.
The information processing apparatus according to claim 11, wherein the file generation unit stores the display rule for the 3D object thumbnail in Item.
The information processing apparatus according to claim 11, wherein the file generation unit stores the display rule for the 3D object thumbnail in meta track.
The information processing apparatus according to claim 8, wherein in a case where geometry based point cloud coding (G-PCC) is used for the 3D object thumbnail, the preprocessing unit generates the role information for using data with limited Geometry decoding as the thumbnail.
The information processing apparatus according to claim 18, wherein the preprocessing unit generates the role information indicating that Geometry decoding is limited by ItemProperty.
An information processing method comprising, by an information processing apparatus: using a 3D object as original data and generating role information that is information indicating that thumbnail data generated from the original data is a thumbnail based on the original data; and storing the role information and encoded data obtained by encoding one frame of the 3D object by a predetermined encoding method, in a file having a predetermined file structure.

Description

TECHNICAL FIELD

[0001] The present disclosure relates to an information processing apparatus and an information processing method, and particularly to an information processing apparatus and an information processing method that enable use of a thumbnail for a 3D object without time information.

BACKGROUND ART

[0002] Conventionally, as a method of expressing a 3D object, there is Point Clouds, which are represented by a set of points having position information and attribute information (especially color information) at the same time in a three-dimensional space. Then, as disclosed in Non-Patent Documents 1 and 2, methods for compressing a Point Cloud are specified.

[0003] For example, as one of methods for compressing a Point Cloud, there is a method in which a Point Cloud is divided into a plurality of areas (hereinafter referred to as segmentation) and each area is projected in a plane to generate a texture image and a geometry image, which are then encoded by a video codec. Here, the geometry image is an image including depth information of point clouds that constitute a Point Cloud. This method is called video-based point cloud coding (V-PCC), and the details thereof are described in Non-Patent Document 1.

[0004] Furthermore, as another compression method, there is a method in which a Point Cloud is separated into geometry, which indicates a three-dimensional shape, and attribute, which indicates color and reflection information as attribute information, and they are encoded. This method is called geometry based point cloud coding (G-PCC).

[0005] Then, use cases are expected in which V-PCC and G-PCC streams generated by such encoding are downloaded and reproduced or distributed over an Internet protocol (IP) network.

[0006] Therefore, as disclosed in Non-Patent Document 3, a study on distribution technologies using ISO base media file format/dynamic adaptive streaming over HTTP (ISOBMFF/DASH), which is an existing framework, in moving picture experts group (MPEG) has started to suppress impacts on existing distribution platforms and realize services at an early stage.

CITATION LIST

Non-Patent Document

[0007] Non-Patent Document 1: m45183 second working draft for Video-based Point Cloud Coding (V-PCC). [0008] Non-Patent Document 2: m45183 working draft for Geometry-based Point Cloud Coding (G-PCC). [0009] Non-Patent Document 3: w17675, First idea on Systems technologies for Point Cloud Coding, April 2018, San Diego,* US*

SUMMARY OF THE INVENTION

Problems to be Solved by the Invention

[0010] By the way, conventionally, like a moving image, a V-PCC stream or G-PCC stream generated by encoding a Point Cloud including a plurality of frames at a predetermined time interval by V-PCC or G-PCC is used in use cases in which it is stored in a file having a file structure using ISOBMFF technology. On the other hand, use cases are also assumed in which, for example, a Point Cloud without time information (that is, a Point Cloud for one frame), such as map data, encoded by V-PCC or G-PCC is stored in a file having a file structure using ISOBMFF technology.

[0011] Furthermore, in general, a thumbnail is attached to two-dimensional still image content, and the thumbnail is used as a sample or an index for identifying the original image (in this case, a two-dimensional still image). For example, a list of thumbnails is displayed for the user to select desired two-dimensional still image content from a plurality of pieces of two-dimensional still image content. For this reason, thumbnails are required to reduce the load in decoding processing, display processing, and the like, and in the case of two-dimensional still image content, two-dimensional still image data having a low resolution is used.

[0012] Therefore, it is required to enable the use of a thumbnail for a three-dimensional 3D object that does not have time information (hereinafter referred to as 3D object still image content) such as the map data described above.

[0013] The present disclosure has been made in view of such a situation, and makes it possible to use a thumbnail for a 3D object without time information.

Solutions to Problems

[0014] The information processing apparatus of an aspect of the present disclosure includes: a preprocessing unit that uses a 3D object as original data and generates role information that is information indicating that thumbnail data generated from the original data is a thumbnail based on the original data; and a file generation unit that stores the role information and encoded data obtained by encoding one frame of the 3D object by a predetermined encoding method, in a file having a predetermined file structure.

[0015] The information processing method of an aspect of the present disclosure includes, by an information processing apparatus: using a 3D object as original data and generating role information that is information indicating that thumbnail data generated from the original data is a thumbnail based on the original data; and storing the role information and encoded data obtained by encoding one frame of the 3D object by a predetermined encoding method, in a file having a predetermined file structure.

[0016] According to an aspect of the present disclosure, a 3D object is used as original data and role information that is information indicating that thumbnail data generated from the original data is a thumbnail based on the original data is generated; and the role information and encoded data obtained by encoding one frame of the 3D object by a predetermined encoding method are stored in a file having a predetermined file structure.

BRIEF DESCRIPTION OF DRAWINGS

[0017] FIG. 1 is a diagram showing an example of signaling of a thumbnail in HEIF.

[0018] FIG. 2 is a diagram explaining octree encoding.

[0019] FIG. 3 is a diagram showing an example of extension of ItemReferenceBox that stores role information indicating a picture thumbnail.

[0020] FIG. 4 is a diagram showing a variation example of extending ItemReferenceBox that stores role information indicating a picture thumbnail.

[0021] FIG. 5 is a diagram showing an example of extension of ItemReferenceBox that stores role information indicating a video thumbnail.

[0022] FIG. 6 is a diagram showing a variation example of extension of ItemReferenceBox that stores role information indicating a video thumbnail.

[0023] FIG. 7 is a diagram showing an example of a definition of EntityToGroupBox (thmb) for signaling a video thumbnail.

[0024] FIG. 8 is a diagram showing an example of signaling that rotates a 3D object thumbnail on a specific axis.

[0025] FIG. 9 is a diagram showing coordinate axes.

[0026] FIG. 10 is a diagram showing an example of signaling in which a plurality of various rotations is combined.

[0027] FIG. 11 is a diagram showing an example of signaling of an initial position.

[0028] FIG. 12 is a diagram showing an example of signaling of a viewpoint position, a line-of-sight direction, and an angle of view of a 3D object thumbnail.

[0029] FIG. 13 is a diagram showing an example of signaling by ItemProperty.

[0030] FIG. 14 is a diagram showing an example of signaling an initial position by ItemProperty.

[0031] FIG. 15 is a diagram showing an example of signaling a 3D object thumbnail as a derivative image.

[0032] FIG. 16 is a diagram showing a first example of a metadata track of a display rule.

[0033] FIG. 17 is a diagram showing a first example of a metadata track of a display rule.

[0034] FIG. 18 is a diagram showing a second example of a metadata track of a display rule.

[0035] FIG. 19 is a diagram showing an example of associating a display rule with a thumbnail.

[0036] FIG. 20 is a diagram showing an example of signaling a 3D object thumbnail by extending GPCCConfigurationBox.

[0037] FIG. 21 is a diagram showing an example of a file structure using an extended GPCCConfigurationBox.

[0038] FIG. 22 is a diagram showing an example of 3D object thumbnail signaling by GPCCLimitedInfoProperty.

[0039] FIG. 23 is a diagram showing an example of a file structure using GPCCLimitedInfoProperty.

[0040] FIG. 24 is a diagram showing an example of decoding geometry data to a specific depth by Sequence Parameter Set.

[0041] FIG. 25 is a block diagram showing an example of a data generation apparatus.

[0042] FIG. 26 is a block diagram showing an example of a data reproduction apparatus.

[0043] FIG. 27 is a flowchart explaining file generation processing for generating a file in which a thumbnail is stored.

[0044] FIG. 28 is a flowchart explaining file generation processing for generating a file in which a thumbnail is stored.

[0045] FIG. 29 is a flowchart explaining thumbnail reproduction processing for reproducing a thumbnail.

[0046] FIG. 30 is a block diagram showing a configuration example of an embodiment of a computer to which the present technology has been applied.

MODE FOR CARRYING OUT THE INVENTION

[0047] A specific embodiment to which the present technology has been applied will be described in detail below with reference to the drawings.

[0048]

[0049] First, signaling of a thumbnail in HEIF will be described with reference to FIG. 1.

[0050] For example, a Point Cloud that constitutes 3D object still image content is assumed to be encoded using V-PCC or G-PCC as described above. Encoded data obtained by encoding a Point Cloud that constitutes 3D object still image content using V-PCC is referred to as V-PCC still image data, and encoded data obtained by encoding a Point Cloud that constitutes 3D object still image content using V-PCC is referred to as V-PCC still image data.

[0051] Furthermore, ISO/IEC 23008-12 MPEG-H Image File Format (HEIF) can be used as a standard for storing 3D object still image content in a file having a file structure using ISOBMFF technology. On the other hand, a two-dimensional image can be encoded by a moving image codec such as advanced video coding (AVC) or high efficiency video coding (HEVC), and can be stored in a file having a file structure using ISOBMFF as two-dimensional image data without time information.

[0052] Therefore, storing V-PCC still image data and G-PCC still image data in a file having a file structure using ISOBMFF technology by regarding the image data as similar to two-dimensional image data that is compressed using a moving image codec and does not have time information can be achieved by, for example, extending HEIF.

[0053] Here, in HEIF, thumbnails are defined as “low resolution representation of an original image”. Then, as shown in FIG. 1, role information indicating a thumbnail is stored in ItemReferenceBox (iref). For example, in a Box whose referenceType is thmb, item_id indicating a thumbnail item is indicated by from_item_id, and item_id of the original image of the thumbnail is indicated by to_item_id. As described above, the role information is information indicating that it is a thumbnail and which original image the thumbnail is based on.

[0054] That is, in FIG. 1, the fact that item_id=2 is a thumbnail is indicated by the Box in which referenceType of ItemReferenceBox is thmb.

[0055] Furthermore, in a case where 3D object still image content is stored in a file having a file structure using ISOBMFF technology, 3D object thumbnail data is stored as one item of one stream or divided into a plurality of streams and stored as multi items. For example, in the case of one item, that item is used as the starting point of reproduction. On the other hand, in the case of multi items, the starting point of reproduction is indicated by item or group of EntityToGroupBox.

[0056] However, since the existing ItemReferenceBox can only signal a reference relationship between items, in a case where the group is the starting point of reproduction, there is a concern that signaling for the original image cannot be performed. Furthermore, the existing standard assumes that thumbnail data is a two-dimensional still image, and does not assume moving image data or 3D objects with reduced resolution, which are sophisticated thumbnails, and there is a concern that signaling cannot be performed.

[0057] Moreover, in a case where a 3D object is used as a thumbnail, the display method (for example, the viewpoint position at which the 3D object is viewed, the line-of-sight direction, the angle of view, the display time, and the like) depends on the client when displaying as a thumbnail. Therefore, the display may be different from the intention of a content author, or the display may be different for each client.

[0058]

[0059] Here, octree encoding as shown in FIG. 2 is used as a method for compressing geometry data of G-PCC still image data.

[0060] For example, octree encoding is a method of expressing the presence or absence of Points in each block by octree in Voxel-expressed data in which each Point of Point cloud data is arranged in a space divided into certain blocks. In this method, as shown in FIG. 2, blocks with the point are represented by 1 and blocks without the points are represented by 0. Furthermore, the fineness of this block is called level of detail (LoD), and the larger the LoD, the finer the block.

[0061] Then, the Geometry data compressed by octree encoding can be decoded to the depth in the middle of the octree to reconstruct a Point Cloud as a low-resolution Geometry. However, in this case, attribute data such as texture separately needs data of depth of decoding. That is, the Geometry data can be shared between the original image and the thumbnail.

[0062] Therefore, in the following, it is proposed to signal low-resolution data including Geometry data common to the original image as a thumbnail.

[0063]

[0064] Next, a thumbnail data format of a thumbnail of a Point Cloud that constitutes 3D object still image content will be described.

[0065] In the present embodiment, three methods using picture thumbnails, video thumbnails, and 3D object thumbnails are proposed as the thumbnail data format.

[0066] A picture thumbnail is two-dimensional still image data in which a 3D object is displayed at a specific viewpoint position, viewpoint direction, and angle of view.

[0067] A video thumbnail is moving image data including an image displaying a 3D object at a plurality of viewpoint positions, viewpoint directions, and angles of view.

[0068] A 3D object thumbnail is low-resolution encoded Point Cloud data.

[0069] For example, in a case where the original image is V-PCC-encoded data, the low-resolution encoded V-PCC data can be used as a 3D object thumbnail. Note that it is not limited to data encoded by the same method for the original image and the 3D object thumbnail. That is, in a case where the original image is V-PCC-encoded data, G-PCC-encoded data, data including Mesh data and texture data, and the like may be used as a 3D object thumbnail.

[0070] Similarly, in a case where the original image is G-PCC-encoded data, V-PCC-encoded data, data including Mesh data and texture data and the like can be used for a 3D object thumbnail in addition to the low-resolution encoded G-PCC data.

[0071] By the way, when causing the existing thumbnail definition to correspond to a Point Cloud that constitutes 3D object still image content, it is considered that the 3D object thumbnail is applicable among the thumbnail data formats described above. Furthermore, in a case where a 3D object thumbnail is displayed on a 2D display or head mounted display (HMD), an equivalent effect can be obtained by using low-resolution picture thumbnail or video thumbnail rendered as a 2D image as a thumbnail.

[0072] Note that in order to indicate the relationship between the thumbnail described above and the 3D object still image content of a Point Cloud, which is the original image, an HEIF thumbnail as shown in FIG. 1 above can be used in a case where the condition described below is met. That is, it is a condition that both things are satisfied: the original image is the 3D object still image content of a Point Cloud and is stored as one item, and the thumbnail is a picture thumbnail or the 3D object thumbnail is stored as one item. Therefore, in the following, first to third methods for enabling signaling of a thumbnail for 3D object still image content even in a case where this condition is not met will be described.

[0073]

[0074] With reference to FIGS. 3 and 4, a method using a picture thumbnail will be described as a first method for signaling a thumbnail for 3D object still image content.

[0075] For example, in order to use a picture thumbnail as a thumbnail for 3D object still image content, extension for using a Point Cloud that constitutes the 3D object still image content as the original image is necessary.

[0076] Here, there is a case where when the 3D object still image content is stored as multi items in a file having a file structure using ISOBMFF technology, the starting point of reproduction of the original image is indicated by group of EntityToGroupBox. In this case, group_id of EntityToGroupBox is an id that indicates the starting point of reproduction, but since only item can be indicated in ItemReferenceBox, it is assumed that the original image cannot be signaled by ItemReferenceBox. The inability to signal the original image in this way is assumed to be similar not only for picture thumbnails but also for video thumbnails and 3D object thumbnails. Therefore, ItemReferenceBox is extended so that the group can indicate.

[0077] FIG. 3 shows an example of extension of ItemReferenceBox that stores role information indicating a picture thumbnail. In FIG. 3, in the parts shown in bold, the 3D object still image content divided and stored as multi items is signaled as the original image.

[0078] That is, as shown in FIG. 3, the to_item_ID fields in SingleItemTypeReferenceBox and SingleItemTypeReferenceBoxLarge signaled by ItemReferenceBox are set to to_entity_ID, so that both item_id and group_id can be signaled by this to_entity_ID. Then, in a case where flags&1 of ItemReferenceBox is 1, it indicates that item_id and group_id are signaled by to_entity_ID. On the other hand, in a case where flags&1 of ItemReferenceBox is 0, it indicates that only item_id is signaled by to_entity_ID.

[0079] FIG. 4 shows a variation example of extending ItemReferenceBox that stores role information indicating a picture thumbnail. In FIG. 4, in the parts shown in bold, the 3D object still image content divided and stored as multi items is signaled as the original image.

[0080] For example, as shown in FIG. 4, version=2 may be added to ItemReferenceBox so that in that case, item_id and group_id can be signaled by signaling indicating the original image. Then, in a case where to_ID_type of SingleReferenceBox is 0, it indicates that item_id is signaled, and in a case where to_ID_type of SingleReferenceBox is 1, it indicates that group_id is signaled.

[0081] Note that, the first method can be used for associating groups, tracks, and the like in ItemReferenceBox for purposes other than thumbnails.

[0082]

[0083] With reference to FIGS. 5 to 7, a method for realizing a video thumbnail will be described as a second method for signaling a thumbnail for 3D object still image content.

[0084] First, as a first example of the second method, signaling for associating an original image with a video thumbnail will be described.

[0085] The video thumbnail data is Video data encoded by HEVC, AVC, or the like, or an Image sequence in which a plurality of pieces of Image data is provided with time information specified in the HEIF standard. Then, the method for storing Video data or Image sequence in an ISOBMFF track has already been specified in the ISO/IEC standard.

[0086] Therefore, as signaling indicating that the video thumbnail data stored in the track is the thumbnail data of the 3D object still image content of a Point Cloud, the first example and the second example of the second method below are described.

[0087] In the first example of the second method, the existing ItemReferenceBox is extended to enable signaling of a video thumbnail.

[0088] For example, the existing ItemReferenceBox cannot signal a video thumbnail. That is because ItemReferenceBox cannot signal a track that indicates a video thumbnail. Therefore, in order to associate the video thumbnail, it is only necessary to indicate track_id of the track of the video thumbnail. Therefore, ItemReferenceBox is extended so that track_id can be indicated.

[0089] FIG. 5 shows an example of extension of ItemReferenceBox that stores role information indicating a video thumbnail. In FIG. 5, video thumbnails are signaled in the parts shown in bold.

[0090] That is, the from_item_ID fields of SingleItemTypeReferenceBox and SingleItemTypeReferenceBoxLarge, which are signaled by extending ItemReferenceBox shown in FIG. 3 above, are set to from_entity_ID, and both item_id and track_id can be signaled by this from_entity_ID. Then, in a case where flags&1 of ItemReferenceBox is 1, it indicates that one of item_id, track_id, and group_id is signaled by from_entity_ID or to_entity_ID. On the other hand, in a case where flags&1 of ItemReferenceBox is 0, it indicates that only item_id is signaled. Such extension enables signaling of a video thumbnail by ItemReferenceBox.

[0091] FIG. 6 shows a variation example of extension of ItemReferenceBox that stores role information indicating a video thumbnail. In FIG. 6, video thumbnails are signaled in the parts shown in bold.

[0092] For example, the extended SingleReferenceBox as shown in FIG. 4 above may be extended. That is, as shown in FIG. 6, an ID to be signaled is specified by from_ID_type. For example, in a case where from_ID_type is 0, it indicates that item_id is signaled, in a case where from_ID_type is 1, it indicates that group_id is signaled, and in a case where from_ID_type is 2, it indicates that track_id is signaled. Then, in the case of a video thumbnail, from_ID_type is set to 2, and track_id in which the video thumbnail is stored is specified by from_ID.

[0093] Note that the fact that group_id can be specified by from_ID_type is assumed to be used in 1 of the first example of the third method as described later.

[0094] Note that in the first example of the second method, it is assumed that the existing referenceType thmb is used, but in order to explicitly indicate that it is a video thumbnail, vthm may be specified in referenceType.

[0095] Furthermore, the first example of the second method can be used for associating groups, tracks, and the like in ItemReferenceBox for purposes other than thumbnails.

[0096] Next, in the second example of the second method, EntityToGroupBox (thmb) is defined so that the original image and the video thumbnail can be grouped.

[0097] That is, the second example of the second method is a method that enables signaling of a video thumbnail by EntityToGroupBox. For example, EntityToGroupBox can signal item_id or track_id in the entity_id field. However, the signaling for grouping the original image and the thumbnail has not been defined, and group_id could not be signaled. Therefore, a group that can indicate a list of the original image and the thumbnail is defined so that group_id can be signaled.

[0098] FIG. 7 shows an example of the definition of EntityToGroupBox (thmb) for signaling a video thumbnail.

[0099] As shown in FIG. 7, grouping_type of EntityToGroupBox is set to thmb, indicating that it is grouping indicated by the thumbnail. Then, regarding entity_ids included in EntityToGroupBox, the first one indicates track_id of the video thumbnail, and the second and subsequent ones indicate item_id of the original image.

[0100] Moreover, considering the case where the starting point of reproduction of the original image is group_id, group_id can be signaled in the entity_id field. In this case, in a case where flags&1 of EntityToGroupBox is 1, entity_id may explicitly indicate that group_id is signaled in addition to the fact that item_id and track_id are signaled. On the other hand, in a case where flags&1 of EntityToGroupBox is 0, it indicates that one of item_id and track_id is signaled.

[0101] Note that, in order to explicitly indicate the video thumbnail, grouping_type dedicated to the video thumbnail may be set to vthm.

[0102] Furthermore, the second example of the second method can be used for associating groups in EntityToGroupBox for purposes other than thumbnails.

[0103]

[0104] With reference to FIGS. 8 to 24, a method of using a 3D object thumbnail will be described as a third method for signaling a thumbnail for 3D object still image content.

[0105] First, in the first example of the third method, a 3D object thumbnail is associated from the original image.

[0106] For example, 3D object thumbnail data is Point Cloud data encoded by V-PCC or G-PCC. In a case where it is stored in a file having a file structure using ISOBMFF technology, the 3D object thumbnail data is stored as one item of one stream or divided into a plurality of streams and stored as multi items. As described above, in the case of multi items, the starting point of reproduction is indicated by item or group of EntityToGroupBox.

[0107] Therefore, as 1 of the first example of the third method, a method of extending the existing ItemReferenceBox to enable signaling of a 3D object thumbnail will be described.

[0108] For example, there is a case where the existing ItemReferenceBox cannot signal a 3D object thumbnail. This is because signaling is not possible in a case where the starting point of reproduction of the 3D object thumbnail is indicated by group_id.

[0109] In 1 of the first example of the third method, ItemReferenceBox is extended such that it is extended to be capable of handling the case where the starting point of reproduction of the 3D object thumbnail is indicated by group.

[0110] That is, similarly to the ItemReferenceBox (first example of the second method) described above with reference to FIG. 5, a method of storing role information indicating a 3D object thumbnail is used. Then, in 1 of the first example of the third method, it is only required to set from_entity_ID and signal item_id or group_id.

[0111] Moreover, as a variation example of 1 of the first example of the third method, as described above with reference to FIG. 6, similarly to the extension of SingleReferenceBox, from_ID_type can be set to 1, and group_id indicating the starting point of reproduction of the 3D object thumbnail can be specified by from_ID.

[0112] Note that in 1 of the first example of the third method, 3dst may be specified in referenceType in order to explicitly indicate that it is a 3D object thumbnail.

[0113] In 2 of the first example of the third method, similarly to the second example of the second method described above, EntityToGroupBox (thmb) is defined so that the original image and the 3D object thumbnail can be grouped.

[0114] For example, EntityToGroupBox can signal item_id or track_id in the entity_id field. However, the signaling for grouping the original image and the thumbnail has not been defined, and group_id could not be signaled. Therefore, a group that can indicate a list of the original image and the thumbnail is defined so that group_id can be signaled.

[0115] As an example of a specific extension, similarly to the second example of the second method described with reference to FIG. 7 above, regarding entity_ids included in EntityToGroupBox, the first one indicates track_id or group_id of the 3D object thumbnail, and the second and subsequent ones indicate item_id or group_id of the original image.

[0116] Note that in order to explicitly indicate the 3D object thumbnail, grouping_type dedicated to the 3D object thumbnail may be set to 3dst.

[0117] Next, the second example of the third method enables signaling of a display rule of the 3D object thumbnail.

[0118] For example, in a case where a 3D object thumbnail is displayed, how to display it depends on the implementation by the client. Specifically, only a certain viewpoint position can be displayed as a 2D image, or a plurality of viewpoint positions can be displayed continuously. For this reason, the intention of the content author who wants to display the 3D object thumbnail properly may not be realized.

[0119] Therefore, the second example of the third method enables the content author to signal how to display the 3D object thumbnail. First, the method of signaling a display rule will be described, and further a method of a method for storing the display rule in ISOBMFF will be described.

[0120] Note that the second example of the third method is a method for indicating the display rule of the 3D object thumbnail, but it can also be used when automatically displaying the original image. Moreover, it can also be used to indicate which part of the original image is displayed in the picture thumbnail or the video thumbnail.

[0121] Signaling of a display rule will be described as 1 of the second example of the third method.

[0122] In 1-1 of the second example of the third method, a display rule for rotating the 3D object thumbnail is signaled. That is, 1-1 of the second example of the third method is a method of switching the display of the 3D object thumbnail by fixing the viewpoint position, the line-of-sight direction, and the angle of view and indicating the rotation of the coordinate system of the 3D object thumbnail.

[0123] For example, FIG. 8 shows an example of signaling that rotates a 3D object thumbnail on a particular axis. Here, coordinate axes shown in FIG. 9 are used, and the directions of the white arrows with respect to each axis are the directions of forward rotation.

[0124] For example, loop indicates whether or not to loop the rotation of the 3D object thumbnail. That is, in a case where loop is 1, it indicates that rotation continues while the 3D object thumbnail is displayed. Furthermore, in a case where loop is 0, it indicates that the 3D object thumbnail rotates only once, and after the 3D object thumbnail rotates once, the initial position is displayed continuously.

[0125] Furthermore, rotation_type indicates the coordinate axis around which the 3D object thumbnail rotates. That is, in a case where rotation_type is 0, it indicates rotation around the yaw axis, in a case where rotation_type is 1, it indicates rotation around the pitch axis, and in a case where rotation_type is 2, it indicates rotation around the roll axis.

[0126] Furthermore, negative_rotation indicates whether or not the 3D object thumbnail rotates backward. That is, in a case where negative_rotation is 0, it indicates that the 3D object thumbnail does not rotate backward (that is, it rotates forward), and in a case where negative_rotation is 1, it indicates that the 3D object thumbnail rotates backward.

[0127] Moreover, timescale and duration indicate the time over which the 3D object thumbnail makes one rotation.

[0128] Note that as a variation example of 1-1 of the second example of the third method, a plurality of various rotations in FIG. 8 may be combined. For example, signaling is performed by the structure shown in FIG. 10.

[0129] The signaling shown in FIG. 10 differs from the signaling shown in FIG. 8 in that a plurality of rotations is written in succession. Note that in the signaling shown in FIG. 10, the parameters having the same name as in FIG. 8 have similar semantics.

[0130] Furthermore, the signaling shown in FIG. 10 is new and enables rotation within one rotation using an angle parameter. That is, angle_duration indicates the time to move the angle indicated by angle.

……
……
……

本文链接：https://patent.nweon.com/22556

Sony Patent | Information processing apparatus and information processing method

您可能还喜欢...

分类

最新AR/VR行业分享

最新AR/VR论文

最新AR/VR行业招聘

Sony Patent | Information processing apparatus and information processing method

您可能还喜欢...

Sony Patent | Head mounted displays (hmds) with front facing cameras for transitioning between non-transparent modes and transparent modes

Sony Patent | Image processing device, image processing method, and program

Sony Patent | Information Processing Apparatus, Information Processing System, And Information Processing Method

分类

最新AR/VR行业分享

最新AR/VR论文

最新AR/VR行业招聘