Sony Patent | Methods, devices, and computer program products for improved 3d mesh texturing

编辑：映维 | 分类：Sony | 2021年8月5日

Patent: Methods, devices, and computer program products for improved 3d mesh texturing

Drawings: Click to check drawins

Publication Number: 20210241430

Publication Date: 20210805

Applicant: Sony

Assignee: Sony Corporation

Sony Patent | Methods, devices, and computer program products for improved 3d mesh texturing

Abstract

Methods, systems, and computer program products for improving the generation of a 3D mesh texture include extracting a plurality of high frequency image components and a plurality of low frequency image components from a plurality of two-dimensional (2D) images of a three-dimensional (3D) object captured at respective points of perspective of the 3D object, generating a low frequency texture atlas from the plurality of low frequency image components, generating a high frequency texture atlas from the plurality of high frequency image components by performing a texturing operation comprising seam leveling on a subset of the plurality of high frequency image components, and generating the texture atlas by merging the low frequency texture atlas with the high frequency texture atlas.

Claims

A method of generating a texture atlas comprising: extracting a plurality of high frequency image components and a plurality of low frequency image components from a plurality of two-dimensional (2D) images of a three-dimensional (3D) object captured at respective points of perspective of the 3D object; generating a low frequency texture atlas from the plurality of low frequency image components; generating a high frequency texture atlas from the plurality of high frequency image components by performing a texturing operation comprising seam leveling on a subset of the plurality of high frequency image components; and generating the texture atlas by merging the low frequency texture atlas with the high frequency texture atlas.
The method of claim 1, wherein extracting the plurality of low frequency image components from the plurality of 2D images of the 3D object comprises performing a blurring operation on respective ones of the plurality of 2D images.
The method of claim 1, wherein extracting the plurality of high frequency image components from the plurality of 2D images comprises subtracting respective ones of the low frequency image components from respective ones of the plurality of 2D images.
The method of claim 1, further comprising: extracting a plurality of high frequency intermediate image components from the plurality of 2D images; extracting a plurality of middle frequency intermediate image components from the plurality of 2D images; and extracting a plurality of low frequency intermediate image components from the plurality of 2D images, wherein extracting the plurality of high frequency image components comprises merging the plurality of high frequency intermediate image components and the plurality of middle frequency intermediate image components, and wherein generating the plurality of low frequency image components comprises merging the plurality of low frequency intermediate image components and the plurality of middle frequency intermediate image components.
The method of claim 4, further comprising: generating a plurality of first blurred images by performing a blurring operation on respective ones of the plurality of 2D images, and generating a plurality of second blurred images by performing the blurring operation on respective ones of the plurality of first blurred images.
The method of claim 5, wherein extracting the plurality of low frequency intermediate image components from the plurality of 2D images comprises selecting the plurality of second blurred images, wherein extracting the plurality of middle frequency intermediate image components from the plurality of 2D images comprises subtracting respective ones of the plurality of second blurred images from respective ones of the plurality of first blurred images, and wherein extracting the plurality of high frequency intermediate image components from the plurality of 2D images comprises subtracting respective ones of the plurality of first blurred images from respective ones of the plurality of 2D images.
The method of claim 1, wherein a first number of the subset of the plurality of high frequency image components is less than a second number of the plurality of low frequency image components.
The method of claim 1, further comprising selecting a first high frequency image component of the plurality of high frequency image components as part of the subset of the plurality of high frequency image components based on a quality of the first high frequency image component, an orientation of the first high frequency image component with respect to the 3D object, and/or a distance to the 3D object from which the first high frequency image component was captured.
The method of claim 1, wherein the texturing operation comprising seam leveling comprises a Markov random field optimization operation.
The method of any of claim 1, wherein generating the low frequency texture atlas based on the plurality of low frequency image components comprises summing, for each low frequency image component of the plurality of low frequency image components, a color value of the low frequency image component multiplied by a weight value.
A computer program product for operating an imaging system, the computer program product comprising a non-transitory computer readable storage medium having computer readable program code embodied in the medium that when executed by a processor causes the processor to perform the method of claim 1.
A system for processing images, the system comprising: a processor; and a memory coupled to the processor and storing computer readable program code that when executed by the processor causes the processor to perform operations comprising: extracting a plurality of high frequency image components and a plurality of low frequency image components from a plurality of two-dimensional (2D) images of a three-dimensional (3D) object captured at respective points of perspective of the 3D object; generating a low frequency texture atlas from the plurality of low frequency image components; generating a high frequency texture atlas from the plurality of high frequency image components by performing a texturing operation comprising seam leveling on a subset of the plurality of high frequency image components; and generating a texture atlas by merging the low frequency texture atlas with the high frequency texture atlas.
The system of claim 12, wherein extracting the plurality of low frequency image components from the plurality of 2D images of the 3D object comprises performing a blurring operation on respective ones of the plurality of 2D images.
The system of claim 12, wherein extracting the plurality of high frequency image components from the plurality of 2D images comprises subtracting respective ones of the low frequency image components from respective ones of the plurality of 2D images.
The system of claim 12, wherein the operations further comprise: extracting a plurality of high frequency intermediate image components from the plurality of 2D images; extracting a plurality of middle frequency intermediate image components from the plurality of 2D images; and extracting a plurality of low frequency intermediate image components from the plurality of 2D images, wherein extracting the plurality of high frequency image components comprises merging the plurality of high frequency intermediate image components and the plurality of middle frequency intermediate image components, and wherein generating the plurality of low frequency image components comprises merging the plurality of low frequency intermediate image components and the plurality of middle frequency intermediate image components.
The system of claim 15, wherein the operations further comprise: generating a plurality of first blurred images by performing a blurring operation on respective ones of the plurality of 2D images, and generating a plurality of second blurred images by performing the blurring operation on respective ones of the plurality of first blurred images.
The system of claim 16, wherein extracting the plurality of low frequency intermediate image components from the plurality of 2D images comprises selecting the plurality of second blurred images, wherein extracting the plurality of middle frequency intermediate image components from the plurality of 2D images comprises subtracting respective ones of the plurality of second blurred images from respective ones of the plurality of first blurred images, and wherein extracting the plurality of high frequency intermediate image components from the plurality of 2D images comprises subtracting respective ones of the plurality of first blurred images from respective ones of the plurality of 2D images.
The system of claim 12, wherein a first number of the subset of the plurality of high frequency image components is less than a second number of the plurality of low frequency image components.
The system of claim 12, wherein the operations further comprise selecting a first high frequency image component of the plurality of high frequency image components as part of the subset of the plurality of high frequency image components based on a quality of the first high frequency image component, an orientation of the first high frequency image component with respect to the 3D object, and/or a distance to the 3D object from which the first high frequency image component was captured.
(canceled)
The system of claim 12, wherein generating the low frequency texture atlas based on the plurality of low frequency image components comprises summing, for each low frequency image component of the plurality of low frequency image components, a color value of the low frequency image component multiplied by a weight value.

Description

FIELD

[0001] Various embodiments described herein relate to methods and devices for image processing and, more particularly, to three-dimensional (3D) modeling.

BACKGROUND

[0002] Three-dimensional (3D) modeling may be used to create a representation of an object for use in a variety of applications, such as augmented reality, 3D printing, 3D model development, and so on. A 3D model may be defined by a collection of points in 3D space connected by various geometric entities such as triangles, lines, curved surfaces, or the like. One potential way to generate a 3D model of an object is via 3D scanning of the object. Although there are various methods to perform 3D scanning, one area of potential growth and development includes capturing a set of images by an image capture device. A collection of points in 3D space may be determined from corresponding feature points in the set of images. A mesh representation (e.g., a collection of vertices, edges, and faces representing a “net” of interconnected primitive shapes, such as triangles) that defines the shape of the object in three dimensions may be generated from the collection of points. Refinements to the mesh representation may be performed to further define details.

[0003] Creating a 3D mesh of an object only provides one component of the overall 3D model. In order for the virtual representation of the object to look realistic, color information is desirable. Known systems simply utilize a per-vertex color: for each vertex of the mesh a color triplet (RGB) may be specified. The color for each pixel in the rendered mesh may be interpolated from these colors. The color resolution for per-vertex methods may be very low, and the resultant model may have a low degree of detail and may not be realistic.

[0004] Texturing is one technique that may achieve better color quality than the per-vertex color methods. In texturing, one or several images are created in addition to the 3D mesh, and these images may be mapped onto the surface of the mesh. For each primitive shape (e.g., triangle) in the mesh there is a corresponding triangle in the texture image.

[0005] A texture image may be created from the 3D mesh and the set of images captured by an image capture device. However it has been recognized by the inventors that many challenges are present in texturing and in the creation of the texture image.

SUMMARY

[0006] Various embodiments described herein provide methods, systems, and computer program products for generating an improved texture for a 3D model.

[0007] Various embodiments of present inventive concepts include a method of generating a texture atlas including extracting a plurality of high frequency image components and a plurality of low frequency image components from a plurality of two-dimensional (2D) images of a 3D object captured at respective points of perspective of the 3D object, generating a low frequency texture atlas from the plurality of low frequency image components, generating a high frequency texture atlas from the plurality of high frequency image components by performing a texturing operation comprising seam leveling on a subset of the plurality of high frequency image components, and generating the texture atlas by merging the low frequency texture atlas with the high frequency texture atlas.

[0008] Various embodiments of present inventive concepts include a system including a processor and a memory coupled to the processor and storing computer readable program code that when executed by the processor causes the processor to perform operations including extracting a plurality of high frequency image components and a plurality of low frequency image components from a plurality of 2D images of a 3D object captured at respective points of perspective of the 3D object, generating a low frequency texture atlas from the plurality of low frequency image components, generating a high frequency texture atlas from the plurality of high frequency image components by performing a texturing operation comprising seam leveling on a subset of the plurality of high frequency image components, and generating the texture atlas by merging the low frequency texture atlas with the high frequency texture atlas.

[0009] Various embodiments of present inventive concepts include a computer program product for operating an imaging system, the computer program product comprising a non-transitory computer readable storage medium having computer readable program code embodied in the medium that when executed by a processor causes the processor to perform a method including extracting a plurality of high frequency image components and a plurality of low frequency image components from a plurality of two-dimensional (2D) images of a 3D object captured at respective points of perspective of the 3D object, generating a low frequency texture atlas from the plurality of low frequency image components, generating a high frequency texture atlas from the plurality of high frequency image components by performing a texturing operation comprising seam leveling on a subset of the plurality of high frequency image components, and generating the texture atlas by merging the low frequency texture atlas with the high frequency texture atlas.

[0010] In some embodiments, extracting the plurality of low frequency image components from the plurality of 2D images of the 3D object comprises performing a blurring operation on respective ones of the plurality of 2D images.

[0011] In some embodiments, extracting the plurality of high frequency image components from the plurality of 2D images comprises subtracting respective ones of the low frequency image components from respective ones of the plurality of 2D images.

[0012] Some embodiments may further include extracting a plurality of high frequency intermediate image components from the plurality of 2D images, extracting a plurality of middle frequency intermediate image components from the plurality of 2D images, and extracting a plurality of low frequency intermediate image components from the plurality of 2D images, where extracting the plurality of high frequency image components comprises merging the plurality of high frequency intermediate image components and the plurality of middle frequency intermediate image components, and generating the plurality of low frequency image components comprises merging the plurality of low frequency intermediate image components and the plurality of middle frequency intermediate image components.

[0013] Some embodiments may further include generating a plurality of first blurred images by performing a blurring operation on respective ones of the plurality of 2D images, and generating a plurality of second blurred images by performing the blurring operation on respective ones of the plurality of first blurred images. In some embodiments, extracting the plurality of low frequency intermediate image components from the plurality of 2D images comprises selecting the plurality of second blurred images, extracting the plurality of middle frequency intermediate image components from the plurality of 2D images comprises subtracting respective ones of the plurality of second blurred images from respective ones of the plurality of first blurred images, and extracting the plurality of high frequency intermediate image components from the plurality of 2D images comprises subtracting respective ones of the plurality of first blurred images from respective ones of the plurality of 2D images.

[0014] In some embodiments, a first number of the subset of the plurality of high frequency image components is less than a second number of the plurality of low frequency image components.

[0015] Some embodiments may further include selecting a first high frequency image component of the plurality of high frequency image components as part of the subset of the plurality of high frequency image components based on a quality of the first high frequency image component, an orientation of the first high frequency image component with respect to the 3D object, and/or a distance to the 3D object from which the first high frequency image component was captured.

[0016] In some embodiments, the texturing operation comprising seam leveling comprises a Markov random field optimization operation.

[0017] In some embodiments, generating the low frequency texture atlas based on the plurality of low frequency image components comprises summing, for each low frequency image component of the plurality of low frequency image components, a color value of the low frequency image component multiplied by a weight value.

[0018] Advantageously, these embodiments may provide an efficient processing method which performs a frequency separation utilizing high and low frequency image components to generate high and low frequency texture atlases that results in an improved texture atlas with fewer artifacts. In some embodiments, only a subset of the high frequency image components may be provided to generate the high frequency texture atlas, which may require fewer processing resources, while still generating a high quality high frequency texture atlas. In some embodiments, a low frequency texture atlas may be generated from the full set of low frequency image components using operations that are efficient with respect to processor and memory resources, thus generating the low frequency texture atlas efficiently while including the full set of information from the low frequency image components. The use of the full set of information from the low frequency image components may provide a higher dynamic range in the low frequency texture atlas as compared to operations which use a single keyframe image to generate portions of the texture atlas. Thus, a high quality final texture atlas may be generated, while not requiring the full time to iterate over all of the keyframe images for the texturing operation.

[0019] It is noted that aspects of the inventive concepts described with respect to one embodiment, may be incorporated in a different embodiment although not specifically described relative thereto. That is, all embodiments and/or features of any embodiment can be combined in any way and/or combination. Other operations according to any of the embodiments described herein may also be performed. These and other aspects of the inventive concepts are described in detail in the specification set forth below.

BRIEF DESCRIPTION OF THE DRAWINGS

[0020] The above and other objects and features will become apparent from the following description with reference to the following figures, wherein like reference numerals refer to like parts throughout the various figures unless otherwise specified.

[0021] FIG. 1 illustrates the use of a camera as part of a 3D construction of an object, according to various embodiments described herein.

[0022] FIGS. 2A, 2B, and 2C illustrate examples of keyframe images used to generate a 3D mesh.

[0023] FIG. 3A illustrates an example of formulating a 3D mesh from a point cloud.

[0024] FIG. 3B illustrates an example of a completed mesh representation.

[0025] FIG. 3C is a diagram illustrating the relationship between a texture atlas and a mesh representation.

[0026] FIG. 4 is a block diagram illustrating the creation of a texture atlas using frequency separation and a texturing operation utilizing seam leveling, according to various embodiments described herein.

[0027] FIG. 5 is a flowchart of operations for creating a texture atlas using frequency separation and a texturing operation utilizing seam leveling, according to various embodiments described herein.

[0028] FIGS. 6A-6D are block diagrams and flowcharts illustrating various aspects of a sub-operation of FIG. 5 for extracting high and low frequency image components from keyframe images, according to various embodiments described herein.

[0029] FIG. 7 is a block diagram of an electronic device capable of implementing the inventive concepts, according to various embodiments described herein.

DETAILED DESCRIPTION

[0030] Various embodiments will be described more fully hereinafter with reference to the accompanying drawings. Other embodiments may take many different forms and should not be construed as limited to the embodiments set forth herein. Like numbers refer to like elements throughout.

[0031] Applications such as 3D imaging, mapping, and navigation may use techniques such as Simultaneous Localization and Mapping (SLAM), which provides a process for constructing and/or updating a map of an unknown environment while simultaneously keeping track of an object’s location within it. 2D images of real objects may be captured with the objective of creating a representation of a 3D object that is used in real-world applications such as augmented reality, 3D printing, and/or 3D visualization with different perspectives of the real objects. As described above, the generated 3D representation of the objects may be characterized by feature points that are specific locations on the physical object in the 2D images that are of importance for the 3D representation such as corners, edges, center points, and other specific areas on the physical object. There are several algorithms used for solving this computational problem associated with 3D imaging, using various approximations. Popular approximate solution methods include the particle filter and Extended Kalman Filter (EKF). The particle filter, also known as a Sequential Monte Carlo (SMC), linearizes probabilistic estimates of data points. The Extended Kalman Filter is used in non-linear state estimation in applications including navigation systems such as Global Positioning Systems (GPS), self-driving cars, unmanned aerial vehicles, autonomous underwater vehicles, planetary rovers, newly emerging domestic robots, medical devices inside the human body, and/or imaging systems. Imaging systems may generate 3D representations of an object using SLAM techniques by performing a transformation of the object in a 2D image to produce a representation of a physical object. The 3D representation may ultimately be a mesh that defines a surface of the representation of the object.

[0032] Once generated, the 3D representation and/or mesh may be further updated to include a surface texture, which can provide colors and/or other details to make the 3D representation more realistic. The 2D images used to create the 3D representation may be used to provide a source for the texture to be applied to the 3D representations. Various embodiments described herein may arise from recognition that techniques such as those described herein may provide for a more efficient generation of a texture for a 3D representation that is of higher quality than conventional techniques.

[0033] The 2D images used in the methods, systems, and computer program products described herein may be captured with image sensors. Image sensors may be collocated with or integrated with a camera. The terms “image sensor,” and “camera” will be used herein interchangeably. The camera may be implemented with integrated hardware and/or software as part of an electronic device, or as a separate device. Types of cameras may include mobile phone cameras, security cameras, wide-angle cameras, narrow-angle cameras, stereoscopic cameras and/or monoscopic cameras.

[0034] Generating a 3D mesh of a physical object may involve the use of a physical camera to capture multiple images of the physical object. For instance, the camera may be rotated around the physical object being scanned to capture different/portions perspectives of the physical object. Based on the generated images, a mesh representation of the physical object may be generated. The mesh representation may be used in many different environments. For example, the model of the physical object represented by the mesh representation may be used for augmented reality environments, 3D printing, entertainment and the like.

[0035] As part of context for the present application, FIG. 1 illustrates the use of a camera 100 as part of a 3D construction of an object 135, according to various embodiments described herein. For example, as illustrated in FIG. 1, a camera 100 may be used to take a series of images (e.g., 130a, 130b) of the object 135, such as a person’s face or other object, at location 120a. The camera 100 may be physically moved around the object 135 to various locations such as location 120b, location 120c, and/or location 120d. Though only four camera locations are illustrated in FIG. 1, it will be understood that more or fewer camera locations may be used to capture images of the object 135. In some embodiments, the object 135 may be moved in relation to the camera 100. One or more images of the object 135 may be captured at each location. For example, image 130a may be captured when the camera 100 is at location 120a and image 130b may be captured when the camera 100 is at location 120b. Each of the captured images may be 2D images. There may be a continuous flow of images from the camera 100 as the camera 100 moves around the object 135 that is being scanned to capture images at various angles. Once the images, such as images 130a and 130b are captured, the images may be processed by a processor in camera 100 and/or a processor external to the camera 100 to generate a 3D image. In some embodiments, a baseline initialization of the 3D image may occur once the first two images are captured. The quality of the baseline initialization may be evaluated to see if a satisfactory baseline initialization has occurred. Otherwise, further processing of additional images may take place.

[0036] In some embodiments, the baseline initialization may indicate the object 135 to be scanned, as well as overall rough dimensions of the object 135. An initial mesh representation may be formed to enclose the dimensions of the object 135, and further images may be repeatedly processed to refine the mesh representation of the object 135.

[0037] The images may be processed by identifying points on the object 135 that were captured the first image 130a, the second image 130b, and/or subsequent images. The points may be various edges, corners, or other points on the object 135. The points are recognizable locations on the physical object 135 that may be tracked in various images of the physical object 135. Still referring to FIG. 1, points on the object 135 may include points 140 through 144. When the camera 100 moves to a different location 120b, another image 130b may be captured. This same process of capturing images and identifying points may occur on the order of tens, hundreds, or thousands (or more) of times in the context of creating a 3D representation. The same points 140 through 144 may be identified in the second image 130b. The spatial coordinates, for example, the X, Y, and/or Z coordinates, of the points 140 through 144 may be estimated using various statistical and/or analysis techniques.

[0038] FIGS. 2A, 2B, and 2C illustrate various images, referred to as keyframe images 130, of an object 135. From among the series of images taken as discussed with respect to FIG. 1, specific images known as keyframe images 130 may be selected. A keyframe image 130 may be an anchor frame, selected from among the many pictures taken of the object 135 based on certain criteria, like a stable pose, and/or even light, color, and/or physical distribution around the object. The keyframe images 130 may be a subset of all of the images (e.g., images 130a, 130b of FIG. 1) that are taken of the object 135. The keyframe images 130 may be stored with additional metadata, such as, for example, the pose information of the camera that captured the image. The pose information may indicate an exact location in space where the keyframe image 130 was taken.

[0039] Referring now to FIG. 2A, in a first keyframe image 130, the object 135 is oriented straight at the camera. Referring now to FIG. 2B, in a second keyframe image 130, the camera is offset from a perpendicular (e.g., straight-on and/or normal) view of the object 135 by about 30 degrees. Referring now to FIG. 2C, in a third keyframe image 130, the camera is offset from a perpendicular (e.g., straight-on and/or normal) view of the object 135 by about 45 degrees. Thus, keyframe images 130 of FIGS. 2A, 2B, and 2C illustrate approximately 45 degrees of the object 135.

[0040] FIG. 3A illustrates the generation of a point cloud 200 and mesh representation 400 based on a 2D image, according to various embodiments described herein. As illustrated in FIG. 3A, analysis of the orientation and position information of a set of images (e.g., images 130a and 130b of FIG. 1) may result in the identification of points (e.g., points 140 through 144 of FIG. 1), which may collectively be referred to as point cloud 200, which is a plurality of points 200 identified from respective images of the object 135. From these identified plurality of points 200, characteristics of the mesh representation 400 of the object 135 may be updated. As described herein, the mesh representation 400 may be composed of a plurality of polygons 300 including edges 330 and vertices 320.

[0041] Respective vertices 320 of the mesh representation 400 may be associated with the surface of the object 135 being scanned and tracked. The point cloud 200 may represent contours and/or other features of the surface of the object 135. Operations for generating a mesh representation 400 of the object 135 may attempt to map the point cloud 200 extracted from a 2D image of the object 135 onto the polygons 300 of the mesh representation 400. It will be recognized that the mesh representation 400 is incrementally improved based on subsequent images, as the subsequent images provide additional points to the point cloud 200 which may be mapped to the plurality of polygons 300 of the mesh representation 400.

[0042] Refining the mesh representation 400 given a point cloud 200 may involve mathematically projecting the 3D location of the plurality of points 200 inferred from an image into and/or onto the mesh representation 400. For each point of the plurality of points 200, an analysis may be performed to determine whether the point lays on the mesh representation 400, or whether the point is off (e.g., above/below/beside in a 3D space) the mesh representation 400. If the point is on the mesh representation 400, the point may be associated with a polygon of the polygons 300 of the mesh representation 400 that contains the point. If the point is off the mesh representation 400, it may indicate the mesh representation 400 needs to be adjusted. For example, the point may indicate that the arrangement of the polygons 300 of the current mesh representation 400 is inaccurate and needs to be adjusted.

[0043] In some embodiments, to adjust the mesh representation 400, a vertex 320 of one of the polygons 300 of the mesh representation 400 may be moved to a location in 3D space corresponding to the point of the point cloud 200 being analyzed. In some embodiments, to adjust the mesh representation 400, the polygons 300 of the mesh representation 400 may be reconfigured and/or new polygons 300 added so as to include a location in 3D space corresponding to the point of the point cloud 200 being analyzed in the surface of the mesh representation 400. In some embodiments, the adjustment of the mesh representation 400 may be weighted so that the mesh representation 400 moves toward, but not entirely to, the location in 3D space corresponding to the point of the point cloud 200 being analyzed. In this way, the mesh representation 400 may gradually move towards the points of a point cloud 200 as multiple images are scanned and multiple point clouds 200 are analyzed.

[0044] FIG. 3B illustrates an example of a completed mesh representation 400 of an object 135 that may be generated, for example, from a set of keyframe images such as keyframe images 130 of FIGS. 2A-2C. Referring to FIG. 1B, a mesh representation 400 of the object 135 may include an exterior surface 151 that includes a plurality of polygons 300. The plurality of polygons 300 may provide a representation of an exterior surface of the object 135. For example, the plurality of polygons 300 may model features (such as features at the points 140-144 of FIG. 1) on the exterior surface of the object 135. In some embodiments, the plurality of polygons 300 may include a plurality of triangles, and are referred to as such herein. Each of the plurality of polygons 300 may have one or more vertices, which may be represented by a three-dimensional coordinate (e.g., a coordinate having three data values, such as an x-value, a y-value, and a z-value). This may be referred to herein as a “3D-coordinate.”

[0045] A mesh representation, such as the mesh representation 400 of FIG. 3B, is one component of a 3D model of the 3D object. In order for the virtual representation of the object to look realistic, it is desirable to add color, detail, or other texture information. This information may be stored in a texture (also referred to herein as a “texture atlas”). FIG. 3C is a diagram illustrating the relationship between a texture 160 and a mesh representation 400’. Mesh representation 400’ of FIG. 3C and the mesh representation 400 of FIG. 3B are similar, though they differ in that one is a mesh representation of a head only and the other is a mesh representation of an entire body. In addition to a three-dimensional coordinate, each vertex 320 may have a two-dimensional texture coordinate (e.g., a coordinate having two data values, such as a u-value and a v-value) indicating which part of the texture 160 corresponds to the vertex 320. The texture coordinate may be referred to herein as a “UV coordinate.” A rendering engine may then apply, or sample, the texture atlas 160 to the vertices 320, in effect “painting” each vertex, or each triangle of the mesh representation 400’, with the corresponding part of the texture 160. As seen in FIG. 1C, texture 160 may have one or more islands 161, where color or other texture information associated with vertices may be located, separated by gaps 162, where color, detail, surface texture or other texture information not associated with vertices may be located. In some embodiments, this may be some static color (e.g., black).

[0046] One aspect in generating a 3D model includes recognizing that the model may be presented or displayed on a two-dimensional display device (though this is not the only possible output of generating a 3D model). Computer graphics systems include algorithms to render a 3D scene or object to a 2D screen. When rendered on a display device, the mesh may be combined in a way with the texture, by taking the 3D coordinate of the vertices and projecting them into a screen space using a camera position and parameters. These values may be provided, for example, to a vertex shader. Each pixel from the texture may be sampled using the UV coordinates. This may be performed, for example, in a fragment shader.

[0047] As discussed above with respect to FIG. 1, a camera 100 may be used to capture a plurality of images (e.g., images 103a, 130b) of a physical object 135, such as a head of a person, at different locations (e.g., locations 120a, 120b, 120c, 120d). The camera 100 may be physically moved around the physical object 135 to various locations, such as from the location 120a to a different location 120b. An image of the physical object 135 may be captured at each location. For example, image 130a is captured when the camera 100 is at the location 120a, and image 130b is captured when the camera 100 moves to the different location 120b.

[0048] It has been recognized by the inventors that creation of a texture having a high degree of detail is desirable. The captured images may be used not only to generate a mesh representation of the object, but also to derive color and detail information to use in generating a texture for the 3D model of the object. Additionally, it is desirable that the creation of the texture be completed in a relatively short timeframe using as little memory resources as possible, both from a computational savings (e.g., efficiency and/or hardware cost) perspective and from a user satisfaction perspective. It is known that there is at least one texture creation algorithm, provided in an article by Waechter et al. entitled “Let There Be Color! Large-Scale Texturing of 3D Reconstructions,” European Conference on Computer Vision, Springer, Cham, 2014 (hereinafter referred to as “Waechter”). The inventors have recognized several deficiencies with the Waechter methods. In the Waechter methods, for example, a global optimization is performed first, in which an input image (e.g., a keyframe image) for each triangle in the mesh is selected, with the result that each triangle in the mesh will be painted with data from one single camera image. The inventors have recognized that this has the potential to produce sub-optimal results in some situations of conventional usage, for example where the mesh is of a lower resolution and where the resulting texture of the triangles is of a higher resolution. The Waechter methods also contain a complicated color adjustment step in order to handle global illumination differences, which can result in undesirable artifacts that may affect large regions of the resulting texture atlas when applied to a set of original keyframe images.

[0049] Other conventional techniques which map a keyframe image to each triangle of a mesh representation may have similar deficiencies as that described above with Waechter. In such algorithms each keyframe image may be analyzed together with the mesh representation, and a decision may be made to assign a single keyframe per mesh triangle as its source of texture. After a unique keyframe is assigned to each triangle of the mesh representation, the keyframe images may be respectively projected onto each respective triangle. This technique may give rise to extremely strong seams when adjacent triangles having different keyframe images are next to each other. Some techniques deal with this phenomenon by “seam leveling.” Some techniques include both local seam leveling and global seam leveling. Local seam leveling may affect only a small region around the seams, while global seam leveling may adjust the level of brightness in the entire keyframe image to match neighboring frames with minimized seams. Texturing operations which include techniques such as these are the open source mvs-texturing algorithm that is described, in part, in the Waechter paper discussed herein. Other texturing operations which utilize Markov random field optimization operations may have similar issues, such as that described in Lempitsky et al., “Seamless Mosaicing of Image-Based Texture Maps,” 2007 IEEE Conference on Computer Vision and Pattern Recognition, Minneapolis, Minn., 2007, pp. 1-6, and Dou et al., “High Quality Texture Mapping for Multi-view Reconstruction,” 2017 2nd International Conference on Multimedia and Image Processing (ICMIP), Wuhan, 2017, pp. 136-140. An example of a texturing algorithm that is not strictly formulated as a Markov random field optimization is described in Wang et al., “Improved 3D-model texture mapping with region-of-interest weighting and iterative boundary-texture updating,” 2016 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), Seattle, Wash., 2016, pp. 1-6. While seam leveling can improve some textures, the seam leveling itself may make new artifacts appear, albeit milder than the original seams. The inventors have recognized that the techniques described herein may minimize the new artifacts that texturing operations such as mvs-texturing cause, while maintaining the beneficial effects of the seam leveling. While reference is made to mvs-texturing operations herein, the techniques described herein may be applied to other texturing operations, such as those incorporating seam leveling.

……
……
……

本文链接：https://patent.nweon.com/19813

Sony Patent | Methods, devices, and computer program products for improved 3d mesh texturing

您可能还喜欢...

分类

最新AR/VR行业分享

最新AR/VR论文

最新AR/VR行业招聘

Sony Patent | Methods, devices, and computer program products for improved 3d mesh texturing

您可能还喜欢...

Sony Patent | Image projection system, image projection apparatus, image display light diffraction optical element, tool, and image projection method

Sony Patent | Frequency Band Determination Device, Head-Mounted Display, Frequency Band Determination Method, And Program

Sony Patent | Spatial Image Display Apparatus And Spatial Image Display Method

分类

最新AR/VR行业分享

最新AR/VR论文

最新AR/VR行业招聘