Microsoft Patent | Depth sensing with depth-adaptive illumination

编辑：映维 | 分类：Microsoft | 2014年1月3日

Patent: Depth sensing with depth-adaptive illumination

Publication Number: 20140002445

Publication Date: 20140102

Assignee: Microsoft Corporation

Abstract

An adaptive depth sensing system (ADSS) illuminates a scene with a pattern that is constructed based on an analysis of at least one prior-generated depth map. In one implementation, the pattern is a composite pattern that includes two or more component patterns associated with different depth regions in the depth map. The composite pattern may also include different illumination intensities associated with the different depth regions. By using this composite pattern, the ADSS can illuminate different objects in a scene with different component patterns and different illumination intensities, where those objects are located at different depths in the scene. This process, in turn, can reduce the occurrence of defocus blur, underexposure, and overexposure in the image information.

Claims

1. An adaptive depth sensing system, implemented by computing functionality, comprising: a depth determination module configured to generate a depth map based on image information received from a camera device; and pattern generation functionality configured to: analyze the depth map to provide an analysis result; generate a composite pattern based on the analysis result; and instruct a projector device to project the composite pattern onto a scene, the depth determination module and pattern generation functionality configured to repeatedly generate depth maps and composite patterns, respectively, the depth determination module and pattern generation functionality being implemented by the computing functionality.

2. The adaptive depth sensing system of claim 1, wherein the depth determination module and the pattern generation functionality cooperatively interact to reduce occurrence of at least one of: defocus blur in the image information; underexposure in the image information; and overexposure in the image information.

3. The adaptive depth sensing system of claim 1, wherein said pattern generating functionality includes a region analysis module that is configured to: identify depth regions in the depth map, each depth region corresponding to a region of the scene with similar depths with respect to a reference point; and identify a set of masks associated with the depth regions.

4. The adaptive depth sensing system of claim 3, wherein said pattern generating functionality also includes a pattern assignment module that is configured to assign component patterns to the respective depth regions, each component pattern including features having a particular property, and different component patterns including features having different respective properties.

5. The adaptive depth sensing system of claim 3, wherein said pattern generating functionality also includes an intensity assignment module that is configured to assign an illumination intensity .beta. for a depth region j in the depth map, as given by: .beta. = .lamda. j 2 0 2 + b , ##EQU00014## where d.sub.j is a representative depth of the depth region j in the depth map, d.sub.0 is a reference depth, and .lamda. and b are constants.

6. The adaptive depth sensing system of claim 1, wherein the composite pattern is given by: P t = m = 1 M P S , m B t , m .beta. t , m , ##EQU00015## where P.sub.t is the composite pattern, M is a number of parts in the composite pattern, P.sub.S,m is a component pattern for application to part m, B.sub.t,m is a mask which defines the part m, .beta..sub.t,m is an illumination intensity for application to part m, and t is time.

7. A method, performed by computing functionality, for generating a depth map, comprising: receiving image information from a camera device, the image information representing a scene that has been illuminated with a composite pattern; generating a depth map based on the image information, the depth map having depth regions, each depth region corresponding to a region of the scene having similar depths with respect to a reference point; generating a new composite pattern having parts that are selected based on the respective depth regions in the depth map; instructing a projector device to project the new composite pattern onto the scene; and repeating said receiving, generating a depth map, generating a new composite pattern, and instructing at least one time.

8. The method of claim 7, wherein said generating a new composite pattern includes: identifying the depth regions in the depth map; and identifying a set of masks associated with the respective depth regions.

9. The method of claim 7, wherein said generating a new composite pattern includes assigning component patterns to the respective depth regions, each component pattern including features having a particular property, and different component patterns including features having different respective properties.

10. The method of claim 9, wherein the particular property associated with each component pattern is feature size.

11. The method of claim 10, wherein the features associated with each component pattern correspond to speckle features of a particular size.

12. The method of claim 9, wherein the particular property associated with each component pattern is code-bearing design.

13. The method of claim 9, wherein said assigning involves assigning a first component pattern having a first feature size K.sub.1 for a first depth region d.sub.1 and a second component pattern having a second feature size K.sub.2 for a second depth region d.sub.2, where K.sub.1>K.sub.2 if d.sub.1<d.sub.2.

14. The method of claim 9, wherein the component patterns are produced using a simulation technique.

15. The method of claim 7, wherein said generating a new composite pattern includes assigning illumination intensities to the respective depth regions.

16. The method of claim 15, wherein an illumination intensity .beta. for a depth region j is given by: .beta. = .lamda. d j 2 d 0 2 + b , ##EQU00016## where d.sub.j is a representative depth of the depth region j in the depth map, d.sub.0 is a reference depth, and .lamda. and b are constants.

17. The method of claim 7, wherein the new composite pattern is given by: P t = m = 1 M P S , m B t , m .beta. t , m , ##EQU00017## where P.sub.t is the new composite pattern, M is a number of parts in the new composite pattern, the parts being associated with different respective depth regions, P.sub.S,m is a component pattern for application to part m, B.sub.t,m is a mask which defines the part m, .beta..sub.t,m is an illumination intensity for application to part m, and t is time.

18. The method of claim 7, wherein the new composite pattern is formed based on a consideration of n previously generated depth maps, where n.gtoreq.2.

19. A computer readable storage medium for storing computer readable instructions, the computer readable instructions providing an adaptive depth sensing system when executed by one or more processing devices, the computer readable instructions comprising: logic configured to receive image information from a camera device, the image information representing a scene that has been illuminated with a composite pattern; logic configured to generate a depth map based on the image information; logic configured to identify depth regions in the depth map, each depth region corresponding to a region of the scene with similar depths with respect to a reference point; logic configured to identify a set of masks associated with the depth regions; logic configured to assign component patterns to the respective depth regions, each component pattern including features having a particular property, and different component patterns including features having different respective properties; logic configured to assign illumination intensities to the respective depth regions; logic configured to produce a new composite pattern based on the masks, component patterns, and illumination intensities; and logic configured to instruct a projector device to project the new composite pattern onto the scene.

20. The computer readable storage medium of claim 19, wherein the new composite pattern is given by: P t = m = 1 M P S , m B t , m .beta. t , m , ##EQU00018## where P.sub.t is the new composite pattern, M is a number of parts in the new composite pattern, the parts being associated with different respective depth regions, P.sub.S,m is a component pattern for application to part m, B.sub.t,m is a mask which defines the part m, .beta..sub.t,m is an illumination intensity for application to part m, and t is time.

Description

BACKGROUND

[0001] A conventional structured light depth sensing system operates by projecting a fixed 2D pattern onto a scene. The depth sensing system then captures image information which represents the scene, as illuminated by the pattern. The depth sensing system then measures the shift that occurs between the original pattern that is projected onto the scene and pattern content that appears in the captured image information. The depth sensing system can then use this shift, together with the triangulation principle, to determine the depth of surfaces in the scene.

[0002] A depth sensing system may produce image information having poor quality in certain circumstances. However, known depth sensing systems do not address these quality-related issues in a satisfactory manner.

SUMMARY

[0003] An adaptive depth sensing system (ADSS) is described herein which produces image information with improved quality (with respect to non-adaptive depth sensing systems). For example, the ADSS can reduce the occurrence of defocus blur, overexposure, and underexposure in the image information captured by the ADSS. The ADSS achieves this result by illuminating a scene with a pattern that is constructed based on an analysis of at least the last-generated depth map.

[0004] In one implementation, the ADSS operates by identifying different depth regions in the depth map(s). The ADSS then generates a composite pattern having different component patterns and illumination intensities assigned to the respective depth regions. Each component pattern includes features that have a particular property, and different component patterns include features having different respective properties. For example, in one non-limiting case, each component pattern includes features having a particular size and/or illumination intensity, and different component patterns include features having different respective sizes and/or illumination intensities.

[0005] The above approach can be manifested in various types of systems, components, methods, computer readable storage media, data structures, articles of manufacture, and so on.

[0006] This Summary is provided to introduce a selection of concepts in a simplified form; these concepts are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.

BRIEF DESCRIPTION OF THE DRAWINGS

[0007] FIG. 1 shows an illustrative environment that includes an adaptive depth sensing system (ADSS).

[0008] FIG. 2 graphically illustrates one manner by which the ADSS of FIG. 1 can generate a composite pattern.

[0009] FIG. 3 is a flowchart that represents an overview of one manner of operation of the ADSS of FIG. 1.

[0010] FIG. 4 is a flowchart that represents a more detailed explanation of one manner of operation of the ADSS of FIG. 1.

[0011] FIG. 5 shows functionality for generating a set of component patterns.

[0012] FIG. 6 is a flowchart that represents one way of generating the set of component patterns.

[0013] FIG. 7 is a flowchart that represents one approach for deriving a mapping table. The mapping table maps different depth ranges to different respective component patterns.

[0014] FIG. 8 shows a lens and ray diagram associated with a projector device and a camera device which interact with the ADSS. This diagram is used to derive a blur model.

[0015] FIG. 9 shows a relationship between depth and diameter of blur. This relationship is used to determine parameters of the blur model.

[0016] FIG. 10 shows relationships between depth and the matching accuracy, with respect to different component patterns. These relationships are produced by inducing blur in component patterns using the blur model, and comparing the un-blurred component patterns with the blurred component patterns.

[0017] FIG. 11 shows one representative mapping table which maps different depth ranges to different respective component patterns. This mapping table is produced, in part, based on analysis of the relationships of FIG. 10.

[0018] FIG. 12 shows illustrative computing functionality that can be used to implement any aspect of the features shown in the foregoing drawings.

[0019] The same numbers are used throughout the disclosure and figures to reference like components and features. Series 100 numbers refer to features originally found in FIG. 1, series 200 numbers refer to features originally found in FIG. 2, series 300 numbers refer to features originally found in FIG. 3, and so on.

DETAILED DESCRIPTION

[0020] This disclosure is organized as follows. Section A describes an overview of an illustrative adaptive depth sensing system (ADSS). Section B describes illustrative functionality for deriving a set of component patterns for use by the ADSS. Section C describes an illustrative approach for generating a mapping table for use by the ADSS. Section D describes an illustrative approach for setting up and initializing the ADSS. Section E describes illustrative computing functionality that can be used to implement any aspect of the features described in preceding sections.

[0021] As a preliminary matter, some of the figures describe concepts in the context of one or more structural components, variously referred to as functionality, modules, features, elements, etc. The various components shown in the figures can be implemented in any manner by any physical and tangible mechanisms, for instance, by software, hardware (e.g., chip-implemented logic functionality), firmware, etc., and/or any combination thereof. In one case, the illustrated separation of various components in the figures into distinct units may reflect the use of corresponding distinct physical and tangible components in an actual implementation. Alternatively, or in addition, any single component illustrated in the figures may be implemented by plural actual physical components. Alternatively, or in addition, the depiction of any two or more separate components in the figures may reflect different functions performed by a single actual physical component. FIG. 12, to be discussed in turn, provides additional details regarding one illustrative physical implementation of the functions shown in the figures.

[0022] Other figures describe the concepts in flowchart form. In this form, certain operations are described as constituting distinct blocks performed in a certain order. Such implementations are illustrative and non-limiting. Certain blocks described herein can be grouped together and performed in a single operation, certain blocks can be broken apart into plural component blocks, and certain blocks can be performed in an order that differs from that which is illustrated herein (including a parallel manner of performing the blocks). The blocks shown in the flowcharts can be implemented in any manner by any physical and tangible mechanisms, for instance, by software, hardware (e.g., chip-implemented logic functionality), firmware, etc., and/or any combination thereof.

[0023] As to terminology, the phrase "configured to" encompasses any way that any kind of physical and tangible functionality can be constructed to perform an identified operation. The functionality can be configured to perform an operation using, for instance, software, hardware (e.g., chip-implemented logic functionality), firmware, etc., and/or any combination thereof.

[0024] The term "logic" encompasses any physical and tangible functionality for performing a task. For instance, each operation illustrated in the flowcharts corresponds to a logic component for performing that operation. An operation can be performed using, for instance, software, hardware (e.g., chip-implemented logic functionality), firmware, etc., and/or any combination thereof. When implemented by a computing system, a logic component represents an electrical component that is a physical part of the computing system, however implemented.

[0025] The phrase "means for" in the claims, if used, is intended to invoke the provisions of 35 U.S.C. .sctn.112, sixth paragraph. No other language, other than this specific phrase, is intended to invoke the provisions of that portion of the statute.

[0026] The following explanation may identify one or more features as "optional." This type of statement is not to be interpreted as an exhaustive indication of features that may be considered optional; that is, other features can be considered as optional, although not expressly identified in the text. Finally, the terms "exemplary" or "illustrative" refer to one implementation among potentially many implementations.

[0027] A. Overview

[0028] FIG. 1 shows an illustrative environment that includes an adaptive depth sensing system (ADSS) 102. The ADSS 102 uses a projector device 104 to project a pattern onto a scene 106. The ADSS 102 uses a camera device 108 to capture image information from the scene 106. That is, the image information represents the scene 106 as illuminated by the pattern. The ADSS 102 then analyzes the image information, in conjunction with the original pattern, to construct a depth map. The depth map represents the depths of surfaces within the scene 106.

[0029] For example, in the representative case of FIG. 1, the depth map represents the depths of the surfaces of a sphere 110 and pyramid 112 within the scene 106. The depths are measured with respect to a reference point, such as the source of the projected pattern (e.g., the projector device 104). In this representative example, the sphere 110 is located at an average distance d.sub.1 with respect to the reference point, while the pyramid 112 is located at an average distance d.sub.2 with respect to the reference point, where d.sub.1<d.sub.2.

[0030] The ADSS 102 is adaptive in the sense that the manner in which it illuminates the scene 106 is dependent on its analysis of at least the last-computed depth map. More formally stated, the ADSS 102 performs analysis on at least depth map D.sub.t-1 that has been produced at time t-1, to derive an analysis result. The ADSS 102 then generates a pattern P.sub.t based on this analysis result. In doing so, the ADSS 102 can provide appropriate illumination to different surfaces in the scene 106 which occur at different respective depths. For example, the ADSS 102 can illuminate the sphere 110 with one part of the pattern P.sub.t and illuminate the pyramid 112 with another part of the pattern P.sub.t. This manner of operation, in turn, can improve the quality of the image information captured by the camera device 108. For example, the ADSS 102 can reduce the occurrence of defocus blur, underexposure, and/or overexposure in the image information.

[0031] With the above introduction, the components of FIG. 1 will be now described in detail. The ADSS 102 as a whole can be implemented by one or more computing devices of any type, such as a personal computer, a tablet computing device, a mobile telephone device, a personal digital assistant (PDA) device, a set-top box, a game console device, and so on. More generally, the functionality of the ADSS 102 can be contained at a single site or distributed over plural sites. For example, although not shown, one or more components of the ADSS 102 can rely on processing functionality that is provided at a remote location with respect to the projector device 104 and the camera device 108. The ADSS 102 can also include one or more graphical processing units that perform parallel processing of image information to speed up the operation of the ADSS 102. Section E provides further details regarding representative implementations of the ADSS 102.

[0032] The ADSS 102 is coupled to the projector device 104 and the camera device 108 through any type of communication conduits (e.g., hardwired links, wireless links, etc.). Alternatively, the projector device 104 and/or the camera device 108 may correspond to components in the same housing as the ADSS 102. The projector device 104 can be implemented as a programmable projector that projects a pattern onto a scene. In one merely representative case, for example, the projector device 104 can be implemented by the DLP.RTM. LightCommander.TM. produced by Texas Instruments, Incorporated, of Dallas, Tex., or any like technology. The camera device 108 can be implemented by any type of image capture technology, such as a CCD device. The projector device 104 and camera device 108 can emit and receive, respectively, any type of electromagnetic radiation, such as visible light, infrared radiation, etc.

[0033] While not illustrated in FIG. 1, in one representative setup, a lens of the projector device 104 can be vertically aligned with a lens of the camera device 108. (This is illustrated, for example, in FIG. 8, to be described in turn.). The vertical arrangement of the projector device 104 with respect to the camera device 108 ensures that pattern shift only occurs vertically. However, other implementations can adopt other orientations of the projector device 104 with respect to the camera device 108. Moreover, other implementations can include two or more projector devices (not shown) and/or two or more camera devices (not shown). For example, the use of two or more camera devices allows the ADSS 102 to provide a more comprehensive representation of objects within the scene 106.

[0034] The projector device 104 may also include a synchronization module 114. The synchronization module 114 synchronizes the operation of the projector device 104 with the camera device 108. For example, the synchronization module 114 can prompt the camera device 108 to capture image information in response to the projection of a pattern by the projector device 104, e.g., by sending a triggering signal from the projector device 104 to the camera device 108. In other implementations, the synchronization module 114 may represent standalone functionality (that is, functionality that is separate from the projector device 104), or functionality that is incorporated into some other module shown in FIG. 1.

[0035] The ADSS 102 itself can include (or can be conceptualized as including) multiple components that perform different tasks. A depth determination module 116 utilizes the principle of triangulation to generate a depth map of the scene 106, based on the original pattern P.sub.t that is projected onto the scene 106, and the pattern that appears in the image information captured by the camera device 108. More formally stated, the depth determination module 116 first determines a pattern shift .DELTA..sub.t that occurs between image information I.sub.t captured by the camera device 108 at time t, and a reference image I.sub.t'. (Additional explanation will be provided below regarding the manner in which the reference image I.sub.t' is computed.) For example, the depth determination module 116 can produce the pattern shift .DELTA..sub.t by matching each pixel in an extended block in the captured image information I.sub.t with its most similar counterpart in the reference image I.sub.t'. The depth determination module 116 can execute this matching in any manner, such as by using a normalized cross-correlation technique.

[0036] The depth determination module 116 can then determine the depth map, D.sub.t, based on the equation:

D t = FHD 0 FH + D 0 .DELTA. t , t = 1 , 2 , . ##EQU00001##

[0037] In this expression, F is the focal length of the camera device 108, H is the distance between projector device's lens and the camera device's lens, D.sub.0 refers to a depth that is used in connection with generating the reference image I.sub.t' (to be described below), and .DELTA..sub.t is the pattern shift. In one representative implementation, the depth determination module 116 generates the depth map D.sub.t for each pixel of the captured image information I.sub.t because that is the granularity in which the pattern shift .DELTA..sub.t is generated.

[0038] At this stage, pattern generation functionality 118 analyzes the depth map to generate a new pattern. More specifically, the pattern generation functionality 118 generates a new pattern P.sub.t for a new time instance t based on a depth map D.sub.t-1 that has been captured in a prior time instance, e.g., t-1. In other implementations, however, the pattern generation functionality 118 can generate a new pattern P.sub.t based on the n prior depth maps that have been generated, where n.gtoreq.2. The pattern generation functionality 118 can resort to this approach in cases in which there is quick movement in the scene 106. Additional explanation will be provided at a later juncture of this description regarding the use of plural depth maps to calculate P.sub.t. In the immediately following explanation, however, it will be assumed that the pattern generation functionality 118 generates P.sub.t based on only the last-generated depth map, namely D.sub.t-1.

[0039] The pattern generation functionality 118 includes a region analysis module 120, a pattern assignment module 122, an intensity assignment module 124, and a pattern composite module 126. To begin with, the region analysis module 120 examines the depth map D.sub.t-1 to identify depth regions in the depth map D.sub.t-1 that have surfaces of similar depths (with respect to the reference point, such as the projector device 104). In one implementation, for example, the region analysis module 120 can use a region growing algorithm to identify M continuous and non-overlapping regions, each having a similar depth value. The region analysis module 120 can represent the depths of each depth region as the average of the depths (of surfaces) within the region. That is, the depth region R(d.sub.1) includes surfaces with an average depth d.sub.1 with respect to the reference point, the depth region R(d.sub.2) includes surfaces with an average depth d.sub.2 with respect to the reference point, and so on. In FIG. 1, assume that the surfaces of the sphere 110 are encompassed by a first depth region, while the surfaces of the pyramid 112 are encompassed by a second depth region.

[0040] The region analysis module 120 then defines a set of masks B.sub.t associated with the different depth regions, where B.sub.t={B.sub.t,m}.sub.m=1.sup.M, and where B.sub.t,m refers to an individual mask within a set of M masks. To perform this task, the region analysis module 120 maps the depth regions that have been identified in the coordinate system of the camera device 108 to the coordinate system of the projector device 104. This mapping can be derived from the calibration of the projector device 104 and the camera device 108, described in Section D. The masks demarcate different parts of the pattern P.sub.t that is to be generated. For example, a first mask may demarcate a first part of the pattern P.sub.t that will be tailored to illuminate the sphere 110, while a second mask may demarcate a second part of the pattern P.sub.t that will be tailored to illuminate the pyramid 112.

[0041] A pattern assignment module 122 assigns a different component pattern to each depth region and its corresponding part of the pattern P.sub.t. (Insofar as the pattern P.sub.t includes multiple components, it is henceforth referred to as a composite pattern to distinguish it from its constituent component patterns.) Further, each component pattern includes features having a particular property, where different component patterns include features having different respective properties. For example, in the detailed examples presented herein, each component pattern has speckle features of a particular size. In other implementations, each component pattern has code-bearing features having a particular design (e.g., permutation), such as binary-coded features having a particular design.

[0042] The pattern assignment module 122 assigns a component pattern to each depth region by consulting a mapping table. The mapping table maps different ranges of depths to different respective component patterns. That is, consider a depth region with an average depth of 1.5 m. The pattern assignment module 122 can consult the mapping table to determine the component pattern that is associated with this depth, where that component pattern has a particular feature property. The pattern assignment module 122 will then assign the identified component pattern to the part of the composite pattern P.sub.t that is devoted to the depth region in question.

[0043] An intensity assignment module 124 assigns an illumination intensity to each depth region and its corresponding part of the composite pattern P.sub.t. Generally, exposure E is related to illumination intensity q and depth d according to the equation:

E = q d 2 . ##EQU00002##

[0044] Further, consider the case of a candidate pattern with a dynamic range [0, q.sub.0], that is, with grayscale values ranging from 0 to q.sub.0. Further assume that image information captured from the scene at reference depth d.sub.0 is not underexposed (with respect to grayscale value 0) and not overexposed (with respect to grayscale value q.sub.0), but would be underexposed and overexposed below and above this range, respectively. Proper exposure can be obtained for the same pattern with a scaled dynamic range [0, q] captured at depth d provided that:

q = d 2 d 0 2 q 0 . ##EQU00003##

[0045] The intensity assignment module 124 can leverage the above-described principle by changing the dynamic intensity range of each projected component pattern to achieve a desired illumination intensity, based on the average depth of a particular depth region. More specifically, the intensity assignment module 124 can assign an illumination intensity .beta. to each depth region j (and each corresponding part of the composite pattern P.sub.t) according to the following equation:

.beta. = .lamda. d j 2 d 0 2 + b , ##EQU00004##

[0046] where d.sub.j corresponds to the average depth of the depth region j in question, d.sub.0 corresponds to the reference depth, and .lamda. and b are empirically-determined constants. In one merely representative environment, .lamda. is set to approximately 0.9 and b is set of approximately 0.1.

[0047] The pattern composition module 126 constructs the new composite pattern P.sub.t based on the masks that have been generated by the region analysis module 120, the component patterns that have been selected by the pattern assignment module 122, and the illumination intensities that have been calculated by the intensity assignment module 124. More formally stated, in one implementation, the pattern composition module 126 constructs the composite pattern P.sub.t based on the following equation:

P t = m = 1 M P S , m B t , m .beta. t , m , t > 1 , ##EQU00005##

[0048] where M refers to the number of parts in the composite pattern P.sub.t, P.sub.S,m refers to the component pattern to be applied to the part m of the composite pattern P.sub.t (which, in turn, corresponds to a particular depth region in the depth map D.sub.t-1), B.sub.t,m refers to the mask that demarcates the part m, and .beta..sub.t,m refers to the illumination intensity to be applied to the part m.

[0049] The projector device 104 projects this composite pattern P.sub.t onto the scene 106, triggering the depth determination module 116 to capture another depth map, and the pattern generation functionality 118 to generate another composite pattern P.sub.t+1. In one implementation, this cycle of computation repeats for every captured frame of image information. However, in other implementations, the pattern generation functionality 118 can regenerate the composite pattern on a less frequent basis. For example, the pattern generation functionality 118 may only generate a new composite pattern if it detects significant movement of the objects within the scene 106.

[0050] Finally, set-up functionality 128 performs various roles in connection with setting up and initializing the ADSS 102. These tasks include: calibrating the projector device 104 and the camera device 108, capturing a reference image set I.sub.S, and capturing the first depth map. Section D provides additional details regarding these preliminary tasks.

[0051] Advancing to FIG. 2, this figure provides a high-level conceptual depiction of the operation of the pattern generation functionality 118 of the ADSS 102. Assume that the region analysis module 120 has identified at least three depth regions associated with the scene 106 and at least three corresponding masks. More specifically, the region analysis module 120 assigns a first mask B.sub.t,1 to a depth region associated with the sphere 110 and a second mask B.sub.t,2 to a depth region associated with the pyramid 112. The region analysis module 120 may also assign a background mask B.sub.t,M to all other regions besides the depth region associated with the sphere 110 and pyramid 112.

[0052] The pattern assignment module 122 assigns a first component pattern P.sub.S,1 to the part of the composite pattern demarcated by the first mask B.sub.t,1, a second component pattern P.sub.S,2 to the part of the composite pattern demarcated by the second mask B.sub.t,2, and a third component pattern P.sub.S,M to the part of the composite pattern demarcated by the third mask B.sub.t,M. In this non-limiting example, these three component patterns are speckle patterns. Further, note that the speckle features of these three component patterns have different respective sizes; for instance, the speckle features in the first component pattern P.sub.S,1 have the largest size and the speckle features in the third component pattern P.sub.S,M have the smallest size. As mentioned above, in another example, each component pattern may alternatively have a particular code-bearing design, differing from the code-bearing designs of other component patterns.

[0053] Note that the star shapes in FIG. 2 represent the speckle features in a high-level conceptual form. In an actual implementation, the speckle features may have random (or pseudo-random) shapes and random (or pseudo-random) arrangements of those shapes. Further, the actual sizes of the speckle features may be different from those depicted in FIG. 2. Section B provides additional details regarding how speckle component patterns having different sized features may be generated.

[0054] The intensity assignment module 124 assigns a first illumination intensity .beta..sub.t,1 to the part of the composite pattern demarcated by the first mask B.sub.t,1, a second illumination intensity .beta..sub.t,2 to the part of the composite pattern demarcated by the second mask B.sub.t,2, and a third illumination intensity .beta..sub.t,M to the part of the composite pattern demarcated by the third mask B.sub.t,M.

[0055] The pattern composition module 126 produces the final composite pattern P.sub.t by superimposing the above-described parts of the composite pattern P.sub.t. As shown, the part of the composite pattern P.sub.t associated with the sphere 110 includes the first component pattern P.sub.S,1 and is illuminated by the first illumination intensity .beta..sub.t,1. The part of the composite pattern P.sub.t associated with the pyramid 112 includes the second component pattern P.sub.S,2 and is illuminated by the second illumination intensity .beta..sub.t,2, and the part of the composite pattern P.sub.t associated with remainder of the depth map includes the third component pattern P.sub.S,M and is illuminated by the third illumination intensity .beta..sub.t,M. When projected, the different parts of the composite pattern P.sub.t will effectively impinge different parts of the scene 106.

[0056] FIG. 3 shows a procedure 300 which represents an overview of one manner of operation of the ADSS 102 of FIG. 1. FIG. 4 shows a procedure 400 that represents a more detailed explanation of the operation of the ADSS 102. Since the operation of the ADSS 102 has already been described in the context of FIG. 1, the explanation of FIGS. 3 and 4 will serve as a summary.

[0057] In block 302 of FIG. 302, the ADSS 102 receives image information from the camera device 108. The image information represents a scene that has been illuminated by a composite pattern. In block 304, the ADSS 102 generates a depth map based on the captured image information. In block 306, the ADSS 102 generates a new composite pattern having parts that are selected based on different depth regions within the depth map. In block 308, the ADSS 102 instructs the projector device 104 to project the new composite pattern onto the scene 106. This process repeats throughout the image capture session. Overall, the depth determination module 116 and the pattern generation functionality 118 cooperatively interact to reduce defocus blur, underexposure, and overexposure in the captured image information.

[0058] Advancing to FIG. 4, in block 402, the ADSS 102 performs various set-up tasks, such as calibrating the camera device 108 and the projector device 104. In block 404, the ADSS 102 generates an initial depth map. Section D (below) provides details regarding blocks 402 and 404.

[0059] In block 406, the ADSS 102 identifies depth regions in the depth map, and generates masks corresponding to the depth regions. In block 408, the ADSS 102 assigns component patterns to the depth regions. In block 410, the ADSS 102 assigns illumination intensities to the depth regions. In block 412, the ADSS 102 composes a new composite pattern P.sub.t based on the masks, component patterns, and illumination intensities that have been determined in blocks 406-410. Blocks 406-412 correspond to block 306 of FIG. 3.

[0060] In block 414 (corresponding to block 308 of FIG. 3), the ADSS 102 instructs the projector device 104 to project the new composite pattern P.sub.t onto the scene 106. In block 416 (corresponding to blocks 302 and 304 of FIG. 3), the ADSS 102 receives information from the camera device 108 and generates a new depth map based on the image information. In the context of the subsequent generation of yet another new composite pattern P.sub.t+1, the depth map D.sub.t that is generated in block 416 can be considered the "prior" depth map.

[0061] More specifically, recall that depth determination module 116 computes the pattern shift between the captured image information I.sub.t and the reference image I.sub.t', and then uses the triangulation principle to compute the depth map. The reference image I.sub.t' can be generated based on the equation:

I t ' = m = 1 M I S , m B t , m ' , t > 1. ##EQU00006##

[0062] I.sub.S,m refers to a component reference image in a reference image set I.sub.S. Section D describes one way in which the reference image set I.sub.S can be calculated. B.sub.t,m', refers to a resized version of the mask B.sub.t,m, determined by mapping between the projector device coordinates and the camera device coordinates. This mapping can be derived from the calibration of the projector device 104 and the camera device 108, which is also described in Section D.

[0063] As a final point, the ADSS 102 is described above as generating a new composite pattern P.sub.t based on only the last-generated depth map D.sub.t-1. But, as said, the ADSS 102 can alternatively generate the new composite pattern P.sub.t based on the n last depth maps, where n.gtoreq.2. In this case, the region analysis module 120 can analyze the plural depth maps to predict, for each object in the scene 106, the likely depth of that object at the time of projection of the composite pattern P.sub.t. The region analysis module 120 can perform this task by extending the path of movement of the object, where that path of movement is exhibited in the plural depth maps.

[0064] B. Deriving the Component Patterns

[0065] FIG. 5 shows a pattern set generation module (PSGM) 502 for generating a set of component patterns P.sub.s. As described in Section A, the component patterns include features having different respective properties. For example, as shown in FIG. 2, the component patterns may include speckle features having different respective sizes. A data store 504 stores the set of component patterns P.sub.s. The data store 504 is accessible to the pattern generating functionality 118.

[0066] FIG. 6 shows one procedure 600 for creating a set of component speckle patterns P.sub.s using a simulation technique. Generally, speckle occurs when many complex components with independent phase are superimposed. FIG. 6 simulates this phenomenon using the Discrete Fourier Transform (DFT). FIG. 6 will be explained below with respect to the generation of a single component pattern having a particular speckle feature size. But the same process can be repeated for each component speckle pattern in P.sub.s, having its associated speckle feature size.

[0067] Beginning with block 602, the PSGM 502 produces an N by N random phase matrix .THETA., defined as:

.THETA. ( x , y ) = { .theta. U [ 0 , 1 ] , 1 .ltoreq. x , y .ltoreq. N / K 0 , otherwise . ##EQU00007##

[0068] In this equation, .theta..sub.U[0,1] denotes a random number taken from a uniform distribution on the interval [0,1], and K represents a factor of N that ultimately will determine the size of the speckle features in the component pattern being generated.

[0069] In block 604, the PSGM 502 produces a random complex matrix A, defined as:

A(x,y)=e.sup.2.pi.i.THETA.(x,y), 1.ltoreq.x,y.ltoreq.N.

[0070] In block 606, the PSGM 502 applies a 2D-DFT operation on A to yield another random complex matrix Z, defined as:

Z ( x , y ) = u = 1 N v = 1 N A ( u , v ) - 2 .pi. ( ux + vy ) / N , 1 .ltoreq. x , y .ltoreq. N . ##EQU00008##

[0071] In block 608, the PSGM 502 generates the component pattern, represented by speckle signal S, by calculating the modulus square of each complex element in Z, to yield:

S(x,y)=|Z(x,y)|.sup.2, 1.ltoreq.x,y.ltoreq.N.

[0072] In one merely representative environment, the PSGM 502 performs the procedure 600 of FIG. 6 for K=4, 8, 16, and 32, resulting in four component patterns. The component pattern for K=4 will have the smallest speckle feature size, while the component pattern for K=32 will have the largest speckle feature size.

[0073] As mentioned above, other implementations of the ADSS 102 can use component patterns having other types of features to produce structured light when projected onto the scene 106. In those contexts, the PSGM 502 would generate other types of component patterns, such as component patterns having code-bearing features.

[0074] C. Deriving the Mapping Table

[0075] FIG. 7 represents a procedure 700 for deriving a mapping table. As described in Section A, the mapping table correlates different depth ranges in a scene with different component patterns. The component pattern that maps to a particular depth range corresponds to the appropriate component pattern to project onto surfaces that fall within the depth range, so as to reduce the effects of defocus blur, underexposure, and overexposure.

[0076] In block 702, a defocus model is derived to simulate the effects of blur at different depths. One manner of deriving the defocus model is described below with reference to the representative lens and ray diagram of FIG. 8. More specifically, FIG. 8 shows a projector lens 802 used by the projector device 104, and a camera lens 804 used by the camera device 108. In this merely representative configuration, the projector lens 802 is vertically disposed with respect to the camera lens 804. Further assume that the focal planes of both the projector device 104 and the camera device 108 correspond to focal plane 806, which occurs at distance d.sub.0 with respect to the projector lens 802 and the camera lens 804. The projector device 104 has an aperture A.sub.1, while camera device 108 has an aperture A.sub.2. A projector device emitter 808 projects light into the scene 106 via the projector lens 802, while a camera device CCD 810 receives light from the scene 106 via the camera lens 804. The camera device CCD 810 is located at a distance of l.sub.0 from the camera device lens 804. A point from the focal plane 806 (at distance d.sub.0) will converge on the camera device CCD 810.

[0077] Consider the case of an object plane at distance d. Because d is not coincident with d.sub.0, a blur circle having diameter C, on the object plane, is caused by defocus of the projector device 104. Using the principle of similar triangles, the projector device's blur circle C can be expressed as:

C = A 1 d 0 - d d 0 . ##EQU00009##

[0078] Moreover, a point from the object plane at distance d will converge beyond the camera device CCD 810, at a distance l to the camera device lens 804. Thus, another blur circle, of diameter c, on the camera device CCD 810, is caused by defocus of the camera device 108. Using the principle of similar triangles, the camera device's blur circle c can be expressed as:

c = l 0 d C + A 2 l - l 0 l . ##EQU00010##

[0079] Assume that the focal length of the camera device 108 is F. According to the lens equation:

1 F = 1 l 0 + 1 d 0 = 1 l + 1 d . ##EQU00011##

[0080] Using the above equation, c can be alternatively expressed as:

c = ( A 1 + A 2 ) F d 0 - F d 0 - d d . ##EQU00012##

[0081] The defocus bur can be modeled as an isotropic, two-dimensional Gaussian function. The standard deviation .sigma. of this function is proportional to the blur circle c. Since A.sub.1, A.sub.2, d.sub.0, and F are all constants, the standard deviation can be expressed as:

.sigma. = k d 0 d - 1 + .sigma. 0 . ##EQU00013##

[0082] Returning to FIG. 7, in block 702, the parameters k and .sigma..sub.0 in the above equation are estimated. To perform this task, the projector device 104 can project a pattern with repetitive white dots as an impulse signal. The camera device 108 can then capture the resultant image information for different object planes at different depths. The diameter of blur (measured in pixels) exhibited by the image information can then be computed for different depths.

[0083] For example, FIG. 9 shows, for one illustrative environment, measurements of the diameter of blur with respect to different depths, where the focal plane is at a distance of 2.0 m. The least square method can then be used to fit a curve to the observations in FIG. 9. That curve yields the values of k and .sigma..sub.0. In one merely representative environment, k is estimated to be approximately 4.3 and .sigma..sub.0 is estimated to be approximately 1.0. Note that the diameter of blur converges to a low value on the far side of the focal plane (at distances greater than 2.0 m). Thus, it is not necessary to consider defocus blur at those far distances.

[0084] In block 706 of FIG. 7, the defocus model that has just been derived is used to induce blur in different component patterns, for different depths. Then, a measure is generated that reflects how closely the original (un-blurred) component pattern matches its blurred counterpart. For example, that measure can define the number of elements in the original component pattern that were successfully matched to their blurred counterparts. FIG. 10 shows the representative outcome of this operation, in one illustrative environment, for depths ranging from 0.4 m to 2.0 m, and for speckle feature sizes of K=1 (the smallest) to K=32 (the largest).

[0085] In block 708 of FIG. 7, component patterns are chosen for different depth ranges. Different considerations play a role in selecting a component pattern for a particular depth range. The considerations include at least: (a) the matching accuracy (as represented by FIG. 10); (b) local distinguishability; and (c) noise.

[0086] Matching Accuracy. Matching accuracy refers to the ability of a component pattern to effectively reduce defocus blur within a particular depth range. In this regard, FIG. 10 indicates that larger speckle features perform better than smaller speckle features as the object plane draws farther from the focal plane 806 (and closer to the projector lens 802 and camera lens 804). Consider, for example, a first component pattern having a first feature size K.sub.1, and a second component pattern having a second feature size K.sub.2, where K.sub.1>K.sub.2. By considering matching accuracy alone, the first component pattern is appropriate for a first depth range d.sub.1 and the second component pattern is appropriate for a second depth range d.sub.2, providing that d.sub.1<d.sub.2.

[0087] Local distinguishability. Local distinguishability refers to the ability of the ADSS 102 to obtain accurate depth readings around object boundaries. With respect to this consideration, smaller speckle features perform better than larger speckle features.

[0088] Noise. Noise refers to the amount of noise-like readings produced when capturing image information. With respect to this consideration, smaller speckle features induce more noise than larger speckle features.

[0089] A system designer can take all of the above-described factors into account in mapping different component patterns to respective depth ranges, e.g., by balancing the effects of each consideration against the others. Different mappings may be appropriate for different environment-specific objectives. In some cases, the system designer may also wish to choose a relatively small number of component patterns to simplify the operation of the ADSS 102 and make it more efficient.

[0090] FIG. 11 shows one mapping table which maps component patterns (associated with different speckle feature sizes) with different depth ranges. In block 710 of FIG. 7, the mapping table is stored in a data store 1102. The pattern assignment module 122 has access to the mapping table in the data store 1102.

[0091] D. Preliminary Operations

[0092] Returning to FIG. 4, in block 402, the system designer can calibrate the projector device 104 and the camera device 108. This ultimately provides a way of mapping the coordinate system of the projector device 104 to the coordinate system of the camera device 108, and vice versa. Known strategies can be used to perform calibration of a structured light system, e.g., as described in Jason Geng, "Structured-light 3D surface imaging: a tutorial," Advances in Optics and Photonics, Vol. 3, 2011, pp. 128-160. For example, block 402 may entail calibrating the camera device 108 by capturing image information of a physical calibration object (e.g., a checkerboard pattern) placed at known positions in the scene 106, and calibrating the camera device 108 based on the captured image information. Calibration of the projector device 104 (which can be treated as an inverse camera) may entail projecting a calibration pattern using the projector device 104 onto a calibration plane, capturing the image information using the calibrated camera device 108, and calibrating the projector device 104 based on the captured image information.

[0093] Block 402 also involves generating the reference image set I.sub.S, where I.sub.S={I.sub.S,m}.sub.m=1.sup.M. Each component reference image I.sub.S,m in the set is produced by projecting a particular component pattern (having a particular property) onto a reference plane at a known depth D.sub.0, oriented normally to the light path of the projector device 104. The camera device 108 then captures image information of the scene 106 to yield I.sub.S,m.

[0094] In block 404, the set-up functionality 128 determines an initial depth map. In one approach, the set-up functionality 128 can instruct the projector device 104 to successively project different component patterns in the set P.sub.S, at different illumination intensities. The depth determination module 116 can then provide depth maps for each combination of P.sub.s,m and .beta..sub.s,m. For each point in the scene, the set-up functionality 128 then selects the depth value that occurs most frequently within the various depth maps that have been collected. This yields the initial depth map when performed for all points in the scene.

[0095] E. Representative Computing Functionality

[0096] FIG. 12 sets forth illustrative computing functionality 1200 that can be used to implement any aspect of the functions described above. For example, the computing functionality 1200 can be used to implement any aspect of the ADSS 102. In one case, the computing functionality 1200 may correspond to any type of computing device that includes one or more processing devices. In all cases, the computing functionality 1200 represents one or more physical and tangible processing mechanisms.

[0097] The computing functionality 1200 can include volatile and non-volatile memory, such as RAM 1202 and ROM 1204, as well as one or more processing devices 1206 (e.g., one or more CPUs, and/or one or more GPUs, etc.). The computing functionality 1200 also optionally includes various media devices 1208, such as a hard disk module, an optical disk module, and so forth. The computing functionality 1200 can perform various operations identified above when the processing device(s) 1206 executes instructions that are maintained by memory (e.g., RAM 1202, ROM 1204, and/or elsewhere).

[0098] More generally, instructions and other information can be stored on any computer readable storage medium 1210, including, but not limited to, static memory storage devices, magnetic storage devices, optical storage devices, and so on. The term computer readable storage medium also encompasses plural storage devices. In all cases, the computer readable storage medium 1210 represents some form of physical and tangible entity.

[0099] The computing functionality 1200 also includes an input/output module 1212 for receiving various inputs (via input modules 1214), and for providing various outputs (via output modules). One particular output mechanism may include a presentation module 1216 and an associated graphical user interface (GUI) 1218. The computing functionality 1200 can also include one or more network interfaces 1220 for exchanging data with other devices via one or more communication conduits 1222. One or more communication buses 1224 communicatively couple the above-described components together.

[0100] The communication conduit(s) 1222 can be implemented in any manner, e.g., by a local area network, a wide area network (e.g., the Internet), etc., or any combination thereof. The communication conduit(s) 1222 can include any combination of hardwired links, wireless links, routers, gateway functionality, name servers, etc., governed by any protocol or combination of protocols.

[0101] Alternatively, or in addition, any of the functions described in the preceding sections can be performed, at least in part, by one or more hardware logic components. For example, without limitation, the computing functionality can be implemented using one or more of: Field-programmable Gate Arrays (FPGAs); Application-specific Integrated Circuits (ASICs); Application-specific Standard Products (ASSPs); System-on-a-chip systems (SOCs); Complex Programmable Logic Devices (CPLDs), etc.

[0102] In closing, the description may have described various concepts in the context of illustrative challenges or problems. This manner of explanation does not constitute an admission that others have appreciated and/or articulated the challenges or problems in the manner specified herein.

[0103] Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.

本文链接：https://patent.nweon.com/17141

Microsoft Patent | Depth sensing with depth-adaptive illumination

您可能还喜欢...

分类

最新AR/VR行业分享

最新AR/VR论文

最新AR/VR行业招聘

Microsoft Patent | Depth sensing with depth-adaptive illumination

您可能还喜欢...

Microsoft Patent | Vision-control system for near-eye display

Microsoft Patent | Head-Mounted Display Input Translation

Microsoft Patent | External illumination with reduced detectability

分类

最新AR/VR行业分享

最新AR/VR论文

最新AR/VR行业招聘