Microsoft Patent | Real-time feedback for surface reconstruction as a service

小编映维 | 分类：Microsoft | 2021年1月21日

Patent: Real-time feedback for surface reconstruction as a service

Publication Number: 20210019953

Publication Date: 20210121

Applicant: Microsoft

Abstract

Techniques for improving how surface reconstruction data is prepared and passed between multiple devices are disclosed. For example, an environment is scanned to generate 3D scanning data. This 3D scanning data is then transmitted to a central processing service. The 3D scanning data is structured or otherwise configured to enable the central processing service to generate a digital 3D representation of the environment using the 3D scanning data. Reduced resolution representation data is received from the central processing service. This reduced resolution representation data was generated based on 3D scanning data generated by one or more other computer systems that were also scanning the same environment. A first visualization corresponding to the original 3D scanning data is then displayed simultaneously with one or more secondary visualization(s) corresponding to the reduced resolution representation data.

Claims

A computer system comprising: one or more processor(s); and one or more computer-readable hardware storage device(s) having stored thereon computer-executable instructions that are executable by the one or more processor(s) to cause the computer system to: for an environment in which the computer system is located, scan the environment to generate three-dimensional (3D) scanning data of the environment; transmit the 3D scanning data to a central processing service, the 3D scanning data being configured to enable the central processing service to generate a digital 3D representation of the environment using the 3D scanning data; receive, from the central processing service, reduced resolution representation data, wherein the reduced resolution representation data was generated based on 3D scanning data generated by one or more other computing system(s) that were also scanning the environment during a same pre-determined time period in which the computer system was scanning the environment; and on a display of the computer system, simultaneously render a first visualization corresponding to the 3D scanning data and one or more secondary visualization(s) corresponding to the reduced resolution representation data.
The computer system of claim 1, wherein one of the first visualization or the second visualization overlaps the other one on the display.
The computer system of claim 1, wherein the first visualization or the second visualization is a wire frame visualization.
The computer system of claim 3, wherein the wire frame visualization is rendered in a bird’s eye mini-map.
The computer system of claim 1, wherein the first visualization and the second visualization are both holograms that are projected onto the display.
The computer system of claim 5, wherein a displayed color of the hologram for the first visualization is different than a displayed color of the hologram for the second visualization.
The computer system of claim 1, wherein transmitting or receiving data, including the 3D scanning data or the reduced resolution representation data, with the central processing service is performed using at least one of: a wireless fidelity (Wi-Fi) network via one or more router(s), a radio network, or a different wireless or wired broadband network, and wherein execution of the computer-executable instructions further causes the computer system to: refrain from merging the 3D scanning data with the reduced resolution representation data such that the 3D scanning data remains separately distinct from the reduced resolution representation data and such that the computer system refrains from generating a single composite of data from the 3D scanning data and the reduced resolution representation data; identify a first set of anchor points from the 3D scanning data; identify a second set of anchor points included within the reduced resolution representation data; and align the 3D scanning data with the reduced resolution representation data by identifying correlations between the first set of anchor points and the second set of anchor points.
The computer system of claim 1, wherein the reduced resolution representation data is received from the central processing service within a pre-determined time period subsequent to the one or more other computing system(s) scanning the environment such that the reduced resolution representation data constitutes real-time feedback data.
The computer system of claim 1, wherein the computer system refrains from fusing the 3D scanning data with the reduced resolution representation data into a single composite of data such that the 3D scanning data remains separately distinct from the reduced resolution representation data.
The computer system of claim 1, wherein the reduced resolution representation data includes data differentiating between areas in the environment that have been scanned by the one or more other computing system(s) and areas in the environment that have not been scanned by the one or more other computing system(s).
A method for providing feedback data to multiple computing systems to enable said multiple computing systems to visually differentiate between areas in an environment that have been scanned via surface reconstruction scanners up to at least a particular scanning threshold and areas in the environment that have not been scanned via said surface reconstruction scanners up to the particular scanning threshold, the method being performed by a central processing service and comprising: receiving, from the multiple computing systems, three-dimensional (3D) scanning data describing a common environment in which the multiple computing systems are located; and subsequent to receiving the 3D scanning data, performing the following: use the 3D scanning data to start generating a 3D surface reconstruction mesh that describes the environment three-dimensionally or, alternatively, to update the 3D surface reconstruction; use the 3D scanning data to generate multiple sets of reduced resolution representation data, with a corresponding set being generated for each one of the multiple computing systems, wherein each set of reduced resolution representation data describes one or more area(s) within the environment that were scanned up to the particular scanning threshold by that set’s corresponding computing system; and for each set of reduced resolution representation data, transmit said set to each computing system included among the multiple computing systems except for that set’s corresponding computing system such that the central processing service refrains from transmitting said set to that set’s corresponding computing system and such that the central processing service sends to each computing system included among the multiple computing systems reduced resolution representation data generated from 3D scanning data acquired by a different computing system.
The method of claim 11, wherein each set of reduced resolution representation data includes anchor point data, which is provided to enable each computing system included among the multiple computing systems to align its corresponding 3D scanning data with any received reduced resolution representation data.
The method of claim 11, wherein the common environment is a building in which the multiple computing systems are located, and wherein a first computing system included among the multiple computing systems is located in a first room of the building while a second computing system included among the multiple computing systems is located in a second room of the building, the second room being different than the first room.
The method of claim 11, wherein the central processing service transmits the sets of reduced resolution representation data to the multiple computing systems within a pre-determined time period subsequent to the central processing service receiving the 3D scanning data such that the sets of reduced resolution representation data constitute live feedback from the central processing service.
The method of claim 11, wherein the sets of reduced resolution representation data are generated based on the 3D scanning data, and wherein each set of reduced resolution representation data includes a fraction of data included in the 3D scanning data such that each set of reduced resolution representation data describes the environment only to a partial extent as opposed to a full extent.
The method of claim 11, wherein the central processing service generates a blueprint of the common environment using the 3D scanning data.
A head-mounted device (HMD) comprising: a display; one or more processor(s); and one or more computer-readable hardware storage device(s) having stored thereon computer-executable instructions that are executable by the one or more processor(s) to cause the HMD to: for an environment in which the HMD is located, scan the environment to generate three-dimensional (3D) scanning data of the environment; transmit the 3D scanning data to a central processing service, the 3D scanning data being configured to enable the central processing service to generate a 3D surface mesh of the environment using the 3D scanning data; receive, from the central processing service, reduced resolution representation data, wherein the reduced resolution representation data was generated based on 3D scanning data generated by one or more other HMD(s) that were also scanning the environment during a same pre-determined time period in which the HMD was scanning the environment; subsequent to receiving the reduced resolution representation data, refrain from merging the 3D scanning data with the reduced resolution representation data such that the 3D scanning data and the reduced resolution representation data remain distinct from one another; align the 3D scanning data with the reduced resolution representation data while continuing to refrain from merging the 3D scanning data with the reduced resolution representation data; and on the display, simultaneously render a first visualization corresponding to the 3D scanning data and one or more secondary visualization(s) corresponding to the reduced resolution representation data, which was generated based on the 3D scanning data generated by the one or more other HMD(s).
The HMD of claim 17, wherein aligning the 3D scanning data with the reduced resolution representation data results in the 3D scanning data and the reduced resolution representation data sharing a same coordinate axis, and wherein the aligning is performed using one or more shared anchor points that are commonly shared between the 3D scanning data and the reduced resolution representation data.
The HMD of claim 17, wherein the first visualization and the second visualization include one of: two-dimensional (2D) wire frame, 3D point cloud visualizations, or 3D holograms.
The HMD of claim 17, wherein the one or more secondary visualization(s) include an indicator indicating whether a particular area within the environment, which particular area was scanned by the one or more other HMD(s), requires additional scanning by the HMD in order to provide the central processing service with additional scanning data, the additional scanning data being needed to ensure that a quality level of the 3D surface mesh satisfies a required quality level for that particular area.

Description

BACKGROUND

[0001] Mixed-reality (MR) systems/devices include virtual-reality (VR) and augmented-reality (AR) systems. Conventional VR systems create completely immersive experiences by restricting users’ views to only virtual images rendered in VR scenes/environments. Conventional AR systems create AR experiences by visually presenting virtual images that are placed in or that interact with the real world. As used herein, VR and AR systems are described and referenced interchangeably via use of the phrase “MR system.” As also used herein, the phrases “virtual image,” “virtual content,” and “hologram” refer to any type of digital image rendered by an MR system. Furthermore, it should be noted that a head-mounted device (HMD) typically provides the display used by the user to view and/or interact with holograms provided within an MR scene. As used herein, “HMD” and “MR system” can be used interchangeably with one another. HMDs and MR systems are also examples of “computer systems.”

[0002] MR systems are emerging as a highly beneficial device for many different types of organizations, events, and people, including first responders (e.g., firefighters, policemen, medics, etc.). For instance, FIG. 1 illustrates an example of a building 100 that is currently on fire. Here, building 100 has numerous different floors, with each floor having its own floor layout (e.g., floor layout 105 showing different rooms relative to one another). In the situation shown in FIG. 1, there is a baby 110 located in one of the rooms of building 100. In this case, it is highly desirable for first responders (e.g., first responder 115 and first responder 120) to be able to quickly navigate their way through the building 100 to locate and rescue the baby 110.

[0003] Some techniques have been developed to enable an HMD to acquire and display a blueprint of a building, which would help users know how best to navigate through the rooms of the building. As an example, using a blueprint, first responders 115 and 120, who may be using an HMD, can quickly navigate through the building to find and rescue the baby 110. As such, use of HMDs truly do have great benefits and can be instrumental in saving countless lives.

[0004] While HMDs have provided substantial benefits in emergency scenarios, their use can be improved even further. For instance, in emergency scenarios, it is highly desirable to quickly and efficiently “clear” rooms by checking to see whether a person, animal, or prized possession is located within those rooms. While current techniques are in place to help guide users (e.g., first responders) in simply navigating different rooms via blueprint data, there is a substantial need (especially in emergency scenarios) to coordinate the activities of multiple users to facilitate efficient sweeping, scanning, and clearing of rooms in an environment.

[0005] The subject matter claimed herein is not limited to embodiments that solve any disadvantages or that operate only in environments such as those described above. Rather, this background is only provided to illustrate one exemplary technology area where some embodiments described herein may be practiced.

BRIEF SUMMARY

[0006] The disclosed embodiments relate to systems, methods, and other devices (e.g., HMDs or computer-readable hardware storage devices) that improve the coordination between multiple scanning devices used to map or scan an environment. By improving this coordination, the embodiments reduce redundancy and significantly increase scanning efficiency.

[0007] In some embodiments, a computer system (e.g., an HMD/MR system) can be used to scan an environment to generate three-dimensional (3D) scanning data of the environment. This 3D scanning data is transmitted to a central processing service, which uses the 3D scanning data to generate a digital 3D representation of the environment. So-called “reduced resolution representation data” is then received by the computer system from the central processing service. The reduced resolution representation data was generated using other 3D scanning data generated by one or more other computing system(s), which were scanning the same environment during the same time period as when the computer system was performing its scan. The computer system then renders a first visualization, which corresponds to the 3D scanning data, simultaneously with one or more secondary visualization(s), which corresponding to the reduced resolution representation data.

[0008] In some embodiments, a central processing service provides real-time feedback to multiple computing systems to enable those systems to visually differentiate between areas in an environment that have been scanned via surface reconstruction scanners (e.g., up to at least a particular scanning threshold) and areas in the environment that have not been scanned (e.g., up to the particular scanning threshold). To do so, the central processing service receives, from the multiple computing systems, 3D scanning data describing a common environment in which the multiple computing systems are located. Subsequent to receiving the 3D scanning data, the central processing service performs a number of operations. For instance, the service uses the 3D scanning data to generate (or update) a 3D surface reconstruction mesh describing the environment three-dimensionally. Additionally, the service uses the 3D scanning data to generate multiple sets of reduced resolution representation data, with a corresponding set being generated for each one of the computing systems. Each one of these sets describes one or more area(s) within the environment that were scanned up to the particular scanning threshold by that set’s corresponding computing system. Then, for each set of reduced scanning data, the service transmits that set to each one of the computing systems except for that set’s corresponding computing system. Consequently, the service refrains from transmitting the set to that set’s corresponding computing system (e.g., to avoid providing that system with redundant data because that system already has scanning data for the areas it scanned). Accordingly, the service sends to each computing system reduced resolution representation data that was generated from 3D scanning data acquired by a different computing system.

[0009] In some embodiments, an HMD is used to scan an environment to generate 3D scanning data of the environment. The HMD transmits the 3D scanning data to a central processing service to enable the service to generate a 3D surface mesh. In turn, the HMD receives reduced resolution representation data, which is generated based on 3D scanning data acquired by one or more other HMDs. Notably, these other HMDs were scanning the environment during the same time period as when the HMD was scanning the environment. The HMD actively refrains from merging its own 3D scanning data with the received reduced resolution representation data. Consequently, the 3D scanning data and the reduced resolution representation data remain distinct from one another. The HMD does, however, align the 3D scanning data with the reduced resolution representation data. The HMD also simultaneously renders a first visualization, which corresponds to the 3D scanning data, with one or more secondary visualization(s), which correspond to the reduced resolution representation data.

[0010] This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.

[0011] Additional features and advantages will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the teachings herein. Features and advantages of the invention may be realized and obtained by means of the instruments and combinations particularly pointed out in the appended claims. Features of the present invention will become more fully apparent from the following description and appended claims, or may be learned by the practice of the invention as set forth hereinafter.

BRIEF DESCRIPTION OF THE DRAWINGS

[0012] In order to describe the manner in which the above-recited and other advantages and features can be obtained, a more particular description of the subject matter briefly described above will be rendered by reference to specific embodiments which are illustrated in the appended drawings. Understanding that these drawings depict only typical embodiments and are not therefore to be considered to be limiting in scope, embodiments will be described and explained with additional specificity and detail through the use of the accompanying drawings in which:

[0013] FIG. 1 illustrates an example scenario in which first responders are responding to an emergency in a building, where the building includes multiple floors and different floor layouts.

[0014] FIGS. 2A and 2B illustrate how it is beneficial to coordinate the activities or paths of the first responders so as to more effectively and efficiently clear, sweep, or scan rooms in a building, especially when responding to an emergency.

[0015] FIGS. 3A, 3B, 3C, 3D, and 3E illustrate how an environment can be scanned to generate a digital 3D representation of the environment. Each of these figures shows how the depth map and head pose of the first responder captures different perspectives and viewpoints.

[0016] FIG. 4A illustrates how multiple devices can generate 3D scanning data and how the 3D scanning data and/or the associated position and pose estimations can be transmitted to a central processing service (e.g., either a central cloud service or a central local service) to enable the central processing service to generate a robust 3D surface mesh of the environment.

[0017] FIG. 4B illustrates how the scanning data can include numerous different types and amounts of data describing the environment.

[0018] FIG. 5A illustrates how the central processing service is able to provide real-time feedback (e.g., “reduced resolution representation data”) to the scanning devices (e.g., HMDs or other scanning devices, such as a scanning sensor connected to a laptop or tablet that are able to perform data processing and visualization of real-time feedback) to help facilitate or coordinate the scanning activities of those devices to avoid redundant scanning or perhaps even to trigger a re-scan or additional scan of an area in the event the area was not adequately scanned during an initial scan.

[0019] FIG. 5B illustrates how the reduced resolution representation data can include numerous different types and amounts of reduced, coarse, skeleton, or limited information describing which areas of an environment have or have not been scanned. The limited information is designed to satisfy different thresholds (e.g., network bandwidth threshold, data limit thresholds, etc.) to enable quick transmission and incorporation/adoption into each HMD.

[0020] FIGS. 5C and 5D illustrate other example techniques for providing reduced resolution representation data to other HMDs.

[0021] FIG. 6 illustrates an enlarged version of a mini-map indicating where different users/HMDs have been within an environment. In particular, FIG. 6 illustrates a breadcrumb trail or footprint trail for each of the different users/HMDs. In some cases, the breadcrumb trails may overlap, indicating that multiple users/HMDs have crossed the same path.

[0022] FIG. 7 illustrates an example scenario in which a bird’s eye two-dimensional (“2D”) perspective mini-map is rendered by a user’s HMD to inform the user where his/her other companions have already been within the environment. Such a feature is particularly beneficial when scanning or clearing rooms so as to avoid redundant scanning or clearing.

[0023] FIG. 8A illustrates an example scenario in which a first responder is determining whether to enter a particular room in an environment.

[0024] FIG. 8B illustrates an example scenario in which the first responder peeks his/her HMD inside the room and is able to determine that the room has already been scanned by a different HMD because an informative hologram is projected by the first responder’s HMD to inform him/her that another first responder has already visited the room. The hologram is generated using reduced resolution representation data received from the central processing service.

[0025] FIG. 8C illustrates an example scenario in which the entirety of the room was not previously scanned (or not scanned to an adequate scanning threshold or degree); thus there are a few areas in the room that are still in need of being scanned in order to provide sufficient 3D scanning data to the central processing service to enable it to generate a robust and accurate 3D surface mesh of the room.

[0026] FIG. 8D illustrates an example scenario in which a holographic indicator is displayed on the first responder’s HMD to inform him/her that certain areas in the room have not yet been adequately scanned and that those areas should be (re)scanned to provide the central processing service with an adequate amount of 3D scanning data to enable it to generate an accurate and robust 3D surface mesh of the room.

[0027] FIG. 8E illustrates an example scenario where an HMD is being used to newly scan or, alternatively, to rescan areas in a room that were either not scanned or that were not previously scanned to an adequate degree or amount.

[0028] FIG. 9 illustrates an example scenario in which a first HMD (not shown) has already scanned a room and a second HMD is now approaching the already-scanned room. Here, the second HMD renders multiple different holograms to indicate how the room was already scanned by the first HMD and also to indicate areas where the second HMD is scanning or is being pointed at. In some cases, multiple holograms can overlap one another to indicate that multiple HMDs have scanned the same area or been present at the same area.

[0029] FIG. 10A illustrates a flowchart of an example method for receiving real-time feedback (e.g., reduced resolution representation data) from a central processing service so that a receiving HMD can visualize the different paths traveled by any number of other HMDs located within the same environment.

[0030] FIG. 10B illustrates a flowchart of an example method for aligning 3D scanning data with reduced resolution representation data to ensure the two sets of data share the same coordinate axis. This alignment process may be performed without fusing, merging, or otherwise joining the two sets of data into a single composite of data (i.e. the data is prevented from being joined or fused together).

[0031] FIG. 11 illustrates a flowchart of an example method performed by a central processing service for receiving 3D scanning data from multiple devices and for generating and transmitting reduced resolution representation data back to those devices so they can then visualize the different paths traveled by other devices in the environment.

[0032] FIG. 12 illustrates an example computer system capable of performing any of the disclosed operations.

DETAILED DESCRIPTION

[0033] The disclosed embodiments improve the coordination between multiple scanning devices (e.g., HMDs, laptops, tablets, or any scanning device capable of performing depth data processing and visualization of real-time feedback) used to map out an environment.

[0034] In some embodiments, a system/device scans an environment to generate 3D scanning data. This data is transmitted to a central processing service to generate a digital 3D representation of the environment. The system receives reduced resolution representation data from the service, where the reduced resolution representation data was generated based on 3D scanning data generated by other systems/devices that were also scanning the environment. The system renders a first visualization, which corresponds to the 3D scanning data, simultaneously with one or more secondary visualization(s), which correspond to the reduced resolution representation data.

[0035] In some embodiments, a central processing service provides real-time feedback to multiple systems/devices to enable those systems to visually differentiate between areas in an environment that have or have not been sufficiently scanned/mapped. The service first receives 3D scanning data from the systems, where the received data describes a common environment in which the systems are located. The service uses the 3D scanning data to generate a digital 3D representation of the environment. The service also uses the 3D scanning data to generate multiple sets of reduced resolution representation data, with a corresponding set being generated for each system. Each set describes an area that was scanned up to a scanning threshold by that set’s corresponding system. For each set, the service also transmits the set to each computing system except for that set’s corresponding system. That is, the service sends to each system reduced resolution representation data based on 3D scanning data generated by another system.

[0036] In some embodiments, an HMD scans an environment to generate 3D scanning data. The HMD transmits the 3D scanning data to a central processing service to generate a 3D surface mesh. The HMD receives (e.g., from the central processing service) reduced resolution representation data generated from 3D scanning data acquired by one or more other HMDs and not by the HMD. The HMD actively refrains from merging its 3D scanning data with the reduced resolution representation data. Without merging the two data sets, the HMD aligns the two sets. The HMD also simultaneously renders a first visualization, which corresponds to the 3D scanning data, with one or more secondary visualization(s), which correspond to the reduced resolution representation data.

Example Technical Benefits, Advantages, and Improvements

[0037] The following section outlines some example improvements and practical applications provided by the disclosed embodiments. It will be appreciated, however, these are just examples only and that the embodiments are not limited to only these improvements.

[0038] The disclosed embodiments bring about numerous benefits to the technology by coordinating any number of sweeping, scanning, and/or clearing activities performed by multiple systems/devices (e.g., HMDs/MR systems) located within the same environment. By coordinating these activities, the embodiments help prevent redundantly generating scanning data for the same areas and also help to improve the efficiency by which an environment is navigated, cleared, and mapped.

[0039] In some cases, the disclosed embodiments also improve the efficiency by which a computer system operates. For example, less computer resources will be used by ensuring that the same environment is not redundantly scanned by multiple different sets of 3D scanning sensors. As a consequence, not only will fewer computer resources be used, but an HMD’s battery life will also be prolonged. Furthermore, by eliminating or preventing the generation of redundant data, the hardware used to generate the resulting 3D mesh will operate on a lesser amount of data, and thus its processes will also be made more efficient. These and other benefits/improvements will be discussed in more detail later in connection with some of the other figures provided in this disclosure.

Navigating, Clearing, and Scanning an Environment

[0040] As an initial matter, it should be noted that while many of the examples disclosed herein are related to first responders and emergency scenarios, it will be appreciated that the disclosed embodiments are not limited only to these types of scenarios. Indeed, the principles may be practiced in emergency situations as well as any type of non-emergency situations (e.g., architectural scenarios, construction scenarios, business scenarios, training scenarios, academic scenarios, etc.). Accordingly, the disclosed “emergency” examples are provided for illustrative purposes only and should not be read as limiting the scope of the disclosed principles.

[0041] Attention will now be directed to FIG. 2A, which shows an example floor layout 200 similar to that of floor layout 105 from FIG. 1. FIG. 2A shows how different rooms are located within the floor layout 200, such as rooms A, B, C, D, E, F, G, H, I, J, and K.

[0042] First responder 205 is located in room A; first responder 210 is located in room H; and first responder 215 is located in room J. Additionally, a baby 220 is located in room C. In an emergency scenario, it is highly desirable for the first responders 205, 210, and 215 to quickly and efficiently clear each of the rooms to ensure that nobody is injured or left behind and also to find the baby 220. Furthermore, in emergency scenarios, it is highly desirable that the first responders 205, 210, and 215 do not redundantly search the same rooms. Such redundancy results in wasted effort by the first responders and may consume an exorbitant amount of time.

[0043] FIG. 2B shows an efficient technique for the first responders 205, 210, and 215 to search the floor. Here, first responder 205 clears rooms A, B, and C and discovers the baby 220. The path traveled or navigated by first responder 205 is shown by augmented reality holograms of footprint 225 (or other visual cues).

[0044] First responder 210 is shown as clearing rooms D, E, H, and I, as shown by footprint 230. Furthermore, first responder 215 is shown as clearing rooms F, G, J, and K, as shown by footprint 235. Notice, none of the first responders 205, 210, or 215 clear a room that has already been cleared by another first responder. As such, the actions of these first responders show how a highly efficient and non-redundant search pattern was used to clear the floor. As will be described herein, the disclosed embodiments can be used to help coordinate the navigation paths between different users to ensure that those users do not redundantly follow a same or overlapping path when sweeping, clearing, or otherwise mapping/scanning an environment.

[0045] FIG. 3A shows an example scenario in which a room 300 is being cleared and scanned by a user 305 (e.g., perhaps a first responder) wearing an HMD 310, which is being used to perform the scan 315 of the room 300. Room 300 may be an example of any of the rooms shown in FIG. 2B.

[0046] HMD 310 includes one or more depth cameras or 3D scanning sensors. As used herein, a “depth camera” (or “3D scanning sensor” or simply “scanning sensor”) includes any type of depth camera or depth detector. Examples include, but are not limited to, time-of-flight (“TOF”) cameras, active stereo camera pairs, passive stereo camera pairs, or any other type of camera, sensor, laser, or device capable of detecting or determining depth. HMD 310’s depth cameras are used to acquire 3D scanning data of room 300. This 3D scanning data identifies depth characteristics of room 300 and is used to generate a “3D surface mesh” (or “3D surface reconstruction mesh,” “surface mesh,” or simply “mesh”) of room 300. This surface mesh is used to identify the objects within the environment as well as their depth with respect to one another and possibly with respect to HMD 310.

[0047] In an AR environment, an AR system relies on the physical features within the real-world to create virtual images (e.g., holograms). As an example, the AR system can project a dinosaur crashing through the wall of the bedroom or can guide the user 305 in navigating between rooms. For example, perhaps room 300 is smoky from a fire, and visibility is very limited. In this case, the AR system can help the user 305 navigate the room by rendering navigation virtual images telling the user 305 where to go to escape the room. To make the virtual images and experience as realistic and useful as possible, the AR system uses the depth and surface characteristics of the room 300 in order to determine how best to create any virtual images. The surface mesh beneficially provides this valuable information to the AR system. Consequently, it is highly desirable to obtain an accurate surface mesh for any environment, such as room 300.

[0048] In a VR environment, the surface mesh also provides many benefits because the VR system can use the surface mesh to help the user avoid crashing into real-world objects (e.g., fixed features or furniture) while interacting with the VR environment. Additionally, or alternatively, a surface mesh can be captured to help a user visualize a 3D space. Consequently, it is highly beneficial to construct a surface mesh of an environment, regardless of what type of MR system is in use.

[0049] FIGS. 3A through 3E show an example technique for acquiring the data used to construct the surface mesh. Here, room 300 is a bedroom environment that includes a table, a chair, a closet, a bookcase, and a windowpane with drapes. Currently, HMD 310 is being used to scan room 300 in order to acquire data about those objects as well as other characteristics of room 300.

[0050] During this scanning process/phase, HMD 310 uses its one or more depth camera(s) to capture multiple depth images of room 300, as shown by the scan segment 315 (corresponding to a “depth image” for that area of the room). The resulting depth images are used to generate multiple depth maps of room 300. By fusing the information from these different images together, a digital 3D representation of room 300 can be generated.

[0051] To illustrate, FIG. 3B shows a surface mesh 320 that initially has a mesh segment 325. Mesh segment 325 corresponds to scan segment 315 from FIG. 3A. In this scenario, because only a single scan segment 315 has been obtained, the surface mesh 320 of room 300 is not yet complete. As HMD 310 further scans room 300, more pieces of the surface mesh 320 will be created.

[0052] FIG. 3C shows the same environment, but now HMD 310 is capturing a different viewpoint/perspective of the environment, as shown by scan segment 330. Specifically, scan segment 330 from FIG. 3C is used to further build the surface mesh 320, as shown in FIG. 3D. More specifically, surface mesh 320 in FIG. 3D now includes mesh segment 335, which was generated based on the information included in scan segment 330, and surface mesh 320 also includes mesh segment 325, which was added earlier. In this regard, multiple different depth images are obtained, acquired, or generated and are used to progressively build surface mesh 320 for room 300. The information in the depth images is fused together to generate a complete surface mesh 320 and to determine the depths of objects within room 300, as shown by FIG. 3E.

[0053] To obtain these depth images, HMD 310 performs a “scanning phase” by capturing depth images at different locations, perspectives, or viewpoints. This so-called “scanning phase” is typically performed rather quickly (e.g., under a minute), but its duration may be dependent on the size of the room or environment (i.e. larger rooms/environments may take longer than smaller rooms/environments). In some embodiments, a low resolution surface mesh can also be built in real-time by the scanning device. As will be discussed later, however, building the high resolution, high quality surface mesh may take considerable time (e.g., minutes or perhaps even hours). As the surface mesh 320 is created, it can be stored in a repository for future use or reference. In some cases, the surface mesh 320 is stored in the cloud and can be made available for any number of devices. Consequently, some embodiments query the cloud to determine whether a surface mesh is already available for an environment prior to scanning the environment.

[0054] Surface mesh 320 can also be used to segment or classify objects within room 320. For instance, the objects captured by surface mesh 320 can be classified, segmented, or otherwise characterized. This segmentation process is performed, at least in part, by determining the attributes of those objects. In some cases, this segmentation process can be performed via any type of machine learning (e.g., machine learning algorithm or device, convolutional neural network(s), multilayer neural network(s), recursive neural network(s), deep neural network(s), decision tree model(s) (e.g., decision trees, random forests, and gradient boosted trees) linear regression model(s), logistic regression model(s), support vector machine(s) (“SVM”), artificial intelligence device(s), or any other type of intelligent computing system).

[0055] Specifically, surface mesh 320 can be used to segment or identify the closet, the bookcase, the windowpane with drapes, the desk, the chair, the bed, and so on in room 300. Any number of objects, including their respective object types, may be identified via the surface mesh 320.

[0056] FIG. 4A more fully elaborates how 3D scanning data can be used to generate a digital 3D representation of an environment, such as room 300 from FIG. 3 or even the entire building 100 from FIG. 1. As used herein, the phrase “digital 3D representation” should be interpreted broadly to include any kind of digital representation of an environment. For example, the digital 3D representation may include a 3D surface mesh (aka 3D surface reconstruction mesh), a 3D point cloud, a depth map, or any other type of digital representation that identifies the different geometries, shapes, depths, and contours of an environment.

[0057] FIG. 4A shows three HMDs, namely HMD 400A, HMD 400B, and HMD 400C. The users wearing these HMDs may be representative of the first responders 205, 210, and 215 from FIG. 2A.

[0058] FIG. 4A shows how HMD 400A has generated or acquired 3D scanning data 405A for rooms A, B, and C using HMD 400A’s corresponding scanning sensors. With reference to FIG. 2B, it was shown how first responder 205 navigated rooms A, B, and C. During these navigations, first responder 205’s HMD acquired scanning data 405A, in the manner described in connection with FIGS. 3A-3E.

[0059] Similarly, the user wearing HMD 400B navigated rooms D, E, H, and I and HMD 400B generated or acquired scanning data 405B using its corresponding scanning sensors. The user wearing HMD 400C navigated rooms F, G, J, and K, and HMD 400C generated or acquired scanning data 405C using its scanning sensors.

[0060] Turning briefly to FIG. 4B, scanning data 405A from FIG. 4A can include numerous different types and amounts of data, as shown in FIG. 4B. Of course, scanning data 405B and 405C may include similar data as well.

[0061] As shown in FIG. 4B, scanning data 405A can include surface data 440 (i.e. depth data describing the geometries, shapes, depths, and contours of objects, surfaces, or other features of an environment). Scanning data 405A can also include anchor point data 445. Anchor point data 445 describes any number of anchor points that are identified within an environment. An anchor point is a location, feature, or set of fiduciary points within an environment that is determined to have a sufficiently low likelihood of moving (i.e. it is determined to have highly static characteristics satisfying a static threshold requirement). To identify anchor points, the points or locations in the environment can be put through an initial segmentation process to determine their characteristics. Based on these characteristics, the HMD can determine whether a point, location, or object is likely to be dynamic (i.e. it probably will move) or static (i.e. it probably will not move).

[0062] Anchor points represent locations within an environment that are highly static (i.e. the characteristics of those locations satisfy a static threshold). By way of example, the corners of the dresser in FIG. 3A may be highly static/non-moving and may serve as an anchor point. Likewise, the corners of the room may serve as worthwhile anchor points because walls typically do not shift. In contrast, the chair or window drapes are probably not very static and would not serve well as anchor points.

[0063] Scanning data 405A may also include location data 450, such as GPS coordinate data, triangulation data from a telecommunications network, triangulation data from wireless routers, or perhaps signal strength data (indicating proximity) relative to a router or other sensing device.

[0064] Additionally, scanning data 405A may include any other type of HMD data 455, such as the amount of time used to scan a particular area, whether the area was fully scanned or only partially scanned, the identification information for the HMD used to scan the area, the timing or timestamp of when the area was scanned, which hardware scanners were used to perform the scan, or perhaps even a quality or accuracy metric detailing the quality of the scan. The scanning data 405A may also include coordinate axis 460 data to determine the orientation of the room relative to a known vector, such as gravity or some other determined reference point (e.g., the orientation of the vertical wall corners in the room). Additionally, the scanning data 405A may include pose estimation 465 data describing one or more poses of the HMD. As used herein, “pose” generally refers to the angle and direction in which the HMD is being aimed or pointed (i.e. a “viewing vector”). The HMD is able to determine its pose and transmit any number of poses via the pose estimation 465 data.

[0065] While the above examples are primarily directed towards indoor activities, it should be noted that the disclosed principles can also be practiced in outdoor environments as well. As such, the disclosed principles should be interpreted broadly to include or to be usable in any type of environment, both indoor and outdoor.

[0066] Returning to FIG. 4A, HMDs 400A, 400B, and 400C were traveling through the same environment (which included multiple rooms as shown by floor layout 200 of FIG. 2A) during the same time period, as shown by same scanning time period 410. It will be appreciated that the same scanning time period 410 can be any duration or length of time.

[0067] For instance, the same scanning time period 410 can include a range of time spanning a few seconds, minutes, hours, or perhaps even days. In some scenarios, HMDs 400A, 400B, and 400C are scanning the environment during the same overlapping time while in other scenarios HMDs 400A, 400B, and 400C are scanning the environment at different times but still within the same scanning time period 410.

[0068] As an example, suppose the same scanning time period 410 was 15 minutes. In this specific example, HMD 400A scans the environment only during the first five minutes of the fifteen-minute block. HMD 400B then scans the environment only during the second five-minute block. HMD 400C then scans the environment only during the third five-minute block. In this regard, HMDs 400A, 400B, and 400C all scanned the environment within the same scanning time period 410 even though their individual scanning durations did not overlap. Of course, that is one example scenario, and it will be appreciated that one or more scans can overlap in time/duration with one or more other scans.

[0069] FIG. 4A also shows how HMDs 400A, 400B, and 400C are able to transmit their respective scanning data to a central cloud service 420, which is one example of a “central processing service” and which represents a “spatial reconstruction as a service” (SRaaS). More particularly, HMD 400A transmits scanning data 405A to central cloud service 420; HMD 400B transmits scanning data 405B to central cloud service 420; and HMD 400C transmits scanning data 405C to central cloud service 420. Here, central cloud service 420 is a computing service operating in a cloud network and is available to HMDs and users on-demand via the Internet by a cloud service provider.

[0070] The HMDs 400A-C can transmit their respective data sets to the central cloud service 420 in numerous different ways or through numerous different communication networks and protocols. As one example, the HMDs 400A-C can rely on a Wi-Fi network 415A to transmit their data. Additionally, or alternatively, the HMDs 400A-C can rely on a separate broadband network 415B to transmit their data. Examples of broadband network 415B include, but are not limited to, a telecommunications network, an inter-squad radio network (encrypted or not encrypted), and possibly a Bluetooth network.

[0071] If the HMDs 400A-C are located inside of a building (e.g., perhaps building 100 of FIG. 1), then the HMDs 400A-C can connect to a wireless hub or router of the building’s Wi-Fi network to transmit their data. Combinations of the above networks may be used as well. For instance, one HMD may use a Wi-Fi network while another HMD uses a telecommunications network.

[0072] Central cloud service 420 receives the scanning data from the multiple different HMDs. Using this scanning data, central cloud service 420 begins generating a digital 3D representation 425 of the environment, where the environment is a combination of rooms A, B, C, D, E, F, G, H, I, J, and K. In some embodiments, the digital 3D representation 425 is or includes a 3D surface reconstruction mesh 430 of those rooms. In some embodiments, the digital 3D representation 425 includes a 3D point cloud, or any type or number of depth maps of those rooms. The central cloud service 420 need not wait until all of the scanning data is received prior to commencing the build of the digital 3D representation 425. Instead, the central cloud service 420 can progressively build the digital 3D representation 425 as new scanning data is progressively received.

[0073] Additionally, central cloud service 420 can use the scanning data to generate a blueprint 435 of the rooms, where the blueprint 435 can be included as a part of the digital 3D representation 425 and where the blueprint 435 is generated using the 3D scanning data. Here, central cloud service 420 can generate a 2D blueprint outlining the different rooms (e.g., rooms A through K) relative to one another, as shown by floor layout 200 of FIG. 2. This blueprint 435 can be created for each floor of a building and, therefore, a blueprint can be provided for the entire building. Accordingly, the disclosed embodiments are able to dynamically generate a 2D blueprint for a building based on the scanning data. Additionally, the disclosed embodiments are able to generate a 3D representation of the building based on the scanning data.

[0074] The process of fully computing the high quality, high resolution digital 3D representation 425 often takes a prolonged period of time, sometimes spanning many minutes or even hours. The digital 3D representation 425 may take longer to compute for more complex environments (i.e. meaning there is more complex scanning data to compute) than for less complex environments.

[0075] Reduced Resolution Representation Data

[0076] As described earlier, it is highly desirable to coordinate the activities of multiple users or HMDs engaged in navigating an environment. In some cases, these navigations are performed to clear rooms of the environment to check for victims in an emergency event/condition while in other cases these navigations are performed simply to map out the environment without facing an emergency event/condition. Regardless of the purpose for which the users and HMDs are navigating the rooms, it is highly desirable to be able to quickly and accurately coordinate the users’ and HMDs’ navigation paths so that the users and HMDs can navigate the environment quickly and efficiently and without redundantly scanning or clearing the same room multiple times by multiple different users/HMDs. Unfortunately, it is highly expensive, both in terms of compute and bandwidth, to pass 3D scanning data quickly from HMD to HMD. What is needed, therefore, is an improved technique for informing HMDs regarding the locations of other HMDs in the same environment and to do so without passing full 3D scanning data among the different HMDs.

[0077] With that understanding, the disclosed embodiments can be used to provide the desired coordination between the multiple different HMDs while refraining from passing full 3D scanning data amongst themselves. For instance, with regard to FIG. 2B, the embodiments can intelligently inform first responder 205 (e.g., via an HMD) that he/she does not need to clear rooms D, E, F, G, H, I, J, and K because first responders 210 and 215 have already done so. Similarly, the embodiments can intelligently inform first responder 210 that he/she does not need to clear rooms A, B, C, F, G, J, and K because first responders 205 and 215 have already done so. To complete the example, the embodiments can also intelligently inform first responder 215 that he/she does not need to clear rooms A, B, C, D, E, H, and I because first responders 205 and 210 have already done so. In some cases, additional instructions can be provided by a central guiding person, operator, or entity tasked with informing the users where they should or should not go (e.g., perhaps via voice commands or a displayed chat thread). To perform these processes, the embodiments make use of what is referred to as “reduced resolution representation data” to make these intelligent guiding instructions.

[0078] FIG. 5A shows an example scenario in which reduced resolution representation data (i.e. coarse, skeleton, or limited data, as will be described later) is being used to inform multiple HMDs regarding which areas of an environment have or have not already been scanned to a sufficient scanning threshold. In particular, FIG. 5A shows a central cloud service 500, which is representative of the central cloud service 420 from FIG. 4A.

[0079] Central cloud service 500 is providing real-time feedback data 505 to a group of multiple HMDs to inform those HMDs of the areas that have already been scanned and/or navigated by other HMDs. The real-time feedback data 505 is provided to the HMDs within a pre-determined time period 510 subsequent to when the scanning data (e.g., scanning data 405A, 405B, and 405C from FIG. 4A) was received by the central cloud service 500.

……
……
……

本文链接：https://patent.nweon.com/17530

Microsoft Patent | Real-time feedback for surface reconstruction as a service

您可能还喜欢...

分类

最新AR/VR行业分享

最新AR/VR论文

最新AR/VR行业招聘

Microsoft Patent | Real-time feedback for surface reconstruction as a service

您可能还喜欢...

Microsoft Patent | Adaptive Close Loop Control For Laser Beam Scanning Displays

Microsoft Patent | Saccadic Breakthrough Mitigation For Near-Eye Display

Microsoft Patent | Telepresence Devices Operation Methods

分类

最新AR/VR行业分享

最新AR/VR论文

最新AR/VR行业招聘