Apple Patent | Adaptive vehicle augmented reality display using stereographic imagery

Patent: Adaptive vehicle augmented reality display using stereographic imagery

Drawings: Click to check drawins

Publication Number: 20210166490

Publication Date: 20210603

Applicant: Apple

Assignee: Apple Inc.

Abstract

An AR system that leverages a pre-generated 3D model of the world to improve rendering of 3D graphics content for AR views of a scene, for example an AR view of the world in front of a moving vehicle. By leveraging the pre-generated 3D model, the AR system may use a variety of techniques to enhance the rendering capabilities of the system. The AR system may obtain pre-generated 3D data (e.g., 3D tiles) from a remote source (e.g., cloud-based storage), and may use this pre-generated 3D data (e.g., a combination of 3D mesh, textures, and other geometry information) to augment local data (e.g., a point cloud of data collected by vehicle sensors) to determine much more information about a scene, including information about occluded or distant regions of the scene, than is available from the local data.

Claims

  1. A system, comprising: a display device; and a controller comprising: one or more processors; and a memory storing instructions that, when executed on or across the one or more processors, cause the one or more processors to: obtain sensor data for an environment of a real-world scene captured by one or more sensors; render virtual content for the environment based at least on the sensor data and pre-generated three-dimensional (3D) mesh data for the environment; determine one or more colors or lighting information of a region of the environment on which the rendered virtual content is to be projected; modify one or more colors of the virtual content based at least on the determined one or more colors or lighting information of the region; and provide the virtual content to the display device.

  2. The system of claim 1, wherein the memory further comprises instructions that, when executed on or across the one or more processors, cause the one or more processors to: determine a surface normal of a surface within the region of the environment; determine direction of light from a light source in the environment; determine that the surface reflects the light in the direction of a viewer based at least on the direction of the light and the surface normal; and move a portion of the virtual content that to be projected on or near the surface to another location.

  3. The system of claim 2, wherein the pre-generated 3D mesh data comprises information indicating the surface normal.

  4. The system of claim 2, wherein to determine that the surface reflects the light in the direction of the viewer, the memory further comprises instructions that, when executed on or across the one or more processors, cause the one or more processors to: determine a current position and direction of a vehicle comprising the display device and the controller.

  5. The system of claim 1, wherein the memory further comprises instructions that, when executed on or across the one or more processors, cause the one or more processors to: modify an intensity or a size of the virtual content based at least on the determined one or more colors or lighting information of the region.

  6. The system of claim 1, wherein the one or more colors of the region are determined according to the sensor data.

  7. The system of claim 1, wherein the display device is incorporated into a windshield of a vehicle.

  8. A method, comprising: performing, with one or more computing devices: obtaining sensor data for an environment of a real-world scene captured by one or more sensors; rendering virtual content for the environment based at least on the sensor data and pre-generated three-dimensional (3D) mesh data for the environment; determining one or more colors or lighting information of a region of the environment on which the rendered virtual content is to be projected; modifying one or more colors of the virtual content based at least on the determined one or more colors or lighting information of the region; and providing the virtual content to a display device.

  9. The method of claim 8, further comprising: determining a surface normal of a surface within the region of the environment; determining direction of light from a light source in the environment; determining that the surface reflects the light in the direction of a viewer based at least on the direction of the light and the surface normal; and moving a portion of the virtual content that to be projected on or near the surface to another location.

  10. The method of claim 9, wherein the pre-generated 3D mesh data comprises information indicating the surface normal.

  11. The method of claim 8, wherein determining that the surface reflects the light in the direction of the viewer comprises: determining a current position and direction of a vehicle comprising the display device and the controller.

  12. The method of claim 8, further comprising: modifying an intensity or a size of the virtual content based at least on the determined one or more colors or lighting information of the region.

  13. The method of claim 8, wherein the one or more colors of the region are determined according to the sensor data.

  14. The method of claim 8, wherein the display device is incorporated into a windshield of a vehicle.

  15. One or more computer-readable storage media storing instructions that, when executed on or across one or more processors, cause the one or more processors to: obtain sensor data for an environment of a real-world scene captured by one or more sensors; render virtual content for the environment based at least on the sensor data and pre-generated three-dimensional (3D) mesh data for the environment; determine one or more colors or lighting information of a region of the environment on which the rendered virtual content is to be projected; modify one or more colors of the virtual content based at least on the determined one or more colors or lighting information of the region; and provide the virtual content to a display device.

  16. The one or more computer-readable storage media of claim 15, wherein the memory further comprises instructions that, when executed on or across the one or more processors, cause the one or more processors to: determine a surface normal of a surface within the region of the environment; determine direction of light from a light source in the environment; determine that the surface reflects the light in the direction of a viewer based at least on the direction of the light and the surface normal; and move a portion of the virtual content that to be projected on or near the surface to another location.

  17. The one or more computer-readable storage media of claim 16, wherein the pre-generated 3D mesh data comprises information indicating the surface normal.

  18. The one or more computer-readable storage media of claim 17, wherein to determine that the surface reflects the light in the direction of the viewer, the memory further comprises instructions that, when executed on or across the one or more processors, cause the one or more processors to: determine a current position and direction of a vehicle comprising the display device and the controller.

  19. The one or more computer-readable storage media of claim 15, wherein the memory further comprises instructions that, when executed on or across the one or more processors, cause the one or more processors to: modify an intensity or a size of the virtual content based at least on the determined one or more colors or lighting information of the region.

  20. The one or more computer-readable storage media of claim 15, wherein the one or more colors of the region are determined according to the sensor data.

Description

[0001] This application is a continuation of U.S. patent application Ser. No. 15/713,274, filed Sep. 22, 2017, which claims benefit of priority to U.S. Provisional Application No. 62/398,927, filed Sep. 23, 2016, which are hereby incorporated by reference in their entirety.

BACKGROUND

[0002] Remote sensing technologies provide different systems with information about the environment external to the system. Diverse technological applications may rely upon remote sensing systems and devices to operate. Moreover, as increasing numbers of systems seek to utilize greater amounts of data to perform different tasks in dynamic environments; remote sensing provides environmental data that may be useful decision-making. For example, control systems that direct the operation of machinery may utilize remote sensing devices to detect objects within a workspace. As another example, augmented reality (AR) systems may utilize remote sensing devices to provide depth information about objects in an environment. In some scenarios, laser based sensing technologies, such as light ranging and detection (LiDAR), can provide high resolution environmental data, such as depth maps, which may indicate the proximity of different objects to the LiDAR.

[0003] Real-time augmented reality faces a variety of challenges when it is a primary display technology in a vehicle traveling at various speeds and angles through ever changing environments. Weather conditions, sunlight, and vehicle kinematics, are just a few of the elements that may impact the rendering but that also limit a system’s overall capabilities. This is especially true since on-board sensors have a fixed range and often require algorithms for optimizing queries which impact overall quality and response time.

SUMMARY

[0004] Methods and systems are described that may, for example, be used in augmented reality (AR) displays in vehicles. Embodiments of an AR system are described that leverage a pre-generated stereographic reconstruction or 3D model of the world to aid in the anchoring and improve rendering of an AR scene. By leveraging the stereographic reconstruction of the world, embodiments of the AR system may use a variety of techniques to enhance the rendering capabilities of the system. In embodiments, an AR system may obtain pre-generated 3D data (e.g., 3D tiles) from a stereographic reconstruction of the world generated using real-world images collected from a large number of sources over time, and may use this pre-generated 3D data (e.g., a combination of 3D mesh, textures, and other geometry information) to determine much more information about a scene than is available from local sources (e.g., a point cloud of data collected by vehicle sensors) which AR rendering can benefit from.

[0005] Embodiments of an AR system are described that may use three-dimensional (3D) mesh map data (e.g., 3D tiles reconstructed from aerial/street photography) to augment or complement vehicle sensor (e.g., LiDAR or camera) information on a heads-up display. The 3D tiles can be used to fill in for limitations of the sensors (e.g., areas of the real environment that are occluded by buildings or terrain, or are out of range) to extend the AR into the full real environment in front of the vehicle (i.e., within the driver’s field of vision). For example, a route may be displayed, including parts of the route that are occluded by objects or terrain in the real environment.

[0006] Embodiments may enable the projection of 3D elements onto the terrain without having to perform altitude queries or reference a point cloud of data collected by on-board sensors. Elements can be rendered in the augmented reality scene beyond the capabilities/range of the onboard sensors. In addition to being blocked by occluding objects in the environment, the sensors are limited by other factors such as distance and the speed of the vehicle. The pre-generated 3D mesh map data can make up for these limitations since the static imagery is made available without having to scan and reference the point cloud.

[0007] In some embodiments, the sensor data may be used to provide AR content for the nearby real environment, with the pre-generated 3D mesh map data used to provide AR content for farther away objects and occluded parts of the real environment.

[0008] In some embodiments, the pre-generated 3D mesh map data may be used to speed up queries of the point cloud data for nearby objects; it may be possible to make more limited queries based on the 3D mesh map data. Thus, point cloud queries may be optimized for the local region based on the 3D mesh map data for the local region.

[0009] In some embodiments, if for some reason the sensor data is not available or is poor/range limited (e.g., sensor failure, dirt on sensor, fog, heavy rain, snow, dark/night (for camera info), inside a tunnel or garage, blocked by other vehicles, etc.), the pre-generated 3D mesh map data may still be available and may be used to fill in the missing local AR content, as well as more remote content.

[0010] In some embodiments, normals from visible surfaces in the scene that are provided in the pre-generated 3D data and knowledge of the location of light sources (e.g., the sun) may allow the AR system to determine the orientation of the surfaces with respect to a light source (e.g., the sun). Using this information, when rendering elements into the augmented reality scene, the AR system may adjust the rendering of content in the AR content so that the content is easier to see.

[0011] In some embodiments, animated elements (e.g., virtual representations of vehicles, pedestrians, etc.) in the 3D rendered scene may be made to respond to the terrain, as well as the type of surface the terrain is composed of, based on the pre-generated 3D data. For example, if a vehicle in the scene turns and goes behind a building, a virtual image of the vehicle may be displayed as going up a hill that is behind the building and thus out of view of the on-board sensors.

[0012] The pre-generated 3D mesh map data may be available for the entire real environment, 360.degree. around the vehicle, behind occlusions, and beyond the horizon. Thus, in some embodiments, the 3D mesh map data may be leveraged to provide information about the environment, including objects that are not visible, to the sides and behind the vehicle.

[0013] In some embodiments, the 3D mesh map data may be used by the AR system in poor/limited visibility driving conditions, e.g. heavy fog, snow, curvy mountain roads, etc., in which the sensor range may be limited, for example to project the route in front of the vehicle onto the AR display. For example, the 3D mesh map data may be used to augment sensor data by showing upcoming curves or intersections.

BRIEF DESCRIPTION OF THE DRAWINGS

[0014] FIG. 1 is a high-level flowchart of a method for augmenting an AR display with stereographically reconstructed 3D data, according to some embodiments.

[0015] FIG. 2 illustrates an adaptive augmented reality (AR) system and display, according to some embodiments.

[0016] FIG. 3 illustrates processing 3D mesh map data and local sensor data to generate virtual content for an AR display, according to some embodiments.

[0017] FIG. 4 illustrates 3D tiles, according to some embodiments.

[0018] FIG. 5 illustrates a 3D mesh, according to some embodiments.

[0019] FIG. 6 illustrates an example adaptive AR display, according to some embodiments.

[0020] FIG. 7 illustrates another example adaptive AR display, according to some embodiments.

[0021] FIG. 8 illustrates adapting virtual content in an AR display according to the real world scene, according to some embodiments.

[0022] FIG. 9 illustrates displaying virtual content for animated elements in an AR display, according to some embodiments.

[0023] FIG. 10 illustrates leveraging 3D mesh map data and local sensor data to provide AR views of the environment to passengers in a vehicle, according to some embodiments.

[0024] FIG. 11 is a high-level flowchart of a method for adapting an AR display using 3D mesh map data, according to some embodiments.

[0025] FIG. 12 is a flowchart of a method for stabilizing virtual content on an AR display, according to some embodiments.

[0026] This specification includes references to “one embodiment” or “an embodiment.” The appearances of the phrases “in one embodiment” or “in an embodiment” do not necessarily refer to the same embodiment. Particular features, structures, or characteristics may be combined in any suitable manner consistent with this disclosure.

[0027] “Comprising.” This term is open-ended. As used in the appended claims, this term does not foreclose additional structure or steps. Consider a claim that recites: “An apparatus comprising one or more processor units … .” Such a claim does not foreclose the apparatus from including additional components (e.g., a network interface unit, graphics circuitry, etc.).

[0028] “Configured To.” Various units, circuits, or other components may be described or claimed as “configured to” perform a task or tasks. In such contexts, “configured to” is used to connote structure by indicating that the units/circuits/components include structure (e.g., circuitry) that performs those task or tasks during operation. As such, the unit/circuit/component can be said to be configured to perform the task even when the specified unit/circuit/component is not currently operational (e.g., is not on). The units/circuits/components used with the “configured to” language include hardware–for example, circuits, memory storing program instructions executable to implement the operation, etc. Reciting that a unit/circuit/component is “configured to” perform one or more tasks is expressly intended not to invoke 35 U.S.C. .sctn. 112(f), for that unit/circuit/component. Additionally, “configured to” can include generic structure (e.g., generic circuitry) that is manipulated by software and/or firmware (e.g., an FPGA or a general-purpose processor executing software) to operate in manner that is capable of performing the task(s) at issue. “Configure to” may also include adapting a manufacturing process (e.g., a semiconductor fabrication facility) to fabricate devices (e.g., integrated circuits) that are adapted to implement or perform one or more tasks.

[0029] “First,” “Second,” etc. As used herein, these terms are used as labels for nouns that they precede, and do not imply any type of ordering (e.g., spatial, temporal, logical, etc.). For example, a buffer circuit may be described herein as performing write operations for “first” and “second” values. The terms “first” and “second” do not necessarily imply that the first value must be written before the second value.

[0030] “Based On.” As used herein, this term is used to describe one or more factors that affect a determination. This term does not foreclose additional factors that may affect a determination. That is, a determination may be solely based on those factors or based, at least in part, on those factors. Consider the phrase “determine A based on B.” While in this case, B is a factor that affects the determination of A, such a phrase does not foreclose the determination of A from also being based on C. In other instances, A may be determined based solely on B.

DETAILED DESCRIPTION

[0031] Methods and systems are described that may, for example, be used in augmented reality (AR) displays in vehicles. Embodiments of an AR system are described that leverage a pre-generated stereographic reconstruction of the world to aid in the anchoring and improve rendering of an AR scene. By leveraging the stereographic reconstruction of the world, embodiments of the AR system may use a variety of techniques to enhance the rendering capabilities of the system. In embodiments, an AR system may obtain pre-generated 3D data from a stereographic reconstruction of the world generated from real-world images collected from a large number of sources over time, and use this pre-generated 3D data (e.g., a combination of 3D mesh, textures, and other geometry information) to determine much more information about a scene than is available from local sources (e.g., a point cloud of data collected by vehicle sensors) which AR rendering can benefit from. In addition, the pre-generated 3D data may be in a manageable format that can be used by the AR system to map the local environment without having to query into the point cloud of data collected by vehicle sensors at a high rate. The pre-generated 3D data may be used to as a guide to queries of the point cloud so that the queries can be concentrated on and limited to regions for which data is needed. The pre-generated 3D data may unlock AR rendering capabilities and allow AR systems and displays for vehicles to exceed the capabilities of systems that use conventional on-board sensor-dedicated approaches that query large amounts of point cloud data in real-time.

[0032] In some embodiments, the pre-generated 3D data, generated from stereographic 3D imagery, provides a mesh reconstruction of the real-world scene in front of and/or around the vehicle or viewer, along with geometry information that may be used to determine surface angles and lighting information, without having to query a point cloud of data collected by local sensors. Knowing the vehicle’s location and field of view, the AR system may obtain 3D tiles with the appropriate information for the scene.

[0033] By leveraging the pre-generated 3D data, an AR system may render elements in an augmented reality scene beyond the capabilities of the available onboard sensors (e.g., LiDAR, cameras, etc.). LiDAR and other on-board sensors typically have a distance limitation, and may also be limited by other factors such as how fast the vehicle is traveling, motion of the vehicle (e.g., turning or bouncing), occluding objects (buildings, other vehicles, terrain, etc.). The process stereographic imagery can make up for these limitations since the static imagery is available via the pre-generated 3D data without having to query a point cloud of data captured by on-board sensors.

[0034] A benefit of using the pre-generated 3D data in an AR system is that it allows the projection of 3D elements onto the terrain without having to perform several altitude queries or reference a large point cloud of data collected by on-board sensors. In addition, the 3D mesh allows the AR system to precisely detect parts of the scene which are occluded, for example a route going behind a building or mountain, and to render virtual content in the scene accordingly.

[0035] In addition to the 3D geometry of the terrain, normals from visible surfaces in the scene that are provided in the pre-generated 3D data and knowledge of the location of light sources (e.g., the sun) may allow the AR system to determine the orientation of the surfaces with respect to a light source (e.g., the sun). Using this information, when rendering elements into the augmented reality scene, the AR system may perform color correction of virtual content based on the lighting angle, move virtual content away from surfaces with glare, and/or mitigate sunlight or glares that could be making parts of the rendered AR scene challenging to see.

[0036] In some embodiments, animated elements (e.g., virtual representations of vehicles, pedestrians, etc.) in the 3D rendered scene may be made to respond to the terrain, as well as the type of surface the terrain is composed of, based on the pre-generated 3D data. For example, if a vehicle in front of the vehicle with the AR system turns and goes behind a building, a virtual image of the vehicle may be displayed as going up a hill that is behind the building and thus out of view of the vehicle’s on-board sensors.

[0037] Vehicles, as used herein, may include any type of surface vehicle, for example, automobiles, trucks, motorcycles, ATVs, buses, trains, etc. However, note that the AR systems and methods as described herein may also be adapted for use in airplanes, helicopters, boats, ships, etc. In addition, the AR systems and methods as described herein may be adapted for use in mobile devices such as smartphones, tablet or pad devices, notebook or laptop computers, and AR/VR head-mounted displays (HMDs) such as glasses, goggles, or helmets that may be carried or worn by pedestrians or passengers in a vehicle.

[0038] FIG. 1 is a high-level flowchart of a method for augmenting an AR display with stereographically reconstructed 3D data, according to some embodiments. Elements 1010 and 1020 may, for example, be performed by a network-based 3D data system including one or more computing systems that collect images (e.g., aerial and street photography from previous collections, images captured by vehicles equipped with instances of the AR system and/or video or still cameras, images captured by personal devices such as smartphones or tablets, etc.), stereographically reconstruct and otherwise process the images to generate data including 3D mesh maps of surfaces and objects, surface normals, textures, and other geometry information, location information (e.g., GPS coordinates), elevation information, time stamps, and so on. The 3D data may be stored, for example as 3D tiles each representing a 3D portion of the real world and tagged with appropriate information, to a backend storage system, for example cloud-based storage. Frontend server(s) and APIs may be provided for retrieving 3D data from the backend storage.

[0039] As indicated at 1010, the 3D data system may obtain images of the real world. As indicated at 1020, the 3D data system may stereographically reconstruct images to generate 3D data (e.g., 3D tiles), and may store the 3D data (e.g., tagged with location information, elevation information, time stamps, etc.) to the storage system (e.g., cloud-based storage). As indicated by the arrow returning from 1020 to 1010, collecting image data and generating 3D data from the collected images may be a continuous process, with new images collected and processed to add to or update the existing 3D data in storage as the images become available.

[0040] As indicated at 1030, an AR system in a vehicle may obtain 3D data (e.g., as 3D tiles) according to a current location and direction of travel of the vehicle. For example, the AR system may obtain location information (e.g., GPS coordinates) from a localization system of the vehicle, and may also obtain directional information from the vehicle. The AR system may then query a frontend to the cloud-based storage of 3D data according to the location and directional information, for example via a wireless communications link used to connect the vehicle to the Internet/cloud, to obtain 3D tiles for an area in front of or around the vehicle. 3D tiles may be obtained for the local environment within the range of the vehicle’s sensors, as well as for the environment in an extended range outside the range of the sensors. In some embodiments, the AR system may locally store or cache previously fetched 3D tiles, for example for frequently or recently traveled routes, and may thus first check the cache to see if 3D data corresponding to the current location is available in the cache before querying the cloud. In some embodiments, the AR system may fetch 3D tiles in advance based on the vehicle’s current location, direction of travel, and velocity. Thus, the AR system may proactively fetch 3D tiles from the cloud-based storage that may be needed in the future while processing previously fetched 3D tiles for the current location and/or upcoming locations.

[0041] As indicated at 1040, the AR system may query sensor data according to the fetched, pre-generated 3D data to obtain 3D information for the local environment within the range of the sensors. The vehicle may include sensors (e.g., LiDAR, cameras, etc.) that may generate a large number of data points for the local environment of the vehicle, referred to as a point cloud. The AR system may query this point cloud to obtain and/or generate local 3D information (e.g., locations and geometry (shapes and sizes) of fixed and moving objects, range of the objects, movement/direction of moving objects, etc.). In some embodiments, the pre-generated 3D data may be used to reduce or speed up queries of the point cloud for nearby objects. For example, queries for a region of the local environment may be reduced or not performed at all if pre-generated 3D data is available for the region. In addition, if sensor data is insufficient or missing for a region of the local environment due to occlusion, limitations of the sensors, or environmental conditions such as rain or fog, then if pre-generated 3D data is available for those regions, the obtained 3D data may be used to fill in those regions.

……
……
……

You may also like...