Intel Patent | Multi-Plane Display Image Synthesis Mechanism
Publication Number: 20200137380
Publication Date: 20200430
An apparatus to facilitate generating a multi-focal/multi-plane (MF/MP) display is disclosed. The apparatus comprises one or more processors to generate a plurality full resolution views for each frame of a three-dimension (3D) scene, perform deep neural network (DNN) inferencing using the plurality of full resolution views to select two or more presentation planes from among a plurality of available planes for display.
 Embodiments relate generally to presentation of imagery on a multi-plane display.
BACKGROUND OF THE DESCRIPTION
 A multi-focal/multi-plane (MF/MP) display is a device capable of displaying multiple images located at different distances from a viewer. In MF/MP displays, images are displayed in planes either at the same time (e.g., stack of transparent two dimensional (2D) displays+fixed optics) or one by one in a time multiplexed manner at a speed beyond a human vision flicker threshold. Generally there is a small number of such image planes (e.g., 2, 4, or 8 planes), which makes a MF/MP display significantly more technologically feasible than a volumetric display.
 For example, a two-plane display may be implemented as a combination of a two-state switchable lens and a single high refresh rate display (e.g., a 120 Hz display); and a four-plane display could be implemented as a stack of two switchable lenses (each having two states) plus high refresh rate display (e.g., 240 Hz). For a 4-plane display, the lens or lens stack may be arranged in a way that shows a virtual image of the display at 3.5, 2.5, 1.5 or 0.5 diopter distance away depending on the lens state.
 Implementation of a MF/MP display shows only a limited number of planes distributed within large range, which may result in disturbing visual artifacts when using naive approaches to display content located between the image planes. Including additional planes quickly becomes technically infeasible. However, it is possible to synthesize images for small number of planes to produce practically artifact-free visual effects, but computation costs are prohibitively high for real-time applications. Thus MF/MP display implementations typically involve a tradeoff between computation costs and visual quality.
BRIEF DESCRIPTION OF THE DRAWINGS
 So that the manner in which the above recited features of the present embodiments can be understood in detail, a more particular description of the embodiments, briefly summarized above, may be had by reference to embodiments, some of which are illustrated in the appended drawings. It is to be noted, however, that the appended drawings illustrate only typical embodiments and are therefore not to be considered limiting of its scope.
 FIG. 1 is a block diagram of a processing system, according to an embodiment.
 FIG. 2 is a block diagram of an embodiment of a processor having one or more processor cores, an integrated memory controller, and an integrated graphics processor.
 FIG. 3 is a block diagram of a graphics processor, which may be a discrete graphics processing unit, or may be a graphics processor integrated with a plurality of processing cores.
 FIG. 4 is a block diagram of a graphics processing engine of a graphics processor in accordance with some embodiments.
 FIG. 5 is a block diagram of hardware logic of a graphics processor core according to some embodiments.
 FIG. 6A-6B illustrate thread execution logic including an array of processing elements employed in a graphics processor core according to some embodiments.
 FIG. 7 is a block diagram illustrating a graphics processor instruction formats according to some embodiments.
 FIG. 8 is a block diagram of another embodiment of a graphics processor.
 FIG. 9A is a block diagram illustrating a graphics processor command format according to an embodiment.
 FIG. 9B is a block diagram illustrating a graphics processor command sequence according to an embodiment.
 FIG. 10 illustrates exemplary graphics software architecture for a data processing system according to some embodiments.
 FIG. 11A is a block diagram illustrating an IP core development system that may be used to manufacture an integrated circuit to perform operations according to an embodiment.
 FIG. 11B illustrates a cross-section side view of an integrated circuit package assembly according to some embodiments.
 FIG. 12 is a block diagram illustrating an exemplary system on a chip integrated circuit that may be fabricated using one or more IP cores, according to an embodiment.
 FIGS. 13A-13B are block diagrams illustrating exemplary graphics processors for use within an System on Chip (SoC), according to embodiments described herein.
 FIGS. 14A-14B illustrate additional exemplary graphics processor logic according to embodiments described herein.
 FIG. 15 illustrates a machine learning software stack, according to an embodiment.
 FIGS. 16A-16B illustrate layers of exemplary deep neural networks.
 FIG. 17 illustrates an exemplary recurrent neural network.
 FIG. 18 illustrates training and deployment of a deep neural network.
 FIG. 19 is a block diagram illustrating distributed learning.
 FIG. 20 illustrates a computing device employing an image synthesis mechanism, according to an embodiment.
 FIG. 21A illustrates images displayed on planes 1, 2, 3 and 4 that are fused by a human eye into a re-focusable 3D image.
 FIG. 21B illustrates one embodiment of a MF/MP display and a human eye viewing it.
 FIGS. 22A & 22B illustrate conventional computational pipelines to implement a MF/MP display.
 FIGS. 23A & 23B illustrate embodiments of a deep neural network (DNN) learning process.
 FIG. 24 is a flow diagram illustrating one embodiment of a process for implementing a MF/MP display.
 FIG. 25 illustrates another embodiment of a MF/MP display flow.
 In embodiments, an image synthesis mechanism implements a DNN to provide real-time generation of artifact-free images for MF/MP displays with focus cues. In such embodiments, the synthesis mechanism implements the DNN to perform inferencing from a set of rendered red, green, blue and depth (RGBD) views. The DNN subsequently selects a number of significant planes from among a number of available planes based on a depth variation of a central RGBD map. The view origins are distributed across an eye box plane providing sufficient parallax to preserve information about reflective/refractive properties of the displayed objects.
 In the following description, numerous specific details are set forth to provide a more thorough understanding. However, it will be apparent to one of skill in the art that the embodiments described herein may be practiced without one or more of these specific details. In other instances, well-known features have not been described to avoid obscuring the details of the present embodiments.
 FIG. 1 is a block diagram of a processing system 100, according to an embodiment. In various embodiments, the system 100 includes one or more processors 102 and one or more graphics processors 108, and may be a single processor desktop system, a multiprocessor workstation system, or a server system having a large number of processors 102 or processor cores 107. In one embodiment, the system 100 is a processing platform incorporated within a system-on-a-chip (SoC) integrated circuit for use in mobile, handheld, or embedded devices.
 In one embodiment, the system 100 can include, or be incorporated within a server-based gaming platform, a game console, including a game and media console, a mobile gaming console, a handheld game console, or an online game console. In some embodiments, the system 100 is a mobile phone, smart phone, tablet computing device or mobile Internet device. The processing system 100 can also include, couple with, or be integrated within a wearable device, such as a smart watch wearable device, smart eyewear device, augmented reality device, or virtual reality device. In some embodiments, the processing system 100 is a television or set top box device having one or more processors 102 and a graphical interface generated by one or more graphics processors 108.
 In some embodiments, the one or more processors 102 each include one or more processor cores 107 to process instructions which, when executed, perform operations for system and user software. In some embodiments, each of the one or more processor cores 107 is configured to process a specific instruction set 109. In some embodiments, instruction set 109 may facilitate Complex Instruction Set Computing (CISC), Reduced Instruction Set Computing (RISC), or computing via a Very Long Instruction Word (VLIW). Multiple processor cores 107 may each process a different instruction set 109, which may include instructions to facilitate the emulation of other instruction sets. Processor core 107 may also include other processing devices, such a Digital Signal Processor (DSP).
 In some embodiments, the processor 102 includes cache memory 104. Depending on the architecture, the processor 102 can have a single internal cache or multiple levels of internal cache. In some embodiments, the cache memory is shared among various components of the processor 102. In some embodiments, the processor 102 also uses an external cache (e.g., a Level-3 (L3) cache or Last Level Cache (LLC)) (not shown), which may be shared among processor cores 107 using known cache coherency techniques. A register file 106 is additionally included in processor 102 which may include different types of registers for storing different types of data (e.g., integer registers, floating point registers, status registers, and an instruction pointer register). Some registers may be general-purpose registers, while other registers may be specific to the design of the processor 102.
 In some embodiments, one or more processor(s) 102 are coupled with one or more interface bus(es) 110 to transmit communication signals such as address, data, or control signals between processor 102 and other components in the system 100. The interface bus 110, in one embodiment, can be a processor bus, such as a version of the Direct Media Interface (DMI) bus. However, processor busses are not limited to the DMI bus, and may include one or more Peripheral Component Interconnect buses (e.g., PCI, PCI Express), memory busses, or other types of interface busses. In one embodiment the processor(s) 102 include an integrated memory controller 116 and a platform controller hub 130. The memory controller 116 facilitates communication between a memory device and other components of the system 100, while the platform controller hub (PCH) 130 provides connections to I/O devices via a local I/O bus.
 The memory device 120 can be a dynamic random access memory (DRAM) device, a static random access memory (SRAM) device, flash memory device, phase-change memory device, or some other memory device having suitable performance to serve as process memory. In one embodiment the memory device 120 can operate as system memory for the system 100, to store data 122 and instructions 121 for use when the one or more processors 102 executes an application or process. Memory controller 116 also couples with an optional external graphics processor 112, which may communicate with the one or more graphics processors 108 in processors 102 to perform graphics and media operations. In some embodiments a display device 111 can connect to the processor(s) 102. The display device 111 can be one or more of an internal display device, as in a mobile electronic device or a laptop device or an external display device attached via a display interface (e.g., DisplayPort, etc.). In one embodiment the display device 111 can be a head mounted display (HMD) such as a stereoscopic display device for use in virtual reality (VR) applications or augmented reality (AR) applications.
 In some embodiments the platform controller hub 130 enables peripherals to connect to memory device 120 and processor 102 via a high-speed I/O bus. The I/O peripherals include, but are not limited to, an audio controller 146, a network controller 134, a firmware interface 128, a wireless transceiver 126, touch sensors 125, a data storage device 124 (e.g., hard disk drive, flash memory, etc.). The data storage device 124 can connect via a storage interface (e.g., SATA) or via a peripheral bus, such as a Peripheral Component Interconnect bus (e.g., PCI, PCI Express). The touch sensors 125 can include touch screen sensors, pressure sensors, or fingerprint sensors. The wireless transceiver 126 can be a Wi-Fi transceiver, a Bluetooth transceiver, or a mobile network transceiver such as a 3G, 4G, or Long Term Evolution (LTE) transceiver. The firmware interface 128 enables communication with system firmware, and can be, for example, a unified extensible firmware interface (UEFI). The network controller 134 can enable a network connection to a wired network. In some embodiments, a high-performance network controller (not shown) couples with the interface bus 110. The audio controller 146, in one embodiment, is a multi-channel high definition audio controller. In one embodiment the system 100 includes an optional legacy I/O controller 140 for coupling legacy (e.g., Personal System 2 (PS/2)) devices to the system. The platform controller hub 130 can also connect to one or more Universal Serial Bus (USB) controllers 142 connect input devices, such as keyboard and mouse 143 combinations, a camera 144, or other USB input devices.
 It will be appreciated that the system 100 shown is exemplary and not limiting, as other types of data processing systems that are differently configured may also be used. For example, an instance of the memory controller 116 and platform controller hub 130 may be integrated into a discreet external graphics processor, such as the external graphics processor 112. In one embodiment the platform controller hub 130 and/or memory controller 160 may be external to the one or more processor(s) 102. For example, the system 100 can include an external memory controller 116 and platform controller hub 130, which may be configured as a memory controller hub and peripheral controller hub within a system chipset that is in communication with the processor(s) 102.
 FIG. 2 is a block diagram of an embodiment of a processor 200 having one or more processor cores 202A-202N, an integrated memory controller 214, and an integrated graphics processor 208. Those elements of FIG. 2 having the same reference numbers (or names) as the elements of any other figure herein can operate or function in any manner similar to that described elsewhere herein, but are not limited to such. Processor 200 can include additional cores up to and including additional core 202N represented by the dashed lined boxes. Each of processor cores 202A-202N includes one or more internal cache units 204A-204N. In some embodiments each processor core also has access to one or more shared cached units 206.
 The internal cache units 204A-204N and shared cache units 206 represent a cache memory hierarchy within the processor 200. The cache memory hierarchy may include at least one level of instruction and data cache within each processor core and one or more levels of shared mid-level cache, such as a Level 2 (L2), Level 3 (L3), Level 4 (L4), or other levels of cache, where the highest level of cache before external memory is classified as the LLC. In some embodiments, cache coherency logic maintains coherency between the various cache units 206 and 204A-204N.
 In some embodiments, processor 200 may also include a set of one or more bus controller units 216 and a system agent core 210. The one or more bus controller units 216 manage a set of peripheral buses, such as one or more PCI or PCI express busses. System agent core 210 provides management functionality for the various processor components. In some embodiments, system agent core 210 includes one or more integrated memory controllers 214 to manage access to various external memory devices (not shown).
 In some embodiments, one or more of the processor cores 202A-202N include support for simultaneous multi-threading. In such embodiment, the system agent core 210 includes components for coordinating and operating cores 202A-202N during multi-threaded processing. System agent core 210 may additionally include a power control unit (PCU), which includes logic and components to regulate the power state of processor cores 202A-202N and graphics processor 208.
 In some embodiments, processor 200 additionally includes graphics processor 208 to execute graphics processing operations. In some embodiments, the graphics processor 208 couples with the set of shared cache units 206, and the system agent core 210, including the one or more integrated memory controllers 214. In some embodiments, the system agent core 210 also includes a display controller 211 to drive graphics processor output to one or more coupled displays. In some embodiments, display controller 211 may also be a separate module coupled with the graphics processor via at least one interconnect, or may be integrated within the graphics processor 208.
 In some embodiments, a ring based interconnect unit 212 is used to couple the internal components of the processor 200. However, an alternative interconnect unit may be used, such as a point-to-point interconnect, a switched interconnect, or other techniques, including techniques well known in the art. In some embodiments, graphics processor 208 couples with the ring interconnect 212 via an I/O link 213.
 The exemplary I/O link 213 represents at least one of multiple varieties of I/O interconnects, including an on package I/O interconnect which facilitates communication between various processor components and a high-performance embedded memory module 218, such as an eDRAM module. In some embodiments, each of the processor cores 202A-202N and graphics processor 208 use embedded memory modules 218 as a shared Last Level Cache.
 In some embodiments, processor cores 202A-202N are homogenous cores executing the same instruction set architecture. In another embodiment, processor cores 202A-202N are heterogeneous in terms of instruction set architecture (ISA), where one or more of processor cores 202A-202N execute a first instruction set, while at least one of the other cores executes a subset of the first instruction set or a different instruction set. In one embodiment processor cores 202A-202N are heterogeneous in terms of microarchitecture, where one or more cores having a relatively higher power consumption couple with one or more power cores having a lower power consumption. Additionally, processor 200 can be implemented on one or more chips or as an SoC integrated circuit having the illustrated components, in addition to other components.
 FIG. 3 is a block diagram of a graphics processor 300, which may be a discrete graphics processing unit, or may be a graphics processor integrated with a plurality of processing cores. In some embodiments, the graphics processor communicates via a memory mapped I/O interface to registers on the graphics processor and with commands placed into the processor memory. In some embodiments, graphics processor 300 includes a memory interface 314 to access memory. Memory interface 314 can be an interface to local memory, one or more internal caches, one or more shared external caches, and/or to system memory.