Facebook Patent | Systems and methods for mask-based spatio-temporal dithering
Patent: Systems and methods for mask-based spatio-temporal dithering
Drawings: Click to check drawins
Publication Number: 20210027725
Publication Date: 20210128
Applicant: Facebook
Abstract
In one embodiment, a computing system may receive a target image with a first number of bits per color. The system may access masks that each includes dots associated with a grayscale range. A subset of the dots associated with each of the masks may be associated with a subrange of the grayscale range. The dots within the subsets of dots associated with the masks may have different positions. The system may generate a number of images based on the target image and the masks. Each of the images may have a second number of bits per color smaller than the first number of bits per color. The system may display the images sequentially on a display for representing the target image.
Claims
-
A method comprising, by a computing system: receiving a target image with a first number of bits per color; accessing masks that each comprises dots associated with a grayscale range, wherein a subset of the dots associated with each of the masks is associated with a subrange of the grayscale range, wherein dots within the subsets of dots associated with the masks have different positions; generating a plurality of images based on the target image and the masks, wherein each of the plurality of images has a second number of bits per color smaller than the first number of bits per color; and displaying the plurality of images sequentially on a display for representing the target image.
-
The method of claim 1, wherein the dots of each mask are associated with a dot pattern, wherein the dot pattern comprises a plurality of stacked dot patterns, and wherein each of the plurality of stacked dot patterns satisfies a spatio stacking constraint by comprising all dot patterns corresponding to all lower grayscale levels.
-
The method of claim 2, wherein each dot of the dot pattern is associated with a threshold value, and wherein the threshold value corresponds to a lowest grayscale level which has a corresponding dot pattern that comprises that dot.
-
The method of claim 2, wherein each mask has threshold values corresponding to all grayscale levels of a quantization grayscale range corresponding to the second number of bits per color.
-
The method of claim 2, wherein the plurality of stacked dot patterns corresponds to all grayscale levels of the quantization grayscale range.
-
The method of claim 2, wherein the dots in the dot pattern of each mask have a blue-noise property.
-
The method of claim 2, wherein a sum of the dot patterns of the masks has a blue-noise property.
-
The method of claim 1, wherein the plurality of images are generated by satisfying a temporal stacking constraint, and wherein the temporal stacking constraint allows the plurality of images to have the luminosity within a threshold range.
-
The method of claim 1, wherein the display has the second number of bits per color.
-
The method of claim 1, wherein the masks are available at a same time for a process of generating the plurality of images, further comprising: determining one or more quantitation errors based on one or more color values of the target image and one or more threshold values associated with one of the masks; and dithering the one or more quantization errors temporally to one or more images without using an error buffer.
-
The method of claim 1, further comprising: generating a seed mask comprising threshold values covering a quantization grayscale range; storing the seed mask in a storage media; and accessing the seed mask from the storage media, wherein the plurality of masks are generated from the seed mask based on a cyclical relationship.
-
The method of claim 11, wherein the quantization grayscale range has a plurality of evenly-placed grayscale levels.
-
The method of claim 11, wherein the quantization grayscale range has a plurality of unevenly-placed grayscale levels.
-
The method of claim 1, further comprising: determining a grayscale limit based on a maximum grayscale level and a number of images for representing the target image.
-
The method of claim 14, wherein, when a target grayscale value associated with the target image is smaller than the grayscale limit, corresponding regions of the plurality of images comprise non-overlapping sets of pixels from each other.
-
The method of claim 14, wherein, when a target grayscale value associated with the target image is greater than the grayscale limit, corresponding regions of the plurality of images comprise overlapping sets of pixels, and wherein the overlapping sets of pixels are determined by incrementally selecting dots from at least another mask of the masks.
-
The method of claim 1, wherein an average grayscale value of a target region of the target image is used as a target grayscale value, and wherein each of the plurality of masks has a same size to the target region of the target image.
-
The method of claim 17, wherein the plurality of images are generated by repeatedly applying a corresponding mask to the target image.
-
One or more computer-readable non-transitory storage media embodying software that is operable when executed to: receive a target image with a first number of bits per color; access masks that each comprises dots associated with a grayscale range, wherein a subset of the dots associated with each of the masks is associated with a subrange of the grayscale range, wherein dots within the subsets of dots associated with the masks have different positions; generate a plurality of images based on the target image and the masks, wherein each of the plurality of images has a second number of bits per color smaller than the first number of bits per color; and display the plurality of images sequentially on a display for representing the target image.
-
A system comprising: one or more non-transitory computer-readable storage media embodying instructions; and one or more processors coupled to the storage media and operable to execute the instructions to: receive a target image with a first number of bits per color; access masks that each comprises dots associated with a grayscale range, wherein a subset of the dots associated with each of the masks is associated with a subrange of the grayscale range, wherein dots within the subsets of dots associated with the masks have different positions; generate a plurality of images based on the target image and the masks, wherein each of the plurality of images has a second number of bits per color smaller than the first number of bits per color; and display the plurality of images sequentially on a display for representing the target image.
Description
TECHNICAL FIELD
[0001] This disclosure generally relates to artificial reality, such as virtual reality and augmented reality.
BACKGROUND
[0002] Artificial reality is a form of reality that has been adjusted in some manner before presentation to a user, which may include, e.g., a virtual reality (VR), an augmented reality (AR), a mixed reality (MR), a hybrid reality, or some combination and/or derivatives thereof. Artificial reality content may include completely generated content or generated content combined with captured content (e.g., real-world photographs). The artificial reality content may include video, audio, haptic feedback, or some combination thereof, and any of which may be presented in a single channel or in multiple channels (such as stereo video that produces a three-dimensional effect to the viewer). Artificial reality may be associated with applications, products, accessories, services, or some combination thereof, that are, e.g., used to create content in an artificial reality and/or used in (e.g., perform activities in) an artificial reality. The artificial reality system that provides the artificial reality content may be implemented on various platforms, including a head-mounted display (HMD) connected to a host computer system, a standalone HMD, a mobile device or computing system, or any other hardware platform capable of providing artificial reality content to one or more viewers.
SUMMARY OF PARTICULAR EMBODIMENTS
[0003] Particular embodiments described herein relate to a method of using multiple dithering masks to generate spatio-temporal subframe images with less gray level bits (or color depth) to represent a target image with more gray level bits, without using an error buffer. The temporal subframe images may have smooth dithering pattern transition between grayscale levels and minimal temporal change among the subframes. For a target region (e.g., a tile region) of the target image, the system may generate a dithering mask for each subframe image. Each dithering mask may include a dot pattern with blue-noise distribution and satisfy spatial-stacking constraints. The dot pattern may include a number of stacked dot patterns with each dot pattern having a dot density corresponding to a grayscale level within the quantization range (e.g., 0-255 grayscale levels for 8-bit display). All dot patterns may be chosen to have blue-noise properties and may have a spatial stacking property according to which the dot pattern for grayscale level N+1 may include the dot pattern for all lower grayscale levels of 0 to N. Each dot in the dithering mask may correspond to a threshold value which equals to the lowest grayscale level for turning on that that dot (i.e., the lowest grayscale level with the corresponding dot pattern that includes that dot).
[0004] In particular embodiments, for representing a target grayscale value g (e.g., an average grayscale value of a target tile region), the dot patterns corresponding to all lower grayscale levels may be spatially stacked to represent the target grayscale level up to a distribution limit gz, (e.g., the dot pattern of the target grayscale level may include all dots of lower grayscale levels). The distribution limit g.sub.L, (e.g., 0.25) may be determined by dividing the maximum grayscale level (e.g., 1) by the number of subframes (e.g., 4 subframes). Under the condition of g<g.sub.L, the dot pattern of each dithering mask of each subframe may include a subset dots with have no overlapping dots with any other subframes. For presenting grayscale higher than the distribution limit gz, (e.g., g>0.25), additional dots could be incrementally added and turned on. To ensure temporal consistency, the dots that are incrementally added may be selected from the dots that are included in one or more dithering masks of the other subframes. For example, for quantizing grayscale between the distribution limit gz, to two times of distribution limit 2g.sup.L (e.g., 0.25<g<0.5), the dots added to the first subframe (which is at grayscale 0.25) may be incrementally selected from the dots included in the dithering mask of the second subframe. As another example, for quantizing grayscale in the range of two times of distribution limit 2g.sub.L to three times of distribution limit 3g.sup.L (e.g., 0.5<g<0.75), the dots to be turned on may include dots of the first and second subframe dithering masks (e.g., which are both at grayscale 0.25). The dots that are incrementally added may be selected from the dots included in the dithering mask of third subframe. The dither masks for generating the subframe images may be pre-determined and may be available for use when needed by the process for generating the subframe images. Therefore, all subframe images may be generated at essential the same time (or generated parallelly) and the quantization errors may be dithered in the temporal domain to other subframes during the subframe image generating process. Therefore, the system may not need to store the quantization error for temporal dithering process to other subframes. As a result, using the dithering masks generated following these principles, the temporal subframe images may be generated without using an error buffer, and therefore reduce the memory usage related to the subframe image generating processes. The subframe images may have smooth dither pattern transition between grayscales and minimal temporal change among the subframes.
[0005] In particular embodiments, the multiple masks used to generate the subframes may be generated from a single seed mask stored in the computer storage. The system may store the single seed mask instead of the multiple dithering masks to reduce the storage memory usage related to the subframe generating process. For an arbitrary number of subframes N, the mask for the n-th subframe may be generated by cyclically permuting the seed mask. For a target grayscale level g, the system may determine an offsetting coefficient k.sub.n based on a remainder of (n-1)g divided by g.sub.max, which is the maximum grayscale level. Then, the system may determine the threshold values of a subsequent subframe mask based on a remainder of (t.sub.1-k.sub.n) divided by g.sub.max, where t.sub.1 is a threshold value for a dot in the first mark. As an example, for a target grayscale of 0.25 in a grayscale range of [0, 1] and 4 subframes, the first, second, third and fourth subframe masks may include the dots that have the threshold values within the ranges of [0, 0.25], [0.25, 0.5], [0.5, 0.75] and [0.75, 1], respectively. The threshold values of the first, second, third and fourth subframe dithering masks may be determined by mod(t.sub.1-0), 1), mod(t.sub.1-0.25, 1), mod(t.sub.1-0.5, 1) and mod(t.sub.1-0.75, 1), respectively. As another example, for a target grayscale of 0.6 within a grayscale range of [0, 1] and 4 subframes, the first, second, third and fourth subframe masks may include the dots that have the threshold values within the ranges of [0, 0.6], [0.2, 0.8], [0.4, 1] and [0.2, 0.8], respectively. The threshold values of the first, second, third and fourth subframe masks may be determined by mod(t.sub.1-0), 1), mod(t.sub.1-0.2, 1), mod(t.sub.1-0.3, 1) and mod(t.sub.1-0.2, 1), respectively.
[0006] The embodiments disclosed herein are only examples, and the scope of this disclosure is not limited to them. Particular embodiments may include all, some, or none of the components, elements, features, functions, operations, or steps of the embodiments disclosed above. Embodiments according to the invention are in particular disclosed in the attached claims directed to a method, a storage medium, a system and a computer program product, wherein any feature mentioned in one claim category, e.g. method, can be claimed in another claim category, e.g. system, as well. The dependencies or references back in the attached claims are chosen for formal reasons only. However, any subject matter resulting from a deliberate reference back to any previous claims (in particular multiple dependencies) can be claimed as well, so that any combination of claims and the features thereof are disclosed and can be claimed regardless of the dependencies chosen in the attached claims. The subj ect-matter which can be claimed comprises not only the combinations of features as set out in the attached claims but also any other combination of features in the claims, wherein each feature mentioned in the claims can be combined with any other feature or combination of other features in the claims. Furthermore, any of the embodiments and features described or depicted herein can be claimed in a separate claim and/or in any combination with any embodiment or feature described or depicted herein or with any of the features of the attached claims.
[0007] In an embodiment, a method may comprise, by a computing system: [0008] receiving a target image with a first number of bits per color; [0009] accessing masks that each comprises dots associated with a grayscale range, wherein a subset of the dots associated with each of the masks is associated with a subrange of the grayscale range, wherein dots within the subsets of dots associated with the masks have different positions; [0010] generating a plurality of images based on the target image and the masks, wherein each of the plurality of images has a second number of bits per color smaller than the first number of bits per color; and [0011] displaying the plurality of images sequentially on a display for representing the target image.
[0012] The dots of each mask may be associated with a dot pattern, the dot pattern may comprise a plurality of stacked dot patterns, and each of the plurality of stacked dot patterns may satisfy a spatio stacking constraint by comprising all dot patterns corresponding to all lower grayscale levels.
[0013] Each dot of the dot pattern may be associated with a threshold value, and the threshold value may correspond to a lowest grayscale level which has a corresponding dot pattern that comprises that dot.
[0014] Each mask may have threshold values corresponding to all grayscale levels of a quantization grayscale range corresponding to the second number of bits per color.
[0015] The plurality of stacked dot patterns may correspond to all grayscale levels of the quantization grayscale range.
[0016] The dots in the dot pattern of each mask may have a blue-noise property.
[0017] A sum of the dot patterns of the masks may have a blue-noise property.
[0018] The plurality of images may be generated by satisfying a temporal stacking constraint, and the temporal stacking constraint may allow the plurality of images to have the luminosity within a threshold range.
[0019] The display may have the second number of bits per color.
[0020] In an embodiment, the masks may be available at a same time for a process of generating the plurality of images, and a method may comprise: [0021] determining one or more quantitation errors based on one or more color values of the target image and one or more threshold values associated with one of the masks; and [0022] dithering the one or more quantization errors temporally to one or more images without using an error buffer.
[0023] In an embodiment, a method may comprise: [0024] generating a seed mask comprising threshold values covering a quantization grayscale range; [0025] storing the seed mask in a storage media; and [0026] accessing the seed mask from the storage media, wherein the plurality of masks are generated from the seed mask based on a cyclical relationship.
[0027] The quantization grayscale range may have a plurality of evenly-placed grayscale levels.
[0028] The quantization grayscale range may have a plurality of unevenly-placed grayscale levels.
[0029] In an embodiment, a method may comprise: [0030] determining a grayscale limit based on a maximum grayscale level and a number of images for representing the target image.
[0031] When a target grayscale value associated with the target image is smaller than the grayscale limit, corresponding regions of the plurality of images may comprise non-overlapping sets of pixels from each other.
[0032] When a target grayscale value associated with the target image is greater than the grayscale limit, corresponding regions of the plurality of images may comprise overlapping sets of pixels, and wherein the overlapping sets of pixels are determined by incrementally selecting dots from at least another mask of the masks.
[0033] An average grayscale value of a target region of the target image may be used as a target grayscale value, and each of the plurality of masks may have a same size to the target region of the target image.
[0034] The plurality of images may be generated by repeatedly applying a corresponding mask to the target image.
[0035] In an embodiment, one or more computer-readable non-transitory storage media may embody software that is operable when executed to: [0036] receive a target image with a first number of bits per color; [0037] access masks that each comprises dots associated with a grayscale range, wherein a subset of the dots associated with each of the masks is associated with a subrange of the grayscale range, wherein dots within the subsets of dots associated with the masks have different positions; [0038] generate a plurality of images based on the target image and the masks, wherein each of the plurality of images has a second number of bits per color smaller than the first number of bits per color; and [0039] display the plurality of images sequentially on a display for representing the target image.
[0040] In an embodiment, a system may comprise: one or more non-transitory computer-readable storage media embodying instructions; and one or more processors coupled to the storage media and operable to execute the instructions to: [0041] receive a target image with a first number of bits per color; [0042] access masks that each comprises dots associated with a grayscale range, wherein a subset of the dots associated with each of the masks is associated with a subrange of the grayscale range, wherein dots within the subsets of dots associated with the masks have different positions; [0043] generate a plurality of images based on the target image and the masks, wherein each of the plurality of images has a second number of bits per color smaller than the first number of bits per color; and [0044] display the plurality of images sequentially on a display for representing the target image.
BRIEF DESCRIPTION OF THE DRAWINGS
[0045] FIG. 1A illustrates an example artificial reality system.
[0046] FIG. 1B illustrates an example augmented reality system.
[0047] FIG. 1C illustrates an example architecture of a display engine.
[0048] FIG. 1D illustrates an example graphic pipeline of the display engine for generating display image data.
[0049] FIG. 2A illustrates an example scanning waveguide display.
[0050] FIG. 2B illustrates an example scanning operation of the scanning waveguide display.
[0051] FIG. 3A illustrates an example 2D micro-LED waveguide display.
[0052] FIG. 3B illustrates an example waveguide configuration for the 2D micro-LED waveguide display.
[0053] FIG. 4A illustrates an example target image to be represented by a series of subframe images with less color depth.
[0054] FIGS. 4B-D illustrate example subframe images generated using segmented quantization and spatio dithering method to represent the target image of FIG. 4A.
[0055] FIG. 5A illustrates an example dithering mask based on dot patterns with blue-noise properties and satisfying spatio stacking constraints.
[0056] FIGS. 5B-D illustrate example dot patterns for grayscale level 1, 8, and 32 in a grayscale level range of [0, 255].
[0057] FIGS. 6A-D illustrate example dot patterns of four dithering masks for generating temporal subframe images satisfying spatio and temporal stacking constraints.
[0058] FIG. 6E illustrates a dot pattern generated by stacking the dot patterns of the four dithering masks as shown in FIGS. 6A-D.
[0059] FIGS. 7A-D illustrate four example dithering masks satisfying both spatio and temporal stacking constraints.
[0060] FIG. 8A illustrates an example target image to be represented by a series of subframe images with less gray level.
[0061] FIGS. 8B-E illustrate four example subframe images generated using the mask-based spatio-temporal dithering method.
[0062] FIG. 9 illustrates an example method for using a mask-based dithering method to generate a series of subframe images to represent a target image.
[0063] FIG. 10 illustrates an example computer system.
DESCRIPTION OF EXAMPLE EMBODIMENTS
[0064] The number of available bits in a display may limit the display’s color depth or gray scale level. Displays with limited color depth or gray scale level may use spatio dithering to generate the illusion of increased color depth or gray scale level, for example, by spreading quantization errors to neighboring pixels. To further increase the color depth or gray scale level, displays may generate a series of temporal subframe images with less gray level bits to give the illusion of a target image which has more gray level bits. Each subframe image may be generated using dithering techniques (e.g., spatio-temporal dithering methods). However, these dithering techniques may need an error buffer to provide temporal feedback, and therefore use more memory space.
[0065] To reduce the memory usage related to processes of generating subframe images, particular embodiments of the system may use a number of dithering masks to generate a series of subframe images with even luminance distribution across all subframe images to represent a target image. For generating N subframe images, the system may generate a dithering mask for each subframe image. Each dithering mask may include a number of dot patterns with each dot pattern having a dot density corresponding to a grayscale level within the quantization range (e.g., 0-255 grayscale levels for 8-bit display). The dot patterns may be generated based on blue-noise distribution and satisfy spatial stacking property. For example, the dot pattern for grayscale level N may include the dot patterns for all lower grayscale levels form 0 to N. The dithering mask may include the dot patterns corresponding to all grayscale levels of the quantization range. Each dot in the dithering mask may correspond to a threshold value which equals to the lowest grayscale level allowing that dot to be included in a dot pattern. The system may generate the subframe images based on the dithering masks without using an error buffer.
[0066] Particular embodiments of the system improve the efficiency of AR/VR display by reducing the memory usage related to generating the temporal subframe images without using an error buffer. Particular embodiments of the system provide better image quality and improve user experience for AR/VR display by using multiple subframe images with less color depth to represent an image with greater color depth. Particular embodiments of the system generate subframe images with more even luminance distribution across the subframe images for representing the target image and eliminate the temporal artifacts such as flashes or uneven luminance over time in AR/VR display when the user’s eyes and head positions change between the subframe images. Particular embodiments of the system allow AR/VR display system to reduce the space and complexity of pixel circuits by having less gray level bits, and therefore miniaturize the size of the display system. Particular embodiments of the system make it possible for AR/VR displays to operate in monochrome mode with digital pixel circuits and eliminating analog pixel circuits for full RGB operations.
[0067] FIG. 1A illustrates an example artificial reality system 100A. In particular embodiments, the artificial reality system 100 may comprise a headset 104, a controller 106, and a computing system 108. A user 102 may wear the headset 104 that may display visual artificial reality content to the user 102. The headset 104 may include an audio device that may provide audio artificial reality content to the user 102. The headset 104 may include one or more cameras which can capture images and videos of environments. The headset 104 may include an eye tracking system to determine the vergence distance of the user 102. The headset 104 may be referred as a head-mounted display (HDM). The controller 106 may comprise a trackpad and one or more buttons. The controller 106 may receive inputs from the user 102 and relay the inputs to the computing system 108. The controller 206 may also provide haptic feedback to the user 102. The computing system 108 may be connected to the headset 104 and the controller 106 through cables or wireless connections. The computing system 108 may control the headset 104 and the controller 106 to provide the artificial reality content to and receive inputs from the user 102. The computing system 108 may be a standalone host computer system, an on-board computer system integrated with the headset 104, a mobile device, or any other hardware platform capable of providing artificial reality content to and receiving inputs from the user 102.
[0068] FIG. 1B illustrates an example augmented reality system 100B. The augmented reality system 100B may include a head-mounted display (HMD) 110 (e.g., glasses) comprising a frame 112, one or more displays 114, and a computing system 120. The displays 114 may be transparent or translucent allowing a user wearing the HMD 110 to look through the displays 114 to see the real world and displaying visual artificial reality content to the user at the same time. The HMD 110 may include an audio device that may provide audio artificial reality content to users. The HMD 110 may include one or more cameras which can capture images and videos of environments. The HMD 110 may include an eye tracking system to track the vergence movement of the user wearing the HMD 110. The augmented reality system 100B may further include a controller comprising a trackpad and one or more buttons. The controller may receive inputs from users and relay the inputs to the computing system 120. The controller may also provide haptic feedback to users. The computing system 120 may be connected to the HMD 110 and the controller through cables or wireless connections. The computing system 120 may control the HMD 110 and the controller to provide the augmented reality content to and receive inputs from users. The computing system 120 may be a standalone host computer system, an on-board computer system integrated with the HMD 110, a mobile device, or any other hardware platform capable of providing artificial reality content to and receiving inputs from users.
[0069] FIG. 1C illustrates an example architecture 100C of a display engine 130. In particular embodiments, the processes and methods as described in this disclosure may be embodied or implemented within a display engine 130 (e.g., in the display block 135). The display engine 130 may include, for example, but is not limited to, a texture memory 132, a transform block 133, a pixel block 134, a display block 135, input data bus 131, output data bus 142, etc. In particular embodiments, the display engine 130 may include one or more graphic pipelines for generating images to be rendered on the display. For example, the display engine may use the graphic pipeline(s) to generate a series of subframe images based on a mainframe image and a viewpoint or view angle of the user as measured by one or more eye tracking sensors. The mainframe image may be generated or/and loaded in to the system at a mainframe rate of 30-90 Hz and the subframe rate may be generated at a subframe rate of 1-2 kHz. In particular embodiments, the display engine 130 may include two graphic pipelines for the user’s left and right eyes. One of the graphic pipelines may include or may be implemented on the texture memory 132, the transform block 133, the pixel block 134, the display block 135, etc. The display engine 130 may include another set of transform block, pixel block, and display block for the other graphic pipeline. The graphic pipeline(s) may be controlled by a controller or control block (not shown) of the display engine 130. In particular embodiments, the texture memory 132 may be included within the control block or may be a memory unit external to the control block but local to the display engine 130. One or more of the components of the display engine 130 may be configured to communicate via a high-speed bus, shared memory, or any other suitable methods. This communication may include transmission of data as well as control signals, interrupts or/and other instructions. For example, the texture memory 132 may be configured to receive image data through the input data bus 211. As another example, the display block 135 may send the pixel values to the display system 140 through the output data bus 142. In particular embodiments, the display system 140 may include three color channels (e.g., 114A, 114B, 114C) with respective display driver ICs (DDIs) of 142A, 142B, and 143B. In particular embodiments, the display system 140 may include, for example, but is not limited to, light-emitting diode (LED) displays, organic light-emitting diode (OLED) displays, active matrix organic light-emitting diode (AMLED) displays, liquid crystal display (LCD), micro light-emitting diode (.mu.LED) display, electroluminescent displays (ELDs), or any suitable displays.
[0070] In particular embodiments, the display engine 130 may include a controller block (not shown). The control block may receive data and control packages such as position data and surface information from controllers external to the display engine 130 though one or more data buses. For example, the control block may receive input stream data from a body wearable computing system. The input data stream may include a series of mainframe images generated at a mainframe rate of 30-90 Hz. The input stream data including the mainframe images may be converted to the required format and stored into the texture memory 132. In particular embodiments, the control block may receive input from the body wearable computing system and initialize the graphic pipelines in the display engine to prepare and finalize the image data for rendering on the display. The data and control packets may include information related to, for example, one or more surfaces including texel data, position data, and additional rendering instructions. The control block may distribute data as needed to one or more other blocks of the display engine 130. The control block may initiate the graphic pipelines for processing one or more frames to be displayed. In particular embodiments, the graphic pipelines for the two eye display systems may each include a control block or share the same control block.
[0071] In particular embodiments, the transform block 133 may determine initial visibility information for surfaces to be displayed in the artificial reality scene. In general, the transform block 133 may cast rays from pixel locations on the screen and produce filter commands (e.g., filtering based on bilinear or other types of interpolation techniques) to send to the pixel block 134. The transform block 133 may perform ray casting from the current viewpoint of the user (e.g., determined using the headset’s inertial measurement units, eye tracking sensors, and/or any suitable tracking/localization algorithms, such as simultaneous localization and mapping (SLAM)) into the artificial scene where surfaces are positioned and may produce tile/surface pairs 144 to send to the pixel block 134. In particular embodiments, the transform block 133 may include a four-stage pipeline as follows. A ray caster may issue ray bundles corresponding to arrays of one or more aligned pixels, referred to as tiles (e.g., each tile may include 16x 16 aligned pixels). The ray bundles may be warped, before entering the artificial reality scene, according to one or more distortion meshes. The distortion meshes may be configured to correct geometric distortion effects stemming from, at least, the eye display systems the headset system. The transform block 133 may determine whether each ray bundle intersects with surfaces in the scene by comparing a bounding box of each tile to bounding boxes for the surfaces. If a ray bundle does not intersect with an object, it may be discarded. After the tile-surface intersections are detected, the corresponding tile/surface pairs may be passed to the pixel block 134.
[0072] In particular embodiments, the pixel block 134 may determine color values or grayscale values for the pixels based on the tile-surface pairs. The color values for each pixel may be sampled from the texel data of surfaces received and stored in texture memory 132. The pixel block 134 may receive tile-surface pairs from the transform block 133 and may schedule bilinear filtering using one or more filer blocks. For each tile-surface pair, the pixel block 134 may sample color information for the pixels within the tile using color values corresponding to where the projected tile intersects the surface. The pixel block 134 may determine pixel values based on the retrieved texels (e.g., using bilinear interpolation). In particular embodiments, the pixel block 134 may process the red, green, and blue color components separately for each pixel. In particular embodiments, the display may include two pixel blocks for the two eye display systems. The two pixel blocks of the two eye display systems may work independently and in parallel with each other. The pixel block 134 may then output its color determinations (e.g., pixels 138) to the display block 135. In particular embodiments, the pixel block 134 may composite two or more surfaces into one surface to when the two or more surfaces have overlapping areas. A composed surface may need less computational resources (e.g., computational units, memory, power, etc.) for the resampling process.
[0073] In particular embodiments, the display block 135 may receive pixel color values from the pixel block 134, covert the format of the data to be more suitable for the scanline output of the display, apply one or more brightness corrections to the pixel color values, and prepare the pixel color values for output to the display. In particular embodiments, the display block 135 may each include a row buffer and may process and store the pixel data received from the pixel block 134. The pixel data may be organized in quads (e.g., 2.times.2 pixels per quad) and tiles (e.g., 16.times.16 pixels per tile). The display block 135 may convert tile-order pixel color values generated by the pixel block 134 into scanline or row-order data, which may be required by the physical displays. The brightness corrections may include any required brightness correction, gamma mapping, and dithering. The display block 135 may output the corrected pixel color values directly to the driver of the physical display (e.g., pupil display) or may output the pixel values to a block external to the display engine 130 in a variety of formats. For example, the eye display systems of the headset system may include additional hardware or software to further customize backend color processing, to support a wider interface to the display, or to optimize display speed or fidelity.
[0074] In particular embodiments, the dithering methods and processes (e.g., spatial dithering method, temporal dithering methods, and spatio-temporal methods) as described in this disclosure may be embodied or implemented in the display block 135 of the display engine 130. In particular embodiments, the display block 135 may include a model-based dithering algorithm or a dithering model for each color channel and send the dithered results of the respective color channels to the respective display driver ICs (e.g., 142A, 142B, 142C) of display system 140. In particular embodiments, before sending the pixel values to the respective display driver ICs (e.g., 142A, 142B, 142C), the display block 135 may further include one or more algorithms for correcting, for example, pixel non-uniformity, LED non-ideality, waveguide non-uniformity, display defects (e.g., dead pixels), etc.
[0075] In particular embodiments, graphics applications (e.g., games, maps, content-providing apps, etc.) may build a scene graph, which is used together with a given view position and point in time to generate primitives to render on a GPU or display engine. The scene graph may define the logical and/or spatial relationship between objects in the scene. In particular embodiments, the display engine 130 may also generate and store a scene graph that is a simplified form of the full application scene graph. The simplified scene graph may be used to specify the logical and/or spatial relationships between surfaces (e.g., the primitives rendered by the display engine 130, such as quadrilaterals or contours, defined in 3D space, that have corresponding textures generated based on the mainframe rendered by the application). Storing a scene graph allows the display engine 130 to render the scene to multiple display frames and to adjust each element in the scene graph for the current viewpoint (e.g., head position), the current object positions (e.g., they could be moving relative to each other) and other factors that change per display frame. In addition, based on the scene graph, the display engine 130 may also adjust for the geometric and color distortion introduced by the display subsystem and then composite the objects together to generate a frame. Storing a scene graph allows the display engine 130 to approximate the result of doing a full render at the desired high frame rate, while actually running the GPU or display engine 130 at a significantly lower rate.
[0076] FIG. 1D illustrates an example graphic pipeline 100D of the display engine 130 for generating display image data. In particular embodiments, the graphic pipeline 100D may include a visibility step 152, where the display engine 130 may determine the visibility of one or more surfaces received from the body wearable computing system. The visibility step 152 may be performed by the transform block (e.g., 2133 in FIG. 1C) of the display engine 130. The display engine 130 may receive (e.g., by a control block or a controller) input data 151 from the body-wearable computing system. The input data 151 may include one or more surfaces, texel data, position data, RGB data, and rendering instructions from the body wearable computing system. The input data 151 may include mainframe images with 30-90 frames per second (FPS). The main frame image may have color depth of, for example, 24 bits per pixel. The display engine 130 may process and save the received input data 151 in the texel memory 132. The received data may be passed to the transform block 133 which may determine the visibility information for surfaces to be displayed. The transform block 133 may cast rays for pixel locations on the screen and produce filter commands (e.g., filtering based on bilinear or other types of interpolation techniques) to send to the pixel block 134. The transform block 133 may perform ray casting from the current viewpoint of the user (e.g., determined using the headset’s inertial measurement units, eye trackers, and/or any suitable tracking/localization algorithms, such as simultaneous localization and mapping (SLAM)) into the artificial scene where surfaces are positioned and produce surface-tile pairs to send to the pixel block 134.
[0077] In particular embodiments, the graphic pipeline 100D may include a resampling step 153, where the display engine 130 may determine the color values from the tile-surfaces pairs to produce pixel color values. The resampling step 153 may be performed by the pixel block 134 in FIG. 1C) of the display engine 130. The pixel block 134 may receive tile-surface pairs from the transform block 133 and may schedule bilinear filtering. For each tile-surface pair, the pixel block 134 may sample color information for the pixels within the tile using color values corresponding to where the projected tile intersects the surface. The pixel block 134 may determine pixel values based on the retrieved texels (e.g., using bilinear interpolation) and output the determined pixel values to the respective display block 135.
[0078] In particular embodiments, the graphic pipeline 100D may include a bend step 154, a correction and dithering step 155, a serialization step 156, etc. In particular embodiments, the bend step, correction and dithering step, and serialization steps of 154, 155, and 156 may be performed by the display block (e.g., 135 in FIG. 1C) of the display engine 130. The display engine 130 may blend the display content for display content rendering, apply one or more brightness corrections to the pixel color values, perform one or more dithering algorithms for dithering the quantization errors both spatially and temporally, serialize the pixel values for scanline output for the physical display, and generate the display data 159 suitable for the display system 140. The display engine 130 may send the display data 159 to the display system 140. In particular embodiments, the display system 140 may include three display driver ICs (e.g., 142A, 142B, 142C) for the pixels of the three color channels of RGB (e.g., 144A, 144B, 144C).
[0079] FIG. 2A illustrates an example scanning waveguide display 200A. In particular embodiments, the head-mounted display (HMD) of the AR/VR system may include a near eye display (NED) which may be a scanning waveguide display 200A. The scanning waveguide display 200A may include a light source assembly 210, an output waveguide 204, a controller 216, etc. The scanning waveguide display 200A may provide images for both eyes or for a single eye. For purposes of illustration, FIG. 3A shows the scanning waveguide display 200A associated with a single eye 202. Another scanning waveguide display (not shown) may provide image light to the other eye of the user and the two scanning waveguide displays may share one or more components or may be separated. The light source assembly 210 may include a light source 212 and an optics system 214. The light source 212 may include an optical component that could generate image light using an array of light emitters. The light source 212 may generate image light including, for example, but not limited to, red image light, blue image light, green image light, infra-red image light, etc. The optics system 214 may perform a number of optical processes or operations on the image light generated by the light source 212. The optical processes or operations performed by the optics systems 214 may include, for example, but are not limited to, light focusing, light combining, light conditioning, scanning, etc.
[0080] In particular embodiments, the optics system 214 may include a light combining assembly, a light conditioning assembly, a scanning mirror assembly, etc. The light source assembly 210 may generate and output an image light 219 to a coupling element 218 of the output waveguide 204. The output waveguide 204 may be an optical waveguide that could output image light to the user eye 202. The output waveguide 204 may receive the image light 219 at one or more coupling elements 218 and guide the received image light to one or more decoupling elements 206. The coupling element 218 may be, for example, but is not limited to, a diffraction grating, a holographic grating, any other suitable elements that can couple the image light 219 into the output waveguide 204, or a combination thereof. As an example and not by way of limitation, if the coupling element 350 is a diffraction grating, the pitch of the diffraction grating may be chosen to allow the total internal reflection to occur and the image light 219 to propagate internally toward the decoupling element 206. The pitch of the diffraction grating may be in the range of 300 nm to 600 nm. The decoupling element 206 may decouple the total internally reflected image light from the output waveguide 204. The decoupling element 206 may be, for example, but is not limited to, a diffraction grating, a holographic grating, any other suitable element that can decouple image light out of the output waveguide 204, or a combination thereof. As an example and not by way of limitation, if the decoupling element 206 is a diffraction grating, the pitch of the diffraction grating may be chosen to cause incident image light to exit the output waveguide 204. The orientation and position of the image light exiting from the output waveguide 204 may be controlled by changing the orientation and position of the image light 219 entering the coupling element 218. The pitch of the diffraction grating may be in the range of 300 nm to 600 nm.
[0081] In particular embodiments, the output waveguide 204 may be composed of one or more materials that can facilitate total internal reflection of the image light 219. The output waveguide 204 may be composed of one or more materials including, for example, but not limited to, silicon, plastic, glass, polymers, or some combination thereof. The output waveguide 204 may have a relatively small form factor. As an example and not by way of limitation, the output waveguide 204 may be approximately 50 mm wide along X-dimension, 30 mm long along Y-dimension and 0.5-1 mm thick along Z-dimension. The controller 216 may control the scanning operations of the light source assembly 210. The controller 216 may determine scanning instructions for the light source assembly 210 based at least on the one or more display instructions for rendering one or more images. The display instructions may include an image file (e.g., bitmap) and may be received from, for example, a console or computer of the AR/VR system. Scanning instructions may be used by the light source assembly 210 to generate image light 219. The scanning instructions may include, for example, but are not limited to, an image light source type (e.g., monochromatic source, polychromatic source), a scanning rate, a scanning apparatus orientation, one or more illumination parameters, or some combination thereof. The controller 216 may include a combination of hardware, software, firmware, or any suitable components supporting the functionality of the controller 216.
[0082] FIG. 2B illustrates an example scanning operation of a scanning waveguide display 200B. The light source 220 may include an array of light emitters 222 (as represented by the dots in inset) with multiple rows and columns. The light 223 emitted by the light source 220 may include a set of collimated beams of light emitted by each column of light emitters 222. Before reaching the mirror 224, the light 223 may be conditioned by different optical devices such as the conditioning assembly (not shown). The mirror 224 may reflect and project the light 223 from the light source 220 to the image field 227 by rotating about an axis 225 during scanning operations. The mirror 224 may be a microelectromechanical system (MEMS) mirror or any other suitable mirror. As the mirror 224 rotates about the axis 225, the light 223 may be projected to a different part of the image field 227, as illustrated by the reflected part of the light 226A in solid lines and the reflected part of the light 226B in dash lines.
[0083] In particular embodiments, the image field 227 may receive the light 226A-B as the mirror 224 rotates about the axis 225 to project the light 226A-B in different directions. For example, the image field 227 may correspond to a portion of the coupling element 218 or a portion of the decoupling element 206 in FIG. 2A. In particular embodiments, the image field 227 may include a surface of the coupling element 206. The image formed on the image field 227 may be magnified as light travels through the output waveguide 220. In particular embodiments, the image field 227 may not include an actual physical structure but include an area to which the image light is projected to form the images. The image field 227 may also be referred to as a scan field. When the light 223 is projected to an area of the image field 227, the area of the image field 227 may be illuminated by the light 223. The image field 227 may include a matrix of pixel locations 229 (represented by the blocks in inset 228) with multiple rows and columns. The pixel location 229 may be spatially defined in the area of the image field 227 with a pixel location corresponding to a single pixel. In particular embodiments, the pixel locations 229 (or the pixels) in the image field 227 may not include individual physical pixel elements. Instead, the pixel locations 229 may be spatial areas that are defined within the image field 227 and divide the image field 227 into pixels. The sizes and locations of the pixel locations 229 may depend on the projection of the light 223 from the light source 220. For example, at a given rotation angle of the mirror 224, light beams emitted from the light source 220 may fall on an area of the image field 227. As such, the sizes and locations of pixel locations 229 of the image field 227 may be defined based on the location of each projected light beam. In particular embodiments, a pixel location 229 may be subdivided spatially into subpixels (not shown). For example, a pixel location 229 may include a red subpixel, a green subpixel, and a blue subpixel. The red, green and blue subpixels may correspond to respective locations at which one or more red, green and blue light beams are projected. In this case, the color of a pixel may be based on the temporal and/or spatial average of the pixel’s subpixels.
[0084] In particular embodiments, the light emitters 222 may illuminate a portion of the image field 227 (e.g., a particular subset of multiple pixel locations 229 on the image field 227) with a particular rotation angle of the mirror 224. In particular embodiment, the light emitters 222 may be arranged and spaced such that a light beam from each of the light emitters 222 is projected on a corresponding pixel location 229. In particular embodiments, the light emitters 222 may include a number of light-emitting elements (e.g., micro-LEDs) to allow the light beams from a subset of the light emitters 222 to be projected to a same pixel location 229. In other words, a subset of multiple light emitters 222 may collectively illuminate a single pixel location 229 at a time. As an example and not by way of limitation, a group of light emitter including eight light-emitting elements may be arranged in a line to illuminate a single pixel location 229 with the mirror 224 at a given orientation angle.
[0085] In particular embodiments, the number of rows and columns of light emitters 222 of the light source 220 may or may not be the same as the number of rows and columns of the pixel locations 229 in the image field 227. In particular embodiments, the number of light emitters 222 in a row may be equal to the number of pixel locations 229 in a row of the image field 227 while the light emitters 222 may have fewer columns than the number of pixel locations 229 of the image field 227. In particular embodiments, the light source 220 may have the same number of columns of light emitters 222 as the number of columns of pixel locations 229 in the image field 227 but fewer rows. As an example and not by way of limitation, the light source 220 may have about 1280 columns of light emitters 222 which may be the same as the number of columns of pixel locations 229 of the image field 227, but only a handful rows of light emitters 222. The light source 220 may have a first length L1 measured from the first row to the last row of light emitters 222. The image field 530 may have a second length L2, measured from the first row (e.g., Row 1) to the last row (e.g., Row P) of the image field 227. The L2 may be greater than L1 (e.g., L2 is 50 to 10,000 times greater than L1).
……
……
……