Apple Patent | Adaptive Transfer Functions

小编映维 | 分类：Apple | 2020年4月2日

Publication Number: 20200105226

Publication Date: 20200402

Applicants: Apple

Abstract

The disclosed techniques use a display device, optionally including optical and/or non-optical sensors providing information about the ambient environment of the display device–along with knowledge of the content that is being displayed–to predict a viewer of the display device’s current visual adaptation. Using the viewer’s predicted adaptation, the content displayed on the display device may be more optimally encoded. This encoding may be accomplished at encode time and may be performed in a display pipeline or, preferably, in the transfer function of the display itself–thereby reducing the precision required in the display pipeline. For example, in well-controlled scenarios where the viewer’s adaptation may be fully characterized, e.g., a viewer wearing a head-mounted display (HMD) device, the full dynamic range of the viewer’s perception may be encoded in 8 or 9 bits that are intelligently mapped to only the relevant display codes, given the viewer’s current predicted adaptation.

CROSS-REFERENCE TO RELATED APPLICATION

[0001] This Application is related to U.S. application Ser. No. 12/968,541, entitled, “Dynamic Display Adjustment Based on Ambient Conditions,” filed Dec. 15, 2010, and issued Apr. 22, 2014 as U.S. Pat. No. 8,704,859, and which is hereby incorporated by reference in its entirety.

BACKGROUND

[0002] Today, consumer electronic devices incorporating display screens are used in a multitude of different environments with different lighting conditions, e.g., the office, the home, home theaters, inside head-mounted displays (HMD), and outdoors. Such devices typically need to be designed to incorporate sufficient display “codes” (i.e., the discrete numerical values used to quantize an encoded intensity of light for a given display pixel) to represent the dimmest representation of content all the way up to the most bright representation–with sufficient codes in between, such that, regardless of the user’s visual adaptation at any given moment in time, minimal (or, ideally, no) color banding would be perceivable to the viewer of the displayed content (including gradient content).

[0003] The full dynamic range of human vision covers many orders of magnitude in brightness. For example, the human eye can adapt to perceive dramatically different ranges of light intensity, ranging from being able to discern landscape illuminated by star-light on a moonless night (e.g., 0.0001 lux ambient level) to the same landscape illuminated by full daylight (e.g., >10,000 lux ambient level), representing a range of illumination levels on the order of 100,000,000, or 10 8. Linearly encoding this range of human visual perception in binary fashion would require at least log2(100,000,000), or 27 bits (i.e., using 27 binary bits would allow for the encoding of 134,217,728, or 2 27, discrete brightness levels) to encode the magnitudes of the brightness levels–and potentially even additional bits to store codes to differentiate values within a doubling.

[0004] At any given visual adaptation to ambient illumination levels, approximately 2 9 brightness levels (i.e., shades of grey) can be distinguished between when placed next to each other, such as in a gradient, to appear continuous. This indicates that roughly, 27+9, or 36 bits would be needed to linearly code the full range of human vision.

[0005] It should be mentioned that, while linear coding of images is required mathematically by some operations, it provides a relatively poor representation for human vision, which, similarly to most other human senses, seems to differentiate most strongly between relative percentage differences in intensity (rather than absolute differences). Linear encoding tends to allocate too few codes (representing brightness values) to the dark ranges, where human visual acuity is best able to differentiate small differences leading to banding, and too many codes to brighter ranges, where visual acuity is less able to differentiate small differences leading to a wasteful allocation of codes in that range.

[0006] Hence, it is common to apply a gamma encoding (e.g., raising linear pixel values to an exponential power, such as 2.2, which is a simple approximation of visual brightness acuity) as a form of perceptual compressive encoding for content intended for visual consumption (but not intended further editing). For example, camera RAW image data is intended for further processing and is correspondingly often stored linearly at a high bit-depth, but JPEG image data, intended for distribution, display and visual consumption, is gamma encoded using fewer bits.

[0007] Generally speaking, content must be viewed in the environment it was authored in to appear correct, which is why motion picture/film content, which is intended to be viewed in a dark theater, is often edited in dark edit suites. When viewed in a different environment, say, content intended for viewing in a dark environment that is viewed in a bright environment instead, the user’s vision will be adapted to that bright environment, causing the content to appear to have too high of a contrast as compared to that content viewed in the intended environment, with the codes encoding the lowest brightness levels (i.e., those containing “shadow detail”) “crushed” to black, due to the user’s vision being adapted to a level where the dimmer codes are undifferentiable from black.

[0008] Classically, a gamma encoding optimized for a given environment, dynamic range of content, and dynamic range of display, was developed, such that the encoding and display codes were well spaced across the intended range, so that the content appears as intended (e.g., not banded, without crushed highlights, or blacks, and with the intended contrast–sometimes called tonality, etc.). The sRGB format’s 8-bit 2.2 gamma is an example of a representation deemed optimal for encoding SDR (standard dynamic range) content to be displayed on a 1/2.45 gamma rec.709 CRT and viewed in a bright office environment.

[0009] However, the example sRGB content, when displayed on its intended rec.709 display, will not have the correct appearance if/when viewed in another environment, either brighter or dimmer, causing the user’s adaptation–and thus their perception of the content–to be different from what was intended.

[0010] As the user’s adaptation changes from the adaptation implied by the suggested viewing environment, for instance to a brighter adaptation, low brightness details may become indistinguishable. At any given moment in time, however, based on current brightness adaptation, a human can only distinguish between roughly 2 8 to 2 9 different brightness levels of light (i.e., between 256-512 so-called “perceptual bins”). In other words, there is always a light intensity value, under which, a user cannot discern changes in light level, i.e., cannot distinguish between low light levels and “true black.” Moreover, the combination of current ambient conditions and the current visual adaptation of the user can also result in a scenario wherein the user loses the ability to view the darker values encoded in an image that the source content author both perceived and intended that the consumer of the content be able to perceive (e.g., a critical plot element in a film might involve a knife not quite hidden in the shadows).

[0011] As the user’s adaptation changes from the adaptation implied by the suggested viewing environment, the general perceived tonality of the image will change, as is described by the adaptive process known as “simultaneous contrast.”

[0012] As the user’s adaptation changes, perhaps because of a change in the ambient light, from the adaptation implied by the suggested viewing environment, the image may also appear to have an unintended color cast, as the image’s white point no longer matches the user’s adapted white point (for instance, content intended for viewing in D65 light (i.e., bluish cloudy noon-day) will appear too blue when viewed in orange-ish 2700K light (i.e., common tungsten light bulbs)).

[0013] As the user’s adaptation changes from the adaptation implied by the suggested viewing environment, such as when the user adapts to environmental lighting brighter than the display, e.g., a bright sky, the display’s fixed brightness, and environmental light is reflected off the display, the display is able to modulate a smaller range of the user’s adapted perception than, say, viewing that same display and content in the dark, where the user’s vision will be adapted to the display and potentially able to modulate the user’s entire perceptual range. Increasing a display’s brightness may cause additional light to “leak” from the display, too, adding to the reflected light, and thus further limiting the darkest level the display can achieve in this environment.

[0014] For these factors and more, it is desirable to map the content to the display and, in turn, map the result into the user’s adapted vision. This mapping is a modification to the original signal and correspondingly requires additional precision to carry. For instance, consider an 8-bit monotonically incrementing gradient. Even a simple attenuation operation, such as multiplying each value by 0.9, moves most values from 8-bit representable values, thus requiring additional precision to represent the signal with fidelity.

[0015] A general approach might be to have the display pipeline that would apply this mapping to adapted vision space implement the full 36-bit linearly-encoded human visual range. However, implementing an image display processing pipeline to handle 36 bits of precision would require significantly more processing resources, more transistors, more wires, and/or more power than would typically be available in a consumer electronic device, especially a portable, light-weight consumer electronic device. Additionally, distributing such a signal via physical media (or via Internet streaming) would be burdensome, especially if streamed wirelessly.

[0016] Any emissive, digitally-controlled display technology has minimum and maximum illumination levels that it is capable of producing (and which might be affected by environmental factors, such as reflection), and some number of discrete brightness levels coded in between those limits. These codes have typically been coded to match the transfer function requirements of the media that the display is intended to be used with. As will be described below, some displays also incorporate Look-up Tables (LUTs), e.g., to fine-tune the native response curve of the display to the desired response curve.

[0017] Thus, there is a need for techniques to implement a perceptually-aware and/or content-aware system that is capable of utilizing a perceptual model to dynamically adjust a display, e.g., to model what light intensity levels a user’s roughly 2 8 to 2 9 discernable perceptual bins are mapping to at any given moment in time, which also correspond to the range of brightness values that the display can modulate, and potentially also intersect with the range of brightness values that the system or media require. Successfully modeling the user’s perception of the content data displayed would allow the user’s experience of the displayed content to remain relatively independent of the ambient conditions in which the display is being viewed and/or the content that is being displayed, while providing optimal encoding. For instance, if viewing a relatively dim display in a very bright environment, the user’s adapted vision is likely much higher than the brightness of the display, and the display’s lowest brightness region might even be indistinguishable. Thus, in this situation, it would be expected that fewer than the conventional 8 to 9 bit precision at 2.2 gamma coding would be required to render an image without banding for such a user in such an environment. Moreover, rather than following the conventional approach of adding an extra bit of data to the pipeline for each doubling of the dynamic range to be displayed, such techniques would put a rather modest limit on the number of bits necessary to encode the entire dynamic range of human perception (e.g., the aforementioned 8 or 9 bits), even in the optimal environment, thereby further enhancing the performance and power efficiency of the display device, while not negatively impacting the user’s experience or limiting the dynamic range of content that can be displayed.

SUMMARY

[0018] As mentioned above, human perception is not absolute; rather, it is relative and adaptive. In other words, a human user’s perception of a displayed image changes based on what surrounds the image, the image itself, what content the user has seen in a preceding time interval (which, as will be described herein, can contribute to the user’s current visual adaptation), and what range of light levels the viewer’s eyes presently differentiate. A display may commonly be positioned in front of a wall. In this case, the ambient lighting in the room (e.g., brightness and color) will illuminate the wall or whatever lies beyond the monitor, influence the user’s visual adaptation, and thus change the viewer’s perception of the image on the display. In other cases, the user may be viewing the content in a dark theater (or while wearing an HMD). In the theater/HMD cases, the viewer’s perception will be adapted almost entirely to the recently-viewed content itself, i.e., not affected by the ambient environment, which they are isolated from. Potential changes in a viewer’s perception include a change to adaptation (both physical dilation of the pupil, as well as neural accommodations changing the brightest light that the user can see without discomfort), which further re-maps the perceptual bins of brightness values the user can (and cannot) differentiate (and which may be modeled using a scaled and/or offset gamma function), as well as changes to white point and black point. Thus, while some devices may attempt to maintain an overall identity 1.0 gamma on the eventual display device (i.e., the gamma map from the content’s encoded transfer function to the display’s electro-optical transfer function, or EOTF), those changes do not take into account the effect on a human viewer’s perception of gamma due to differences in ambient light conditions and/or the dynamic range of the recently-viewed content itself.

[0019] The techniques disclosed herein use a display device, in conjunction with various optical sensors, e.g., potentially multi-spectral ambient light sensor(s), image sensor(s), or video camera(s), and/or various non-optical sensors, e.g., user presence/angle/distance sensors, time of flight (ToF) cameras, structured light sensors, or gaze sensors, to collect information about the ambient conditions in the environment of a viewer of the display device, as well as the brightness levels on the face of the display itself. Use of these various sensors can provide more detailed information about the ambient lighting conditions in the viewer’s environment, which a processor in communication with the display device may utilize to evaluate a perceptual model, based at least in part, on the received environmental information, the viewer’s predicted adaptation levels, and information about the display, as well as the content itself that is being, has been, or will be displayed to the viewer. The output from the perceptual model may be used to adapt the content so that, when sent to a given display and viewed by the user in a given ambient environment, it will appear as the source author intended (even if authored in a different environment resulting in a different adaptation). The perceptual model may also be used to directly adapt the display’s transfer function, such that the viewer’s perception of the content displayed on the display device is relatively independent of the ambient conditions in which the display is being viewed. The output of the perceptual model may comprise modifications to the display’s transfer function that are a function of gamma, black point, white point, or a combination thereof. Since adapting content from the nominal intended transfer function of the image/display pipeline requires additional precision (e.g., bits in both representation and processing limits), it may be more efficient to dynamically adapt the transfer function of the entire pipeline and display. In particular, by determining the viewer’s adaptation and mapping the display’s transfer function and coding to map into the viewer’s adaptation, and then directly adapting the content to the display, the adaptation of the content may be performed in a single step operation, e.g., not requiring the display pipeline to perform further adaptation (and thus not requiring extra precision). According to some embodiments, this adaptation of displayed content may be performed as a part of a common color management process (traditionally providing gamma and gamut mapping between source and display space, e.g., as defined by ICC profiles) that is performed on a graphics processing unit (or other processing unit) in high-precision space, and thus not requiring extra processing step. Then, if the modulation range of the display in the user’s adapted vision only requires, e.g., 8 bits to optimally describe, the pipeline itself will also only need to run in 8-bit mode (i.e., not requiring extra adaptation, which would require extra bits for extra precision).

[0020] The perceptual models disclosed herein may solve, or at least aid in solving, various problems with current display technology, wherein, e.g., content tends to have a fixed transfer function for historic reasons. For instance, in the case of digital movies intended to be viewed in the dark, the user is adapted almost wholly to the content. Thus, the content can have a reasonable dynamic range, wherein sufficient dark codes are still present, e.g., to allow for coverage of brighter scenes. Moreover, when watching a dark scene, the user’s adaptation will become more sensitive to darker tones, and there will be too few codes to avoiding banding. Similarly, there may be too few bits available to represent subtle colors for very dim objects (which may sometimes lead to dark complexions in dark scenes being unnaturally rendered in blotches of cyan, magenta, or green–instead of actual skin tones). In another case, e.g., when viewing standard dynamic range (SDR) content when the user is adapted to a much brighter environment, an SDR display simply cannot modulate a perceptually large enough dynamic range (i.e., as compared to that of the original encoding) to maintain the SDR appearance, because the display’s modulated brightness range intersects relatively few of the user’s perceptual bins (e.g., an 8-bit 2.2 gamma conventional display may have 64 or fewer of the 256 available grey levels be perceptually distinguishable). Techniques such as local tone mapping (LTM), which may introduce the notion of over- and under-shooting pixel values may be applied to sharpen the distinction between elements in certain regions of the display that are deemed to be distinguishable in the content when the content is being viewed on the reference display and in the reference viewing environment.

[0021] The static and/or inefficient allocation of display codes leads to either poor performance across a wide array of ambient viewing conditions (such as the exemplary viewing conditions described above) or an increased burden on an image processing pipeline leading to performance, thermal, and reduced battery life issues, due to the additional bits that need to be carried to encode the wider range of output display levels needed to produce a satisfactory high dynamic range viewing experience in a wide array of ambient viewing conditions.

[0022] In other embodiments, the content itself may alternatively or additionally be encoded on a per-frame (or per-group-of-frames) basis, thereby taking advantage of the knowledge of the brightness levels and content of, e.g.: previously displayed content, the currently displayed content, and even upcoming content (i.e., content that will be displayed in the near future), in order to more optimally encode the content for the user’s predicted current adaptation level. The encoding used (e.g., a parametric description of the transfer function) could be provided for each frame (or group-of-frames), or perhaps only if the transfer function changed beyond a threshold amount, or based on the delta between the current and previous frames’ transfer functions. In still other embodiments, the encoding used could potentially be stored in the form of metadata information accompanying one or more frames, or even in a separate data structure describing each frame of group-of-frames. In yet other embodiments, such encoded content may be decoded using the conventional static transfer function (e.g., to be backward compatible), and then use the dynamic transfer function for optimal encoding.

[0023] Estimating the user’s current visual adaptation allows the system to determine where, within the dynamic range of human vision (and within the capabilities of the display device), to allocate the output codes to best reproduce the output video signal, e.g., in a way that will give the user the benefit of the full content viewing experience, while not wasting codes in parts of the dynamic range of human vision that the user would not even be able to distinguish between at their given predicted adaptation level. For example, as alluded to above, video content that is intended for dim or dark viewing environments could advantageously have a known and/or per-frame transfer function that is used to optimally encode the content based on what the adapted viewer may actually perceive. For instance, in a dark viewing environment where the user is known to be adapted to the content, on a dim scene that was preceded by a very bright scene, it might be predicted based on the perceptual model that the dimmest of captured content codes will be indifferentiable from true black, and, consequently, that known perceptually-crushed region will not even be encoded (potentially via the offset in the encoding and decoding transfer functions). This would allow for fewer overall bits to be used to encode the next frame or number of frames (it is to be understood that the user’s adaptation will eventually adapt to the dimmer scene over time or a sufficient number of successive frames with similar average brightness levels) or for the same number of bits to more granularly encode the range of content brightness values that the viewer will actually be able distinguish. In cases where the viewing conditions are static across the entire duration of the rendering of media content (e.g., a movie or a still image), a single optimized transfer function may be determined and used over the duration of the rendering.

[0024] Thus, according to some embodiments, a non-transitory program storage device comprising instructions stored thereon is disclosed. When executed, the instructions are configured to cause one or more processors to: receive data indicative of one or more characteristics of a display device; receive data from one or more optical sensors indicative of ambient light conditions surrounding the display device; receive data indicative of one or more characteristics of a content; evaluate a perceptual model based, at least in part, on: the received data indicative of the one or more characteristics of the display device, the received data indicative of ambient light conditions surrounding the display device, the received data indicative of the one or more characteristics of the content, and a predicted adaptation level of a user of the display device, wherein the instructions to evaluate the perceptual model comprise instructions to determine one or more adjustments to a gamma, black point, white point, or a combination thereof, of the display device; adjust a transfer function for the display device based, at least in part, on the determined one or more adjustments; and cause the content to be displayed on the display device utilizing the adjusted transfer function.

[0025] In some embodiments, the determined one or more adjustments to the transfer function for the display device are implemented over a determined time interval, e.g., over a determined number of displayed frames of content and/or via a determined number of discrete step changes. In some such embodiments, the determined adjustments are only made when the adjustments exceed a minimum adjustment threshold. In other embodiments, the determined adjustments are made at a rate that is based, at least in part, on a predicted adaptation rate of the user of the display device.

[0026] In other embodiments, the aforementioned techniques embodied in instructions stored in non-transitory program storage devices may also be practiced as methods and/or implemented on electronic devices having display, e.g., a mobile phone, PDA, HMD, monitor, television, or a laptop, desktop, or tablet computer.

[0027] In still other embodiments, the same principles may be applied to audio data. In other words, by knowing the dynamic range of the audio content and the recent dynamic audio range of the user’s environment, the audio content may be transmitted and/or encoded using a non-linear transfer function that is optimized for the user’s current audio adaptation level and aural environment.

[0028] Advantageously, the perceptually-aware dynamic display adjustment techniques that are described herein may be removed from a device’s software image processing pipeline and instead implemented directly by the device’s hardware and/or firmware, with little or no additional computational costs (and, in fact, placing an absolute cap on computational costs, in some cases), thus making the techniques readily applicable to any number of electronic devices, such as mobile phones, personal data assistants (PDAs), portable music players, monitors, televisions, as well as laptop, desktop, and tablet computer screens.

BRIEF DESCRIPTION OF THE DRAWINGS

[0029] FIG. 1 illustrates a system for performing gamma adjustment utilizing a look up table.

[0030] FIG. 2 illustrates a Framebuffer Gamma Function and an exemplary Native Display Response.

[0031] FIG. 3 illustrates graphs representative of a LUT transformation and a Resultant Gamma Function.

[0032] FIG. 4A illustrates the properties of ambient lighting and diffuse reflection off a display device.

[0033] FIG. 4B illustrates the additive effects of unintended light on a display device.

[0034] FIG. 5 illustrates a resultant gamma function and a graph indicative of a perceptual transformation.

[0035] FIG. 6 illustrates a system for performing perceptually-aware dynamic display adjustment, in accordance with one or more embodiments.

[0036] FIG. 7 illustrates a simplified functional block diagram of a perceptual model, in accordance with one or more embodiments.

[0037] FIG. 8 illustrates a graph representative of an adaptive display transfer function, in accordance with one or more embodiments.

[0038] FIG. 9 illustrates, in flowchart form, a process for performing perceptually-aware dynamic display adjustment, in accordance with one or more embodiments.

[0039] FIG. 10 illustrates a simplified functional block diagram of a device possessing a display, in accordance with one embodiment.

DETAILED DESCRIPTION

[0040] The disclosed techniques use a display device, in conjunction with various optical and/or non-optical sensors, e.g., ambient light sensors or structured light sensors, to collect information about the ambient conditions in the environment of a viewer of the display device. Use of this information–and information regarding the display device and the content being displayed–can provide a more accurate prediction of the viewer’s current adaptation levels. A processor in communication with the display device may evaluate a perceptual model based, at least in part, on the predicted effects of the ambient conditions (and/or the content itself) on the viewer’s experience. The output of the perceptual model may be suggested modifications that are used to adjust the scale, offset, gamma, black point, and/or white point of the display device’s transfer function, such that, with a limited number of bits of precision (e.g., a fixed number of bits, such as 8 or 9 bits), the viewer’s viewing experience is high quality and/or high dynamic range, while remaining relatively independent of the current ambient conditions and/or the content that has been recently viewed.

[0041] The techniques disclosed herein are applicable to any number of electronic devices: such as digital cameras, digital video cameras, mobile phones, personal data assistants (PDAs), head-mounted display (HMD) devices, monitors, televisions, digital projectors (including cinema projectors), as well as desktop, laptop, and tablet computer displays.

[0042] In the interest of clarity, not all features of an actual implementation are described in this specification. It will of course be appreciated that in the development of any such actual implementation (as in any development project), numerous decisions must be made to achieve the developers’ specific goals (e.g., compliance with system-and business-related constraints), and that these goals will vary from one implementation to another. It will be appreciated that such development effort might be complex and time-consuming, but would nevertheless be a routine undertaking for those of ordinary skill having the benefit of this disclosure. Moreover, the language used in this disclosure has been principally selected for readability and instructional purposes, and may not have been selected to delineate or circumscribe the inventive subject matter, with resort to the claims being necessary to determine such inventive subject matter. Reference in the specification to “one embodiment” or to “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiments is included in at least one embodiment of the invention, and multiple references to “one embodiment” or “an embodiment” should not be understood as necessarily all referring to the same embodiment.

[0043] Referring now to FIG. 1, a typical system 112 for performing system gamma adjustment utilizing a Look Up Table (LUT) 110 is shown. Element 100 represents the source content, created by, e.g., a source content author, that viewer 116 wishes to view. Source content 100 may comprise an image, video, or other displayable content type. Element 102 represents the source profile, that is, information describing the color profile and display characteristics of the device on which source content 100 was authored by the source content author. Source profile 102 may comprise, e.g., an ICC profile of the author’s device or color space (which will be described in further detail below), or other related information.

[0044] Information relating to the source content 100 and source profile 102 may be sent to viewer 116’s device containing the system 112 for performing gamma adjustment utilizing a LUT 110. Viewer 116’s device may comprise, for example, a mobile phone, PDA, HMD, monitor, television, or a laptop, desktop, or tablet computer. Upon receiving the source content 100 and source profile 102, system 112 may perform a color adaptation process 106 on the received data, e.g., for performing gamut mapping, i.e., color matching across various color spaces. For instance, gamut matching tries to preserve (as closely as possible) the relative relationships between colors (e.g., as authored/approved by the content author on the display described by the source ICC profile), even if all the colors must be systematically mapped from their source to the display’s color space in order to get them to appear correctly on the destination device.

[0045] Once the source pixels have been color mapped (often a combination of gamma mapping, gamut mapping and chromatic adaptation based on the source and destination color profiles), image values may enter the so-called “framebuffer” 108. In some embodiments, image values, e.g., pixel component brightness values, enter the framebuffer having come from an application or applications that have already processed the image values to be encoded with a specific implicit gamma. A framebuffer may be defined as a video output device that drives a video display from a memory buffer containing a complete frame of, in this case, image data. The implicit gamma of the values entering the framebuffer can be visualized by looking at the “Framebuffer Gamma Function,” as will be explained further below in relation to FIG. 2. Ideally, this Framebuffer Gamma Function is the exact inverse of the display device’s “Native Display Response” function, which characterizes the luminance response of the display to input.

[0046] Because the inverse of the Native Display Response isn’t always exactly the inverse of the framebuffer, a LUT, sometimes stored on a video card or in other memory, may be used to account for the imperfections in the relationship between the encoding gamma and decoding gamma values, as well as the display’s particular luminance response characteristics. Thus, if necessary, system 112 may then utilize LUT 110 to perform a so-called system “gamma adjustment process.” LUT 110 may comprise a two-column table of positive, real values spanning a particular range, e.g., from zero to one. The first column values may correspond to an input image value, whereas the second column value in the corresponding row of the LUT 110 may correspond to an output image value that the input image value will be “transformed” into before being ultimately being displayed on display 114. LUT 110 may be used to account for the imperfections in the display 114’s luminance response curve, also known as the “display transfer function.” In other embodiments, a LUT may have separate channels for each primary color in a color space, e.g., a LUT may have Red, Green, and Blue channels in the sRGB color space.

[0047] The transformation applied by the LUT to the incoming framebuffer data before the data is output to the display device may be used to ensure that a desired 1.0 gamma boost is applied to the eventual display device when considered as a system from encoded source through to display. The system shown in FIG. 1 is generally a good system, although it does not compensate for the user’s adaptation (to the display and environment) nor take into account the display’s range of brightness compared to the user’s adaptation or the effect on the viewer of changes in the dynamic range of the content recently viewed by the viewer. In other words, an experience that is identical to the one the content was authored to be viewed in is only achieved/appropriate in one ambient lighting environment. For example, content captured in a bright environment won’t require a gamma boost, e.g., due to the “simultaneous contrast” phenomenon, if viewed in the identical (i.e., bright) environment. Some content is specifically authored for the bright surround (i.e., a nearly 1.0 ratio of reference white display brightness and surround brightness. Most displays with auto-brightness, or user-controlled brightness, are bright surround when used in a greater than dim surround (e.g., greater than 16 lux), but still not too bright of an environment that the display can match (e.g., it may be difficult for an approximately 500 nit display to have the same apparent brightness as daylit surroundings). Some content, e.g., Rec. 709 video, is captured bright surround and then has an implicit over unity system gamma applied to add appropriate boost to compensate for the simultaneous contrast effect when viewed in the intended 16 lux (i.e., approximately 5 nit) dim surround.

[0048] As mentioned above, in some embodiments, the goal of this gamma adjustment system 112 is to have an overall 1.0 system gamma applied to the content that is being displayed on the display device 114. An overall 1.0 system gamma corresponds to a linear relationship between the input encoded luma values and the output luminance on the display device 114. Ideally, an overall 1.0 system gamma will correspond to the source author’s intended look of the displayed content. However, as will be described later, this overall 1.0 gamma may only be properly perceived in one particular set one set of ambient lighting conditions, thus necessitating the need for an ambient- and perceptually-aware dynamic display adjustment system. As may also now be understood, systems such as that described above with reference to FIG. 1 are actually acting to change the pixel values of the source content received at the system, so as to adapt the content to the display the user is viewing the content on. As will be described in further detail below, a different (and more efficient) approach to display adaptation may be to adapt the transfer function of the display itself (e.g., in a single step operation)–rather than attempt to rewrite the values of all incoming display content.

[0049] Referring now to FIG. 2, a Framebuffer Gamma Function 200 and an exemplary Native Display Response 202 is shown. Gamma adjustment, or, as it is often simply referred to, “gamma,” is the name given to the nonlinear operation commonly used to encode luma values and decode luminance values in video or still image systems. Gamma, y, may be defined by the following simple power-law expression: L.sub.out=L.sub.in.sup.Y, where the input and output values, L.sub.in and L.sub.out, respectively, are non-negative real values, typically in a predetermined range, e.g., zero to one. A gamma value greater than one is sometimes called an “encoding gamma,” and the process of encoding with this compressive power-law nonlinearity is called “gamma compression;” conversely, a gamma value less than one is sometimes called a “decoding gamma,” and the application of the expansive power-law nonlinearity is called “gamma expansion.” Gamma encoding of content helps to map the content data into a more perceptually-uniform domain.

[0050] Another way to think about the gamma characteristic of a system is as a power-law relationship that approximates the relationship between the encoded luma in the system and the actual desired image luminance on whatever the eventual user display device is. In existing systems, a computer processor or other suitable programmable control device may perform gamma adjustment computations for a particular display device it is in communication with based on the native luminance response of the display device, the color gamut of the device, and the device’s white point (which information may be stored in an ICC profile), as well as the ICC color profile the source content’s author attached to the content to specify the content’s “rendering intent.”

[0051] The ICC profile is a set of data that characterizes a color input or output device, or a color space, according to standards promulgated by the International Color Consortium (ICC). ICC profiles may describe the color attributes of a particular device or viewing requirement by defining a mapping between the device source or target color space and a profile connection space (PCS), usually the CIE XYZ color space. ICC profiles may be used to define a color space generically in terms of three main pieces: 1) the color primaries that define the gamut; 2) the transfer function (sometimes referred to as the gamma function); and 3) the white point. ICC profiles may also contain additional information to provide mapping between a display’s actual response and its “advertised” response, i.e., its tone response curve (TRC), for instance, to correct or calibrate a given display to a perfect 2.2 gamma response.

[0052] In some implementations, the ultimate goal of the gamma adjustment process is to have an eventual overall 1.0 gamma boost, i.e., so-called “unity” or “no boost,” applied to the content as it is displayed on the display device. An overall 1.0 system gamma corresponds to a linear relationship between the input encoded luma values and the output luminance on the display device, meaning there is actually no amount of gamma “boosting” being applied.

[0053] Returning now to FIG. 2, the x-axis of Framebuffer Gamma Function 200 represents input image values spanning a particular range, e.g., from zero to one. The y-axis of Framebuffer Gamma Function 200 represents output image values spanning a particular range, e.g., from zero to one. As mentioned above, in some embodiments, image values may enter the framebuffer 108 already having been processed and have a specific implicit gamma. As shown in graph 200 in FIG. 2, the encoding gamma is roughly 1/2.2, or 0.45. That is, the line in graph 200 roughly looks like the function, L.sub.OUT=L.sub.IN.sup.0.45. Gamma values around 1/2., or 0.45, are typically used as encoding gammas because the native display response of many display devices have a gamma of roughly 2.2, that is, the inverse of an encoding gamma of 1/2.2. In other cases, a gamma of, e.g., 1/2.45, may be applied to 1.96 gamma encoded (e.g., bright surround) content when displayed on a conventional 1/2.45 gamma CRT display, in order to provide the 1.25 gamma “boost” (i.e., 2.45 divided by 1.96), required to compensate for the simultaneous contrast effect causing bright content to appear low-contrast when viewed in a dim surround (i.e., an environment where the area seen beyond the display, also known as the surround, is dimmer than the reference white of the display, for example, a wall illuminated by a 16 lux environment likely reflects approximately 16/pi, or .about.5 nits, and itself might have a reference white as high as 100 nits; 100/5 is a display to surround ratio of 20:1 and is generally considered to be a dim surround).

本文链接：https://patent.nweon.com/9755

Apple Patent | Adaptive Transfer Functions

您可能还喜欢...

分类

最新AR/VR行业分享

最新AR/VR论文

最新AR/VR行业招聘

Apple Patent | Adaptive Transfer Functions

您可能还喜欢...

Apple Patent | Light field capture

Apple Patent | Augmented Reality Based On Wireless Ranging

Apple Patent | Wearable Voice-Induced Vibration Or Silent Gesture Sensor

分类

最新AR/VR行业分享

最新AR/VR论文

最新AR/VR行业招聘