Facebook Patent | Generating A Modified Audio Experience For An Audio System
Patent: Generating A Modified Audio Experience For An Audio System
Publication Number: 10638248
Publication Date: 20200428
Applicants: Facebook
Abstract
An audio system is configured to present a modified audio experience that reduces the degradation of a target audio experience presented to a user by the audio system. The audio system includes an acoustic sensor array, a controller, and a playback device array. To generate the modified audio experience, the acoustic sensor array receives the sound waves from one or more non-target audio source(s) causing the degradation, identifies the audio source(s), determines the spatial location of the audio source(s), determines the type of the audio source(s) and generates audio instructions that, when executed by the playback device array, present the modified audio experience to the user. The modified audio experience may perform active noise cancelling, ambient sound masking, and/or neutral sound masking to compensate for the sound waves received from non-target audio sources. The audio system may be part of a headset that can produce an artificial reality environment.
BACKGROUND
The present disclosure generally relates to generating an audio experience, and specifically relates to generating an audio experience that compensates for sound waves generated by obtrusive audio sources.
Conventional audio systems may use headphones to present a target audio experience including a plurality of audio content. Because the conventional systems use headphones, the target audio experience is relatively unaffected by other audio sources in the local area of the audio system. However, audio systems including headphones occlude the ear canal and are undesirable for some artificial reality environments (e.g., augmented reality). Generating a target audio experience over air for a user within a local area, while minimizing the exposure of others in the local area to that audio content is difficult due to a lack of control over far-field radiated sound. Conventional systems are not able to dynamically present audio content that compensates for sound waves that can be perceived by the user as degrading the target audio experience.
SUMMARY
A method for generating a modified audio experience that reduces the degradation of a target audio experience presented to a user by an audio system. The degradation, or impact, may be caused by a user perceiving sound waves generated by non-target audio sources in the local area of the audio system. The method reduces the degradation, or impact, by presenting modified audio content that compensates for the sound waves generated by the non-target audio source. In some embodiments, the modified audio experience is similar to the target audio experience despite the presence of sound waves generated by the non-target audio sources.
The method determines, via an acoustic sensor array of a headset, sounds waves from one or more audio sources in a local area of the headset. A controller of the headset determines array transfer functions (ATFs) associated with the sounds waves, and determines the spatial location and/or type of the audio sources. The controller generates audio instructions that, when executed by a playback device array, present the modified audio experience to the user. The modified audio experience may perform active noise cancelling, ambient sound masking, and/or neutral sound masking to compensate for the sound waves received from non-target audio sources.
The method may be performed by an audio system. For example, an audio system that is part of a headset (e.g., near-eye display, head-mounted display). The audio system includes an acoustic sensor array, a controller, and a playback device array. The audio system may present the modified audio automatically after detecting an audio source or in response to an input from a user.
BRIEF DESCRIPTION OF DRAWINGS
FIG. 1 is a diagram of a headset including an audio system, in accordance with one or more embodiments.
FIG. 2 illustrates a local area of a headset worn by a user perceiving non-target audio sources in their auditory field, in accordance with one or more embodiments.
FIG. 3 is a block diagram of an example audio system, according to one or more embodiments.
FIG. 4 is a process for generating a modified audio experience that compensates for the degradation of a target audio experience, according to one or more embodiments.
FIG. 5 is a block diagram of an example artificial reality system, according to one or more embodiments.
The figures and the following description relate to various embodiments by way of illustration only. It should be noted that from the following discussion, alternative embodiments of the structures and methods disclosed herein will be readily recognized as viable alternatives that may be employed without departing from the principles of what is claimed.
DETAILED DESCRIPTION
* Introduction*
An audio system generates an audio experience that reduces the perception of an audio source (e.g., distraction) in an auditory field of a user. The audio system may be part of a headset (e.g., near-eye display or a head-mounted display). The audio system includes an acoustic sensor array, a controller, and a playback device array. The acoustic sensor array detects sounds from one or more audio sources in a local area of the headset. The playback device array generates an audio experience for the user by presenting audio content in an auditory field of the user. An auditory field of a user includes the spatial locations from which a user of the headset may perceive audio sources.
The controller generates audio instructions that are executable by the playback device array. The audio instructions, when executed by the playback device array, may present a target audio experience for a user. A target audio experience includes audio content presented to a user that is targeted for the user to perceive in their auditory field during operation of the headset. For example, the audio content elements of a target audio experience presented to a user operating a headset may include a soundtrack to a movie, sound effects in a game, a music playlist, etc.
In some embodiments, the playback device array does not include playback devices that obstruct the ear canal (e.g., earbuds or headphones). This allows a user to perceive sound waves from audio sources in the local area concurrent with audio content presented by the playback device array. Therefore, in some cases, one or more audio sources in a local area may degrade a target audio experience (“non-target audio source”) presented to the user by the audio system. Non-target audio sources degrade a target audio experience by generating sound waves that can be perceived as disruptions to and target audio experience presented by the audio system. To illustrate, a non-target audio source may degrade a target audio experience by generating sound waves that interrupt a user’s immersion in a target audio experience, provide a distraction in the auditory field of the user, interfere with audio content presented by the audio system, mask audio content presented by the audio system, etc. More generally, a non-target audio source impacts a target audio experience presented to the user in a negative manner.
The controller can generate audio instructions that, when executed by the playback device array, reduce the degradation of the target audio experience (“experience degradation”). To do so, the controller determines transfer functions for the sound waves received from the non-target audio sources, the spatial location(s) of the non-target audio source(s), and the type of non-target audio source(s). The controller then generates audio instructions that, when executed, compensate (i.e., cancel, mask, etc.) for the sound waves degrading the target audio experience. More generally, the controller generates audio instructions that, when executed by the playback device array, reduce the impact of unintended sound waves on the audio experience.
The controller determines transfer functions based on the sound waves received from audio sources. A transfer function is a function that maps sound waves received from multiple acoustic sensors (e.g., an acoustic sensor array) to audio signals that can be analyzed by the controller. The controller may determine the spatial location (e.g., a coordinate) of a non-target audio source based on audio characteristics of the received sound waves and/or the determined transfer functions. The controller may also classify a type of the non-target audio sources based on the audio characteristics of the received sound waves and/or the determined transfer functions. An audio characteristic is any property describing the properties of a sound wave. Some examples of audio characteristics may include, for example, amplitude, direction, frequency, speed, some other sound wave property, or some combination thereof. For example, the controller may classify a non-target audio source as an unobtrusive source (e.g., a fan, a rainstorm, traffic, an air-conditioning unit, etc.) or an obtrusive source (e.g., a person talking, sirens, bird calls, a door slamming, etc.) based on the audio characteristics (e.g., frequency and amplitude) of the sound waves generated by the sources.
The controller generates audio instructions that reduce the experience degradation based on the audio characteristics of the received sound waves, the determined spatial location of a non-target audio source, and/or the determined type of a non-target audio source. In one example, the controller generates the audio instructions by applying head related transfer functions.
The generated audio instructions generated by the controller, when executed by the playback device, present a modified audio experience to the user. The modified audio experience includes the audio content of the target audio experience, but also includes audio content that compensates for the sound waves received from non-target audio sources. In other words, the modified audio experience includes audio content that reduces the experience degradation caused by non-target audio sources. As such, the modified audio experience may be highly similar to the target audio experience despite the presence of sound waves generated by non-target audio source. To illustrate, the modified audio experience may include audio content that performs active noise cancellation, ambient sound masking, and/or neutral sound masking of non-target audio sources. Because of the normalizing audio content, a user may not perceive, or have reduced perception of, the sound waves generated by audio sources in the area.
Various embodiments may include or be implemented in conjunction with an artificial reality system. Artificial reality is a form of reality that has been adjusted in some manner before presentation to a user, which may include, e.g., a virtual reality (VR), an augmented reality (AR), a mixed reality (MR), a hybrid reality, or some combination and/or derivatives thereof. Artificial reality content may include completely generated content or generated content combined with captured (e.g., real-world) content. The artificial reality content may include video, audio, haptic feedback, or some combination thereof, and any of which may be presented in a single channel or in multiple channels (such as stereo video that produces a three-dimensional effect to the viewer). Additionally, in some embodiments, artificial reality may also be associated with applications, products, accessories, services, or some combination thereof, that are used to, e.g., create content in an artificial reality and/or are otherwise used in (e.g., perform activities in) an artificial reality. The artificial reality system that provides the artificial reality content may be implemented on various platforms, including a headset (e.g., a head-mounted device or near-eye display) connected to a host computer system, a standalone headset, a mobile device or computing system, or any other hardware platform capable of providing artificial reality content to one or more viewers.
* Head Wearable Device*
FIG. 1 is a diagram of a headset 100 including an audio system, according to one or more embodiments. The headset 100 presents media to a user. In one embodiment, the headset 100 may be a near-eye display (NED). In another embodiment, the headset 100 may be a head-mounted display (HMD). In general, the headset may be worn on the face of a user such that visual content (e.g., visual media) is presented using one or both lens 110 of the headset. However, the headset 100 may also be used such that media content is presented to a user in a different manner. Examples of media content presented by the headset 100 include one or more images, video, audio, or some combination thereof. The media may also include the audio content of an audio experience that may be presented to a user.
The headset 100 includes the audio system, and may include, among other components, a frame 112, a lens 110, a sensor device 114, and a controller 116. While FIG. 1 illustrates the components of the headset 100 in example locations on the headset 100, the components may be located elsewhere on the headset 100, on a peripheral device paired with the headset 100, or some combination thereof. Similarly, any or all of the components may be embedded, or partially embedded, within the headset and not visible to a user.
The headset 100 may correct or enhance the vision of a user, protect the eye of a user, or provide images to a user. The headset 100 may be eyeglasses which correct for defects in a user’s eyesight. The headset 100 may be sunglasses which protect a user’s eye from the sun. The headset 100 may be safety glasses which protect a user’s eye from impact. The headset 100 may be a night vision device or infrared goggles to enhance a user’s vision at night. The headset 100 may be a near-eye display that produces artificial reality content for the user. Alternatively, the headset 100 may not include a lens 110 and may be a frame 112 with an audio system that provides audio content (e.g., music, radio, podcasts) to a user.
The lens 110 provides or transmits light to a user wearing the headset 100. The lens 110 may be prescription lens (e.g., single vision, bifocal and trifocal, or progressive) to help correct for defects in a user’s eyesight. The prescription lens transmits ambient light to the user wearing the headset 100. The transmitted ambient light may be altered by the prescription lens to correct for defects in the user’s eyesight. The lens 110 may be a polarized lens or a tinted lens to protect the user’s eyes from the sun. The lens 110 may be one or more waveguides as part of a waveguide display in which image light is coupled through an end or edge of the waveguide to the eye of the user. The lens 110 may include an electronic display for providing image light and may also include an optics block for magnifying image light from the electronic display. Additional detail regarding the lens 110 is discussed with regards to FIG. 5.
In some embodiments, the headset 100 may include a depth camera assembly (DCA) (not shown) that captures data describing depth information for a local area surrounding the headset 100. In some embodiments, the DCA may include a light projector (e.g., structured light and/or flash illumination for time-of-flight), an imaging device, and a controller. The captured data may be images captured by the imaging device of light projected onto the local area by the light projector. In one embodiment, the DCA may include two or more cameras that are oriented to capture portions of the local area in stereo and a controller. The captured data may be images captured by the two or more cameras of the local area in stereo. The controller computes the depth information of the local area using the captured data and depth determination techniques (e.g., structured light, time-of-flight, stereo imaging, etc.). Based on the depth information, the controller determines absolute positional information of the headset 100 within the local area. The DCA may be integrated with the headset 100 or may be positioned within the local area external to the headset 100. In the latter embodiment, the controller of the DCA may transmit the depth information to the controller 116 of the headset 100. In addition, the sensor device 114 generates one or more measurements signals in response to motion of the headset 100. The sensor device 114 may be location on a portion of the frame 112 of the headset 100. Additional detail regarding a depth array camera is discussed with regards to FIG. 5.
The sensor device 114 may include a position sensor, an inertial measurement unit (IMU), or both. Some embodiments of the headset 100 may or may not include the sensor device 114 or may include more than one sensor device 114. In embodiments in which the sensor device 114 includes an IMU, the IMU generates IMU data based on measurement signals from the sensor device 114. Examples of sensor devices 114 include: one or more accelerometers, one or more gyroscopes, one or more magnetometers, another suitable type of sensor that detects motion, a type of sensor used for error correction of the IMU, or some combination thereof. The sensor device 114 may be located external to the IMU, internal to the IMU, or some combination thereof.
Based on the one or more measurement signals, the sensor device 114 estimates a current position of the headset 100 relative to an initial position of the headset 100. The initial position may be the position of the headset 100 when the headset 100 is initialized in a local area. The estimated position may include a location of the headset 100 and/or an orientation of the headset 100 or the user’s head wearing the headset 100, or some combination thereof. The orientation may correspond to a position of each ear relative to the reference point. In some embodiments, the sensor device 114 uses the depth information and/or the absolute positional information from a DCA to estimate the current position of the headset 100. The sensor device 114 may include multiple accelerometers to measure translational motion (e.g., forward/back, up/down, left/right) and multiple gyroscopes to measure rotational motion (e.g., pitch, yaw, roll). In some embodiments, an IMU rapidly samples the measurement signals and calculates the estimated position of the headset 100 from the sampled data. For example, the IMU integrates the measurement signals received from the accelerometers over time to estimate a velocity vector and integrates the velocity vector over time to determine an estimated position of a reference point on the headset 100. The reference point is a point that may be used to describe the position of the headset 100. While the reference point may generally be defined as a point in space, however, in practice the reference point is defined as a point within the headset 100.
As previously described, the audio system generates a modified audio experience that reduces the degradation of a target audio experience by compensating for sound waves received by non-target audio sources. In the illustrated example, the audio system comprises an acoustic sensor array, a controller 116, and a playback device array. However, in other embodiments, the audio system may include different and/or additional components. Similarly, in some cases, functionality described with reference to the components of the audio system can be distributed among the components in a different manner than is described here. For example, some or all of the functions of the controller 116 may be performed by a remote server.