Magic Leap Patent | Spatial audio for interactive audio environments
Patent: Spatial audio for interactive audio environments
Publication Number: 20250203313
Publication Date: 2025-06-19
Assignee: Magic Leap
Abstract
Systems and methods of presenting an output audio signal to a listener located at a first location in a virtual environment are disclosed. According to embodiments of a method, an input audio signal is received. For each sound source of a plurality of sound sources in the virtual environment, a respective first intermediate audio signal corresponding to the input audio signal is determined, based on a location of the respective sound source in the virtual environment, and the respective first intermediate audio signal is associated with a first bus. For each of the sound sources of the plurality of sound sources in the virtual environment, a respective second intermediate audio signal is determined. The respective second intermediate audio signal corresponds to a reflection of the input audio signal in a surface of the virtual environment. The respective second intermediate audio signal is determined based on a location of the respective sound source, and further based on an acoustic property of the virtual environment. The respective second intermediate audio signal is associated with a second bus. The output audio signal is presented to the listener via the first bus and the second bus.
Claims
What is claimed is:
1.A method comprising:determining an intermediate audio signal, the intermediate audio signal corresponding to a reflection of an input audio signal by a surface of a virtual environment, wherein:said determining the intermediate audio signal comprises encoding the input audio signal based on a location of a listener, the location of the listener is determined via one or more sensors of a wearable head device, and the intermediate audio signal is associated with a first plurality of channels; associating the intermediate audio signal with a bus, wherein the bus is associated with the first plurality of channels; and presenting, via the bus, an output audio signal to the listener, wherein:the output audio signal is determined via decoding the intermediate audio signal to produce a decoded intermediate audio signal, and the decoded intermediate audio signal is associated with a second plurality of channels, different from the first plurality of channels.
2.The method of claim 1, wherein the one or more sensors comprise one or more microphones.
3.The method of claim 1, wherein the one or more sensors comprise one or more cameras.
4.The method of claim 1, wherein the output audio signal is presented to the listener via one or more speakers associated with the wearable head device.
5.The method of claim 1, further comprising displaying to the listener, concurrently with the presentation of the output audio signal, a view of the virtual environment.
6.The method of claim 1, wherein the intermediate signal is determined based on an acoustic property of the virtual environment, the acoustic property determined via the one or more sensors.
7.The method of claim 6, further comprising retrieving the acoustic property from a database.
8.The method of claim 6, wherein said retrieving the acoustic property comprises:identifying the acoustic property based on the location of the listener.
9.The method of claim 6, wherein the acoustic property is determined via a first device, and the intermediate audio signal is determined via a second device different from the first device.
10.The method of claim 1, wherein the intermediate audio signal is decoded via an ambisonics decoder.
11.The method of claim 1, wherein:the virtual environment is part of a mixed reality environment, the surface of the virtual reality environment comprises a surface of the mixed reality environment, the intermediate audio signal is associated with a sound source, and a location of the sound source comprises a location of the mixed reality environment.
12.A system, comprising:one or more speakers; one or more sensors; and one or more processors configured to perform a method comprising:determining an intermediate audio signal, the intermediate audio signal corresponding to a reflection of an input audio signal by a surface of a virtual environment, wherein:said determining the intermediate audio signal comprises encoding the input audio signal based on a location of a listener, the location of the listener is determined via the one or more sensors, and the intermediate audio signal is associated with a first plurality of channels; associating the intermediate audio signal with a bus, wherein the bus is associated with the first plurality of channels; and presenting, via the bus and the one or more speakers, an output audio signal to the listener, wherein:the output audio signal is determined via decoding the intermediate audio signal to produce a decoded intermediate audio signal, and the decoded intermediate audio signal is associated with a second plurality of channels, different from the first plurality of channels.
13.The system of claim 12, wherein the one or more sensors comprise one or more microphones.
14.The system of claim 12, wherein the one or more sensors comprise one or more cameras.
15.The system of claim 12, wherein the one or more sensors and the one or more speakers are associated with a wearable head device configured to be worn by the listener.
16.The system of claim 12, further comprising a display, wherein the method further comprises displaying to the listener, concurrently with the presentation of the output audio signal, a view of the virtual environment.
17.The system of claim 12, wherein the intermediate signal is determined based on an acoustic property of the virtual environment, the acoustic property determined via the one or more sensors.
18.The system of claim 17, wherein the acoustic property is determined via a first device, and the intermediate audio signal is determined via a second device different from the first device.
19.The system of claim 12, wherein:the virtual environment is part of a mixed reality environment, the surface of the virtual reality environment comprises a surface of the mixed reality environment, the intermediate audio signal is associated with a sound source, and a location of the sound source comprises a location of the mixed reality environment.
20.A non-transitory computer-readable medium storing instructions that, when executed by one or more processors, cause the one or more processors to perform a method comprising:determining an intermediate audio signal, the intermediate audio signal corresponding to a reflection of an input audio signal by a surface of a virtual environment, wherein:said determining the intermediate audio signal comprises encoding the input audio signal based on a location of a listener, the location of the listener is determined via one or more sensors of a wearable head device, and the intermediate audio signal is associated with a first plurality of channels; associating the intermediate audio signal with a bus, wherein the bus is associated with the first plurality of channels; and presenting, via the bus, an output audio signal to the listener, wherein:the output audio signal is determined via decoding the intermediate audio signal to produce a decoded intermediate audio signal, and the decoded intermediate audio signal is associated with a second plurality of channels, different from the first plurality of channels.
Description
CROSS-REFERENCE TO RELATED APPLICATION
This application is a continuation of U.S. Non-Provisional application Ser. No. 18/461,289, filed Sep. 5, 2023, which is a continuation of U.S. Non-Provisional application Ser. No. 17/092,060, filed Nov. 6, 2020, now U.S. Pat. No. 11,925,598, which is a continuation of U.S. Non-Provisional application Ser. No. 16/445,171, filed on Jun. 18, 2019, now U.S. Pat. No. 10,863,300, which claims priority to U.S. Provisional Application No. 62/686,655, filed on Jun. 18, 2018, the contents of which are incorporated by reference herein in their entirety. This application additionally claims priority to U.S. Provisional Application No. 62/686,665, filed on Jun. 18, 2018, the contents of which are incorporated by reference herein in their entirety.
FIELD
This disclosure generally relates spatial audio rendering, and specifically relates to spatial audio rendering for virtual sound sources in a virtual acoustic environment.
BACKGROUND
Virtual environments are ubiquitous in computing environments, finding use in video games (in which a virtual environment may represent a game world); maps (in which a virtual environment may represent terrain to be navigated); simulations (in which a virtual environment may simulate a real environment); digital storytelling (in which virtual characters may interact with each other in a virtual environment); and many other applications. Modern computer users are generally comfortable perceiving, and interacting with, virtual environments. However, users' experiences with virtual environments can be limited by the technology for presenting virtual environments. For example, conventional displays (e.g., 2D display screens) and audio systems (e.g., fixed speakers) may be unable to realize a virtual environment in ways that create a compelling, realistic, and immersive experience.
Virtual reality (“VR”), augmented reality (“AR”), mixed reality (“MR”), and related technologies (collectively, “XR”) share an ability to present, to a user of an XR system, sensory information corresponding to a virtual environment represented by data in a computer system. Such systems can offer a uniquely heightened sense of immersion and realism by combining virtual visual and audio cues with real sights and sounds. Accordingly, it can be desirable to present digital sounds to a user of an XR system in such a way that the sounds seem to be occurring—naturally, and consistently with the user's expectations of the sound—in the user's real environment. Generally speaking, users expect that virtual sounds will take on the acoustic properties of the real environment in which they are heard. For instance, a user of an XR system in a large concert hall will expect the virtual sounds of the XR system to have large, cavernous sonic qualities; conversely, a user in a small apartment will expect the sounds to be more dampened, close, and immediate.
Digital, or artificial, reverberators may be used in audio and music signal processing to simulate perceived effects of diffuse acoustic reverberation in rooms. In XR environments, it is desirable to use digital reverberators to realistically simulate the acoustic properties of rooms in the XR environment. Convincing simulations of such acoustic properties can lend feelings of authenticity and immersion to the XR environment.
BRIEF SUMMARY
Systems and methods of presenting an output audio signal to a listener located at a first location in a virtual environment are disclosed. According to embodiments of a method, an input audio signal is received. For each sound source of a plurality of sound sources in the virtual environment, a respective first intermediate audio signal corresponding to the input audio signal is determined, based on a location of the respective sound source in the virtual environment, and the respective first intermediate audio signal is associated with a first bus. For each of the sound sources of the plurality of sound sources in the virtual environment, a respective second intermediate audio signal is determined. The respective second intermediate audio signal corresponds to a reflection of the input audio signal in a surface of the virtual environment. The respective second intermediate audio signal is determined based on a location of the respective sound source, and further based on an acoustic property of the virtual environment. The respective second intermediate audio signal is associated with a second bus. The output audio signal is presented to the listener via the first bus and the second bus.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 illustrates an example wearable system, according to some embodiments.
FIG. 2 illustrates an example handheld controller that can be used in conjunction with an example wearable system, according to some embodiments.
FIG. 3 illustrates an example auxiliary unit that can be used in conjunction with an example wearable system, according to some embodiments.
FIG. 4 illustrates an example functional block diagram for an example wearable system, according to some embodiments.
FIG. 5 illustrates an example geometrical room representation, according to some embodiments.
FIG. 6 illustrates an example model of a room response measured from a source to a listener in a room, according to some embodiments.
FIG. 7 illustrates example factors affecting a user's perception of direct sounds, reflections, and reverberations, according to some embodiments.
FIG. 8 illustrates an example audio mixing architecture for rendering multiple virtual sound sources in a virtual room, according to some embodiments.
FIG. 9 illustrates an example audio mixing architecture for rendering multiple virtual sound sources in a virtual room, according to some embodiments.
FIG. 10 illustrates an example per-source processing module, according to some embodiments.
FIG. 11 illustrates an example per-source reflections pan module, according to some embodiments.
FIG. 12 illustrates an example room processing algorithm, according to some embodiments.
FIG. 13 illustrates an example reflections module, according to some embodiments.
FIG. 14 illustrates an example spatial distribution of apparent directions of arrival of reflections, according to some embodiments.
FIG. 15 illustrates examples of direct gain, reflections gain, and reverberation gain as functions of distance, according to some embodiments.
FIG. 16 illustrates example relationships between distance and spatial focus, according to some embodiments.
FIG. 17 illustrates example relationships between time and signal amplitude, according to some embodiments.
FIG. 18 illustrates an example system for processing spatial audio, according to some embodiments.
Publication Number: 20250203313
Publication Date: 2025-06-19
Assignee: Magic Leap
Abstract
Systems and methods of presenting an output audio signal to a listener located at a first location in a virtual environment are disclosed. According to embodiments of a method, an input audio signal is received. For each sound source of a plurality of sound sources in the virtual environment, a respective first intermediate audio signal corresponding to the input audio signal is determined, based on a location of the respective sound source in the virtual environment, and the respective first intermediate audio signal is associated with a first bus. For each of the sound sources of the plurality of sound sources in the virtual environment, a respective second intermediate audio signal is determined. The respective second intermediate audio signal corresponds to a reflection of the input audio signal in a surface of the virtual environment. The respective second intermediate audio signal is determined based on a location of the respective sound source, and further based on an acoustic property of the virtual environment. The respective second intermediate audio signal is associated with a second bus. The output audio signal is presented to the listener via the first bus and the second bus.
Claims
What is claimed is:
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
Description
CROSS-REFERENCE TO RELATED APPLICATION
This application is a continuation of U.S. Non-Provisional application Ser. No. 18/461,289, filed Sep. 5, 2023, which is a continuation of U.S. Non-Provisional application Ser. No. 17/092,060, filed Nov. 6, 2020, now U.S. Pat. No. 11,925,598, which is a continuation of U.S. Non-Provisional application Ser. No. 16/445,171, filed on Jun. 18, 2019, now U.S. Pat. No. 10,863,300, which claims priority to U.S. Provisional Application No. 62/686,655, filed on Jun. 18, 2018, the contents of which are incorporated by reference herein in their entirety. This application additionally claims priority to U.S. Provisional Application No. 62/686,665, filed on Jun. 18, 2018, the contents of which are incorporated by reference herein in their entirety.
FIELD
This disclosure generally relates spatial audio rendering, and specifically relates to spatial audio rendering for virtual sound sources in a virtual acoustic environment.
BACKGROUND
Virtual environments are ubiquitous in computing environments, finding use in video games (in which a virtual environment may represent a game world); maps (in which a virtual environment may represent terrain to be navigated); simulations (in which a virtual environment may simulate a real environment); digital storytelling (in which virtual characters may interact with each other in a virtual environment); and many other applications. Modern computer users are generally comfortable perceiving, and interacting with, virtual environments. However, users' experiences with virtual environments can be limited by the technology for presenting virtual environments. For example, conventional displays (e.g., 2D display screens) and audio systems (e.g., fixed speakers) may be unable to realize a virtual environment in ways that create a compelling, realistic, and immersive experience.
Virtual reality (“VR”), augmented reality (“AR”), mixed reality (“MR”), and related technologies (collectively, “XR”) share an ability to present, to a user of an XR system, sensory information corresponding to a virtual environment represented by data in a computer system. Such systems can offer a uniquely heightened sense of immersion and realism by combining virtual visual and audio cues with real sights and sounds. Accordingly, it can be desirable to present digital sounds to a user of an XR system in such a way that the sounds seem to be occurring—naturally, and consistently with the user's expectations of the sound—in the user's real environment. Generally speaking, users expect that virtual sounds will take on the acoustic properties of the real environment in which they are heard. For instance, a user of an XR system in a large concert hall will expect the virtual sounds of the XR system to have large, cavernous sonic qualities; conversely, a user in a small apartment will expect the sounds to be more dampened, close, and immediate.
Digital, or artificial, reverberators may be used in audio and music signal processing to simulate perceived effects of diffuse acoustic reverberation in rooms. In XR environments, it is desirable to use digital reverberators to realistically simulate the acoustic properties of rooms in the XR environment. Convincing simulations of such acoustic properties can lend feelings of authenticity and immersion to the XR environment.
BRIEF SUMMARY
Systems and methods of presenting an output audio signal to a listener located at a first location in a virtual environment are disclosed. According to embodiments of a method, an input audio signal is received. For each sound source of a plurality of sound sources in the virtual environment, a respective first intermediate audio signal corresponding to the input audio signal is determined, based on a location of the respective sound source in the virtual environment, and the respective first intermediate audio signal is associated with a first bus. For each of the sound sources of the plurality of sound sources in the virtual environment, a respective second intermediate audio signal is determined. The respective second intermediate audio signal corresponds to a reflection of the input audio signal in a surface of the virtual environment. The respective second intermediate audio signal is determined based on a location of the respective sound source, and further based on an acoustic property of the virtual environment. The respective second intermediate audio signal is associated with a second bus. The output audio signal is presented to the listener via the first bus and the second bus.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 illustrates an example wearable system, according to some embodiments.
FIG. 2 illustrates an example handheld controller that can be used in conjunction with an example wearable system, according to some embodiments.
FIG. 3 illustrates an example auxiliary unit that can be used in conjunction with an example wearable system, according to some embodiments.
FIG. 4 illustrates an example functional block diagram for an example wearable system, according to some embodiments.
FIG. 5 illustrates an example geometrical room representation, according to some embodiments.
FIG. 6 illustrates an example model of a room response measured from a source to a listener in a room, according to some embodiments.
FIG. 7 illustrates example factors affecting a user's perception of direct sounds, reflections, and reverberations, according to some embodiments.
FIG. 8 illustrates an example audio mixing architecture for rendering multiple virtual sound sources in a virtual room, according to some embodiments.
FIG. 9 illustrates an example audio mixing architecture for rendering multiple virtual sound sources in a virtual room, according to some embodiments.
FIG. 10 illustrates an example per-source processing module, according to some embodiments.
FIG. 11 illustrates an example per-source reflections pan module, according to some embodiments.
FIG. 12 illustrates an example room processing algorithm, according to some embodiments.
FIG. 13 illustrates an example reflections module, according to some embodiments.
FIG. 14 illustrates an example spatial distribution of apparent directions of arrival of reflections, according to some embodiments.
FIG. 15 illustrates examples of direct gain, reflections gain, and reverberation gain as functions of distance, according to some embodiments.
FIG. 16 illustrates example relationships between distance and spatial focus, according to some embodiments.
FIG. 17 illustrates example relationships between time and signal amplitude, according to some embodiments.
FIG. 18 illustrates an example system for processing spatial audio, according to some embodiments.