Meta Patent | Techniques for avoiding negative audio performance variations in extended-reality devices, and mixed-reality systems and methods of using these techniques

小编映维 | 分类：Meta | 发布日期 2025年7月10日

Patent: Techniques for avoiding negative audio performance variations in extended-reality devices, and mixed-reality systems and methods of using these techniques

Publication Number: 20250223976

Publication Date: 2025-07-10

Assignee: Meta Platforms Technologies

Abstract

An example MR headset includes a housing with electronic components for presenting MR content, where the housing is configured to attach to a strap thereby forming a cavity, and defines an aperture, including a first opening on a first side of the MR headset and a second opening on a second side of the MR headset. The MR headset includes a speaker within the cavity positioned for minimizing intermodulation of the speaker within the cavity. The housing includes a fan to cool the electronic components and minimize noise interference. The MR headset includes an expansion chamber surrounding the inner surface of the fluidic channel that causes bias flow across the perforations. The MR headset includes microphones distributed along an outer surface of the housing for detecting audio content from a user, and each respective microphone is separated from the first and second opening by at least an audial-interference threshold distance.

Claims

What is claimed is:

1. A mixed-reality (MR) headset, comprising:a housing comprising one or more electronic components, the one or more electronic components configured to be used in presentation of MR content including audio content, wherein (i) the housing is configured to attach to a strap thereby forming a cavity, and (ii) the housing defines an aperture, including a first opening on a first side of the MR headset and a second opening on a second side of the MR headset;a speaker housed within the cavity, the speaker being positioned adjacent to a foam insert for minimizing intermodulation of the speaker within the cavity during presentation of the audio content;a fan housed in the housing, the fan configured to cool the one or more electronic components, and minimize noise interference from operation of the fan with the presentation of the audio content, wherein:the first opening of the aperture is adjacent to an exhausting side of the fan, such that exhaust from the fan causes a grazing flow to enter the first opening and exit the second opening, thereby forming a fluidic channel across the aperture, andan inner surface of the fluidic channel defines a set of perforations, wherein the set of perforations is configured to receive acoustic waves associated with resonant frequencies of the fan;an expansion chamber that surrounds the inner surface of the fluidic channel, the expansion chamber having a back volume that causes a bias flow across the set of perforations of the inner surface of the fluidic channel; anda set of microphones distributed along an outer surface of the housing, the set of microphones configured to detect audio content from a user of the MR headset as part of presenting the MR content, wherein each respective microphone of the set of microphones is separated from the first opening and the second opening of the aperture defined in the housing by at least an audial-interference threshold distance.

2. The MR headset of claim 1, wherein:each respective perforation of the set of perforations is configured with a predetermined diameter corresponding to a resonant frequency of the acoustic waves produced by operations of the fan.

3. The MR headset of claim 2, wherein:the set of perforations is a first set of perforations,the predetermined diameter is a first predetermined diameter, andthe inner surface of the fluidic channel defines a second set of perforations having a second predetermined diameter corresponding to another resonant frequency of the acoustic waves produced by operations of the fan.

4. The MR headset of claim 3, wherein:the first set of perforations has a first pitch, such that the first predetermined diameter and the first pitch are tuned to remove resonant acoustic waves having a first frequency, andthe second set of perforations has a second pitch, such that the second predetermined diameter and the second pitch are tuned to remove resonant acoustic waves having a second frequency.

5. The MR headset of claim 1, wherein:the fluidic channel has a rectangular profile having a first dimension spanning a direction parallel to a length of the housing, and a second dimension corresponding to a depth of the housing, andthe first dimension is at least double the length of the second dimension.

6. The MR headset of claim 5, wherein:the second dimension is configured based on a calculated Stokes layer of the grazing flow determined based in part on a respective size and respective pitch of each respective perforation of the set of perforations.

7. The MR headset of claim 5, wherein the fluidic channel has a flared profile such that an outlet of the fluidic channel has a greater surface area than an inlet of the fluidic channel.

8. The MR headset of claim 7, wherein the outlet comprises two spaced openings comprising the greater surface area of the outlet of the fluidic channel.

9. The MR headset of claim 8, wherein a portion of the expansion chamber is positioned between the two spaced openings.

10. The MR headset of claim 1, wherein the set of microphones is configured in an end-fire array configuration configured to cancel external noises in front of the user from being detected by the microphones.

11. The MR headset of claim 1, wherein the set of microphones is configured in a broadside array configuration such that two respective microphones on each side of the MR headset are symmetrical along a plane defined by the outer surface of the MR headset.

12. The MR headset of claim 1, wherein the set of perforations defined by the fluidic channel are configured to reduce resonant noise caused by the fan below a value of 0 decibels of sound pressure level (SPL) for at least one range of frequencies.

13. The MR headset of claim 1, wherein the inner surface of the fluidic channel comprises at least five micro-perforated panels (MPPs) comprising the respective perforations of the set of perforations.

14. The MR headset of claim 1, wherein the inner surface of the fluidic channel is comprised of an aluminum sheet having a thickness of between 0.1 and 0.5 millimeters.

15. The MR headset of claim 1, wherein the inner surface of the fluidic channel further comprises a mesh having an acoustic impedance of 10 pascal-seconds per meter.

16. The MR headset of claim 1, wherein the back volume of the expansion chamber is between 2000 and 4000 cubic millimeters.

17. The MR headset of claim 1, wherein the expansion chamber has a length of at least 20 millimeters.

18. A system, comprising:an MR headset, including:a housing comprising one or more electronic components, the one or more electronic components configured to be used in presentation of MR content including audio content, wherein (i) the housing is configured to attach to a strap thereby forming a cavity, and (ii) the housing defines an aperture, including a first opening on a first side of the MR headset and a second opening on a second side of the MR headset;a speaker housed within the cavity, the speaker being positioned adjacent to a foam insert for minimizing intermodulation of the speaker within the cavity during presentation of the audio content;a fan housed in the housing, the fan configured to cool the one or more electronic components, and minimize noise interference from operation of the fan with the presentation of the audio content, wherein:the first opening of the aperture is adjacent to an exhausting side of the fan, such that exhaust from the fan causes a grazing flow to enter the first opening and exit the second opening, thereby forming a fluidic channel across the aperture, andan inner surface of the fluidic channel defines a set of perforations, wherein the set of perforations is configured to receive acoustic waves associated with resonant frequencies of the fan;an expansion chamber that surrounds an inner surface of the fluidic channel, the expansion chamber having a back volume that causes a bias flow across the set of perforations of the inner surface of the fluidic channel; anda set of microphones distributed along an outer surface of the housing, the set of microphones configured to detect audio content from a user of the MR headset as part of presenting the MR content, wherein each respective microphone of the set of microphones is separated from the first opening and the second opening of the aperture defined in the housing by at least an audial-interference threshold distance.

19. A housing of an MR headset, comprising:one or more electronic components, the one or more electronic components configured to be used in presentation of MR content including audio content, wherein (i) the housing is configured to attach to a strap thereby forming a cavity, and (ii) the housing defines an aperture, including a first opening on a first side of the MR headset and a second opening on a second side of the MR headset;a speaker housed within the cavity, the speaker being positioned adjacent to a foam insert for minimizing intermodulation of the speaker within the cavity during presentation of the audio content;a fan housed in the housing, the fan configured to cool the one or more electronic components, and minimize noise interference from operation of the fan with the presentation of the audio content, wherein:the first opening of the aperture is adjacent to an exhausting side of the fan, such that exhaust from the fan causes a grazing flow to enter the first opening and exit the second opening, thereby forming a fluidic channel across the aperture, andan inner surface of the fluidic channel defines a set of perforations, wherein the set of perforations is configured to receive acoustic waves associated with resonant frequencies of the fan;an expansion chamber that surrounds the inner surface of the fluidic channel, the expansion chamber having a back volume that causes a bias flow across the set of perforations of the inner surface of the fluidic channel; anda set of microphones distributed along an outer surface of the housing, the set of microphones configured to detect audio content from a user of the MR headset as part of presenting the MR content, wherein each respective microphone of the set of microphones is separated from the first opening and the second opening of the aperture defined in the housing by at least an audial-interference threshold distance.

20. The housing of claim 19, wherein:each respective perforation of the set of perforations is configured with a predetermined diameter corresponding to a resonant frequency of the acoustic waves produced by operations of the fan.

Description

RELATED APPLICATIONS

This application claims priority to U.S. Prov. App. No. 63/618,853, filed on Jan. 8, 2024, and titled “Arrangements of Imaging and Illumination Sensors for Extended-reality Headset, Physical Button for Passthrough Mode, Drop Protection and Audio Improvements for the Headset, and Systems and Methods of Use Thereof,” which is hereby incorporated by reference in its entirety.

This application also relates to U.S. application Ser. No. 18/774,858, filed on Jul. 16, 2024, and titled “Techniques for Using Floodlight LEDs When Imaging Sensors Have an Insufficient Level of Detail for Identifying Hand Gestures, and Mixed-Reality Systems and Methods of Using These Techniques,” and U.S. application Ser. No. 18/782,385, filed on Jul. 24, 2024, and titled “Techniques for Guiding Perspiration to Desired Channels to Avoid Negative Impacts to Electrical and Mechanical Functions of Extended-Reality Devices, and Systems and Methods of use thereof,” which are hereby incorporated by reference in their entirety.

TECHNICAL FIELD

This relates generally to a mixed-reality (MR) headset and components thereof, including but not limited to techniques for optimizing audial interactions with MR environments presented by the MR headset.

BACKGROUND

MR headsets can be capable of presenting MR content to users which can be immersive and engagingly interactive. Such presentation techniques provide for new opportunities as well as new challenges, particularly since such content lends itself to different interaction types than are used for more conventional digital content (e.g., desktop computer graphics, smartphone). Further, such interactions may require that particular conditions (e.g., lighting conditions) be present in order to be detected with a sufficient level of accuracy for engaging with the MR content. Additionally, such interactions may be deleteriously impacted by audial performance (e.g., input interactions at a microphone, and/or output interactions at a speaker) that disrupts and/or otherwise impacts the immersion provided by the presentation of the MR content.

Such effects can be exacerbated by the need for forced cooling to support proper operation of the electronic components within the MR headset (e.g., to dissipate the byproduct of the power consumption needed to deliver the experience). Further, functional requirements, such as a strap-fastening opening on the MR headset, can cause negative effects to the audio (e.g., audio received from a speaker located near the strap-fastening opening).

As such, there is a need to address one or more of the above-identified challenges. A brief summary of solutions to the issues noted above is found below.

SUMMARY

The embodiments described herein include floodlight-emitting diodes (flood LEDs) which may be mounted or otherwise integrated with an MR headset, and which are able to illuminate a volume of physical space where a user is likely to perform interactions for performing operations within an MR environment being presented by the MR headset.

In an example embodiment, an MR headset is provided. The MR headset includes a housing comprising one or more electronic components, the one or more electronic components configured to be used in presentation of MR content including audio content, wherein (i) the housing is configured to attach to a strap thereby forming a cavity, and (ii) the housing defines an aperture, including a first opening on a first side of the MR headset and a second opening on a second side of the MR headset. The MR headset includes a speaker housed within the cavity, the speaker being positioned adjacent to a foam insert for minimizing (e.g., attenuating, reducing) intermodulation of the speaker within the cavity during presentation of the audio content. The MR headset includes a fan housed in the housing, the fan configured to cool the one or more electronic components, and minimize noise interference from operation of the fan with the presentation of the audio content, where (i) the first opening of the aperture is adjacent to an exhausting side of the fan, such that exhaust from the fan causes a grazing flow to enter the first opening and exit the second opening, thereby forming a fluidic channel across the aperture, and (ii) an inner surface of the fluidic channel defines a set of perforations, wherein the set of perforations is configured to receive acoustic waves associated with resonant frequencies of the fan. The MR headset includes an expansion chamber that surrounds the inner surface of the fluidic channel, the expansion chamber having a back volume that causes a bias flow across the set of perforations of the inner surface of the fluidic channel. And the MR headset includes a set of microphones distributed along an outer surface of the housing, the set of microphones configured to detect audio content from a user of the MR headset as part of presenting the MR content, wherein each respective microphone of the set of microphones is separated from the first opening and the second opening of the aperture defined in the housing by at least an audial-interference threshold distance.

The devices and/or systems described herein can be configured to include instructions that cause the performance of methods and operations associated with the presentation and/or interaction with an extended reality. These methods and operations can be stored on a non-transitory, computer-readable storage medium of a device or a system. It is also noted that the devices and systems described herein can be part of an overarching system that includes multiple devices. A non-exhaustive of list of electronic devices that, either alone or in combination (e.g., a system), can include instructions that cause performance of methods and operations associated with the presentation and/or interaction with an extended reality include: an extended-reality headset (e.g., a MR headset or an augmented-reality (AR) headset as two examples), a wrist-wearable device, an intermediary processing device, a smart textile-based garment, etc. For example, when an XR headset is described, it is understood that the XR headset can be in communication with one or more other devices (e.g., a wrist-wearable device, a server, intermediary processing device, etc.) which together can include instructions for performing methods and operations associated with the presentation and/or interaction with an extended-reality headset (i.e., the XR headset would be part of a system that includes one or more additional devices). Multiple combinations with different related devices are envisioned, but for the sake of brevity they are not recited herein.

The features and advantages described in the specification are not necessarily all-inclusive and, in particular, certain additional features and advantages will be apparent to one of ordinary skill in the art in view of the drawings, specification, and claims. Moreover, it should be noted that the language used in the specification has been principally selected for readability and instructional purposes.

Having summarized the above example aspects, a brief description of the drawings will now be presented.

BRIEF DESCRIPTION OF THE DRAWINGS

For a better understanding of the various described embodiments, reference should be made to the Detailed Description below in conjunction with the following drawings in which like reference numerals refer to corresponding parts throughout the figures.

FIGS. 1A to 1D illustrate an example head-wearable device, in accordance with some embodiments.

FIGS. 2A to 2E illustrate example aspects of microphone configurations of a head-wearable device, in accordance with some embodiments.

FIG. 3 illustrates an audio module of a head-wearable device, in accordance with some embodiments.

FIGS. 4A to 4D illustrate an exhaust system of a head-wearable device, in accordance with some embodiments.

FIGS. 5A, 5B, 5C-1, and 5C-2 illustrate example MR and AR systems, in accordance with some embodiments.

In accordance with customary practice, the various features illustrated in the drawings may not be drawn to scale. Accordingly, the dimensions of the various features may be arbitrarily expanded or reduced for clarity. In addition, some of the drawings may not depict all of the components of a given system, method, or device. Finally, like reference numerals may be used to denote like features throughout the specification and figures.

DETAILED DESCRIPTION

Numerous details are described herein to provide a thorough understanding of the example embodiments illustrated in the accompanying drawings. However, some embodiments may be practiced without many of the specific details, and the scope of the claims is only limited by those features and aspects specifically recited in the claims. Furthermore, well-known processes, components, and materials have not necessarily been described in exhaustive detail so as to avoid obscuring pertinent aspects of the embodiments described herein.

Embodiments of this disclosure can include or be implemented in conjunction with distinct types of extended realities (XR), such as MR and AR systems. Mixed realities and augmented realities, as described herein, are any superimposed functionality and/or sensory-detectable presentation provided by an MR or AR system within a user's physical surroundings. Such mixed realities can include and/or represent virtual realities and virtual realities in which at least some aspects of the surrounding environment are reconstructed within the virtual environment (e.g., displaying virtual reconstructions of physical objects in a physical environment to avoid the user colliding with the physical objects in a surrounding physical environment). In the case of mixed realities, the surrounding environment that is presented via a display is captured via one or more sensors configured to capture the surrounding environment (e.g., a camera sensor, time-of-flight (ToF) sensor). While a wearer of an MR headset can see the surrounding environment in full detail, in some embodiments, they are typically seeing a reconstruction of the environment reproduced using data from the one or more sensors (i.e., the physical objects are not directly viewed by the user).

An MR headset can also forgo displaying reconstructions of objects in the physical environment, thereby providing a user with an entirely virtual-reality (VR) experience. An AR system, on the other hand, provides an experience in which information is provided, e.g., through the use of a waveguide, in conjunction with the direct viewing of at least some of the surrounding environment through the transparent or semi-transparent waveguide(s) and/or lens(es) of the AR headset. Throughout this application, the term “XR” is used to cover both augmented realities and mixed realities. In addition, this application also uses, at times, “head-wearable device” or “headset device” to describe headsets such as AR headsets and MR headsets.

As alluded to above, an MR environment, as described herein, can include, but is not limited to, non-immersive, semi-immersive, and fully immersive VR environments. As also alluded to above, AR environments can include marker-based augmented-reality environments, markerless augmented-reality environments, location-based augmented-reality environments, and projection-based augmented-reality environments. The above descriptions are not exhaustive and any other environment that allows for intentional environmental lighting to pass through to the user would fall within the scope of augmented-reality and any other environment that does not allow for intentional environmental lighting to pass through to the user would fall within the scope of a mixed-reality.

The AR and MR content can include video, audio, haptic events, or some combination thereof, any of which can be presented in a single channel or in multiple channels (such as stereo video that produces a three-dimensional effect to a viewer). Additionally, AR and MR can be associated with applications, products, accessories, services, or some combination thereof which are used, for example, to create content in an AR or MR environment and/or are otherwise used in (e.g., to perform activities in) AR and MR environments.

Interacting with the AR and MR environments described herein can occur using multiple different modalities and the resulting outputs can also occur across multiple modalities. In one example AR or MR system, a user can perform a swiping in-air hand gesture to cause a song to be skipped by a song-providing API providing playback at, for example, a home speaker.

A hand gesture, as described herein, can include an in-air gesture, a surface-contact gesture, and or other gestures that can be detected and determined based on movements of a single hand (e.g., a one-handed gesture performed with a user's hand that is detected by one or more sensors of a wearable device (e.g., electromyography (EMG) and/or inertial measurement units (IMUs) of a wrist-wearable device, and/or one or more sensors included in a smart textile wearable device) and/or detected via image data captured by an imaging device of a wearable device (e.g., a camera of a head-wearable device, an external tracking camera setup in the surrounding environment, etc.)). “In-air” means that the user's hand does not contact a surface, object, or portion of an electronic device (e.g., a head-wearable device or other communicatively coupled device, such as the wrist-wearable device), in other words the gesture is performed in open air in 3D space and without contacting a surface, an object, or an electronic device. Surface-contact gestures (contacts at a surface, object, body part of the user, or electronic device) more generally are also contemplated in which a contact (or an intention to contact) is detected at a surface (e.g., a single or double finger tap on a table, on a user's hand or another finger, on the user's leg, a couch, a steering wheel, etc.). The different hand gestures disclosed herein can be detected using image data and/or sensor data (e.g., neuromuscular signals sensed by one or more biopotential sensors (e.g., EMG sensors) or other types of data from other sensors, such as proximity sensors, time-of-flight (ToF) sensors, sensors of an inertial measurement unit (IMU), capacitive sensors, strain sensors, etc.) detected by a wearable device worn by the user and/or other electronic devices in the user's possession (e.g., smartphones, laptops, imaging devices, intermediary devices, and/or other devices described herein).

The input modalities as alluded to above can be varied and dependent on user experience. For example, in an interaction in which a wrist-wearable device is used, a user can provide inputs using in-air or surface contact gestures that are detected using neuromuscular signal sensors of the wrist-wearable device. In the event that wrist-wearable device is not used, alternative and entirely interchangeable input modalities can be used instead, such as camera(s) located on the headset or elsewhere to detect in-air or surface contact gestures or inputs at an intermediary processing device (e.g., through physical input components (e.g., buttons and trackpads)). These different input modalities can be interchanged based on desired user experiences, portability, and/or a feature set of the product (e.g., a low-cost product may not include hand-tracking cameras).

While the inputs are varied, the resulting outputs stemming from the inputs are also varied. For example, an in-air gesture input detected by a camera of a head-wearable device can cause an output to occur at a head-wearable device or control another electronic device different from the head-wearable device. In another example, an input detected using data from a neuromuscular signal sensor can also cause an output to occur at a head-wearable device or control another electronic device different from the head-wearable device. While only a couple examples are described above, one skilled in the art would understand that different input modalities are interchangeable along with different output modalities in response to the inputs.

Specific operations described above may occur as a result of specific hardware. The devices described are not limiting and features on these devices can be removed or additional features can be added. The different devices can include one or more analogous hardware components. For brevity, only analogous devices and components are described herein. Any differences in the devices and components are described below in their respective sections.

As described herein, a processor (e.g., a central processing unit (CPU) or microcontroller unit (MCU)) is an electronic component that is responsible for executing instructions and controlling the operation of an electronic device (e.g., a wrist-wearable device, a head-wearable device, an HIPD, a smart textile-based garment, or other computer system). There are distinct types of processors that may be used interchangeably or specifically required by embodiments described herein. For example, a processor may be (i) a general processor designed to perform a wide range of tasks, such as running software applications, managing operating systems, and performing arithmetic and logical operations; (ii) a microcontroller designed for specific tasks such as controlling electronic devices, sensors, and motors; (iii) a graphics processing unit (GPU) designed to accelerate the creation and rendering of images, video, and animation (e.g., virtual-reality animations, such as three-dimensional modeling); (iv) a field-programmable gate array (FPGA) that can be programmed and reconfigured after manufacturing and/or customized to perform specific tasks, such as signal processing, cryptography, and machine learning; or (v) a digital signal processor (DSP) designed to perform mathematical operations on signals such as audio, video, and radio waves. One of skill in the art will understand that one or more processors of one or more electronic devices may be used in various embodiments described herein.

As described herein, controllers are electronic components that manage and coordinate the operation of other components within an electronic device (e.g., controlling inputs, processing data, and/or generating outputs). Examples of controllers can include (i) microcontrollers, including small, low-power controllers that are commonly used in embedded systems and Internet of Things (IoT) devices; (ii) programmable logic controllers (PLCs) that may be configured to be used in industrial automation systems to control and monitor manufacturing processes; (iii) system-on-a-chip (SoC) controllers that integrate multiple components such as processors, memory, I/O interfaces, and other peripherals into a single chip; and/or (iv) DSPs. As described herein, a graphics module is a component or software module that is designed to handle graphical operations and/or processes and can include a hardware module and/or a software module.

As described herein, memory refers to electronic components in a computer or electronic device that store data and instructions for the processor to access and manipulate. The devices described herein can include volatile and non-volatile memory. Examples of memory can include (i) random access memory (RAM) such as DRAM, SRAM, DDR RAM, or other random access solid-state memory devices configured to store data and instructions temporarily; (ii) read-only memory (ROM) configured to store data and instructions permanently (e.g., one or more portions of system firmware and/or boot loaders); (iii) flash memory, magnetic disk storage devices, optical disk storage devices, and other non-volatile solid-state storage devices which can be configured to store data in electronic devices (e.g., universal serial bus (USB) drives, memory cards, and/or solid-state drives (SSDs)); and (iv) cache memory configured to temporarily store frequently accessed data and instructions. Memory as described herein can include structured data (e.g., SQL databases, MongoDB databases, GraphQL data, or JSON data). Other examples of memory can include (i) profile data, including user account data, user settings, and/or other user data stored by the user; (ii) sensor data detected and/or otherwise obtained by one or more sensors; (iii) media content data, including stored image data, audio data, documents, and the like; (iv) application data, which can include data collected and/or otherwise obtained and stored during use of an application; and/or any other types of data described herein.

As described herein, a power system of an electronic device is configured to convert incoming electrical power into a form that can be used to operate the device. A power system can include various components, including (i) a power source, which can be an alternating current (AC) adapter or a direct current (DC) adapter power supply; (ii) a charger input that can be configured to use a wired and/or wireless connection (which may be part of a peripheral interface such as a USB, micro-USB interface, near-field magnetic coupling, magnetic inductive and magnetic resonance charging, and/or radio frequency (RF) charging); (iii) a power-management integrated circuit configured to distribute power to various components of the device and ensure that the device operates within safe limits (e.g., regulating voltage, controlling current flow, and/or managing heat dissipation); and/or (iv) a battery configured to store power to provide usable power to components of one or more electronic devices.

As described herein, peripheral interfaces are electronic components (e.g., of electronic devices) that allow electronic devices to communicate with other devices or peripherals and can provide a means for input and output of data and signals. Examples of peripheral interfaces can include (i) USB and/or micro-USB interfaces configured for connecting devices to an electronic device; (ii) Bluetooth interfaces configured to allow devices to communicate with each other, including Bluetooth low-energy (BLE); (iii) near-field communication (NFC) interfaces configured to be short-range wireless interfaces for operations such as access control; (iv) POGO pins, which may be small spring-loaded pins configured to provide a charging interface; (v) wireless charging interfaces; (vi) global-position system (GPS) interfaces; (vii) Wi-Fi interfaces for providing a connection between a device and a wireless network; and (viii) sensor interfaces.

As described herein, sensors are electronic components (e.g., inside of and/or otherwise in electronic communication with electronic devices such as wearable devices) configured to detect physical and environmental changes and generate electrical signals. Examples of sensors can include (i) imaging sensors for collecting imaging data (e.g., including one or more cameras disposed on a respective electronic device, such as SLAM cameras); (ii) biopotential-signal sensors; (iii) inertial measurement unit (e.g., IMUs) for detecting, for example, angular rate, force, magnetic field, and/or changes in acceleration; (iv) heart rate sensors for measuring a user's heart rate; (v) SpO₂sensors for measuring blood oxygen saturation and/or other biometric data of a user; (vi) capacitive sensors for detecting changes in capacitance at a portion of a user's body (e.g., a sensor-skin interface) and/or the proximity of other devices or objects; (vii) sensors for detecting some inputs (e.g., capacitive and force sensors); and (viii) light sensors (e.g., ToF sensors, infrared light sensors, or visible light sensors) and/or sensors for sensing data from the user or the user's environment. As described herein, biopotential-signal-sensing components are devices used to measure electrical activity within the body (e.g., biopotential-signal sensors). Biopotential-signal sensors include, for example, (i) electroencephalography (EEG) sensors configured to measure electrical activity in the brain to diagnose neurological disorders; (ii) electrocardiogramaensors configured to measure electrical activity of the heart to diagnose heart problems; (iii) electromyography (EMG) sensors configured to measure the electrical activity of muscles and diagnose neuromuscular disorders; and (iv) electrooculography (EOG) sensors configured to measure the electrical activity of eye muscles to detect eye movement and diagnose eye disorders.

As described herein, an application stored in the memory of an electronic device (e.g., software) includes instructions stored in the memory. Examples of such applications include (i) games, (ii) word processors, (iii) messaging applications, (iv) media-streaming applications, (v) financial applications, (vi) calendars, (vii) clocks, (viii) web browsers, (ix) social media applications, (x) camera applications, (xi) web-based applications, (xii) health applications, (xiii) AR and MR applications, and/or any other applications that can be stored in memory. The applications can operate in conjunction with data and/or one or more components of a device or communicatively coupled devices to perform one or more operations and/or functions.

As described herein, communication interface modules can include hardware and/or software capable of data communications using any of a variety of custom or standard wireless protocols (e.g., IEEE 802.15.4, Wi-Fi, ZigBee, 6LoWPAN, Thread, Z-Wave, Bluetooth Smart, ISA100.11a, WirelessHART, or MiWi), custom or standard wired protocols (e.g., Ethernet or HomePlug), and/or any other suitable communication protocol, including communication protocols not yet developed as of the filing date of this document. A communication interface is a mechanism that enables different systems or devices to exchange information and data with each other, including hardware, software, or a combination of hardware and software. For example, a communication interface can refer to a physical connector and/or port on a device that enables communication with other devices (e.g., USB, Ethernet, HDMI, or Bluetooth). A communication interface can refer to a software layer that enables different software programs to communicate with each other (e.g., application programming interfaces (APIs) and protocols such as HTTP and TCP/IP).

As described herein, a graphics module is a component or software module that is designed to handle graphical operations and/or processes and can include a hardware module and/or a software module.

As described herein, non-transitory computer-readable storage media are physical devices or storage media that can be used to store electronic data in a non-transitory form (e.g., such that the data is stored permanently until it is intentionally deleted or modified).

FIGS. 1A-1D illustrate an example MR headset in accordance with some embodiments. The MR headset 100 includes a housing 110, one or more displays, one or more object-tracking assemblies 120, and one or more processors. Additional components of the MR headset 100 are described below in reference to FIGS. 5A to 5C-2.

The housing 110 includes an interior surface and an exterior surface opposite the interior surface. The housing 110 occludes the field of view of the user while the user wears the MR headset 100 (as depicted in FIG. 5C-1, where the user 502 is wearing an MR headset 532 that may include some or all of the components of the MR headset 100). In particular, the housing 110 covers a user's eyes to allow for generation of an immersive environment. The one or more displays are disposed within the interior surface of the housing such that the head-wearable device, when worn by the user, causes presentation of an extended reality environment. In some embodiments, the housing 110 is a two-part housing that is configured to having a first part and a second part that couple together to form a housing for electronic and mechanical components for presenting MR content.

The one or more object-tracking assemblies 120 are disposed on the exterior surface of the housing 110. Each object-tracking assembly 120 includes a plurality of imaging devices 122 (e.g., a first imaging device 122a and a second imaging device 122b) and/or one or more illumination devices 124. In some embodiments, the plurality of imaging devices 122 consists of distinct types of imaging devices. For example, the first imaging device 122a can be a red-green-and-blue (RGB) camera and the second imaging device 122b can be a simultaneous localization and mapping (SLAM) camera. The one or more illumination devices 124 can be one or more light-emitting diodes (LEDs) such as flood LEDs, infrared (IR) light sources, lamps, etc. In some embodiments, as will be described in greater detail below, the one or more illumination devices 124 include flood LEDs that are configured and arranged to illuminate a volume of physical space where a user will perform hand gestures for interacting with MR content.

For each object-tracking assembly 120, the plurality of imaging devices 122 is aligned on a first axis (e.g., the y-axis) and at least one illumination device 124 is aligned on a second axis, perpendicular to the first axis. The illumination device 124 is disposed at a predetermined intermediate distance between at least two imaging devices of the plurality of imaging devices (e.g., in the middle between the first and second imaging devices 122a and 122b). For example, as shown on the MR headset 100, the object-tracking assembly 120 forms a triangular arrangement on the exterior surface of the housing 110. In some embodiments, the first imaging device 122a (e.g., the RGB camera) is disposed above the second imaging device 122b (e.g., the SLAM camera) such that the first imaging device 122a is as close to a user's actual field of view as possible. In some embodiments, the second imaging device 122b is angled downward such that the field of view of the second imaging device 122b is focused on tracking a user's hands. Similarly, the illumination device 124 is slightly angled downward, in accordance with some embodiments, in order to illuminate the user's hands in order to allow for tracking of the user's hands, via the second imaging device 122b, during low-light conditions, low-contrast background conditions, and/or other ambient lighting conditions that negatively impact the detection of objects in image data. In some embodiments, the MR headset 100 includes at least two object-tracking assemblies 120 (e.g., first and second object-tracking assemblies 120a and 120b, where the second object-tracking assembly 120b mirrors the first object-tracking assembly 120a).

The one or more processors can be configured to execute one or more programs stored in memory communicatively coupled with the one or more processors. The one or more programs include instructions for causing the MR headset 100 to, via the illumination device 124, generate ambient lighting conditions and receive, via the plurality of imaging devices 122, image data. The one or more programs further include instructions for causing the MR headset 100 to, in accordance with a determination that the image data satisfies an object-tracking threshold, present a tracked object via the one or more displays 115. Alternatively, or in addition, the one or more programs include instructions for causing the MR headset 100 to, in accordance with a determination that the image data satisfies an object-tracking threshold, detect the performance of a hand gesture. The above examples are non-limiting; the captured image data can be used for object detection, facial-recognition detection, gesture detection, etc. In some embodiments, the one or more programs include instructions for causing the MR headset 100 to, in response to detection of an object and/or gesture, perform an operation or action associated with the detected object or gestures. As the skilled artisan will appreciate upon reading the descriptions provided herein, the above-example operations can be performed at the MR headset 100 and/or a device communicatively coupled with the MR headset 100 (e.g., a wrist-wearable device 526, an HIPD 542, a server 530, a computer 440, and/or any other device described below in reference to FIGS. 5A to 5C-2).

FIG. 1B shows a perspective view of the MR headset 100, in accordance with some embodiments. The perspective view of the MR headset 100 shows an additional imaging device 132a of the MR headset 100 (e.g., a respective flood LED of a second set of flood LEDs 132). The additional imaging device 132a can be an instance of the second imaging device 122b described above in reference to FIG. 1A. Alternatively, or additionally, the additional imaging device 132a is an instance of or includes the first imaging device 122a. The additional imaging device 132a can be used in conjunction with the object-tracking assembly 120 to provide full field-of-view coverage (e.g., increasing the field of view illuminated by one or more of the imaging devices 122). Another additional imaging device 132b is disposed on an opposite side of the MR headset 100, as shown in FIG. 1D, in accordance with some embodiments.

FIG. 1C shows a bottom view of the MR headset 100, in accordance with some embodiments. The bottom view of the MR headset 100 shows an input device 145 disposed on a portion of the housing. In some embodiments, the input device 145 is a physical (depressible) button. The input device 145, in response to receiving a user input (e.g., depression of the button), causes the MR headset 100 to initiate the passthrough mode. The passthrough mode, when active, causes the MR headset 100 to present, via the display 115, image data of a real-world environment. The image data of the real-world environment is captured by one or more imaging devices of the MR headset 100 (e.g., imaging devices 122 and/or 132). In some embodiments, the image data of the real-world environment replaces an extended-reality environment presented by the display 115 (e.g., removing a user from an immersive AR environment such that the user can focus on the real-world environment).

FIG. 1D show another perspective view of the MR headset 100, in accordance with some embodiments. The other perspective view of the MR headset 100 shows the other additional imaging device 132b of the MR headset 100. As described above, the other additional imaging device 132b is disposed on the opposite side of the MR headset 100. The other additional imaging device 132a can be used in conjunction with the object-tracking assembly 120 to provide full field-of-view coverage. The other perspective view 150 of the MR headset 100 further shows an interior surface of the housing 110 and the one or more displays 115 disposed on the interior surface of the housing 110.

FIGS. 2A to 2E illustrate example aspects of microphone configurations of a head-wearable device, in accordance with some embodiments. The different microphone configurations improve a user's voice capture and allow for rejection of the noise generated in front of the user. Additionally, the different microphone configurations enhance user speech detection and playback for scenarios of colocation and use in noisy environments.

FIG. 2A illustrates three distinct microphone configurations of a head-wearable device. A first example microphone configuration 210 (a broadside array) includes at least two microphones, each microphone at opposite sides of a bottom portion of the head-wearable device. A second example microphone configuration 220 (an end fire array (near)) includes at least two microphones, each microphone on the same sides of a bottom portion of the head-wearable device and extending diagonally from a middle portion of the head-wearable device to a side of the head-wearable device. A third example microphone configuration 230 (an end fire array (far)) includes at least two microphones, each microphone on the same sides of a bottom portion of the head-wearable device and extending diagonally from a middle portion of a side of the head-wearable device to an edge of the head-wearable device (adjacent to a frontal cover of the head-wearable device).

FIG. 2B shows a fourth example microphone configuration 240 and a fifth example microphone configuration 250.

In accordance with some embodiments, the fourth example microphone configuration 240 includes a MR headset that includes at least two microphones and hides one of two holes in a vent gap, and creates visual logic and symmetry on the headset.

In accordance with some embodiments, the fifth example microphone configuration 250 includes at least two microphones analogous with those described above in reference to the second example microphone configuration 220.

FIG. 2C shows example performance of values of different microphone configurations, in accordance with some embodiments. In particular, values for enhancing voice (SNRi) and Nullforming performance in isolating noise (NSRi). The broadside symmetric values 216 correspond to the first example microphone configuration 210, the asymmetric end fire (near) values 226 correspond to the second example microphone configuration 220, and the asymmetric end fire (far) values 236 correspond to the third example microphone configuration 230. In some embodiments, the broadside symmetric values corresponding to the broadside symmetric configuration have an SNRi value of between −10 and 0 (e.g., −4.4), and an NSRi value of between 55 and 65 (e.g., 59.9). In some embodiments, the asymmetric end fire (near) values, which correspond to a first asymmetric end fire configuration, have an SNRi value of between 10 and 20 (e.g., 14.9), and an SNRi value of between 45 and 55 (e.g., 51.3). In some embodiments, the asymmetric end fire (far) values correspond to a second asymmetric end fire configuration, have an SNRi value of between 0 and 10 (e.g., 5), and an SNRi of value of between 55 and 65 (e.g., 60).

FIGS. 2D-1 to 2D-4 shows example beamformer responses of different microphone configurations, in accordance with some embodiments. The broadside array beamformer responses 214 correspond to the first example microphone configuration 210 (FIG. 2A) and the end fire array beamformer responses 224 correspond to the second and third example microphone configuration 220 and 230 (FIG. 2A).

FIG. 2E shows an overview of an end fire configuration, in accordance with some embodiments. Advantages for the end fire array include positioning noises in front of and behind the user such that the noise can be filtered out effectively. The end fire array allows for noise sources such as typing, voices directly in front of the user, doors shutting, etc. can be more effectively canceled by the head-wearable device, which brings higher quality experiences to certain work or co-located scenarios. The visual impact of this, however, results in asymmetric holes in the bottom of the device and may feel less considered than the integration of the broad side array.

FIG. 3 illustrates an audio module of a head-wearable device, in accordance with some embodiments. As described above in reference to FIGS. 2A-2G, the head-wearable device 100 can use an end fire arrangement of mics to identify sounds coming from a user's mouth and all other sounds (which can be rejected/noise-cancelled). To further improve sound quality, foam padding 310 in front cavity near a mic next to side strap holder can be included. In particular, the foam padding 310 is disposed within a housing of a side strap holder and is configured to fill hollow portions of the housing (e.g., foam-like material to ensure that hollow structures no longer negatively impact audio quality). The foam padding 310 removes resonance issues present in the microphone module (damps airflow and removes distortion). In some embodiments, the foam padding 310 is a predetermined thickness (e.g., 6.1 mm+/−0.2 mm).

FIGS. 4A to 4D illustrate an exhaust system of a head-wearable device, in accordance with some embodiments. In particular, the head-wearable device uses a “muffler” installed after the fan and before the exhaust from the head-wearable device. The added chambers act as resonators that convert the acoustic energy into heat through a process of visco-thermal loss, which reduces objectionable noise and enable a higher air flow. In this way, the added chambers enable an increase in performance for the same noise and/or a smaller package.

FIG. 4A shows an example muffler, in accordance with some embodiments. The muffler is configured to receive a flow (e.g., grazing flow) and direct the flow through a chamber. The chamber of the muffler includes a plurality of perforations to bias the flow. Specifically, the plurality of perforations allows for the flow to enter the cavity (e.g., acting as resonators as described above).

FIG. 4B illustrates an example exhaust system of the head-wearable device. The exhaust system includes an air inlet disposed on a first exterior surface of the housing; an air outlet disposed on a second exterior surface of the housing; a fan disposed within a compartment of the housing; and a channel fluidically coupling the compartment and the air outlet. The fan is configured to pull air in through the air inlet, circulate air through a portion of the housing (e.g., cooling at least one of the one or more processors and the one or more displays), and push air out through the air outlet. The channel includes a chamber including one or more perforations and configured to operate as a resonator (e.g., that convert the acoustic energy into heat through the process of visco-thermal loss). In FIG. 4B, air is pushed out through the air outlet. The air, as traveling towards the air outlet, travels through a channel including one or more perforations.

FIG. 4C illustrates another example of air traveling through the exhaust system of the head-wearable device. As shown in FIG. 4C, a channel of the exhaust system includes micro-perforated panels 430. The micro-perforated panels 430 are fluidically coupled with a cavity such that acoustic energy is converted into heat through the process of visco-thermal loss. In accordance with some embodiments, there can be more than one cavity in distinct locations of the exhaust system (e.g., cavity portions 432-a, 432-b, and 432-c). Air that does not travel through the micro-perforated panels exits via the air outlet. In some embodiments, the muffler includes three sections of micro perforated panels (MPP) that are laterally configured (one in each lateral) and are located near the center of the headset (two panels). In some embodiments, the MPP are further configured having 0.5 mm of diameter and 4.4 mm of pitch. The three sections of the muffler are installed within the back cavity of the headset.

FIG. 4D illustrates a portion of a channel of the exhaust system, in accordance with some embodiments. The highlighted portions of the channel indicate portions of the exhaust system that can include micro-perforated panels.

Example Extended Reality Systems

FIGS. 5A, 5B, 5C-1, and 5C-2 illustrate example XR systems that include AR and MR systems, in accordance with some embodiments. FIG. 5A shows a first XR system 500a and first example user interactions using a wrist-wearable device 526, a head-wearable device (e.g., AR device 528), and/or a handheld intermediary processing device (HIPD) 530. FIG. 5B shows a second XR system 500b and second example user interactions using a wrist-wearable device 526, AR device 528, and/or an HIPD 542. FIGS. 5C-1 and 5C-2 show a third MR system 500c and third example user interactions using a wrist-wearable device 526, a head-wearable device (e.g., a mixed-reality device such as a virtual-reality (VR) device), and/or an HIPD 542. As the skilled artisan will appreciate upon reading the descriptions provided herein, the above-example AR and MR systems (described in detail below) can perform various functions and/or operations.

The wrist-wearable device 526, the head-wearable devices, and/or the HIPD 542 can communicatively couple via a network 525 (e.g., cellular, near field, Wi-Fi, personal area network, wireless LAN, etc.). Additionally, the wrist-wearable device 526, the head-wearable devices, and/or the HIPD 542 can also communicatively couple with one or more servers 530, computers 540 (e.g., laptops, computers, etc.), mobile devices 550 (e.g., smartphones, tablets, etc.), and/or other electronic devices via the network 525 (e.g., cellular, near field, Wi-Fi, personal area network, wireless LAN, etc.). Similarly, a smart textile-based garment 538, when used, can also communicatively couple with the wrist-wearable device 526, the head-wearable device(s), the HIPD 542, the one or more servers 530, the computers 540, the mobile devices 550, and/or other electronic devices via the network 525 to provide inputs.

Turning to FIG. 5A, a user 502 is shown wearing the wrist-wearable device 526 and the AR device 528 and having the HIPD 542 on their desk. The wrist-wearable device 526, the AR device 528, and the HIPD 542 facilitate user interaction with an AR environment. In particular, as shown by the first AR system 500a, the wrist-wearable device 526, the AR device 528, and/or the HIPD 542 cause presentation of one or more avatars 504, digital representations of contacts 506, and virtual objects 508. As discussed below, the user 502 can interact with the one or more avatars 504, digital representations of the contacts 506, and virtual objects 508 via the wrist-wearable device 526, the AR device 528, and/or the HIPD 542. In addition, the user 502 is able to directly view physical objects in the environment, such as a physical table 529, through transparent lens(es) and waveguide(s) of the AR device 528. Alternatively, a MR device could be used in place of the AR device 528 and a similar user experience can take place, but the user would not be directly viewing physical objects in the environment, such as a table 529, and would instead be presented with a virtual reconstruction of the table 529 produced from one or more sensors of the MR device (e.g., an outward-facing camera capable of recording the surrounding environment).

The user 502 can use any of the wrist-wearable device 526, the AR device 528 (e.g., through physical inputs at the AR device and/or built-in motion tracking of a user's extremities), a smart-textile garment, externally mounted extremity tracking device, the HIPD 542 to provide user inputs, etc. For example, the user 502 can perform one or more hand gestures that are detected by the wrist-wearable device 526 (e.g., using one or more EMG sensors and/or IMUs built into the wrist-wearable device) and/or AR device 528 (e.g., using one or more image sensors or cameras) to provide a user input. Alternatively, or additionally, the user 502 can provide a user input via one or more touch surfaces of the wrist-wearable device 526, the AR device 528, and/or the HIPD 542, and/or voice commands captured by a microphone of the wrist-wearable device 526, the AR device 528, and/or the HIPD 542. The wrist-wearable device 526, the AR device 528, and/or the HIPD 542 include an artificially intelligent (AI) digital assistant to help the user in providing a user input (e.g., completing a sequence of operations, suggesting different operations or commands, providing reminders, confirming a command). For example, the digital assistant can be invoked through an input occurring at the AR device 528 (e.g., via an input at a temple arm of the AR device 528). In some embodiments, the user 502 can provide a user input via one or more facial gestures and/or facial expressions. For example, cameras of the wrist-wearable device 526, the AR device 528, and/or the HIPD 542 can track the user 502's eyes for navigating a user interface.

The wrist-wearable device 526, the AR device 528, and/or the HIPD 542 can operate alone or in conjunction to allow the user 502 to interact with the AR environment. In some embodiments, the HIPD 542 is configured to operate as a central hub or control center for the wrist-wearable device 526, the AR device 528, and/or another communicatively coupled device. For example, the user 502 can provide an input to interact with the AR environment at any of the wrist-wearable device 526, the AR device 528, and/or the HIPD 542, and the HIPD 542 can identify one or more back-end and front-end tasks to cause the performance of the requested interaction and distribute instructions to cause the performance of the one or more back-end and front-end tasks at the wrist-wearable device 526, the AR device 528, and/or the HIPD 542. In some embodiments, a back-end task is a background-processing task that is not perceptible by the user (e.g., rendering content, decompression, compression, etc.), and a front-end task is a user-facing task that is perceptible to the user (e.g., presenting information to the user, providing feedback to the user, etc.). The HIPD 542 can perform the back-end tasks and provide the wrist-wearable device 526 and/or the AR device 528 operational data corresponding to the performed back-end tasks such that the wrist-wearable device 526 and/or the AR device 528 can perform the front-end tasks. In this way, the HIPD 542, which has more computational resources and greater thermal headroom than the wrist-wearable device 526 and/or the AR device 528, performs computationally intensive tasks and reduces the computer resource utilization and/or power usage of the wrist-wearable device 526 and/or the AR device 528.

In the example shown by the first AR system 500a, the HIPD 542 identifies one or more back-end tasks and front-end tasks associated with a user request to initiate an AR video call with one or more other users (represented by the avatar 504 and the digital representation of the contact 506) and distributes instructions to cause the performance of the one or more back-end tasks and front-end tasks. In particular, the HIPD 542 performs back-end tasks for processing and/or rendering image data (and other data) associated with the AR video call and provides operational data associated with the performed back-end tasks to the AR device 528 such that the AR device 528 performs front-end tasks for presenting the AR video call (e.g., presenting the avatar 504 and the digital representation of the contact 506).

In some embodiments, the HIPD 542 can operate as a focal or anchor point for causing the presentation of information. This allows the user 502 to be generally aware of where information is presented. For example, as shown in the first AR system 500a, the avatar 504 and the digital representation of the contact 506 are presented above the HIPD 542. In particular, the HIPD 542 and the AR device 528 operate in conjunction to determine a location for presenting the avatar 504 and the digital representation of the contact 506. In some embodiments, information can be presented within a predetermined distance from the HIPD 542 (e.g., within five meters). For example, as shown in the first AR system 500a, virtual object 508 is presented on the desk some distance from the HIPD 530. Similar to the above example, the HIPD 530 and the AR device 528 can operate in conjunction to determine a location for presenting the virtual object 508. Alternatively, in some embodiments, presentation of information is not bound by the HIPD 530. More specifically, the avatar 504, the digital representation of the contact 506, and the virtual object 508 do not have to be presented within a predetermined distance of the HIPD 530. While an AR device 528 is described working with an HIPD, a MR headset can be interacted with in the same way as the AR device 528.

User inputs provided at the wrist-wearable device 526, the AR device 528, and/or the HIPD 542 are coordinated such that the user can use any device to initiate, continue, and/or complete an operation. For example, the user 502 can provide a user input to the AR device 528 to cause the AR device 528 to present the virtual object 508 and, while the virtual object 508 is presented by the AR device 528, the user 502 can provide one or more hand gestures via the wrist-wearable device 526 to interact and/or manipulate the virtual object 508. While an AR device 528 is described as working with a wrist-wearable device 526, a MR headset can be interacted with in the same way as the AR device 528.

FIG. 5A illustrates an interaction in which an artificially intelligent (AI) virtual assistant can assist in requests made by a user 502. The AI virtual assistant can be used to complete open-ended requests made through natural language inputs by a user 502. For example, in FIG. 5A the user 502 makes an audible request 544 to summarize the conversation and then share the summarized conversation with others in the meeting. In addition, the AI virtual assistant is configured to use sensors of the extended-reality system (e.g., cameras of an extended-reality headset, microphones, and various other sensors of any of the devices in the system) to provide contextual prompts to the user for initiating tasks.

FIG. 5A also illustrates an example neural network 552 that is used to train an Artificial Intelligence. Uses of Artificial Intelligences are varied and encompass many distinct aspects of the devices and systems described herein. AI capabilities cover a diverse range of applications and deepen interactions between the user 502 and user devices (e.g., the AR device 528, a MR device 532, the HIPD 542, the wrist-wearable device 526, etc.). The AI discussed herein can be derived using many different training models, including but not limited to artificial neural networks (ANNs), deep neural networks (DNNs), convolution neural networks (CNNs), recurrent neural network (RNN), large language model (LLM), short-term memory networks, transformer models, decision trees, random forests, support vector machines, k-nearest neighbors, genetic algorithms, Markov models, Bayesian networks, fuzzy logic systems, and deep reinforcement learnings. For devices and systems herein that employ multiple AIs, different models can be used, depending on the task. For example, for a natural language AI virtual assistant an LLM can be used and for object detection of a physical environment a DNN can be used.

In another example, an AI virtual assistant can include many different AI models and based on the user's request, multiple AI models may be employed (concurrently, sequentially or a combination thereof). For example, an LLM-based AI can provide instructions for helping a user follow a recipe and the instructions can be based in part on another AI that is derived from an ANN, a DNN, an RNN, etc., that is capable of discerning what part of the recipe the user is on (e.g., object and scene detection).

As artificial intelligence training models evolve, the operations and experiences described herein could potentially be performed with different models other than those listed above, and a person skilled in the art would understand that the above list is non-limiting.

A user 502 can interact with an artificial intelligence through natural language inputs captured by a voice sensor, text inputs, or any other input modality that accepts natural language and/or a corresponding voice sensor module. In another instance, a user can provide an input by tracking an eye gaze of a user 502 via a gaze tracker module. Additionally, the AI can also receive inputs beyond those supplied by a user 502. For example, the AI can generate its response further based on environmental inputs (e.g., temperature data, image data, video data, ambient light data, audio data, GPS location data, inertial measurement (i.e., user motion) data, pattern recognition data, magnetometer data, depth data, pressure data, force data, neuromuscular data, heart rate data, temperature data, sleep data, etc.) captured in response to a user request by various types of sensors and/or their corresponding sensor modules. The sensors' data can be retrieved entirely from a single device (e.g., AR device 528) or from multiple devices that are in communication with each other (e.g., a system that includes at least two of: an AR device 528, a MR device 532, the HIPD 542, the wrist-wearable device 526, etc.). The AI can also access additional information (e.g., one or more servers 530, computers 540, mobile devices 550, and/or other electronic devices) via a network 525.

A non-limiting list of AI enhanced functions includes, but is not limited to, image recognition, speech recognition (e.g., automatic speech recognition), text recognition (e.g., scene text recognition), pattern recognition, natural language processing and understanding, classification, regression, clustering, anomaly detection, sequence generation, content generation, and optimization. In some embodiments, AI enhanced functions are fully or partially executed on cloud computing platforms communicatively coupled to the user devices (e.g., the AR device 528, a MR device 532, the HIPD 542, the wrist-wearable device 526, etc.) via the one or more networks. The cloud computing platforms provide scalable computing resources, distributed computing, managed AI services, interference acceleration, pre-trained models, application programming interface (API), and/or other resources to support comprehensive computations required by the AI enhanced function.

Example outputs stemming from the use of AI can include natural language responses, mathematical calculations, charts displaying information, audio, images, videos, texts, summaries of meetings, predictive operations based on environmental factors, classifications, pattern recognitions, recommendations, assessments, or other operations. In some embodiments, the generated outputs are stored on local memories of the user devices (e.g., the AR device 528, a MR device 532, the HIPD 542, the wrist-wearable device 526, etc.), storages of the external devices (servers, computers, mobile devices, etc.), and/or storages of the cloud computing platforms.

The AI-based outputs can be presented across different modalities (e.g., audio-based, visual-based, haptic-based, and any combination thereof) and across different devices of the XR system described herein. Some visual-based outputs can include the displaying of information on XR augments of an XR headset, user interfaces displayed at a wrist-wearable device, laptop device, mobile device, etc. On devices with or without displays (e.g., HIPD 542), haptic feedback can provide information to the user 502. An artificial intelligence can also use the inputs described above to determine the appropriate modality and device(s) to present content to the user (e.g., a user walking on a busy road can be presented with an audio output instead of a visual output to avoid distracting the user 502).

Example Augmented-Reality Interaction

FIG. 5B shows the user 502 wearing the wrist-wearable device 526 and the AR device 528 and holding the HIPD 542. In the second AR system 500b, the wrist-wearable device 526, the AR device 528, and/or the HIPD 530 are used to receive and/or provide one or more messages to a contact of the user 502. In particular, the wrist-wearable device 526, the AR device 528, and/or the HIPD 530 detect and coordinate one or more user inputs to initiate a messaging application and prepare a response to a received message via the messaging application.

In some embodiments, the user 502 initiates, via a user input, an application on the wrist-wearable device 526, the AR device 528, and/or the HIPD 530 that causes the application to initiate on at least one device. For example, in the second AR system 400b the user 502 performs a hand gesture associated with a command for initiating a messaging application (represented by messaging user interface 512); the wrist-wearable device 526 detects the hand gesture and, based on a determination that the user 502 is wearing AR device 528, causes the AR device 528 to present a messaging user interface 512 of the messaging application. The AR device 528 can present the messaging user interface 512 to the user 502 via its display (e.g., as shown by user 502's field of view 510). In some embodiments, the application is initiated and can be run on the device (e.g., the wrist-wearable device 526, the AR device 528, and/or the HIPD 542) that detects the user input to initiate the application, and the device provides another device's operational data to cause the presentation of the messaging application. For example, the wrist-wearable device 526 can detect the user input to initiate a messaging application, initiate and run the messaging application, and provide operational data to the AR device 528 and/or the HIPD 542 to cause presentation of the messaging application. Alternatively, the application can be initiated and run at a device other than the device that detected the user input. For example, the wrist-wearable device 526 can detect the hand gesture associated with initiating the messaging application and cause the HIPD 542 to run the messaging application and coordinate the presentation of the messaging application.

Further, the user 502 can provide a user input provided at the wrist-wearable device 526, the AR device 528, and/or the HIPD 530 to continue and/or complete an operation initiated at another device. For example, after initiating the messaging application via the wrist-wearable device 526 and while the AR device 528 presents the messaging user interface 512, the user 502 can provide an input at the HIPD 542 to prepare a response (e.g., shown by the swipe gesture performed on the HIPD 542). The user 502's gestures performed on the HIPD 542 can be provided and/or displayed on another device. For example, the user 502's swipe gestures performed on the HIPD 530 are displayed on a virtual keyboard of the messaging user interface 512 displayed by the AR device 528.

In some embodiments, the wrist-wearable device 526, the AR device 528, the HIPD 542, and/or other communicatively coupled devices can present one or more notifications to the user 502. The notification can be an indication of a new message, an incoming call, an application update, a status update, etc. The user 502 can select the notification via the wrist-wearable device 526, the AR device 528, or the HIPD 542 and cause presentation of an application or operation associated with the notification on at least one device. For example, the user 502 can receive a notification that a message was received at the wrist-wearable device 526, the AR device 528, the HIPD 542, and/or other communicatively coupled device and provide a user input at the wrist-wearable device 526, the AR device 528, and/or the HIPD 542 to review the notification, and the device detecting the user input can cause an application associated with the notification to be initiated and/or presented at the wrist-wearable device 526, the AR device 528, and/or the HIPD 542.

While the above example describes coordinated inputs used to interact with a messaging application, the skilled artisan will appreciate upon reading the descriptions that user inputs can be coordinated to interact with any number of applications, including but not limited to gaming applications, social media applications, camera applications, web-based applications, financial applications, etc. For example, the AR device 528 can present to the user 502 game application data and the HIPD 542 can use a controller to provide inputs to the game. Similarly, the user 502 can use the wrist-wearable device 526 to initiate a camera of the AR device 528, and the user can use the wrist-wearable device 526, the AR device 528, and/or the HIPD 542 to manipulate the image capture (e.g., zoom in or out, apply filters, etc.) and capture image data.

While an AR device 528 is shown being capable of certain functions, it is understood that an AR device can be an AR device with varying functionalities based on cost and market demand. For example, an AR device may include a single output modality such as an audio output modality. In another example, the AR device may include a low-fidelity display as one of the output modalities, where simple information (e.g., text and/or low-fidelity images/video) is capable of being presented to the user. In yet another example, the AR device can be configured with front-facing LED(s) configured to provide a user with information, e.g., an LED around the right-side lens can illuminate to notify the wearer to turn right while directions are being provided, or an LED on the left side can illuminate to notify the wearer to turn left while directions are being provided. In another embodiment, the AR device can include an outward-facing projector such that information (e.g., text information, media, etc.) may be displayed on the palm of a user's hand or other suitable surface (e.g., a table, whiteboard, etc.). In yet another embodiment, information may also be provided by locally dimming portions of a lens to emphasize portions of the environment in which the user's attention should be directed. These examples are non-exhaustive, and features of one AR device described above can be combined with features of another AR device described above. While features and experiences of an AR device have been described in the preceding sections, it is understood that the described functionalities and experiences can be applied in a comparable manner to a MR headset, which is described below in the proceeding sections.

Example Mixed-Reality Interaction

Turning to FIGS. 5C-1 and 5C-2, the user 502 is shown wearing the wrist-wearable device 526 and a MR device 532 (e.g., a device capable of providing either an entirely virtual-reality (VR) experience or a mixed-reality experience that displays object(s) from a physical environment at a display of the device) and holding the HIPD 542. In the third MR system 500c, the wrist-wearable device 526, the MR device 532, and/or the HIPD 542 are used to interact within an MR environment, such as a VR game or other MR/VR application. While the MR device 532 presents a representation of a VR game (e.g., first MR game environment 520) to the user 502, the wrist-wearable device 526, the MR device 532, and/or the HIPD 542 detect and coordinate one or more user inputs to allow the user 502 to interact with the VR game.

In some embodiments, the user 502 can provide a user input via the wrist-wearable device 526, the MR device 532, and/or the HIPD 542 that causes an action in a corresponding MR environment. For example, the user 502 in the third MR system 400c (shown in FIG. 5C-1) raises the HIPD 542 to prepare for a swing in the first MR game environment 520. The MR device 532, responsive to the user 502 raising the HIPD 542, causes the MR representation of the user 502 to perform a similar action (e.g., raise a virtual object, such as a virtual sword 524). In some embodiments, each device uses respective sensor data and/or image data to detect the user input and provide an accurate representation of the user 502's motion. For example, image sensors (e.g., SLAM cameras or other cameras) of the HIPD 542 can be used to detect a position of the 530 relative to the user 502's body such that the virtual object can be positioned appropriately within the first MR game environment 520; sensor data from the wrist-wearable device 526 can be used to detect a velocity at which the user 502 raises the HIPD 542 such that the MR representation of the user 502 and the virtual sword 524 are synchronized with the user 502's movements; and image sensors of the MR device 532 can be used to represent the user 502's body, boundary conditions, or real-world objects within the first MR game environment 520.

In FIG. 5C-2, the user 502 performs a downward swing while holding the HIPD 542. The user 502's downward swing is detected by the wrist-wearable device 526, the MR device 532, and/or the HIPD 542 and a corresponding action is performed in the first MR game environment 520. In some embodiments, the data captured by each device is used to improve the user's experience within the MR environment. For example, sensor data of the wrist-wearable device 526 can be used to determine a speed and/or force of the downward swing and image sensors of the HIPD 542 and/or the MR device 532 can be used to determine a location of the swing and how it should be represented in the first MR game environment 520, which in turn can be used as inputs for the MR environment (e.g., game mechanics, which can be used to detect speed, force, locations, and/or aspects of the user 502's actions to classify a user's inputs (e.g., user performs a light strike, hard strike, critical strike, glancing strike, miss) or calculate an output (e.g., amount of damage)).

FIG. 5C-2 further illustrates that a portion of the physical environment is reconstructed and displayed at a display of the MR device 532 while the MR game environment 520 is being displayed. In this instance, a reconstruction of the physical environment 546 is displayed in place of a portion of the MR game environment 520 when object(s) in the physical environment are potentially in the path of the user (e.g., a collision with the user and an object in the physical environment is likely). Thus, this example MR game environment 520 includes (i) an immersive virtual-reality portion 548 (e.g., an environment that does not have a corollary counterpart in a nearby physical environment) and (ii) a reconstruction of the physical environment 546 (e.g., a table and a cup resting on the table). While the example shown here is a MR environment that shows a reconstruction of the physical environment to avoid collisions, other uses of reconstructions of the physical environment can be used, such as defining features of the virtual environment based on the surrounding physical environment (e.g., a virtual column can be placed based on an object in the surrounding physical environment (e.g., a tree)).

While the wrist-wearable device 526, the MR device 532, and/or the HIPD 542 are described as detecting user inputs, in some embodiments, user inputs are detected at a single device (with the single device being responsible for distributing signals to the other devices for performing the user input). For example, the HIPD 542 can operate an application for generating the first MR game environment 520 and provide the MR device 532 with corresponding data for causing the presentation of the first MR game environment 520, as well as detect the user 502's movements (while holding the HIPD 542) to cause the performance of corresponding actions within the first MR game environment 520. Additionally, or alternatively, in some embodiments, operational data (e.g., sensor data, image data, application data, device data, and/or other data) of one or more devices is provided to a single device (e.g., the HIPD 542) to process the operational data and cause respective devices to perform an action associated with processed operational data.

In some embodiments, the user 502 can wear a wrist-wearable device 526, wear a MR device 532, wear a smart textile-based garment 538 (e.g., wearable haptic gloves), and/or hold an HIPD 542 device. In this embodiment, the wrist-wearable device 526, the MR device 532, and/or the smart textile-based garments 538 are used to interact within an MR environment (e.g., any AR or MR system described above in reference to FIGS. 4A-4B). While the MR device 532 presents a representation of a MR game (e.g., second MR game environment 530) to the user 502, the wrist-wearable device 526, the MR device 532, and/or the smart textile-based garments 538 detect and coordinate one or more user inputs to allow the user 502 to interact with the MR environment.

In some embodiments, the user 502 can provide a user input via the wrist-wearable device 526, an HIPD 542, the MR device 532, and/or the smart textile-based garments 538 that causes an action in a corresponding MR environment. In some embodiments, each device uses its respective sensor data and/or image data to detect the user input and provide an accurate representation of the user 502's motion. While four input devices are shown (e.g., a wrist-wearable device 526, a MR device 532, an HIPD 542, and a smart textile-based garment 538), each one of these input devices entirely on its own can provide inputs for fully interacting with the MR environment. For example, the wrist-wearable device can provide sufficient inputs on its own for interacting with the MR environment. In some embodiments, if multiple input devices are used (e.g., a wrist-wearable device and the smart textile-based garment 538), sensor fusion can be utilized to ensure that inputs are correct. While multiple input devices are described, it is understood that other input devices can be used in conjunction or on their own instead, such as but not limited to external motion-tracking cameras, other wearable devices fitted to different parts of a user, apparatuses that allow for a user to experience walking in an MR while remaining substantially stationary in the physical environment, etc.

As described above, the data captured by each device is used to improve the user's experience within the MR environment. Although not shown, the smart textile-based garments 538 can be used in conjunction with an MR device and/or an HIPD 542.

Example Embodiments

(A1) In accordance with some embodiments of this disclose, an example MR headset is provided. The example MR headset includes a housing comprising one or more electronic components, the one or more electronic components configured to be used in presentation of MR content including audio content. The housing is configured to attach to a strap thereby forming a cavity (e.g., for placing a connector of the strap into housing). And the housing defines an aperture, including a first opening on a first side of the MR headset, and a second opening on a second side of the MR headset. The MR headset includes a speaker housed within the cavity. The speaker is positioned adjacent to a foam insert for minimizing (e.g., reducing, attenuating) intermodulation of the speaker within the cavity during presentation of the audio content.

The MR headset further includes a fan housed in the housing, the fan configured to cool the one or more electronic components, and to minimize noise interference from operation of the fan with the presentation of the audio content. The first opening of the aperture is adjacent to an exhausting side of the fan, such that exhaust from the fan causes a grazing flow to enter the first opening and exit the second opening, thereby forming a fluidic channel across the aperture. And an inner surface of the fluidic channel defines a set of perforations. The set of perforations is configured to receive acoustic waves associated with resonant frequencies of the fan.

The MR headset further includes an expansion chamber that surrounds an inner surface of the fluidic channel, the expansion chamber having a back volume that causes a bias flow across the set of perforations of the inner surface of the fluidic channel. In some embodiments, the bias flow is substantially perpendicular to the grazing flow.

The MR headset further includes a set of microphones distributed along an outer surface of the housing, the set of microphones configured to detect audio content from a user of the MR headset as part of presenting the MR content, where each respective microphone of the set of microphones is separated from the first opening and the second opening of the aperture defined in the housing by at least an audial-interference threshold distance.

(A2) In some embodiments of A1, each respective perforation of the set of perforations is configured with a predetermined diameter corresponding to a resonant frequency of acoustic waves produced by operations of the fan.

(A3) In some embodiments of A2, the set of perforations is a first set of perforations, the predetermined diameter is a first predetermined diameter, and the inner surface of the fluidic channel defines a second set of perforations having a second predetermined diameter corresponding to another resonant frequency of acoustic waves produced by operations of the fan.

(A4) In some embodiments of A3, the first set of perforations has a first pitch, such that the first predetermined diameter and the first pitch are tuned to remove resonant acoustic waves having a first frequency. And the second set of perforations has a second pitch, such that the second predetermined diameter and the second pitch are tuned to remove resonant acoustic waves having a second frequency.

(A5) In some embodiments of any one of A1 to A4, the fluidic channel has a rectangular profile having a first dimension spanning a direction parallel to a length of the housing, and a second dimension corresponding to a depth of the housing. And the first dimension is at least double the length of the second dimension.

(A6) In some embodiments of any one of A1 to A5, the second dimension is configured based on a calculated Stokes layer of the grazing flow determined based in part on a respective size and respective pitch of each respective perforation of the set of perforations.

(A7) In some embodiments of A1 to A6, the fluidic channel has a flared profile such that the outlet of the fluidic channel has a greater surface area than the inlet of the fluidic channel.

(A8) In some embodiments of A7, the outlet comprises two spaced openings comprising the greater surface area of the outlet of the fluidic channel.

(A9) In some embodiments of A8, a portion of the expansion chamber is positioned between the two spaced openings.

(A10) In some embodiments of A1 to A9, the set of microphones is configured in an end-fire array configuration configured to cancel external noises in front of the user from being detected by the microphones.

(A11) In some embodiments of A1 to A10, the set of microphones is configured in a broadside array configuration such that two respective microphones on each side of the MR headset are symmetrical along a plane defined by the outer surface of the MR headset.

(A12) In some embodiments of A1 to A11, the set of perforations defined by the fluidic channel are configured to reduce resonant noise caused by the fan below a value of 0 decibels of sound pressure level (SPL) for at least one range of frequencies.

(A13) In some embodiments of A1 to A12, the inner surface of the fluidic channel comprises at least five micro-perforated panels (MPPs) comprising the respective perforations of the set of perforations.

(A14) In some embodiments of A1 to A13, the inner surface of the fluidic channel is comprised of an aluminum sheet having a thickness of between 0.1 and 0.5 millimeters.

(A15) In some embodiments of A1 to A14, the inner surface of the fluidic channel further comprises a mesh having an acoustic impedance of 10 pascal-seconds per meter.

(A16) In some embodiments of A1 to A16, the back volume of the expansion chamber is between 2000 and 4000 cubic millimeters.

(A17) In some embodiments of A1 to A17, the expansion chamber has a length of at least 20 millimeters.

(A18) In some embodiments of A1 to A18, the functional opening is configured to receive a strap for attaching the MR headset to the user's head.

While some experiences are described as occurring on an AR device and other experiences described as occurring on a MR device, one skilled in the art would appreciate that experiences can be ported over from a MR device to an AR device, and vice versa.

Some definitions of devices and components that can be included in some or all of the example devices discussed are defined here for ease of reference. A skilled artisan will appreciate that certain types of the components described may be more suitable for a particular set of devices and less suitable for a distinct set of devices. But subsequent reference to the components defined here should be considered to be encompassed by the definitions provided.

In some embodiments example devices and systems, including electronic devices and systems, will be discussed. Such example devices and systems are not intended to be limiting, and one of skill in the art will understand that alternative devices and systems in the example devices and systems described herein may be used to perform the operations and construct the systems and device that are described herein.

As described herein, an electronic device is a device that uses electrical energy to perform a specific function. It can be any physical object that contains electronic components such as transistors, resistors, capacitors, diodes, and integrated circuits. Examples of electronic devices include smartphones, laptops, digital cameras, televisions, gaming consoles, and music players, as well as the example electronic devices discussed herein. As described herein, an intermediary electronic device is a device that sits between two other electronic devices and/or a subset of components of one or more electronic devices and facilitates communication and/or data processing and/or data transfer between the respective electronic devices and/or electronic components.

The foregoing descriptions of FIGS. 4A to 4C-2 provided above are intended to augment the description provided in reference to FIGS. 1A to 3E. While terms in the following description may not be identical to terms used in the foregoing description, a person having ordinary skill in the art would understand these terms to have the same meaning.

Any data collection performed by the devices described herein and/or any devices configured to perform or cause the performance of the different embodiments described above in reference to any of the Figures, hereinafter the “devices,” is done with user consent and in a manner that is consistent with all applicable privacy laws. Users are given options to allow the devices to collect data, as well as the option to limit or deny collection of data by the devices. A user is able to opt in or opt out of any data collection at any time. Further, users are given the option to request the removal of any collected data.

It will be understood that, although the terms “first,” “second,” etc., may be used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the claims. As used in the description of the embodiments and the appended claims, the singular forms “a,” “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will also be understood that the term “and/or” as used herein refers to and encompasses any and all combinations of one or more of the associated listed items. It will be further understood that “comprises” and/or “comprising,” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

As used herein, “if” can be construed to mean “when” or “upon” or “in response to determining” or “in accordance with a determination” or “in response to detecting,” that a stated condition precedent is true, depending on the context. Similarly, the phrase “if it is determined [that a stated condition precedent is true]” or “if [a stated condition precedent is true]” or “when [a stated condition precedent is true]” can be construed to mean “upon determining” or “in response to determining” or “in accordance with a determination” or “upon detecting” or “in response to detecting” [that the stated condition precedent is true], depending on the context.

The foregoing description, for purposes of explanation, has been described with reference to specific embodiments. However, the illustrative discussions above are not intended to be exhaustive or to limit the claims to the precise forms disclosed. Many modifications and variations are possible in view of the above teachings. The embodiments were chosen and described in order to best explain principles of operation and practical applications, to thereby enable others skilled in the art.

本文链接：https://patent.nweon.com/41090

Meta Patent | Techniques for avoiding negative audio performance variations in extended-reality devices, and mixed-reality systems and methods of using these techniques

您可能还喜欢...

分类

最新AR/VR行业分享

最新AR/VR论文

最新AR/VR行业招聘

Meta Patent | Techniques for avoiding negative audio performance variations in extended-reality devices, and mixed-reality systems and methods of using these techniques

您可能还喜欢...

Facebook Patent | Light Source With Plurality Of Waveguides

Oculus Patent | Display With A Tunable Pinhole Array For Augmented Reality

Meta Patent | Waveguide imaging system for eye tracking

分类

最新AR/VR行业分享

最新AR/VR论文

最新AR/VR行业招聘