Sony Patent | Head gesture-based control with a hearable device
Patent: Head gesture-based control with a hearable device
Publication Number: 20250306688
Publication Date: 2025-10-02
Assignee: Sony Group Corporation
Abstract
A head gesture control system is provided that enables user control of features associated with a hearable device by using head gestures. The system determines that a movement by a user is a head control gesture designated for a particular adjustment. Various gesture factors are employed in this determination. The head control gesture may be used in combination with other types of device controls, such as tap and voice. A feedback indicator is provided back to the user describing the feature adjustment and enabling the user to ensure proper control is carried out. The user can then make additional or different adjustments or cancel the adjustment, if desired.
Claims
We claim:
1.A method for using a head gesture to control of a feature associated with a hearable device, the method comprising:detecting a plurality of first user movements of a user of the hearable device; identifying the plurality of first user movements as a head control gesture corresponding to a particular adjustment of the feature associated with the hearable device, by applying one or more gesture factors; based, at least in part, on identifying the head control gesture, adjusting the feature according to the particular adjustment; and outputting to the user, a feedback indicator to describe the adjusting of the feature.
2.The method of claim 1, wherein the feature is selected from the group of: setting, mode, audio content player, audio beam focus, audio source tracking, calling interaction, and smart assistant operation, and combinations thereof.
3.The method of claim 1, further comprising:assessing the plurality of first user movements to determine a target sound source in an environment of the user; receiving by one or more microphones of the hearable device, sound signals for a sound from the target sound source; based, at least in part, on determining the target sound source, locking onto the target sound source by adjusting one or more audio elements of the hearable device to enhance hearing of the sound; tracking a change in direction of the target sound source; and based on the change in direction, readjusting the feature to maintain enhanced hearing of the sound.
4.The method of claim 1, wherein the feature includes one or more audio elements, the method further comprising:assessing the plurality of first user movements to determine a direction of a target sound source in an environment of the user; receiving by one or more microphones of the hearable device, sound signals for a sound from the target sound source; based, at least in part, on determining the target sound source, locking onto the target sound source by adjusting the one or more audio elements of the hearable device to enhance hearing of the sound; tracking a change in direction of the target sound source as the target sound source moves location relative to the user; and based on the change in direction, readjusting the feature to maintain enhanced hearing of the sound.
5.The method of claim 1, wherein the feature includes audio beam focusing and wherein the feedback indicator includes a notification of a section of a sound field that the audio beam focusing is directed.
6.The method of claim 1, further comprising:receiving a plurality of second user movements; gathering context information associated with the plurality of second user movements; applying one or more non-gesture factors to identify the plurality of second user movements as non-gesture movements; and rejecting the non-gesture movements for control of the feature.
7.The method of claim 1, wherein identifying the head control gesture comprises:detecting a base head position prior to the plurality of first user movements; and assessing the plurality of first user movements relative to the base head position.
8.The method of claim 1, further comprising:outputting an inquiry for user control; detecting the plurality of first user movements; and determining the plurality of first user movements is responsive to the inquiry.
9.A head gesture control system to adjust a feature associated with a hearable device, the head gesture control system comprising:at least one sensor to detect plurality of user movements of a user using the hearable device; a hearable device of a user comprising:one or more processors; and logic encoded in one or more non-transitory media for execution by the one or more processors and when executed, operable to perform operations comprising:detecting a plurality of first user movements of a user of the hearable device; identifying the plurality of first user movements as a head control gesture corresponding to a particular adjustment of the feature associated with the hearable device, by applying one or more gesture factors; based, at least in part, on identifying the head control gesture, adjusting the feature of the hearable device according to the particular adjustment; and outputting to the user, a feedback indicator to describe the adjusting of the feature.
10.The head gesture control system of claim 9, wherein the feature is selected from the group of: setting, mode, audio content player, audio beam focus, audio course tracking, calling interaction, and smart assistant operation, and combinations thereof.
11.The head gesture control system of claim 9, wherein the operations further comprise:assessing the plurality of first user movements to determine a target sound source in an environment of the user; receiving by one or more microphones of the hearable device, sound signals for a sound from the target sound source; based, at least in part, on determining the target sound source, locking onto the target sound source by adjusting one or more audio elements of the hearable device to enhance hearing of the sound; tracking a change in direction of the target sound source; and based on the change in direction, readjusting the one or more audio elements to maintain enhanced hearing of the sound.
12.The head gesture control system of claim 9, wherein the operations further comprise:detecting at least one of a tactile input, voice input, and voice input; and identifying the at least one of the tactile input, voice input, and visual input as a control input for the feature, wherein adjusting of the feature of the hearable device is further based on identifying the control input.
13.The head gesture control system of claim 9, further comprises a wearable configured to be worn by the user and holding the sensor positioned to detect the plurality of first user movements.
14.The head gesture control system of claim 13, wherein the plurality of user movements includes eye movement and wherein the wearable includes at least one reverse camera configured to detect the eye movement.
15.The head gesture control system of claim 9, wherein the operations further comprise:receiving a plurality of second user movements; gathering context information associated with the plurality of second user movements; applying one or more non-gesture factors to identify the plurality of second user movements as non-gesture movements; and rejecting the non-gesture movements for control of the feature.
16.A non-transitory computer-readable storage medium carrying program instructions thereon for using head gesture to control a feature associated with a hearable device, the instructions when executed by one or more processors cause the one or more processors to perform operations comprising:detecting a plurality of first user movements of a user of the hearable device; identifying the plurality of first user movement as a head control gesture corresponding to a particular adjustment of the feature associated with the hearable device, by applying one or more gesture factors; based, at least in part, on identifying the head control gesture, adjusting the feature of the hearable device according to the particular adjustment; and outputting to the user, a feedback indicator to describe the adjusting of the feature.
17.The non-transitory computer-readable storage medium of claim 16, wherein the feature is selected from the group of: setting, mode, audio content player, audio beam focus, audio source tracking, calling interaction, and smart assistant operation, and combinations thereof.
18.The non-transitory computer-readable storage medium of claim 16, wherein the operations further comprise:assessing the plurality of first user movements to determine a target sound source in an environment of the user; receiving by one or more microphones of the hearable device, sound signals for a sound from the target sound source; based, at least in part, on determining the target sound source, locking onto the target sound source by adjusting one or more audio elements of the hearable device to enhance hearing of the sound; tracking a change in direction of the target sound source; and based on the change in direction, readjusting the one or more audio elements to maintain enhanced hearing of the sound.
19.The non-transitory computer-readable storage medium of claim 16, wherein the feature includes one or more audio elements, and the operations further comprise:assessing the plurality of first user movements to determine a direction of a target sound source in an environment of the user; receiving by one or more microphones of the hearable device, sound signals for a sound from the target sound source; based, at least in part, on determining the target sound source, locking onto the target sound source by adjusting the one or more audio elements of the hearable device to enhance hearing of the sound; tracking a change in direction of the target sound source as the target sounds source moves location relative to the user; and based on the change in direction, readjusting the feature to maintain enhanced hearing of the sound.
20.The non-transitory computer-readable storage medium of claim 15, wherein the operations further comprise:receiving a plurality of second user movements; gathering context information associated with the plurality of second user movements; applying one or more non-gesture factors to identify the plurality of second user movements as non-gesture movements; and rejecting the non-gesture movements for control of the feature.
Description
CROSS REFERENCES TO RELATED APPLICATIONS
This application claims the benefit of U.S. Provisional Application No. 63/571,967, entitled HEAD GESTURE-BASED CONTROL WITH A HEARABLE DEVICE, filed on Mar. 29, 2024 (020699-124700US/SYP352697US01), which is hereby incorporated by reference as if set forth in full in this application for all purposes. This application is also related to the following application, U.S. patent application Ser. No. 18/622,606, entitled NON-SPEECH SOUND CONTROL WITH A HEARABLE DEVICE, filed on Mar. 29, 2024 (020699-124600US/SYP352670US01), which is hereby incorporated by reference as if set forth in full in this application for all purposes.
BACKGROUND
Non-verbal behaviors can be used to communicate in subtle ways. A head gesture, such as a gaze, a shrug, or a nod can communicate different intents according to culture, context, or definition. Devices that allow for gesture interactions by users can allow for greater use of the device. Head gesture device controls can allow users to multitask by freeing hands and voice. Typically, users can control devices by pressing buttons, tapping or otherwise touching a portion of the device, opening an application on another device (e.g., a smart phone), or using voice assistance.
Hearable devices (interchangeably called “hearables”) include a variety of ear worn devices configured to alter the hearing abilities of the user, such as playing audio close to or into the ear (e.g., headphones, earbuds), blocking environmental audio (e.g., headphone covering the ears and noise canceling devices), enhancing hearing of environmental audio (e.g., hearing aids), etc. Use of hearable devices have become common accessories to be worn and connected with other devices, such as smart phones, that have become constant fixtures for people. Simple, hands free control using hearables devices can be a significant convenience.
SUMMARY
A head gesture control system (also called “control system”, “gesture control system” or “system”) is provided that enables user control of features associated with a hearable device by using head gestures. The system determines that a movement by a user is a head control gesture designated for a particular adjustment. Feedback is provided back to the user describing the feature adjustment, e.g. boosting voice frequencies, and enabling the user to ensure proper control is carried out. The user can then make additional or different adjustments or cancel the adjustment, if desired.
A method is provided for using head gestures to control of one or more features associated with a hearable device. The hearable device detects at least one user movement and typically a plurality of user movements of a user using the hearable device. The user movement(s) are identified as a head control gesture by applying one or more gesture factors that correlate with particular adjustments of a feature. The head control gesture corresponds to a particular adjustment of a feature associated with the hearable device. Based, at least in part, on identifying the head control gesture, the feature is adjusted according to the particular adjustment. The feature that may be adjusted in this manner may be selected from the group of: setting, mode, audio content player, audio beam focus, sound tracking, calling interaction, and smart assistant operation. Other features may also be possible to be adjusted in this manner. A feedback indicator may be outputting to the user. The feedback indicator provides a description of the feature adjustments.
Some implementations may include a locking functionality in which the user movement is assessed to determine a target sound source in an environment of the user to which the feature adjustment is to be directed. The feature may include one or more audio elements. One or more microphones of the hearable device receive sound signals for a sound from the target sound source. Based, at least in part, on determining the target sound source, the control system locks the features onto the target sound source, for example, by adjusting one or more audio elements of the hearable device to enhance hearing of the sound. A change in direction of the target sound source can be tracked as it moves location relative to the user. Based on the change in direction, the feature may be adjusted to maintain enhanced hearing of the sound.
A change in direction of the target sound source is tracked, such as via sensors of the hearable device or otherwise in communication with the hearable device. Based on the change in direction, the feature may be readjusted to maintain enhanced hearing of the target sound source.
In some aspects, the head control gesture may include at least one eye gaze event for a predefined period of time in a direction of the target sound source. The feature may also include audio beam focusing. In some cases, the feedback may include a notification of a section of a sound field that the audio beam focusing is directed.
Output of the feedback indicator may also include steps such as receiving by one or more microphones of the hearable device, sound signals from a target sound source. The sound may be matched with a stored sound print of one or more stored sound prints of candidate sound sources. The target sound source may be identified as a recognized source of the candidate sound sources. The feature indicator may be an output of an audio identification of the recognized source. And this may therefore cause this recognized source to be tracked and focused regardless of head control gestures or other controls.
In still some implementations, user movement may be detected for which context information associated with the user movement may be gathered; applying one or more non-gesture factors to identify the user movement as a non-gesture movement; and rejecting the non-gesture movement to control of the feature. It should be noted that head control gestures can be used in combination with other controls such as tapping the device and voice control.
At times, the user movement may be assessed from a starting point of a base head position. The base head position may be detected prior to the user movement. For example, the base head position may be used to positionally focus a locking of a sound source that is directly in front of the user using a head gesture, such as a couple of rapid nods. Assessment of the user movement may be relative to the base head position. For example, when the user rotates the head, the focus on the sound source can be kept locked. As discussed in more detail later, the base head position is useful to more easily determine other gestures such left and right head tilts and side tilts.
In some implementations, an inquiry may be outputted to the user regarding user control of a feature. User movement is detected and determined whether the user movement is responsive to the inquiry. The feature adjustment may take place or be halted, accordingly.
In some implementations, head gesture control system (also referred to as an apparatus) is provided, which is configured to adjust a feature associated with a hearable device. The head gesture control system has at least one sensor to detect a plurality of user movements of a user using the hearable device. The system also includes a hearable device including one or more processors and logic encoded in one or more non-transitory media for execution by the one or more processors and when executed operable to perform various operations as described above in terms of the method. Additional operations may be performed for example to combine the head control gesture with other input controls. At least one of a tactile input, voice input, and voice input may be detected and identified as a control input for the feature associated with detected head control gesture. The feature of the hearable device may be adjusted based on identifying the control input as well as the head control gesture.
In some implementations, the control system may include a wearable configured to be worn by the user and holding the sensor positioned to detect the plurality of first user movements.
In some implementations, a non-transitory computer-readable storage medium is provided which carries program instructions for adjusting features based on detected user head control gestures. These instructions when executed by one or more processors cause the one or more processors to perform operations as described above for the focusing method described above.
A further understanding of the nature and the advantages of particular embodiments disclosed herein may be realized by reference of the remaining portions of the specification and the attached drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
The disclosure is illustrated by way of example, and not by way of limitation in the figures in which like reference numerals are used to refer to similar elements.
FIG. 1 is a conceptual diagram illustrating an example of the head gesture control system in which head level is detected, in accordance with some implementations.
FIGS. 2A and 2B are conceptual diagrams illustrating examples of the head gesture control system in which FIG. 2A shows use of a head control gesture to direct a size of a focus area and FIG. 2B shows use of head control gesture to focus on an object, in accordance with some implementations.
FIG. 3 is a conceptual diagram illustrating an example of the head gesture control system that includes eye movement control of a feature by identification of a section of a field of view, in accordance with some implementations.
FIG. 4 is a conceptual diagram illustrating an example of the head gesture control system that includes eye movement control of a feature by identification of an object in the a field of view, in accordance with some implementations.
FIG. 5 is a conceptual diagram illustrating an example of the head gesture control system that includes eye gaze control of a feature using vertical spatial separation, in accordance with some implementations.
FIG. 6 a flow diagram of an example method for controlling a feature associate with a hearable using head control gestures, in accordance with some implementations.
FIG. 7 is a flow diagram of various example method for controlling a feature associated with a hearable by locking onto a sound source, in accordance with some implementations.
FIG. 8 is a block diagram of components of the head gesture control system usable to implement in the processes of FIGS. 6 and 7, in accordance with some implementations.
Publication Number: 20250306688
Publication Date: 2025-10-02
Assignee: Sony Group Corporation
Abstract
A head gesture control system is provided that enables user control of features associated with a hearable device by using head gestures. The system determines that a movement by a user is a head control gesture designated for a particular adjustment. Various gesture factors are employed in this determination. The head control gesture may be used in combination with other types of device controls, such as tap and voice. A feedback indicator is provided back to the user describing the feature adjustment and enabling the user to ensure proper control is carried out. The user can then make additional or different adjustments or cancel the adjustment, if desired.
Claims
We claim:
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
Description
CROSS REFERENCES TO RELATED APPLICATIONS
This application claims the benefit of U.S. Provisional Application No. 63/571,967, entitled HEAD GESTURE-BASED CONTROL WITH A HEARABLE DEVICE, filed on Mar. 29, 2024 (020699-124700US/SYP352697US01), which is hereby incorporated by reference as if set forth in full in this application for all purposes. This application is also related to the following application, U.S. patent application Ser. No. 18/622,606, entitled NON-SPEECH SOUND CONTROL WITH A HEARABLE DEVICE, filed on Mar. 29, 2024 (020699-124600US/SYP352670US01), which is hereby incorporated by reference as if set forth in full in this application for all purposes.
BACKGROUND
Non-verbal behaviors can be used to communicate in subtle ways. A head gesture, such as a gaze, a shrug, or a nod can communicate different intents according to culture, context, or definition. Devices that allow for gesture interactions by users can allow for greater use of the device. Head gesture device controls can allow users to multitask by freeing hands and voice. Typically, users can control devices by pressing buttons, tapping or otherwise touching a portion of the device, opening an application on another device (e.g., a smart phone), or using voice assistance.
Hearable devices (interchangeably called “hearables”) include a variety of ear worn devices configured to alter the hearing abilities of the user, such as playing audio close to or into the ear (e.g., headphones, earbuds), blocking environmental audio (e.g., headphone covering the ears and noise canceling devices), enhancing hearing of environmental audio (e.g., hearing aids), etc. Use of hearable devices have become common accessories to be worn and connected with other devices, such as smart phones, that have become constant fixtures for people. Simple, hands free control using hearables devices can be a significant convenience.
SUMMARY
A head gesture control system (also called “control system”, “gesture control system” or “system”) is provided that enables user control of features associated with a hearable device by using head gestures. The system determines that a movement by a user is a head control gesture designated for a particular adjustment. Feedback is provided back to the user describing the feature adjustment, e.g. boosting voice frequencies, and enabling the user to ensure proper control is carried out. The user can then make additional or different adjustments or cancel the adjustment, if desired.
A method is provided for using head gestures to control of one or more features associated with a hearable device. The hearable device detects at least one user movement and typically a plurality of user movements of a user using the hearable device. The user movement(s) are identified as a head control gesture by applying one or more gesture factors that correlate with particular adjustments of a feature. The head control gesture corresponds to a particular adjustment of a feature associated with the hearable device. Based, at least in part, on identifying the head control gesture, the feature is adjusted according to the particular adjustment. The feature that may be adjusted in this manner may be selected from the group of: setting, mode, audio content player, audio beam focus, sound tracking, calling interaction, and smart assistant operation. Other features may also be possible to be adjusted in this manner. A feedback indicator may be outputting to the user. The feedback indicator provides a description of the feature adjustments.
Some implementations may include a locking functionality in which the user movement is assessed to determine a target sound source in an environment of the user to which the feature adjustment is to be directed. The feature may include one or more audio elements. One or more microphones of the hearable device receive sound signals for a sound from the target sound source. Based, at least in part, on determining the target sound source, the control system locks the features onto the target sound source, for example, by adjusting one or more audio elements of the hearable device to enhance hearing of the sound. A change in direction of the target sound source can be tracked as it moves location relative to the user. Based on the change in direction, the feature may be adjusted to maintain enhanced hearing of the sound.
A change in direction of the target sound source is tracked, such as via sensors of the hearable device or otherwise in communication with the hearable device. Based on the change in direction, the feature may be readjusted to maintain enhanced hearing of the target sound source.
In some aspects, the head control gesture may include at least one eye gaze event for a predefined period of time in a direction of the target sound source. The feature may also include audio beam focusing. In some cases, the feedback may include a notification of a section of a sound field that the audio beam focusing is directed.
Output of the feedback indicator may also include steps such as receiving by one or more microphones of the hearable device, sound signals from a target sound source. The sound may be matched with a stored sound print of one or more stored sound prints of candidate sound sources. The target sound source may be identified as a recognized source of the candidate sound sources. The feature indicator may be an output of an audio identification of the recognized source. And this may therefore cause this recognized source to be tracked and focused regardless of head control gestures or other controls.
In still some implementations, user movement may be detected for which context information associated with the user movement may be gathered; applying one or more non-gesture factors to identify the user movement as a non-gesture movement; and rejecting the non-gesture movement to control of the feature. It should be noted that head control gestures can be used in combination with other controls such as tapping the device and voice control.
At times, the user movement may be assessed from a starting point of a base head position. The base head position may be detected prior to the user movement. For example, the base head position may be used to positionally focus a locking of a sound source that is directly in front of the user using a head gesture, such as a couple of rapid nods. Assessment of the user movement may be relative to the base head position. For example, when the user rotates the head, the focus on the sound source can be kept locked. As discussed in more detail later, the base head position is useful to more easily determine other gestures such left and right head tilts and side tilts.
In some implementations, an inquiry may be outputted to the user regarding user control of a feature. User movement is detected and determined whether the user movement is responsive to the inquiry. The feature adjustment may take place or be halted, accordingly.
In some implementations, head gesture control system (also referred to as an apparatus) is provided, which is configured to adjust a feature associated with a hearable device. The head gesture control system has at least one sensor to detect a plurality of user movements of a user using the hearable device. The system also includes a hearable device including one or more processors and logic encoded in one or more non-transitory media for execution by the one or more processors and when executed operable to perform various operations as described above in terms of the method. Additional operations may be performed for example to combine the head control gesture with other input controls. At least one of a tactile input, voice input, and voice input may be detected and identified as a control input for the feature associated with detected head control gesture. The feature of the hearable device may be adjusted based on identifying the control input as well as the head control gesture.
In some implementations, the control system may include a wearable configured to be worn by the user and holding the sensor positioned to detect the plurality of first user movements.
In some implementations, a non-transitory computer-readable storage medium is provided which carries program instructions for adjusting features based on detected user head control gestures. These instructions when executed by one or more processors cause the one or more processors to perform operations as described above for the focusing method described above.
A further understanding of the nature and the advantages of particular embodiments disclosed herein may be realized by reference of the remaining portions of the specification and the attached drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
The disclosure is illustrated by way of example, and not by way of limitation in the figures in which like reference numerals are used to refer to similar elements.
FIG. 1 is a conceptual diagram illustrating an example of the head gesture control system in which head level is detected, in accordance with some implementations.
FIGS. 2A and 2B are conceptual diagrams illustrating examples of the head gesture control system in which FIG. 2A shows use of a head control gesture to direct a size of a focus area and FIG. 2B shows use of head control gesture to focus on an object, in accordance with some implementations.
FIG. 3 is a conceptual diagram illustrating an example of the head gesture control system that includes eye movement control of a feature by identification of a section of a field of view, in accordance with some implementations.
FIG. 4 is a conceptual diagram illustrating an example of the head gesture control system that includes eye movement control of a feature by identification of an object in the a field of view, in accordance with some implementations.
FIG. 5 is a conceptual diagram illustrating an example of the head gesture control system that includes eye gaze control of a feature using vertical spatial separation, in accordance with some implementations.
FIG. 6 a flow diagram of an example method for controlling a feature associate with a hearable using head control gestures, in accordance with some implementations.
FIG. 7 is a flow diagram of various example method for controlling a feature associated with a hearable by locking onto a sound source, in accordance with some implementations.
FIG. 8 is a block diagram of components of the head gesture control system usable to implement in the processes of FIGS. 6 and 7, in accordance with some implementations.