Patent: 3d captions with face tracking
Patent PDF: 20250078427
Publication Number: 20250078427
Publication Date: 2025-03-06
Assignee: Snap Inc
Abstract
Aspects of the present disclosure involve a system comprising a computer-readable storage medium storing at least one program and method for performing operations comprising: receiving, by one or more processors that implement a messaging application, a video feed from a camera of a user device; detecting, by the messaging application, a face in the video feed; in response to detecting the face in the video feed, retrieving a three-dimensional (3D) caption; modifying the video feed to include the 3D caption at a position in 3D space of the video feed proximate to the face; and displaying a modified video feed that includes the face and the 3D caption.
Claims
What is claimed is:
1. A system comprising:at least one hardware processor;a memory storing instructions which, when executed by the at least one hardware processor, cause the at least one hardware processor to perform operations comprising:selectively displaying one or more related graphical elements together with a virtual element based on a determination of which of a first camera and a second camera, directed towards a second direction different from a first direction of the first camera, is being used to capture content, the selectively displaying of the one or more related graphical elements comprising:displaying the one or more related graphical elements together with the virtual element in response to determining that the first camera is being used to capture the content; andremoving the one or more related graphical elements in response to determining that the second camera is being used to capture the content based on receiving input comprising a request to activate the second camera.
2. The system of claim 1, the operations comprising:in response to determining that the first camera is a front-facing camera being used, modifying the content to include a 3D caption as the virtual element with the one or more related graphical elements; andin response to receiving the request to activate the second camera, modifying a display position of the 3D caption.
3. The system of claim 1, wherein the operations further comprise:detecting a face in the content, wherein the virtual element is retrieved in response to detecting the face.
4. The system of claim 1, wherein the operations further comprise:determining that the first camera being used to capture the content is a front-facing camera, the virtual element with one or more related graphical elements being displayed at a position in 3D space of the content proximate to a face in response to determining that the front-facing camera is being used.
5. The system of claim 1, wherein the operations further comprise:modifying the content captured by the second camera to transition the virtual element to be displayed on a surface depicted in the content from being displayed proximate to a face;in response to receiving a request to access a virtual element feature after less than a threshold amount of time has elapsed since a content segment comprising the modified content was stored, restoring at least one of a style, text, color, or graphical element of the virtual element, for display; andin response to determining that the request to access the virtual element feature is received after more than the threshold amount of time has elapsed since the content segment was stored, presenting a virtual element entry interface with default parameters.
6. The system of claim 1, wherein the operations further comprise:receiving a request to access a virtual element manipulation feature;determining that request is a request to access the virtual element manipulation feature for a first time; andpresenting, in the content, a 3D hint in front of the virtual element that animates repeatedly instructions for modifying placement of the virtual element.
7. The system of claim 1, wherein the operations further comprise:detecting contact between a screen in which the content is displayed and two fingers of a user, the contact being at a location in the screen in which the virtual element is displayed;after detecting the contact, determining that one of the two fingers has been released from contacting the screen; andin response to determining that one of the two fingers has been released from contacting the screen, providing an option to translate a position of the virtual element up and down along a y-axis.
8. The system of claim 1, wherein the operations further comprise:determining that resources of a user device satisfy a resource threshold; andin response to determining that the resources of the user device satisfy the resource threshold, presenting a virtual element modification option to modify at least one of a text style or color of the virtual element.
9. The system of claim 1, wherein the operations further comprise curving the virtual element around a top of a face.
10. The system of claim 1, wherein the operations further comprise:determining that a face is no longer detected in the content; andin response to determining that the face is no longer detected, disabling a feature that enables addition of virtual elements.
11. The system of claim 1, wherein the operations further comprise:detecting first and second faces in the content;determining that the second face includes a greater number of pixels than the first face; andin response to determining that the second face includes the greater number of pixels than the first face, modifying the content to include the virtual element at a position in three-dimensional space of the content proximate to the second face instead of the first face.
12. The system of claim 1, wherein the operations further comprise:determining context associated with the content; andautomatically populating text of the virtual element based on the context.
13. The system of claim 1, wherein the operations further comprise:detecting input indicating that a user tapped on a screen at a position of the virtual element that is displayed in the content; andin response to detecting the input, presenting text of the virtual element in 2D to enable the user to modify the text.
14. The system of claim 13, wherein the operations further comprise dimming the screen in which the text is presented to focus the user on the text.
15. The system of claim 13, wherein the operations further comprise:determining that the user tapped on the screen at a location between two characters of the text; andpositioning a cursor to modify the text starting from the location between the two characters of the text in response to determining that the user tapped on the screen at the location between the two characters of the text.
16. The system of claim 13, wherein the operations further comprise enabling adjustment of a size and layout of the text using a pinch gesture based on a width of the text.
17. A method comprising:selectively displaying one or more related graphical elements together with a virtual element based on a determination of which of a first camera and a second camera, directed towards a second direction different from a first direction of the first camera, is being used to capture content, the selectively displaying of the one or more related graphical elements comprising:displaying the one or more related graphical elements together with the virtual element in response to determining that the first camera is being used to capture the content; andremoving the one or more related graphical elements in response to determining that the second camera is being used to capture the content based on receiving input comprising a request to activate the second camera.
18. The method of claim 17, further comprising:in response to determining that the first camera is a front-facing camera being used to capture the content, modifying the content to include a virtual element comprising a 3D caption with the one or more related graphical elements;receiving a request to activate the second camera to capture content; andin response to receiving the request to activate the second camera:removing the one or more related graphical elements from the content; andmodifying a display position of the virtual element comprising the 3D caption.
19. The method of claim 17, further comprising:detecting a face in the content, wherein the virtual element is retrieved in response to detecting the face.
20. A non-transitory machine-readable medium storing instructions which, when executed by one or more processors of a machine, cause the machine to perform operations comprising:selectively displaying one or more related graphical elements together with a virtual element based on a determination of which of a first camera and a second camera, directed towards a second direction different from a first direction of the first camera, is being used to capture content, the selectively displaying of the one or more related graphical elements comprising:displaying the one or more related graphical elements together with the virtual element in response to determining that the first camera is being used to capture the content; andremoving the one or more related graphical elements in response to determining that the second camera is being used to capture the content based on receiving input comprising a request to activate the second camera.
Description
CLAIM OF PRIORITY
This application is a continuation of U.S. patent application Ser. No. 18/375,693, filed on Oct. 2, 2023, which is a continuation of U.S. patent application Ser. No. 17/581,093, filed on Jan. 21, 2022, which is a continuation of U.S. patent application Ser. No. 16/721,418, filed on Dec. 19, 2019, which are incorporated herein by reference in their entireties.
TECHNICAL FIELD
The present disclosure relates generally to visual presentations and more particularly to rendering virtual objects within a real-world environment captured in a camera feed of a computing device.
BACKGROUND
Augmented reality (AR) refers to supplementing the view of real-world objects and environments with computer-generated graphics content. Virtual rendering systems can be used to create, view, and interact with engaging and entertaining AR experiences, in which 3D virtual object graphics content appears to be present in the real world. Virtual rendering systems are frequently implemented within mobile devices such as smartphones and tablets.
BRIEF DESCRIPTION OF THE DRAWINGS
In the drawings, which are not necessarily drawn to scale, like numerals may describe similar components in different views. To easily identify the discussion of any particular element or act, the most significant digit or digits in a reference number refer to the figure number in which that element is first introduced. Some embodiments are illustrated by way of example, and not limitation, in the figures of the accompanying drawings in which:
FIG. 1 is a block diagram showing a messaging system for exchanging data (e.g., messages and associated content) over a network, according to example embodiments.
FIG. 2 is block diagram illustrating further details regarding a messaging system, according to example embodiments.
FIG. 3 is a schematic diagram illustrating data which may be stored in the database of the messaging system, according to example embodiments.
FIG. 4 is a schematic diagram illustrating a structure of a message generated by a messaging client application for communication, according to example embodiments.
FIG. 5 is a block diagram illustrating various components of a three-dimensional (3D) caption system, which may be provided as part of the messaging system, according to example embodiments.
FIGS. 6 and 7 are flowcharts illustrating example operations of the 3D caption system in performing a method for generating a message that includes a 3D caption, according to example embodiments.
FIGS. 8-11 are interface diagrams that illustrate various interfaces provided by the messaging system, according to some example embodiments.
FIGS. 12A-12C are interface diagrams that illustrate various interfaces provided by the messaging system, according to some example embodiments.
FIGS. 13A-13D are interface diagrams that illustrate various interfaces provided by the messaging system, according to some example embodiments.
FIGS. 14A and 14B are interface diagrams that illustrate various interfaces provided by the messaging system, according to some example embodiments.
FIG. 15 is an interface diagram that illustrates an interface provided by the messaging system, according to some example embodiments.
FIGS. 16A and 16B are interface diagrams that illustrate various interfaces provided by the messaging system, according to some example embodiments.
FIGS. 17A and 17B are interface diagrams that illustrate various interfaces provided by the messaging system, according to some example embodiments.
FIGS. 18A-J are interface diagrams that illustrate various interfaces provided by the messaging system, according to some example embodiments.
FIG. 19 is a block diagram illustrating a representative software architecture, which may be used in conjunction with various hardware architectures herein described, according to example embodiments.
FIG. 20 is a block diagram illustrating components of a machine able to read instructions from a machine-readable medium (e.g., a machine-readable storage medium) and perform any one or more of the methodologies discussed herein, according to example embodiments.