Facebook Patent | Joint infrared and visible light visual-inertial object tracking
Patent: Joint infrared and visible light visual-inertial object tracking
Drawings: Click to check drawins
Publication Number: 20210208673
Publication Date: 20210708
Applicant: Facebook
Abstract
In one embodiment, a method for tracking includes receiving motion data captured by a motion sensor of a wearable device, generating a pose of the wearable device based on the motion data, capturing a first frame of the wearable device by a camera using a first exposure time, identifying, in the first frame, a pattern of lights disposed on the wearable device, capturing a second frame of the wearable device by the camera using a second exposure time, identifying, in the second frame, predetermined features of the wearable device, and adjusting the pose of the wearable device in the environment based on the identified pattern of light in the first frame or the identified predetermined features in the second frame. The method utilizes the predetermined features for tracking the wearable device in a visible-light frame under specific light conditions to improve the accuracy of the pose of controller.
Claims
-
A method comprising, by a computing system: receiving motion data captured by one or more motion sensors of a wearable device; generating a pose of the wearable device based on the motion data; capturing a first frame of the wearable device by a camera using a first exposure time; identifying, in the first frame, a pattern of lights disposed on the wearable device; capturing a second frame of the wearable device by the camera using a second exposure time; identifying, in the second frame, predetermined features of the wearable device; and adjusting the pose of the wearable device in an environment based on at least one of (1) the identified pattern of lights in the first frame or (2) the identified predetermined features in the second frame.
-
The method of claim 1, wherein: capturing the first frame of the wearable device using the first exposure time when the environment has a first light condition; and capturing the second frame of the wearable device using the second exposure time when the environment has a second light condition.
-
The method of claim 2, wherein the second light condition comprises one or more of: an environment having bright light; an environment having a light source to interfere the pattern of lights of the wearable device; and the camera not being able to capture the pattern of lights.
-
The method of claim 1, wherein: the wearable device is equipped with one or more inertial measurement units (IMUs) and one or more infrared (IR) light emitting diodes (LEDs); the first frame is an IR image; and the second frame is a visible-light image.
-
The method of claim 1, wherein the second exposure time is longer than the first exposure time.
-
The method of claim 1, wherein the pose of the wearable device is generated at a faster frequency than a frequency that the first frame and the second frame are captured.
-
The method of claim 1, further comprising: capturing a third frame of the wearable device by the camera using the second exposure time; identifying, in the third frame, one or more features corresponding to the predetermined features of the wearable device; determining correspondence data between the predetermined features and the one or more features; and tracking the wearable device in the environment based on the correspondence data.
-
The method of claim 1, wherein the computing system comprises: the camera configured to capture the first frame and the second frame of the wearable device; an identifying unit configured to identify the pattern of lights and the predetermined features of the wearable device; and a filter unit configured to adjust the pose of the wearable device.
-
The method of claim 1, wherein the camera is located within a head-mounted device; and wherein the wearable device is a controller separated from the head-mounted device.
-
The method of claim 9, wherein the head-mounted device comprises one or more processors, wherein the one or more processors are configured to implement the camera, the identifying unit, and the filter unit.
-
One or more computer-readable non-transitory storage media embodying software that is operable when executed to: receive motion data captured by one or more motion sensors of a wearable device; generate a pose of the wearable device based on the motion data; capture a first frame of the wearable device by a camera using a first exposure time; identify, in the first frame, a pattern of lights disposed on the wearable device; capture a second frame of the wearable device by the camera using a second exposure time; identify, in the second frame, predetermined features of the wearable device; and adjust the pose of the wearable device in an environment based on at least one of (1) the identified pattern of lights in the first frame or (2) the identified predetermined features in the second frame.
-
The media of claim 11, wherein the software is further operable when executed to: capture the first frame of the wearable device using the first exposure time when the environment has a first light condition; and capture the second frame of the wearable device using the second exposure time when the environment has a second light condition.
-
The media of claim 12, wherein the second light condition comprises one or more of: an environment having bright light; an environment having a light source to interfere the pattern of lights of the wearable device; and the camera not being able to capture the pattern of lights.
-
The media of claim 11, wherein: the wearable device is equipped with one or more inertial measurement units (IMUs) and one or more infrared (IR) light emitting diodes (LEDs); the first frame is an IR image; and the second frame is a visible-light image.
-
The media of claim 11, wherein the second exposure time is longer than the first exposure time.
-
The media of claim 11, wherein the pose of wearable device is generated at a faster frequency than a frequency that the first frame and the second frame are captured.
-
The media of claim 11, wherein the software is further operable when executed to: capture a third frame of the wearable device by the camera using the second exposure time; identify, in the third frame, one or more features corresponding to the predetermined features of the wearable device; determine correspondence data between the predetermined features and the one or more features; and track the wearable device in the environment based on the adjusted pose and the correspondence data.
-
The media of claim 11, wherein the camera is located within a head-mounted device; and wherein the wearable device is a remote controller separated from the head-mounted device.
-
A system comprising: one or more processors; and one or more computer-readable non-transitory storage media coupled to one or more of the processors and comprising instructions operable when executed by the one or more of the processors to cause the system to: receive motion data captured by one or more motion sensors of a wearable device; generate a pose of the wearable device based on the motion data; capture a first frame of the wearable device by a camera using a first exposure time; identify, in the first frame, a pattern of lights disposed on the wearable device; capture a second frame of the wearable device by the camera using a second exposure time; identify, in the second frame, predetermined features of the wearable device; and adjust the pose of the wearable device in an environment based on at least one of (1) the identified pattern of lights in the first frame or (2) the identified predetermined features in the second frame.
-
The system according to claim 19, wherein the instructions further cause the system to: capture the first frame of the wearable device using the first exposure time when the environment has a first light condition; and capture the second frame of the wearable device using the second exposure time when the environment has a second light condition.
Description
TECHNICAL FIELD
[0001] This disclosure generally relates to infrared-based object tracking, and more specifically methods, apparatus, and system for inertial-aided infrared and visible light tracking.
BACKGROUND
[0002] Current AR/VR controllers are being tracked using the known patterns formed by infrared (IR) light emitting diodes (LEDs) on the controllers. Although each controller has an IMU and the IMU data could be used to determine the pose of the controller, the estimated pose will inevitably drift over time. Thus, periodically, the IMU-based pose estimations of the controller would need to be realigned with the observed patterns observed by the camera. In addition, tracking based on the IR LEDs have several shortcomings. For example, bright sunlight or other infrared light sources would cause tracking to fail. Furthermore, when the controller is close to the user’s head, the IR LEDs may not be visible to allow for proper tracking.
SUMMARY OF PARTICULAR EMBODIMENTS
[0003] To address the foregoing problems, disclosed are methods, apparatuses, and a system, to track a controller by capturing a short exposure frame and a long exposure frame of an object alternately, such as performing an infrared (IR)-based tracking and a visual inertial odometry (VIO) tracking alternately by a camera. The present disclosure provides a method to realign a location of the controller by taking an IR image of the controller with a shorter exposure time and a visible-light image with a longer exposure time alternately. The method disclosed in the present application may consider the condition of the environment to track the controller based on the IR-based observations or the visible-light observations. Furthermore, the method disclosed in the present application may re-initiate the tracking of the controller periodically or when the controller is visible in the field of view of the camera, so that an accuracy of the estimated pose of the controller can be improved over time.
[0004] The embodiments disclosed herein are only examples, and the scope of this disclosure is not limited to them. Particular embodiments may include all, some, or none of the components, elements, features, functions, operations, or steps of the embodiments disclosed herein. According to one embodiment of a method, the method comprises, by a computing system, receiving motion data captured by one or more motion sensors of a wearable device. The method further comprises generating a pose of the wearable device based on the motion data. The method yet further comprises capturing a first frame of the wearable device by a camera using a first exposure time. The method additionally comprises identifying, in the first frame, a pattern of lights disposed on the wearable device. The method further comprises capturing a second frame of the wearable device by the camera using a second exposure time. The method further comprises identifying, in the second frame, predetermined features of the wearable device. In particular embodiments, the predetermined features may be features identified in a previous frame. The method yet further comprises adjusting the pose of the wearable device in an environment based on at least one of (1) the identified pattern of lights in the first frame or (2) the identified predetermined features in the second frame.
[0005] Embodiments according to the invention are in particular disclosed in the attached claims directed to a method, a storage medium, a system and a computer program product, wherein any feature mentioned in one claim category, e.g. method, can be claimed in another claim category, e.g. system, as well. The dependencies or references back in the attached claims are chosen for formal reasons only. However, any subject matter resulting from a deliberate reference back to any previous claims (in particular multiple dependencies) can be claimed as well, so that any combination of claims and the features thereof are disclosed and can be claimed regardless of the dependencies chosen in the attached claims. The subject-matter which can be claimed comprises not only the combinations of features as set out in the attached claims but also any other combination of features in the claims, wherein each feature mentioned in the claims can be combined with any other feature or combination of other features in the claims. Furthermore, any of the embodiments and features described or depicted herein can be claimed in a separate claim and/or in any combination with any embodiment or feature described or depicted herein or with any of the features of the attached claims.
[0006] Certain aspects of the present disclosure and their embodiments may provide solutions to these or other challenges. There are, proposed herein, various embodiments which address one or more of the issues disclosed herein. The methods disclosed in the present disclosure may provide a tracking method for a controller, which adjusts the pose of the controller estimated by IMU data collected from the IMU(s) disposed on the controller based on an IR image and/or a visible-light image captured by a camera of the head-mounted device. The methods disclosed in the present disclosure may improve the accuracy of the pose of the controller, even if the user is under an environment with various light conditions or light interferences. Furthermore, particular embodiments disclosed in the present application may generate the pose of the controller based on the IMU data and the visible-light images, so that the IR-based tracking may be limited under a certain light condition to save power and potentially lower cost for manufacturing the controller. Therefore, the alternative tracking system disclosed in the present disclosure may improve the tracking task efficiently in various environment conditions.
[0007] Particular embodiments of the present disclosure may include or be implemented in conjunction with an artificial reality system. Artificial reality is a form of reality that has been adjusted in some manner before presentation to a user, which may include, e.g., a virtual reality (VR), an augmented reality (AR), a mixed reality (MR), a hybrid reality, or some combination and/or derivatives thereof. Artificial reality content may include completely generated content or generated content combined with captured content (e.g., real-world photographs). The artificial reality content may include video, audio, haptic feedback, or some combination thereof, and any of which may be presented in a single channel or in multiple channels (such as stereo video that produces a three-dimensional effect to the viewer). Additionally, in some embodiments, artificial reality may be associated with applications, products, accessories, services, or some combination thereof, that are, e.g., used to create content in an artificial reality and/or used in (e.g., perform activities in) an artificial reality. The artificial reality system that provides the artificial reality content may be implemented on various platforms, including a head-mounted display (HMD) connected to a host computer system, a standalone HMD, a mobile device or computing system, or any other hardware platform capable of providing artificial reality content to one or more viewers.
[0008] The embodiments disclosed herein are only examples, and the scope of this disclosure is not limited to them. Particular embodiments may include all, some, or none of the components, elements, features, functions, operations, or steps of the embodiments disclosed above. Embodiments according to the invention are in particular disclosed in the attached claims directed to a method, a storage medium, a system and a computer program product, wherein any feature mentioned in one claim category, e.g. method, can be claimed in another claim category, e.g. system, as well. The dependencies or references back in the attached claims are chosen for formal reasons only. However, any subject matter resulting from a deliberate reference back to any previous claims (in particular multiple dependencies) can be claimed as well, so that any combination of claims and the features thereof are disclosed and can be claimed regardless of the dependencies chosen in the attached claims. The subject-matter which can be claimed comprises not only the combinations of features as set out in the attached claims but also any other combination of features in the claims, wherein each feature mentioned in the claims can be combined with any other feature or combination of other features in the claims. Furthermore, any of the embodiments and features described or depicted herein can be claimed in a separate claim and/or in any combination with any embodiment or feature described or depicted herein or with any of the features of the attached claims.
BRIEF DESCRIPTION OF THE DRAWINGS
[0009] The patent or application file contains drawings executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee.
[0010] The accompanying drawing figures incorporated in and forming a part of this specification illustrate several aspects of the disclosure, and together with the description serve to explain the principles of the disclosure.
[0011] FIG. 1 illustrates an example diagram of a tracking system architecture.
[0012] FIG. 2 illustrates an example embodiment of tracking a controller based on an IR images and/or a visible-light image.
[0013] FIG. 3 illustrates an example embodiment of tracking the controller based on the identified pattern of lights and/or the identified features.
[0014] FIG. 4 illustrates an example diagram of adjusting a pose of the controller.
[0015] FIG. 5 illustrates an example diagram of locating the controller in a local or global map based on the adjusted pose of the controller.
[0016] FIGS. 6A-6B illustrate an embodiment of a method for adjusting a pose of the wearable device by capturing an IR image and a visible-light image alternately based on a first light condition in an environment.
[0017] FIG. 7 illustrates an embodiment of a method for adjusting a pose of the wearable device by capturing a visible-light image based on a second light condition in an environment.
[0018] FIG. 8 illustrates an example computer system.
DESCRIPTION OF EXAMPLE EMBODIMENTS
[0019] For extensive services and functions provided by current AR/VR devices, a controller is commonly paired with the AR/VR devices to provide the user an easy, intuitive way to input instructions for the AR/VR devices. The controller is usually equipped with at least one inertial measurement units (IMUs) and infrared (IR) light emitting diodes (LEDs) for the AR/VR devices to estimate a pose of the controller and/or to track a location of the controller, such that the user may perform certain functions via the controller. For example, the user may use the controller to display a visual object in a corner of the room or generate a visual tag in an environment. The estimated pose of the controller will inevitably drift over time and require a realignment by an IR-based tracking. However, the IR-based tracking may be interfered by other LED light sources and/or under an environment having bright light. Furthermore, the IR-based tracking may fail due to the IR LEDs of the controller not being visible to allow for proper tracking. Particular embodiments disclosed in the present disclosure provide a method to alternately take an IR image and a visible-light image for adjusting the pose of the controller based on different light levels, environmental conditions, and/or a location of the controller.
[0020] Particular embodiments disclosed in the present disclosure provide a method to realign the pose of the controller utilizing an IR tracking or a feature tracking depending on whichever happens first. During an initialization of a controller, particular embodiments of the present application may predetermine certain features, e.g., reliable features to track the controller, by setting/painting on these features in a central module, so that the central module can identify these features in a visible-light image to adjust a pose of the controller when the pose of the controller drifts along operation.
[0021] FIG. 1 illustrates an example VIO-based SLAM tracking system architecture, in accordance with certain embodiments. The tracking system 100 comprises a central module 110 and at least one controller module 120. The central module 110 comprises a camera 112 configured to capture a frame of the controller module 120 in an environment, an identifying unit 114 configured to identify patches and features from the frame captured by the camera 112, and at least one processor 116 configured to estimate geometry of the central module 110 and the controller module 120. For example, the geometry comprises 3D points in a local map, a pose/motion of the controller module 120 and/or the central module 110, a calibration of the central module 110, and/or a calibration of the controller module 120. The controller module 120 comprises at least one IMU 122 configured to collect raw IMU data 128 of the controller module 120 upon receiving an instruction 124 from the central module 110, and to send the raw IMU data 128 to the processor 116 to generate a pose of the controller module 120, such that the central module 110 may learn and track a pose of the controller module 120 in the environment. The controller module 120 can also provide raw IMU data 126 to the identifying unit 114 for computing a prediction, such as correspondence data, for a corresponding module. Furthermore, the controller module 120 may comprise trackable markers selectively distributed on the controller module 120 to be tracked by the central module 110. For example, the trackable markers may be a plurality of light (e.g., light emitting diodes) or other trackable markers that can be tracked by the camera 112.
[0022] In particular embodiments, the identifying unit 114 of the central module 110 receives an instruction 130 to initiate the controller module 120. The identifying unit 114 instructs the camera 112 to capture a first frame of the controller module 120 for the initialization upon the receipt of the instruction 130. The first frame 140 may comprise one or more predetermined features 142 which are set or painted on in the central module 110. For example, the predetermined features 142 may be features identified in previous frames to track the controller module 120, and these identified features which are repeatedly recognized in the previous frames are considered reliable features for tracking the controller module 120. The camera 112 of the central module 110 may then start to capture a second frame 144 after the initialization of the controller module 120. For example, the processor 116 of the central module 110 may start to track the controller module 120 by capturing the second frame 144. In one embodiment, the second frame 144 may be a visible-light image which comprises the predetermined feature 142 of the controller module 120, so that the central module 110 may adjust the pose of the controller module 120 based on the predetermined feature 142 captured in the second frame 144. In another embodiment, the second frame may be an IR image which captures the plurality of lights disposed on the controller module 120, such that the central module 110 may realign the pose of the controller module 120 based on a pattern 146 of lights formed by the plurality of lights on the controller module 120. Also, the IR image can be used to track the controller module 120 based on the pattern 146 of lights, e.g., constellation of LEDs, disposed on the controller module 120, and furthermore, to update the processor 116 of the central module 110. In particular embodiments, the central module 110 may be set to take an IR image and a visible-light image alternately for realignment of the controller module 120. In particular embodiments, the central module 110 may determine to take either an IR image or a visible-light image for realignment of the controller module 120 based on a light condition of the environment. Detailed operations and actions performed at the central module 110 may be further described in FIG. 4.
[0023] In certain embodiments, the identifying unit 114 may further capture a third frame following the second frame 144 and identify, in the third frame, one or more patches corresponding to the predetermined feature 142. In this particular embodiment, the second frame 144 and the third frame, and potentially one or more next frames, are the visible-light frames, e.g., the frames taken with a long-exposure time, such that the central module 110 can track the controller module 120 based on the repeatedly-identified features over frames. The identifying unit 114 may then determine correspondence data 132 of a predetermined feature 142 between patches corresponding to each other identified in different frames, e.g., the second frame 144 and the third frame, and send the correspondence data 132 to the processor 116 for further analysis and service, such as adjusting the pose of the controller module 120 and generating state information of the controller module 120. In particular embodiments, the state information may comprise a pose, velocity, acceleration, spatial position and motion of the controller module 120, and potentially a previous route, of controller module 120 relative to an environment built by the series of frames captured by the cameras 112 of the central module 110.
[0024] FIG. 2 illustrates an example tracking system for a controller based on an IR image and/or a visible-light image, in accordance with certain embodiments. The tracking system 200 comprises a central module (not shown) and a controller module 210. The central module comprises a camera and at least one processor to track the controller module 210 in an environment. In particular embodiments, the camera of the central module may capture a first frame 220 to determine or set up predetermined features 222 of the controller module 210 for tracking during initialization stage. For example, during the initialization/startup phase of the controller module 210, a user would place the controller module 210 in a range of field of view (FOV) of the camera of the central module to initiate the controller module 210. The camera of the central module may capture the first frame 220 of the controller module 210 in this startup phase to determine one or more predetermined features 222 to track the controller module 210, such as an area where the purlicue of the hand overlaps with the controller module 120 and the ulnar border of the hand where represents a user’s hand holding the controller module 120. In particular embodiments, the predetermined features 222 can also be painted on (e.g., via small QR codes). In particular embodiments, the predetermined feature 222 may be a corner of a table or any other trackable features identified in a visible-light frame. In particular embodiments, the predetermined feature 222 may be IR patterns “blobs” in an IR image, e.g., the constellations of LEDs captured in the IR image.
[0025] In particular embodiments, the controller module 210 comprises at least one IMU and a plurality of IR LEDs, such that the controller module 210 can be realigned during operation based on either a second frame 230 capturing a pattern 240 of the IR LEDs or a second frame 230 capturing the predetermined features 222. For example, the central module may generate a pose of the controller module 210 based on raw IMU data sending from the controller module 210. The generated pose of the controller module 210 may be shifted over time and required a realignment. The central module may determine to capture a second frame 230 which captures the controller module 210 for adjusting the generated pose of the controller 210 based on a light condition in the environment. In one embodiment, the second frame 230 may be an IR image comprising a pattern 240 of the IR LEDs. When the IR pattern is a known a priori, the second frame, which is an IR image, can be used to realign or track the controller module 210 without multiple frames. In another embodiment, the second frame 230 may be a visible-light image which is identified to comprise at least one predetermined feature 222. The visible-light image may be an RGB image, a CMYK image, or a greyscale image.
[0026] In particular embodiments, the central module may capture an IR image and a visible-light image alternately by a default setting, such that the central module may readjust the generated pose of the controller module 210 based on either the IR image or the visible-light image whichever is captured first for readjustment. In particular embodiments, the central module may capture the IR image when the environment comprises a first light condition. The first light condition may comprise one or more of an indoor environment, an environment not having bright light in the background, an environment not having a light source to interfere the pattern 240 of IR LEDs of the controller module 210. For example, the environment may not comprise other LEDs to interfere the pattern 240 formed by the IR LEDs of the central module to determine a location of the controller module 210.
[0027] In particular embodiments, the central module may capture the visible image when the environment comprises a second light condition. The second light condition may comprise one or more of an environment having bright light, an environment having a light source to interfere the pattern 240 of IR LEDs of the controller module 210, and the camera of the central module not being able to capture the pattern of lights. For example, when a user is holding a controller implemented with the controller module 210 too close to a head-mounted device implemented with the central module, the camera of the central module cannot capture a complete pattern 240 formed by the IR LEDs of the controller module 210 to determine a location of the controller module 210 in the environment. Detailed operations and actions performed at the central module may be further described in FIGS. 3 to 7.
[0028] FIG. 3 illustrates an example controller 300 implemented with a controller module, in accordance with certain embodiments. The controller 300 comprises a surrounding ring portion 310 and a handle portion 320. The controllers 300 is implemented with the controller module described in the present disclosure and includes a plurality of tracking features positioned in a corresponding tracking pattern. In particular embodiments, the tracking features can include, for example, fiducial markers or light emitting diodes (LED). In particular embodiments described herein the tracking features are LED lights, although other lights, reflectors, signal generators or other passive or active markers can be used in other embodiments. For example, the controller 300 may comprise a contrast feature on the ring portion 310 or the handle portion 320, e.g., a strip with contrast color around the surface of the ring portion 310, and/or a plurality of IR LEDs 330 embedded in the ring portion 310. The tracking features in the tracking patterns are configured to be accurately tracked by a tracking camera of a central module to determine a motion, orientation, and/or spatial position of the controller 300 for reproduction in a virtual/augmented environment. In particular embodiments, the controller 300 includes a constellation or pattern of lights 332 disposed on the ring portion 310.
[0029] In particular embodiment, the controller 300 comprises at least one predetermined feature 334 for the central module to readjust a pose of the controller 300. The pose of the controller 300 may be adjusted by a spatial movement (X-Y-Z positioning movement) determined based on the predetermined features 334 between frames. For example, the central module may determine an updated spatial position of the controller 300 in frame k+1, e.g., a frame captured during operation, and compare it with a previous spatial position of the controller 300 in frame k, e.g., a frame captured in the initialization of the controller 300, to readjust the pose of the controller 300.
[0030] FIG. 4 illustrates an example diagram of a tracking system 400 comprising a central module 410 and a controller module 430, in accordance with certain embodiments. The central module 410 comprises a camera 412, an identifying unit 414, a tracking unit 416, and a filter unit 418 to perform a tracking/adjustment for the controller 420 in an environment. The controller module 430 comprises a plurality of LEDs 432 and at least one IMU 434. In particular embodiments, the identifying unit 414 of the central module 410 may send instructions 426 to initiate the controller module 430. In particular embodiments, the initialization for the controller module 430 may comprise capturing a first frame of the controller module 430 and predetermining one or more features in the first frame for tracking/identifying the controller module 430. The instructions 426 may indicate the controller module 430 to provide raw IMU data 436 for the central module 410 to track the controller module 430. The controller module 430 sends the raw IMU data 436 collected by the IMU 434 to the filter unit 418 of the central module 410 upon a receipt of the instructions 426, to order to generate/estimate a pose of the controller module 430 during operation. Furthermore, the controller module 430 sends the raw IMU data 436 to the identifying unit 414 for computing predictions of a corresponding module, e.g., correspondence data of the controller module 430. In particular embodiments, the central module 410 measures the pose of the controller module 430 at a frequency from 500 Hz to 1 kHz.
[0031] After initialization of the controller module 430, the camera 412 of the central module 410 may capture a second frame when the controller module 430 is within a FOV range of the camera for a realignment of the generated pose of the controller module 430. In particular embodiments, the camera 412 may capture the second frame of the controller module 430 for realignment as an IR image or a visible-light image alternately by a default setting. For example, the camera 412 may capture an IR image and a visible-light image alternately at a slower frequency than the frequency of generating the pose of the controller module 430, e.g., 30 Hz, and utilize whichever image captured first or capable for realignment, such as an image capturing a trackable pattern of the LEDs 432 of the controller module 430 or an image capturing predetermined features for tracking the controller module 430.
[0032] In particular embodiments, the identifying unit 414 may determine a light condition in the environment to instruct the camera 412 to take a specific type of frame. For example, the camera 412 may provide the identifying unit 414 a frame 420 based on a determination of the light condition 422. In one embodiment, the camera 412 may capture an IR image comprising a pattern of LEDs 432 disposed on the controller module 430, when the environment does not have bright light in the background. In another embodiment, the camera 412 may capture a visible-light image of the controller module 430, when the environment has a similar light source to interfere the pattern of LEDs 432 of the controller module 430. In particular embodiments, the camera 412 captures an IR image using a first exposure time and captures a visible-light image using a second exposure time. The second exposure time may be longer than the first exposure time considering the movement of the user and/or the light condition of the environment.
[0033] In particular embodiments where no LEDs 432 of the controller module 430 are used, the central module 410 may track the controller module 430 based on visible-light images. A neural network may be used to find the controller module 430 in the visible-light images. The identifying unit 414 of the central module 410 may the identify features which are constantly observed over several frames, e.g., the predetermined features and/or reliable features for tracking the controller module 430, in the frames captured by the camera 412. The central module 410 may utilize these features to compute/adjust the pose of the controller module 430. In particular embodiments, the features may comprise patches of images corresponding to the controller module 430, such as the edges of the controller module 430.
[0034] In particular embodiments, the identifying unit 414 may further send the identified frames 424 to the filter unit 418 for adjusting the generated pose of the controller module 430. When the filter unit 418 receives an identified frame 418, which can either be an IR image capturing the pattern of lights or a visible-light image comprising patches for tracking the controller module 430, the filter unit 418 may determine a location of the controller module 430 in the environment based on the pattern of lights of the controller module 430 or the predetermined feature identified in the patches from the visible-light image. In particular embodiments, a patch may be a small image signature of a feature (e.g., corner or edge of the controller) that is distinct and easily identifiable in an image/frame, regardless of the angle at which the image was taken by the camera 412.
[0035] Furthermore, the filter unit 418 may also utilize these identified frames 424 to conduct extensive services and functions, such as generating a state of a user/device, locating the user/device locally or globally, and/or rendering a virtual tag/object in the environment. In particular embodiments, the filter unit 418 of the central module 410 may also use the raw IMU data 436 in assistance of generating the state of a user. In particular embodiments, the filter unit 418 may use the state information of the user relative to the controller module 430 in the environment based on the identified frames 424, to project a virtual object in the environment or set a virtual tag in a map via the controller module 430.
[0036] In particular embodiment, the identifying unit 414 may also send the identified frames 424 to the tracking unit 416 for tracking the controller module 430. The tracking unit 416 may determine correspondence data 428 based on the predetermined features in different identified frames 424, and track the controller module 430 based on the determined correspondence data 428.
[0037] In particular embodiments, the central module 410 captures at least the following frames to track/realign the controller module 430: (1) an IR image; (2) a visible-light image; (3) an IR image; and (4) a visible-light image. In a particular embodiment, the identifying unit 414 of the central module 410 may identify IR patterns in captured IR images. When the IR patterns in the IR images are matched against an a priori pattern, such as the constellation of LED positions on the controller module 430 identified in the first frame, a single IR image can be sufficient to be used by the filter unit 418 for state estimation and/or other computations. In another embodiment of a feature-based tracking, the identifying unit 414 of the central module 410 may identify a feature to track in a first visible-light image, and the identifying unit 414 may then try to identify the same feature in a second visible-light frame, which feature is corresponding to the feature identified in the first visible-light image. When the identifying unit 414 repeatedly observes the same feature over at least two visible-light frames, these observations, e.g., identified features, in these frames can be used by the filter unit 418 for state estimation and/or other computations. Furthermore, in particular embodiments, the central module 410 can also use a single visible-light frame to update the state estimation based on a three-dimensional model of the controller module 430, such as a computer-aided design (CAD) model of the controller module 430.
[0038] In particular embodiments, the tracking system 400 may be implemented in any suitable computing device, such as, for example, a personal computer, a laptop computer, a cellular telephone, a smartphone, a tablet computer, an augmented/virtual reality device, a head-mounted device, a portable smart device, a wearable smart device, or any suitable device which is compatible with the tracking system 400. In the present disclosure, a user which is being tracked and localized by the tracking device may be referred to a device mounted on a movable object, such as a vehicle, or a device attached to a person. In the present disclosure, a user may be an individual (human user), an entity (e.g., an enterprise, business, or third-party application), or a group (e.g., of individuals or entities) that interacts or communicates with the tracking system 400. In particular embodiments, the central module 410 may be implemented in a head-mounted device, and the controller module 430 may be implemented in a remote controller separated from the head-mounted device. The head-mounted device comprises one or more processors configured to implement the camera 412, the identifying unit 414, the tracking unit 416, and the filter unit 418 of the central module 410. In one embodiment, each of the processors is configured to implement the camera 412, the identifying unit 414, the tracking unit 416, and the filter unit 418 separately. The remote controller comprises one or more processors configured to implement the LEDs 432 and the IMU 434 of the controller module 430. In one embodiment, each of the processors is configured to implement the LEDs 432 and the IMU 434 separately.
[0039] This disclosure contemplates any suitable network to connect each element in the tracking system 400 or to connect the tracking system 400 with other systems. As an example and not by way of limitation, one or more portions of network may include an ad hoc network, an intranet, an extranet, a virtual private network (VPN), a local area network (LAN), a wireless LAN (WLAN), a wide area network (WAN), a wireless WAN (WWAN), a metropolitan area network (MAN), a portion of the Internet, a portion of the Public Switched Telephone Network (PSTN), a cellular telephone network, or a combination of two or more of these. Network may include one or more networks.
[0040] FIG. 5 illustrates an example diagram of a tracking system 500 with mapping service, in accordance with certain embodiments. The tracking system 500 comprises a controller module 510, a central module 520, and a cloud 530. The controller module 510 comprises an IMU unit 512, a light unit 514, and a processor 516. The controller module 510 receives one or more instructions 542 from the central module 520 to perform specific functions. For example, the instruction 542 comprises, but is not limited to, an instruction to initiate the controller module 510, an instruction to switch off the light unit 514, and an instruction to tag a virtual object in the environment. The controller module 510 is configured to send raw IMU data 540 to the central module 420 for a pose estimation during operation, so that the processor 516 of the controller module 510 may perform the instructions 542 accurately in a map or in the environment.
[0041] The central module 520 comprises a camera 522, an identifying unit 524, a tracking unit 526, and a filter unit 528. The central module 520 may be configured to track the controller module 510 based on various methods, e.g., an estimated pose of the controller module 510 determined by the raw IMU data 540. Furthermore, the central module 520 may be configured to adjust the estimated pose of the controller module 510 during operation based on a frame of the controller module 510 captured by the camera 522. In particular embodiments, the identifying unit 524 of the central module 520 may determine a program to capture a frame of the controller module 510 based on a light condition of the environment. The program comprises, but is not limited to, capturing an IR image and a visible-light image alternately and capturing a visible-light image only. The IR image is captured by a first exposure time, and the visible-light image is captured by a second exposure time. In particular embodiments, the second exposure time may be longer than the first exposure time. The identifying unit 524 may then instruct the camera 522 to take a frame/image of the controller module 510 based on the determination, and the camera 522 would provide the identifying unit 524 a specific frame according to the determination. In particular embodiments, the identifying unit 524 may also instruct the controller module 510 to switch off the light unit 514 specific to a certain light condition, e.g., another LED source nearby, to save power.
[0042] The identifying unit 524 identifies the frame upon the receipt from the camera 522. In particular, the identifying unit 524 may receive a frame whichever is being captured first when the controller module 510 requires a readjustment of its pose. For example, the camera 522 captures an IR image and a visible-light image alternately at a slow rate, e.g., a frequency of 30 Hz, and then sends a frame to the identifying unit 524 when the controller module 510 is within the FOV of the camera 522. Therefore, the frame being captured could be either the IR image or the visible-light image. In particular embodiments, the identifying unit 524 may identify a pattern formed by the light unit 514 of the controller module 510 in the captured frame. The pattern formed by the light unit 514 may indicate that a position of the controller module 510 relative to the user/the central module 520 and/or the environment. For example, in response to a movement/rotation of the controller module 510, the pattern of the light unit 514 changes. In particular embodiments, the identifying unit 524 may identify predetermined features for tracking the controller module 510 in the captured frame. For example, the predetermined features of the controller module 510 may comprise a user’s hand gesture when holding the controller module 510, so that the predetermined features may indicate a position of the controller module 510 relative to the user/the central module 520. The identifying unit 524 may then send the identified frames to the filter unit 528 for an adjustment of the pose of the controller module 528. In particular embodiments, the identifying unit 524 may also send the identified frames to the tracking unit 526 for tracking the controller unit 510.
[0043] The filter unit 528 generates a pose of the controller module 510 based on the received raw IMU data 540. In particular embodiments, the filter unit 528 generates the pose of the controller module 510 at a faster rate than a rate of capturing a frame of the controller module. For example, the filter unit 528 may estimate and update the pose of the controller module 510 at a rate of 500 Hz. The filter unit 528 then realign/readjust the pose of the controller module 510 based on the identified frames. In particular embodiments, the filter unit 528 may adjust the pose of the controller module 510 based on the pattern of the light unit 514 of the controller module 510 in the identified frame. In particular embodiments, the filter unit 528 may adjust the pose of the controller module 510 based on the predetermined features identified in the frame.
[0044] In particular embodiments, the tracking unit 526 may determine correspondence data based on the predetermined features identified in different frames. The correspondence data may comprise observations and measurements of the predetermined feature, such as a location of the predetermined feature of the controller module 510 in the environment. Furthermore, the tracking unit 526 may also perform a stereo computation collected near the predetermined feature to provide additional information for the central module 520 to track the controller module 510. In addition, the tracking unit 526 of the central module 520 may request a live map from the cloud 530 corresponding to the correspondence data. In particular embodiments, the live map may comprise map data 544. The tracking unit 526 of the central module 520 may also request a remote relocalization service 544 for the controller module 510 to be located in the live map locally or globally.
[0045] Furthermore, the filter unit 528 may estimate a state of the controller module 510 based on the correspondence data and the raw IMU data 540. In particular embodiments, the state of the controller module 510 may comprise a pose of the controller module 510 relative to an environment which is built based on the frames captured by the camera 522, e.g., a map built locally. In addition, the filter unit 528 may also send the state information of the controller module 510 to the cloud 530 for a global localization or an update of the map stored in the cloud 530 (e.g., with the environment built locally).
……
……
……