Sony Patent | Information Processing Apparatus, Information Processing Method, And Recording Medium

编辑：映维 | 分类：Sony | 2020年10月29日

Patent: Information Processing Apparatus, Information Processing Method, And Recording Medium

Publication Number: 20200341284

Publication Date: 20201029

Applicants: Sony

Abstract

[Problem] To make it possible to guide the line of sight of a user in a more preferable manner. [Solution] An information processing apparatus includes: an acquisition unit that acquires first information regarding guidance of a line of sight of a user; and a control unit that controls second information to be presented to the user so as to guide the line of sight of the user, wherein the control unit controls, based on the first information, the second information so as to be localized in a route having a lower following load on the user with regard to the second information out of a first route and a second route connecting a start position and an end position regarding the guidance of the line of sight in accordance with a second coordinate that is independent of a first coordinate associated with an output unit.

FIELD

[0001] The present disclosure relates to an information processing apparatus, an information processing method, and a recording medium.

BACKGROUND

[0002] The technology called virtual reality (VR: Virtual Reality) that causes a user to perceive a virtual space as if it is the reality and the technology called virtual reality (VR: Virtual Reality) that superimposes additional information on the real world and presents it to the user have received attention in recent years. In this background, various considerations have been made on the user interface that assumes the use of the VR technology or the AR technology. For example, Patent Literature 1 discloses an example of the technique for executing the interaction between users using the AR technology.

CITATION LIST

Patent Literature

[0003] Patent Literature 1: WO 2014/162825 A1

SUMMARY

Technical Problem

[0004] With the use of the VR technology or the AR technology, the target information to be presented to the user may be localized in the expanded space that is wider than the field of view of the user as well as the range within the field of view (for example, presented in a desired position (coordinates) in the real space or in the virtual space). That is, in such a case, for example, the user may look around to change the position and the direction of the viewpoint so as to see the information localized in the space around him/her (for example, the information positioned outside his/her field of view).

[0005] Furthermore, when the information that needs to be noticed by the user is presented outside the field of view of the user (in other words, outside the range displayed on a display device such as a display), the user may be unaware of the presented information. In this background, there has been consideration on, for example, the introduction of the mechanism that, in a situation where the information to be presented is localized outside the field of view of the user, guides the line of sight of the user to the desired position (e.g., the position where the information is localized).

[0006] Thus, the present disclosure suggests the technique with which it is possible to guide the line of sight of the user in a more preferable manner.

Solution to Problem

[0007] According to the present disclosure, an information processing apparatus is provided that includes: an acquisition unit that acquires first information regarding guidance of a line of sight of a user; and a control unit that controls second information to be presented to the user so as to guide the line of sight of the user, wherein the control unit controls, based on the first information, the second information so as to be localized in a route having a lower following load on the user with regard to the second information out of a first route and a second route connecting a start position and an end position regarding the guidance of the line of sight in accordance with a second coordinate that is independent of a first coordinate associated with an output unit.

[0008] Moreover, according to the present disclosure, an information processing method is provided that causes a computer to execute: acquiring first information regarding guidance of a line of sight of a user; controlling second information to be presented to the user so as to guide the line of sight of the user; and controlling, based on the first information, the second information so as to be localized in a route having a lower following load on the user with regard to the second information out of a first route and a second route connecting a start position and an end position regarding the guidance of the line of sight in accordance with a second coordinate that is independent of a first coordinate associated with an output unit.

[0009] Moreover, according to the present disclosure, a recording medium storing a program is provided that causes a computer to execute: acquiring first information regarding guidance of a line of sight of a user; controlling second information to be presented to the user so as to guide the line of sight of the user; and controlling, based on the first information, the second information so as to be localized in a route having a lower following load on the user with regard to the second information out of a first route and a second route connecting a start position and an end position regarding the guidance of the line of sight in accordance with a second coordinate that is independent of a first coordinate associated with an output unit.

Advantageous Effects of Invention

[0010] As described above, the present disclosure provides the technique with which it is possible to guide the line of sight of the user in a more preferable manner.

[0011] Furthermore, the above-described advantage is not necessarily a limitation, and any advantage mentioned in this description or other advantages that may be derived from this description may be achieved together with the above-described advantage or instead of the above-described advantage.

BRIEF DESCRIPTION OF DRAWINGS

[0012] FIG. 1 is an explanatory diagram illustrating an example of the schematic configuration of an information processing system according to an embodiment of the present disclosure.

[0013] FIG. 2 is an explanatory diagram illustrating an example of the schematic configuration of an input/output device according to the embodiment.

[0014] FIG. 3 is an explanatory diagram illustrating the overview of the information processing system according to a first embodiment of the present disclosure.

[0015] FIG. 4 is an explanatory diagram illustrating the overview of the information processing system according to the embodiment.

[0016] FIG. 5 is a block diagram illustrating an example of the functional configuration of the information processing system according to the embodiment.

[0017] FIG. 6 is a flowchart illustrating an example of the flow of the series of processes in the information processing system according to the embodiment.

[0018] FIG. 7 is an explanatory diagram illustrating the overview of the information processing system according to a modification 1-1 of the embodiment.

[0019] FIG. 8 is an explanatory diagram illustrating the overview of the information processing system according to the modification 1-1 of the embodiment.

[0020] FIG. 9 is an explanatory diagram illustrating the overview of the information processing system according to the modification 1-1 of the embodiment.

[0021] FIG. 10 is an explanatory diagram illustrating the overview of the information processing system according to the modification 1-1 of the embodiment.

[0022] FIG. 11 is an explanatory diagram illustrating the overview of the information processing system according to a modification 1-2 of the embodiment.

[0023] FIG. 12 is an explanatory diagram illustrating the overview of the information processing system according to a modification 1-3 of the embodiment.

[0024] FIG. 13 is an explanatory diagram illustrating the overview of the information processing system according to the modification 1-3 of the embodiment.

[0025] FIG. 14 is an explanatory diagram illustrating the overview of the information processing system according to the modification 1-3 of the embodiment.

[0026] FIG. 15 is an explanatory diagram illustrating the overview of the information processing system according to a second embodiment of the present disclosure.

[0027] FIG. 16 is an explanatory diagram illustrating the overview of the information processing system according to the embodiment.

[0028] FIG. 17 is an explanatory diagram illustrating the overview of the information processing system according to the embodiment.

[0029] FIG. 18 is an explanatory diagram illustrating the overview of the information processing system according to the embodiment.

[0030] FIG. 19 is a flowchart illustrating an example of the flow of the series of processes in the information processing system according to the embodiment.

[0031] FIG. 20 is an explanatory diagram illustrating an example of the control regarding the guidance of the line of sight of a user by the information processing system according to the embodiment.

[0032] FIG. 21 is an explanatory diagram illustrating an example of the control regarding the guidance of the line of sight of a user by the information processing system according to the embodiment.

[0033] FIG. 22 is a functional block diagram illustrating an example of the hardware configuration of an information processing apparatus included in an information processing system according to an embodiment of the present disclosure.

DESCRIPTION OF EMBODIMENTS

[0034] Preferred embodiments of the present disclosure are described below in detail with reference to the accompanying drawings. Furthermore, in the description and the drawings, the components having substantially the same function and configuration are denoted with the same reference numeral, and duplicated descriptions are omitted.

[0035] Furthermore, the descriptions are given in the following order.

[0036] 1. Overview [0037] 1. 1. Schematic configuration [0038] 1. 2. Configuration of an input/output device [0039] 1. 3.* Principle of self-location estimation*

[0040] 2.* Consideration on guidance of the line of sight*

[0041] 3. First Embodiment [0042] 3. 1. Overview [0043] 3. 2. Functional configuration [0044] 3. 3. Process [0045] 3. 4. Modification [0046] 3. 5.* Evaluation*

[0047] 4. Second Embodiment [0048] 4. 1. Overview [0049] 4. 2. Process [0050] 4. 3. Modification [0051] 4. 4.* Evaluation*

[0052] 5.* Hardware configuration*

[0053] 6.* Conclusion*

1.* OVERVIEW*

[0054] <1. 1. Schematic Configuration>

[0055] First, an example of the schematic configuration of an information processing system according to an embodiment of the present disclosure is described with reference to FIG. 1. FIG. 1 is an explanatory diagram illustrating an example of the schematic configuration of the information processing system according to an embodiment of the present disclosure. In FIG. 1, a reference numeral M11 schematically represents an object (i.e., a real object) located in the real space. Furthermore, reference numerals V11 and V13 schematically represent virtual contents (i.e., virtual objects) that are presented so as to be superimposed on the real space. In the example illustrated in FIG. 1, based on what is called the AR technology, for example, an information processing system 1 superimposes a virtual object on an object, such as the real object M11, in the real space and presents it to the user. Further, in FIG. 1, for easy understanding of the characteristics of the information processing system according to the present embodiment, both the real object and the virtual object are presented together.

[0056] As illustrated in FIG. 1, the information processing system 1 according to the present embodiment includes an information processing apparatus 10 and an input/output device 20. The information processing apparatus 10 and the input/output device 20 are configured so as to transmit/receive information to/from each other via a predetermined network. Furthermore, the type of network connecting the information processing apparatus 10 and the input/output device 20 is not particularly limited. In a specific example, the network may be configured by using what is called a wireless network, such as a network based on a standard such as Bluetooth (registered trademark) or Wi-Fi (registered trademark). Further, in another example, the network may be configured by using the Internet, a dedicated line, a LAN (Local Area Network), a WAN (Wide Area Network), or the like. Moreover, the network may include a plurality of networks, and a part of them may be configured as a wired network.

[0057] The input/output device 20 is configured to acquire various types of input information and to present various types of output information to a user who holds the input/output device 20. Further, the information processing apparatus 10 controls the presentation of the output information by the input/output device 20 based on the input information acquired by the input/output device 20. For example, the input/output device 20 acquires, as the input information, the information (e.g., the image of the captured real space) for recognizing the real object M11 and outputs the acquired information to the information processing apparatus 10. The information processing apparatus 10 recognizes the position of the real object M11 in the real space based on the information acquired from the input/output device 20 and causes the input/output device 20 to present the virtual objects V11 and V13 in accordance with the recognition result. Under such a control, the input/output device 20 may present the virtual objects V11 and V13 to the user such that the virtual objects V11 and V13 are superimposed with respect to the real object M11 based on what is called the AR technology.

[0058] Furthermore, in the example illustrated in FIG. 1, the input/output device 20 is configured as, for example, what is called a head mounted device that is attached to at least part of the user’s head while in use and is configured so as to detect the line of sight of the user. With this configuration, for example, when the information processing apparatus 10 recognizes that the user is watching the desired target (e.g., the real object M11 or the virtual objects V11 and V13) in accordance with the detection result of the line of sight of the user by the input/output device 20, it may determine that the target is an operation target. Furthermore, the information processing apparatus 10 may determine that the object to which the line of sight of the user is directed is an operation target by using a predetermined operation on the input/output device 20 as a trigger. As described above, the information processing apparatus 10 may determine the operation target and execute the process associated with the operation target so as to provide the user with various services via the input/output device 20.

[0059] Furthermore, the information processing apparatus 10 may be configured as, for example, a wireless communication terminal such as a smartphone. Further, the information processing apparatus 10 may be configured as a device such as a server. Moreover, although the input/output device 20 and the information processing apparatus 10 are illustrated as different devices from each other in FIG. 1, the input/output device 20 and the information processing apparatus 10 may be integrally configured. Moreover, the configuration and the process of the input/output device 20 and the information processing apparatus 10 are described later in detail.

[0060] An example of the schematic configuration of the information processing system according to an embodiment of the present disclosure has been described above with reference to FIG. 1.

[0061] <1. 2. Configuration of the Input/Output Device>

[0062] Next, an example of the schematic configuration of the input/output device 20 according to the present embodiment illustrated in FIG. 1 is described with reference to FIG. 2. FIG. 2 is an explanatory diagram illustrating an example of the schematic configuration of the input/output device according to the present embodiment.

[0063] As described above, the input/output device 20 according to the present embodiment is configured as what is called a head mounted device that is attached to at least part of the user’s head while in use. For example, in the example illustrated in FIG. 2, the input/output device 20 is configured as what is called an eyewear type (glasses-type) device, and at least any one of lenses 293a and 293b is configured as a transmissive display (a display unit 211). Furthermore, the input/output device 20 includes first imaging units 201a and 201b, second imaging units 203a and 203b, an operating unit 207, and a holding unit 291 corresponding to the frame of the glasses. When the input/output device 20 is attached to the user’s head, the holding unit 291 holds the display unit 211, the first imaging units 201a and 201b, the second imaging units 203a and 203b, and the operating unit 207 so as to have a predetermined positional relationship with respect to the user’s head. Although not illustrated in FIG. 2, the input/output device 20 may include a voice collecting unit that collects the user’s voice.

[0064] Here, a more detailed configuration of the input/output device 20 is described. For example, in the example illustrated in FIG. 2, the lens 293a corresponds to the lens on the side of the right eye, and the lens 293b corresponds to the lens on the side of the left eye. Specifically, when the input/output device 20 is mounted, the holding unit 291 holds the display unit 211 such that the display unit 211 (in other words, the lenses 293a and 293b) is positioned in front of the user’s eye.

[0065] The first imaging units 201a and 201b are configured as what is called a stereo camera and, when the input/output device 20 is attached to the user’s head, are held by the holding unit 291 so as to face in the direction (i.e., the front side of the user) in which the user’s head faces. Here, the first imaging unit 201a is held near the user’s right eye, and the first imaging unit 201b is held near the user’s left eye. With this configuration, the first imaging units 201a and 201b capture the subject located in front of the input/output device (i.e., the real object located in the real space) from different positions. Thus, the input/output device 20 may acquire the image of the subject located in front of the user and may also calculate the distance from the input/output device 20 to the subject on the basis of the disparity between the images captured by the first imaging units 201a and 201b, respectively.

[0066] There is no particular limitation on the configuration and the method as long as the distance between the input/output device 20 and the subject may be measured. In a specific example, the distance between the input/output device 20 and the subject may be measured based on a method such as multi-camera stereo, movement disparity, TOF (Time Of Flight), or Structured Light. Here, the TOF is the method in which light such as infrared rays is projected to the subject and the time that elapses before the projected light is reflected by the subject and is returned is measured for each pixel so that the image (what is called a distance image) including the distance (depth) to the subject is obtained based on the measurement result. Furthermore, Structured Light is the method in which the subject is irradiated with a pattern with light such as infrared rays and it is captured so that a distance image including the distance (depth) to the subject is obtained based on a change in the pattern obtained from an imaging result. Further, the movement disparity is the method for measuring the distance to the subject based on the disparity even with what is called a monocular camera. Specifically, by moving the camera, the subject is captured from different viewpoints, and the distance to the subject is measured based on the disparity between the captured images. Moreover, various sensors recognize the moving distance and the moving direction of the camera so as to measure the distance to the subject more accurately. Here, the configuration of the imaging unit (e.g., a monocular camera or a stereo camera) may be changed depending on the method for measuring the distance.

[0067] Furthermore, the second imaging units 203a and 203b are held by the holding units 291, respectively, such that the eyeballs of the user are positioned within the respective imaging ranges when the input/output device 20 is attached to the user’s head. In a specific example, the second imaging unit 203a is held such that the user’s right eye is positioned within the imaging range. With this configuration, it is possible to recognize the direction in which the line of sight of the right eye is directed based on the image of the right eyeball captured by the second imaging unit 203a and the positional relationship between the second imaging unit 203a and the right eye. Similarly, the second imaging unit 203b is held such that the user’s left eye is positioned within the imaging range. That is, it is possible to recognize the direction in which the line of sight of the left eye is directed based on the image of the left eyeball captured by the second imaging unit 203b and the positional relationship between the second imaging unit 203b and the left eye. Furthermore, in the example illustrated in FIG. 2, the input/output device 20 includes both the second imaging units 203a and 203b; however, only either one of the second imaging units 203a and 203b may be included.

[0068] The operating unit 207 is configured to receive an operation from the user with regard to the input/output device 20. The operating unit 207 may be configured by using an input device such as a touch panel or a button. The operating unit 207 is held at a predetermined position of the input/output device 20 by the holding unit 291. For example, in the example illustrated in FIG. 2, the operating unit 207 is held at the position corresponding to a temple of the glasses.

[0069] Furthermore, the input/output device 20 according to the present embodiment may be configured to include, for example, an acceleration sensor or an angular velocity sensor (gyro sensor) so as to detect the movement of the user’s head wearing the input/output device 20 (in other words, the movement of the input/output device 20 itself). In a specific example, the input/output device 20 may detect each component in the yaw (yaw) direction, the pitch (pitch) direction, and the roll (roll) direction as the movement of the user’s head to recognize a change in at least any of the position and the posture of the user’s head.

[0070] With the above-described configuration, the input/output device 20 according to the present embodiment may recognize a change in its position or posture in the real space in accordance with the movement of the user’s head. Furthermore, here, the input/output device 20 may cause the display unit 211 to present a virtual content (i.e., a virtual object) such that it is superimposed with respect to the real object located in the real space based on what is called the AR technology. Further, an example of the method (i.e., self-location estimation) with which the input/output device 20 estimates its position and posture in the real space is described in detail later.

[0071] Furthermore, examples of a head mounted display device (HMD: Head Mounted Display) that is appliable as the input/output device 20 include a see-through type HMD, a video see-through type HMD, and a retinal projection type HMD when the application of the AR technology is assumed.

[0072] The see-through type HMD uses, for example, a half mirror or a transparent light guide plate to hold a virtual-image optical system including a transparent light guide unit, or the like, in front of the user’s eyes and displays an image on the inner side of the virtual-image optical system. Therefore, the user wearing the see-through type HMD may see the external scenery while watching the image displayed on the inner side of the virtual-image optical system. With this configuration, the see-through type HMD may superimpose the image of a virtual object on the optical image of the real object located in the real space in accordance with the recognition result of at least any one of the position and the posture of the see-through type HMD on the basis of, for example, the AR technology. Moreover, specific examples of the see-through type HMD include what is called a glasses-type wearable device in which the part corresponding to a lens of the glasses is configured as a virtual-image optical system. For example, the input/output device 20 illustrated in FIG. 2 corresponds to an example of the see-through type HMD.

[0073] When the video see-through type HMD is attached to the user’s head or face, it is attached so as to cover the user’s eyes, and a display unit such as a display is held in front of the user’s eyes. Further, the video see-through type HMD includes an imaging unit that captures the surrounding scenery so as to cause the display unit to display the image of the scenery, captured by the imaging unit, in front of the user. With this configuration, although it is difficult for the user wearing the video see-through type HMD to directly see the external scenery, it is possible to check the external scenery by using the image displayed on the display unit. Moreover, here, the video see-through type HMD may superimpose a virtual object on the image of the external scenery in accordance with the recognition result of at least any one of the position and the posture of the video see-through type HMD based on, for example, the AR technology.

[0074] In the retinal projection type HMD, a projection unit is held in front of the user’s eyes so that an image is projected by the projection unit to the user’s eyes such that the image is superimposed on the external scenery. More specifically, in the retinal projection type HMD, the image is directly projected by the projection unit on the retina of the user’s eye, and the image is focused on the retina. With this configuration, even a user having myopia or hypermetropia may view clearer images. Furthermore, the user wearing the retinal projection type HMD may see the external scenery while watching the image projected by the projection unit. With this configuration, the retinal projection type HMD may superimpose the image of a virtual object on the optical image of the real object located in the real space in accordance with the recognition result of at least any one of the position and the posture of the retinal projection type HMD based on, for example, the AR technology.

[0075] Furthermore, in addition to the example described above, it is also possible to apply an HMD called an immersive HMD when the application of the VR technology is assumed. As is the case with the video see-through type HMD, the immersive HMD is attached so as to cover the user’s eyes, and a display unit such as a display is held in front of the user’s eye. Thus, the user wearing the immersive HMD has difficulty in directly seeing the external scenery (i.e., the scenery in the real world) and therefore sees only the video displayed on the display unit. With this configuration, the immersive HMD may give a sense of immersion to the user who is viewing an image.

[0076] An example of the schematic configuration of the input/output device according to an embodiment of the present disclosure has been described above with reference to FIG. 2.

[0077] <1. 3. Principle of Self-Location Estimation>

[0078] Next, an example of the principle of the method (i.e., self-location estimation) for estimating the position and the posture of the input/output device 20 in the real space when the virtual object is superimposed with respect to the real object is described.

[0079] According to a specific example of the self-location estimation, the input/output device 20 uses an imaging unit, such as a camera, provided therein to capture a marker, or the like, having a known size and presented on the real object in the real space. Then, the input/output device 20 analyzes the captured image to estimate at least any one of the position and the posture of its own relative to the marker (and furthermore the real object on which the marker is presented). Furthermore, the following description focuses on the case in which the input/output device 20 estimates its position and posture; however, the input/output device 20 may estimate only any one of the position and the posture of its own.

[0080] Specifically, it is possible to estimate the direction of the imaging unit (and furthermore the input/output device 20 including the imaging unit) relative to the marker in accordance with the direction of the marker captured in the image (e.g., the direction of the pattern of the marker). Furthermore, in the case where the size of the marker is known, the distance between the marker and the imaging unit (i.e., the input/output device 20 including the imaging unit) may be estimated in accordance with the size of the marker in the image. More specifically, when the marker is captured in a long distance, the marker having a smaller size is captured. Further, here, the range of the real space captured in the image may be estimated based on the angle of view of the imaging unit. With the use of the above features, it is possible to inversely calculate the distance between the marker and the imaging unit in accordance with the size of the marker captured in the image (in other words, the percentage occupied by the marker within the angle of view). With the above-described configuration, the input/output device 20 may estimate its position and posture relative to the marker.

[0081] Furthermore, the technique called SLAM (simultaneous localization and mapping) may be used for the self-location estimation of the input/output device 20. The SLAM is a technique using an imaging unit, such as a camera, various sensors, an encoder, or the like, to perform the self-location estimation and the generation of an environment map in parallel. In a more specific example, in the SLAM (in particular, Visual SLAM), the three-dimensional shape of the captured scene (or the subject) is sequentially restored based on the moving image captured by the imaging unit. Then, the restoration result of the captured scene is associated with the detection result of the position and the posture of the imaging unit so that the map of the surrounding environment is generated and the position and the posture of the imaging unit (and furthermore the input/output device 20) in the environment are estimated. For example, the input/output device 20 may include various sensors such as an acceleration sensor and an angular velocity sensor to estimate the position and the posture of the imaging unit as the information indicating a relative change based on detection results of the sensors. It is obvious that, as long as the position and the posture of the imaging unit may be estimated, the method is not necessarily limited to the method based on detection results of various sensors such as an acceleration sensor and an angular velocity sensor.

[0082] With the above configuration, for example, the estimation results of the position and the posture of the input/output device 20 relative to the marker based on the capturing result of the known marker by the imaging unit may be used for the initialization processing and the position correction for the above-described SLAM. With this configuration, even in a situation where the marker is not included in the angle of view of the imaging unit, the input/output device 20 may estimate its position and posture relative to the marker (and furthermore the real object for which the marker is presented) due to the self-location estimation based on the SLAM in accordance with the result of the previously executed initialization or position correction.

[0083] An example of the principle of the method (i.e., the self-location estimation) for the input/output device 20 to estimate its position and posture in the real space when the virtual object is superimposed with respect to the real object has been described above. Furthermore, in the following description, for example, the position and the posture of the input/output device 20 relative to an object (the real object) in the real space may be estimated based on the above-described principle.

2.* CONSIDERATION ON GUIDANCE OF THE LINE OF SIGHT*

[0084] Next, the overview of the guidance for the line of sight of the user in the assumed case of the use of the AR technology or the VR technology is described. Furthermore, the term “localize” in this description refers to being located or being presented to be located at a position (in other words, coordinates) in a desired space, such as the real space or a virtual space. Furthermore, the term “localize” may include not only a stationary state in a desired space but also, for example, a moving state along a route in a desired space. For example, the description of a desired object being localized along a predetermined route may include the object being presented so as to move along the route.

[0085] When information is presented to a user via a display device such as a display, for example, the position where the target information to be presented is localized is sometimes limited to the range defined in the display screen of the display device. On the other hand, when the VR technology or the AR technology is used, the target information to be presented to the user is not limited to the range within the field of view of the user but may be also localized within the expanded space (for example, around the user) that is wider than the field of view.

[0086] In a specific example, when the use of the AR technology is assumed, an object that is virtual (hereinafter, also referred to as “virtual object”) may be presented to the user such that the object is localized not only within the range of the field of view of the user but also in the real space that spreads around the user. In this case, for example, the user may see the virtual object localized in the space around him/her (e.g., the virtual object located outside of his/her own field of view) while changing the position or the direction of the viewpoint so as to look around. Furthermore, even in the case of the use of the VR technology, the information may be presented to the user in substantially the same manner as is the case with the use of the AR technology except that the virtual space is used instead of the real space.

[0087] However, when the information that needs to be noticed by the user is presented outside the field of view of the user (in other words, outside the range displayed on the display device such as a display), the user is sometimes unaware of the presented information. In this background, for a situation where the target information to be presented is localized outside the field of view of the user, as in the case of the use of the AR technology or the VR technology, for example, there is sometimes demand for the introduction of the mechanism that guides the user’s line of sight to a desired position (for example, the position where the information is localized).

[0088] Therefore, the present disclosure suggests an example of the technique for guiding the line of sight of the user in a more preferable manner in a situation where the user sees the information presented so as to be localized around him/her while flexibly changing the position and the direction of the viewpoint. In particular, the present disclosure suggests an example of the mechanism that may reduce a load (e.g., a mental load or a physical load) on the user in accordance with the following when the user causes the line of sight to follow the guidance.

3.* FIRST EMBODIMENT*

[0089] Hereinafter, an information processing system according to a first embodiment of the present disclosure is described.

[0090] <3. 1. Overview>

[0091] First, the overview of the information processing system according to the present embodiment is described. For example, FIG. 3 and FIG. 4 are explanatory diagrams illustrating the overview of the information processing system according to the present embodiment and illustrating an example of the method for guiding the line of sight of the user. In FIG. 3 and FIG. 4, the reference numeral R101 schematically indicates the field of view of a user U111. Specifically, the reference numeral P110 indicates the position of the viewpoint (hereinafter, simply referred to as “viewpoint”) of the user U111. Furthermore, in the following description, the position of the viewpoint of the user is also referred to as “viewpoint position” as in the position P110. Further, the reference numeral V111 schematically indicates the display information (hereinafter, also referred to as “guidance object”) presented to guide the line of sight of the user. The guidance object V111 is presented to the user U111 via the display unit (e.g., the display unit 211 illustrated in FIG. 2), such as a display.

[0092] Specifically, FIG. 3 illustrates an example of the case in which the line of sight of the user U111 paying attention to the position indicated by the reference numeral P111 within the field of view R101 is guided to a position P113 outside the field of view R101 due to the presentation of the guidance object V111. Specifically, the position P111 is a position substantially corresponding to the point of gaze of the user U111 before the guidance of the line of sight is started. That is, in the example illustrated in FIG. 3, the guidance object V111 is first presented at the position P111, and the presentation of the guidance object V111 to the user U111 is controlled such that the guidance object P111 is moved from the point of gaze P111 to the position P113. Furthermore, in the following description, the position at which the guidance of the line of sight is started, such as the position P111, is also referred to as “start position”, and the position to which the line of sight is guided, i.e., the position at which the guidance of the line of sight is ended, such as the position P113, is also referred to as “end position”.

[0093] For example, in the example illustrated in FIG. 3, the presentation of the guidance object V111 is controlled such that the guidance object V111 is localized along a route R111 connecting the start position P111 and the end position P113 with a straight line (that is, the guidance object V111 is moved along the route R111). However, in the example illustrated in FIG. 3, the guidance object V111 may be presented to the user U111 in such an expression that it passes through the user U111 or it passes near the user U111 in accordance with the positional relationship among the viewpoint position P100, the start position P111 (i.e., the point of gaze), and the end position P113 (i.e., the guidance destination). That is, the guidance object V111 is presented to the user U111 in such an expression that the guidance object V111 moves toward him/her and passes through the his/her body. In such a situation, the user U111 may feel uncomfortable.

[0094] Therefore, the information processing system according to the present embodiment controls the presentation of the guidance object V111 such that the guidance object V111 is localized along the route for which the user U111 has a lower following load with regard to the guidance object V111.

[0095] For example, FIG. 4 illustrates an example of the control regarding the guidance of the line of sight of the user U111 in the information processing system according to the present embodiment. Furthermore, in FIG. 4, the target denoted by the same reference numeral as that in FIG. 3 indicates the same target as that in FIG. 3. For example, in the example illustrated in FIG. 4, when the route R111 connecting the distance between the start position P111 and the end position P111 with a straight line is set in the same manner as that in the example illustrated in FIG. 3, the guidance object V111 passes through an area near the user U111.

[0096] Therefore, in the situation illustrated in FIG. 4, the information processing system according to the present embodiment controls the presentation of the guidance object V111 such that the guidance object V111 is localized along the route for which the user U111 has a lower following load (e.g., a physical load or a mental load) with regard to the guidance object V111. Specifically, in the example illustrated in FIG. 4, the information processing system sets, as the route along which the guidance object V111 is moved, a route R113 farther away from the user U111 (i.e., the route farther away from the viewpoint position P100) among a plurality of routes connecting the start position P111 and the end position P113.

[0097] Under such a control, the guidance object V111 is controlled so as to move from the start position P111 to the end position P113 while the separate state between the user U111 and the guidance object V111 is maintained. That is, it is possible to prevent the occurrence of a situation in which the guidance object V111 is presented to the user U111 in such an expression that the guidance object V111 is moved toward the user U111 and is eventually passed through the user U111. Thus, it is possible to reduce the following load on the user with regard to the guidance object V111, as compared with the case where the guidance object V111 is controlled so as to be localized along the route R111.

[0098] The overview of the information processing system according to the present embodiment has been described above with reference to FIG. 3 and FIG. 4.

[0099] <3. 2. Functional Configuration>

[0100] Next, an example of the functional configuration of the information processing system according to the present embodiment is described with reference to FIG. 5. FIG. 5 is a block diagram illustrating an example of the functional configuration of the information processing system according to the present embodiment. Furthermore, in this description, an example of the configuration of the information processing system is described with a focus on the case where the information processing system uses the AR technology to present information to the user.

[0101] As illustrated in FIG. 5, the information processing system 1 includes the input/output device 20, the information processing apparatus 10, and a storage unit 190. Here, the input/output device 20 and the information processing apparatus 10 correspond to, for example, the input/output device 20 and the information processing apparatus 10 in the example illustrated in FIG. 1.

[0102] The input/output device 20 includes, for example, imaging units 201 and 203, a sensing unit 221, and an output unit 210. Furthermore, the imaging unit 201 corresponds to the first imaging units 201a and 201b in the example illustrated in FIG. 2. Further, the imaging unit 203 corresponds to the second imaging units 203a and 203b in the example illustrated in FIG. 2. That is, the imaging unit 201 and the imaging unit 203 are substantially the same as those in the example described with reference to FIG. 2, and therefore detailed descriptions thereof are omitted.

[0103] The sensing unit 221 is configured to sense various states. For example, the sensing unit 221 may include an acceleration sensor, an angular velocity sensor, an orientation sensor, or the like, and use the sensor to sense the movement of the site (e.g., the head) of the user wearing the input/output device 20 (and furthermore the direction in which the site of the user faces). Furthermore, the sensing unit 221 may include a receiver corresponding to the GNSS (Global Navigation Satellite System), or the like, to sense the position of the input/output device 20 (and furthermore the position of the user). Further, the sensing unit 221 may include a biological sensor, or the like, and use the biological sensor to sense various states of the user. Moreover, the sensing unit 221 notifies the information processing apparatus 10 of the information corresponding to a sensing result of various states.

[0104] The output unit 210 corresponds to an output interface to present various types of information to the user. For example, the output unit 210 may include the display unit 211. Furthermore, the display unit 211 corresponds to the display unit 211 in the example illustrated in FIG. 2 and therefore detailed descriptions are omitted. Further, the output unit 210 may include a sound output unit 213. The sound output unit 213 is configured by using a sound output device such as a speaker to output voice or sound so as to present desired information to the user.

[0105] The information processing apparatus 10 includes a recognition processing unit 101, a detection unit 103, a processing execution unit 109, and an output control unit 111.

[0106] The image captured by the imaging unit 201 is acquired, and the acquired image is subjected to an analysis process so as to recognize an object (subject) present in the real space and captured in the image. In a specific example, the recognition processing unit 101 acquires images (hereinafter, also referred to as “stereo images”) captured at different viewpoints from the imaging unit 201 configured as a stereo camera and measures the distance to the object captured in the image for each pixel of the image based on the disparity between the acquired images. Thus, the recognition processing unit 101 may estimate or recognize the relative positional relationship (in particular, the positional relationship in the depth direction) between the imaging unit 201 (and furthermore the input/output device 20) and each object present in the real space and captured in the image at the timing in which the image is captured.

[0107] Furthermore, the recognition processing unit 101 may perform the self-location estimation and the generation of the environment map based on the SLAM to recognize the positional relationship between the input/output device 20 and the object present in the real space and captured in the image. In this case, for example, the recognition processing unit 101 may acquire the information corresponding to the detection result of a change in the position and the posture of the input/output device 20 from the detection unit 103 (e.g., an posture detection unit 107), which is described later, and use the information for the self-location estimation based on the SLAM.

[0108] As described above, the recognition processing unit 101 recognizes the position of each object present in the real space and captured in the image and outputs the information indicating the recognition result to the output control unit 111.

[0109] Furthermore, as described above, the method for measuring the distance to the subject is not limited to the above-described measurement method based on a stereo image. Therefore, the configuration corresponding to the imaging unit 201 may be changed as appropriate in accordance with the method for measuring the distance. In a specific example, in a case where the distance to the subject is measured based on the TOF, it is possible to provide, instead of the imaging unit 201, a light source that projects infrared rays and a light receiving element that detects infrared rays projected from the light source and reflected by the subject. Furthermore, a plurality of measurement methods may be used to measure the distance to the object. In this case, the input/output device 20 or the information processing apparatus 10 may include the configuration that acquires the information to be used for the measurement in accordance with the measurement method to be used. It is obvious that the content of the information (e.g., depth map) indicating the recognition result of the real-space position of each object captured in the image may be changed as appropriate in accordance with the measurement method to be applied.

[0110] The detection unit 103 detects various states of the user holding the input/output device 20 (for example, the user wearing the input/output device 20) based on various types of information acquired by the input/output device 20. For example, the detection unit 103 includes a line-of-sight detection unit 105 and the posture detection unit 107.

[0111] The line-of-sight detection unit 105 detects the direction in which the line of sight of the user is directed based on the image of the user’s eyeball captured by the imaging unit 203. Here, as an example of the method for detecting the line of sight of the user has been described above with reference to FIG. 2, a detailed description is omitted. Furthermore, the line-of-sight detection unit 105 outputs the information corresponding to the detection result of the line of sight of the user to the output control unit 111.

[0112] The posture detection unit 107 detects the position and the posture (hereinafter, also simply referred to as “posture”) of the site of the user holding the input/output device 20 in accordance with a detection result of various states by the sensing unit 221. For example, the posture detection unit 107 may detect a change in the posture of the input/output device 20 in accordance with the detection result of the acceleration sensor or the angular velocity sensor held by the input/output device 20 configured as a head mounted device so as to detect the posture of the user’s head wearing the input/output device 20. Similarly, the posture detection unit 107 may detect a change in the posture of the site, such as the user’s hand, arm, or leg, in accordance with the detection result of the acceleration sensor or the angular velocity sensor held by the site. It is obvious that the above is merely an example and, as long as the posture of the site of the user may be detected, there is no particular limitation on the target site for the posture detection and the method for detecting the posture of the site. That is, the configuration for detecting the posture of the site and the detection method may be changed as appropriate depending on the target site for the posture detection. Furthermore, the posture detection unit 107 outputs the information corresponding to the detection result of the posture of the site of the user to the output control unit 111.

[0113] The processing execution unit 109 is configured to execute various functions (e.g., applications) provided by the information processing apparatus 10 (and furthermore the information processing system 1). For example, the processing execution unit 109 may extract the target application to be executed from a predetermined storage unit (for example, the storage unit 190 described later) in accordance with a predetermined trigger such as a user input and execute the extracted application. Furthermore, the processing execution unit 109 may give a command to the output control unit 111 so as to make an upstream output in accordance with the execution result of the application. In a specific example, the processing execution unit 109 may give a command to the output control unit 111 so as to present display information such as a virtual object such that the virtual object is localized at a desired position in the real space in accordance with the execution result of the application.

[0114] The output control unit 111 controls the output unit 210 so as to output various types of information, which is the target to be output, and thus present the information to the user. For example, the output control unit 111 may cause the output unit 210 to output the information corresponding to the execution result of the application by the processing execution unit 109 so as to present the information corresponding to the execution result to the user. In a more specific example, the output control unit 111 may control the display unit 211 so as to present the display information, which is the target to be output, and thus present the display information to the user. Moreover, the output control unit 111 may control the sound output unit 213 so as to output the sound corresponding to the information to be output and thus present the information to the user.

[0115] Furthermore, the output control unit 111 may cause the display unit 211 to display the display information, such as a virtual object, such that the display information is localized in the real space in accordance with the recognition result of the object in the real space by the recognition processing unit 101 (in other words, the result of the self-location estimation of the input/output device 20 in the real space). In this case, for example, the output control unit 111 estimates the area in the real space in which the information displayed on the display unit 211 is superimposed (in other words, the area included in the field of view of the user) in accordance with the result of the self-location estimation of the input/output device 20 to recognize the positional relationship between the area and the display information localized in the real space. Then, the output control unit 111 performs control such that the display information localized in the area is displayed at the corresponding position in the display area of the display unit 211 in accordance with the recognition result of the positional relationship between the area and the display information.

[0116] Specifically, the output control unit 111 calculates, for example, the position and the range of the area displayed on the display unit 211 in the real space in accordance with the position and the posture of the input/output device 20 (the display unit 211) in the real space. This makes it possible to recognize the positional relationship in the real space between the area displayed on the display unit 211 and the position in which the display information to be presented is localized. Then, the output control unit 111 may calculate the position in the screen on which the display information is displayed in accordance with the positional relationship between the area displayed on the display unit 211 and the position where the display information to be presented is localized. Furthermore, the coordinate associated with a position in the screen of the display unit 211 (i.e., the coordinate associated with the display unit 211) corresponds to an example of a “first coordinate”, and the coordinate associated with a position in the real space corresponds to an example of a “second coordinate”.

[0117] Furthermore, the output control unit 111 includes a guidance control unit 113. The guidance control unit 113 presents information to the user through the output unit 210 to control the guidance of the line of sight of the user. Here, the guidance control unit 113 may use the detection result by the line-of-sight detection unit 105 with regard to the direction in which the line of sight of the user is directed and the detection result by the posture detection unit 107 with regard to the posture of the site of the user for the guidance of the line of sight of the user. In a specific example, the guidance control unit 113 may recognize the positional relationship among the position of the viewpoint P110, the start position P111 (e.g., the point of gaze), and the end position P113 (i.e., the guidance destination of the line of sight) in the example illustrated in FIG. 4 in accordance with the result of the self-location estimation of the input/output device 20. Thus, for example, the guidance control unit 113 may set the route R113 for guiding the line of sight in accordance with the recognition result of the positional relationship and control the presentation of the guidance object V111 to the user such that the guidance object V111 is localized along the route R113.

[0118] Furthermore, the operation regarding the guidance of the line of sight of the user is not limited to the example described with reference to FIG. 4. Therefore, another example of the operation regarding the guidance of the line of sight of the user is separately described later as a modification. Further, the user’s self-location (i.e., the position of the viewpoint P110), the detection result of the user’s line of sight, the posture of the site of the user, and the like, correspond to an example of “first information” regarding the guidance of the user’s line of sight, and the guidance object V111 corresponds to an example of “second information”. That is, for example, the part of the output control unit 111 regarding the acquisition of the above-described first information corresponds to an example of an “acquisition unit”, and the guidance control unit 113 corresponds to an example of a “control unit”. Furthermore, the route, such as the route R111 illustrated in FIG. 4, connecting the start position and the end position with a straight line corresponds to an example of a “first route”, and the route, such as the route R113, away from the user as compared with the first route corresponds to an example of a “second route”.

[0119] The storage unit 190 is a storage area for temporarily or permanently storing various types of data. For example, the storage unit 190 may store the data for the information processing apparatus 10 to execute various functions. In a more specific example, the storage unit 190 may store the data (for example, libraries) for executing various applications, management data for managing various settings, and the like.

[0120] Furthermore, the above-described functional configuration of the information processing system 1 according to the present embodiment is merely an example, and the functional configuration of the information processing system 1 is not necessarily limited to the example illustrated in FIG. 3 as long as each function of the input/output device 20 and the information processing apparatus 10 described above may be performed. In a specific example, the input/output device 20 and the information processing apparatus 10 may be integrally configured.

[0121] Furthermore, a part of the components of the information processing apparatus 10 may be provided in a different device. In a specific example, the recognition processing unit 101 and the detection unit 103 may be provided in a different device (for example, the input/output device 20 or a device different from the information processing apparatus 10 or the input/output device 20). Similarly, a part of the components of the input/output device 20 may be provided in a different device.

[0122] Furthermore, each function of the information processing apparatus 10 may be performed by a plurality of apparatuses operating in cooperation. In a specific example, the function provided by each component of the information processing apparatus 10 may be provided by a virtual service (e.g., a cloud service) performed by a plurality of apparatuses operating in cooperation. In this case, the service corresponds to the above-described information processing apparatus 10. Similarly, each function of the input/output device 20 may be performed by a plurality of devices operating in cooperation.

[0123] An example of the functional configuration of the information processing system according to the present embodiment has been described above with reference to FIG. 5.

[0124] <3. 3. Process>

[0125] Next, with reference to FIG. 6, an example of the flow of the series of processes in the information processing system according to the present embodiment is described with a focus on, particularly, the operation regarding the guidance of the line of sight of the user by the information processing apparatus 10. FIG. 6 is a flowchart illustrating an example of the flow of the series of processes in the information processing system according to the present embodiment.

[0126] First, the information processing apparatus 10 (the guidance control unit 113) sets the guidance destination for the line of sight (S101). For example, the information processing apparatus 10 may set, as the guidance destination of the line of sight of the user, the position where the display information (e.g., a virtual object), which needs to be noticed by the user, is localized. In a more specific example, the information processing apparatus 10 may set, as the guidance destination of the line of sight of the user, the position where the virtual object is localized based on the execution result of a desired application.

[0127] Subsequently, the information processing apparatus 10 (the line-of-sight detection unit 107) detects the line of sight of the user based on various types of information acquired by the input/output device 20 and recognizes the positions of the viewpoint and the point of gaze based on the detection result (S103). In a specific example, the information processing apparatus 10 may recognize the direction of the line of sight of the user in accordance with the imaging result by the imaging unit 203 held by the input/output device 20 with regard to the eyeball of the user. Further, the information processing apparatus 10 (the recognition processing unit 101) estimates the self-location of the input/output device 20 (and furthermore the self-location of the user) in the real space in accordance with the imaging result by the imaging unit 201 held by the input/output device 20 with regard to the environment around the user. Furthermore, the information processing apparatus 10 (the guidance control unit 113) recognizes the position (i.e., the point of gaze) in the real space to which the user directs the line of sight based on the self-location of the user and the direction of the line of sight of the user. Moreover, the self-location of the user may correspond to the position of the viewpoint of the user.

……
……
……

本文链接：https://patent.nweon.com/13563

Sony Patent | Information Processing Apparatus, Information Processing Method, And Recording Medium

您可能还喜欢...

分类

最新AR/VR行业分享

最新AR/VR论文

最新AR/VR行业招聘

Sony Patent | Information Processing Apparatus, Information Processing Method, And Recording Medium

您可能还喜欢...

Sony Patent | Information processing apparatus, information processing program, and information processing system

Sony Patent | Peripheral tracking system and method

Sony Patent | Tracking System For Head Mounted Display

分类

最新AR/VR行业分享

最新AR/VR论文

最新AR/VR行业招聘