Sony Patent | Information Processing Apparatus, Information Processing Method, And Recording Medium
Patent: Information Processing Apparatus, Information Processing Method, And Recording Medium
Publication Number: 20200042105
Publication Date: 20200206
Applicants: Sony
Abstract
To recognize a user input in a more favorable form without via an input device provided in a housing of an apparatus. An information processing apparatus includes a determination unit configured to determine whether or not an imaging unit is in a predetermined shielding state, and a recognition unit configured to recognize an operation input of a user according to the predetermined shielding state.
TECHNICAL FIELD
[0001] The present disclosure relates to an information processing apparatus, an information processing method, and a recording medium.
BACKGROUND ART
[0002] In recent years, types of devices so-called information processing apparatuses have also diversified with the advancement of communication technologies and miniaturization of various devices. Not only personal computers (PCs) but also information processing apparatuses configured to be made carriable by users such as smartphones and tablet terminals have also been in widespread use. In particular, in recent years, so-called wearable devices configured to be made usable while being carried by a user by being worn on a part of the body have also been proposed. Specific examples of such wearable devices include devices mounted on a head such as a head mounted display (HMD) and a glasses-type wearable device (hereinafter referred to as “head-mounted devices”).
CITATION LIST
Patent Document
[0003] Patent Document 1: Japanese Patent Application Laid-Open No. 2014-186361
SUMMARY OF THE INVENTION
Problems to be Solved by the Invention
[0004] By the way, as an example of an input interface for the user to input various types of information to an information processing apparatus, input devices such as a button, a switch, and a touch sensor are generally known. Meanwhile, in the head-mounted device, there are some cases where the user has a difficulty in directly viewing an input device provided in a part of a housing due to the characteristics of the head-mounted device that is used by being worn on the head, and the cases are less convenient than a case where the user can directly view an input interface.
[0005] To cope with the inconvenience, there are some cases where gesture input is adopted as the input interface for inputting various types of information to the information processing apparatus without via the input devices such as a button and a switch. However, since gesture input requires relatively high-load processing such as image recognition, power consumption tends to be larger.
[0006] Therefore, the present disclosure proposes an information processing apparatus, an information processing method, and a recording medium capable of recognizing an operation input of a user in a more favorable form without via an input device provided in a housing of the apparatus.
Solutions to Problems
[0007] According to the present disclosure, provided is an information processing apparatus including a determination unit configured to determine whether or not an imaging unit is in a predetermined shielding state, and a recognition unit configured to recognize an operation input of a user according to the predetermined shielding state.
[0008] Furthermore, according to the present disclosure, provided is an information processing method for causing a computer to perform determining whether or not an imaging unit is in a predetermined shielding state, and recognizing an operation input of a user according to the predetermined shielding state.
[0009] Furthermore, according to the present disclosure, provided is a recording medium storing a program for causing a computer to execute determining whether or not an imaging unit is in a predetermined shielding state, and recognizing an operation input of a user according to the predetermined shielding state.
Effects of the Invention
[0010] As described above, according to the present disclosure, provided is an information processing apparatus, an information processing method, and a recording medium capable of recognizing an operation input of a user in a more favorable form without via an input device provided in a housing of the apparatus.
[0011] Note that the above-described effect is not necessarily limited, and any of effects described in the present specification or other effects that can be grasped from the present specification may be exerted in addition to or in place of the above-described effect.
BRIEF DESCRIPTION OF DRAWINGS
[0012] FIG. 1 is an explanatory view for describing an example of a schematic configuration of an information processing system according to an embodiment of the present disclosure.
[0013] FIG. 2 is an explanatory view for describing an example of a schematic configuration of an input/output device according to the embodiment.
[0014] FIG. 3 is an explanatory view for describing an outline of an input interface according to the embodiment.
[0015] FIG. 4 is an explanatory view for describing the outline of the input interface according to the embodiment.
[0016] FIG. 5 is a block diagram illustrating an example of a functional configuration of the information processing system according to the embodiment.
[0017] FIG. 6 is an explanatory diagram for describing an example of the input interface according to the embodiment.
[0018] FIG. 7 is a flowchart illustrating an example of a flow of a series of processing of the information processing system according to the present embodiment.
[0019] FIG. 8 is an explanatory diagram for describing an example of the information processing system according to the embodiment.
[0020] FIG. 9 is an explanatory diagram for describing an example of the information processing system according to the embodiment.
[0021] FIG. 10 is an explanatory diagram for describing an example of the information processing system according to the embodiment.
[0022] FIG. 11 is an explanatory diagram for describing an example of the information processing system according to the embodiment.
[0023] FIG. 12 is an explanatory diagram for describing an example of the information processing system according to the embodiment.
[0024] FIG. 13 is an explanatory diagram for describing an example of the information processing system according to the embodiment.
[0025] FIG. 14 is an explanatory diagram for describing an example of the information processing system according to the embodiment.
[0026] FIG. 15 is an explanatory diagram for describing an example of the information processing system according to the embodiment.
[0027] FIG. 16 is an explanatory diagram for describing an example of the information processing system according to the embodiment.
[0028] FIG. 17 is an explanatory diagram for describing an example of a user interface according to a first modification.
[0029] FIG. 18 is an explanatory diagram for describing an example of a user interface according to a second modification.
[0030] FIG. 19 is a functional block diagram illustrating a configuration example of a hardware configuration of an information processing apparatus configuring an information processing system according to an embodiment of the present disclosure.
MODE FOR CARRYING OUT THE INVENTION
[0031] Favorable embodiments of the present disclosure will be described in detail with reference to the appended drawings. Note that, in the present specification and drawings, overlapping description of configuration elements having substantially the same functional configuration is omitted by providing the same sign.
[0032] Note that the description will be given in the following order.
[0033] 1.* Schematic Configuration*
[0034] 1.1** System Configuration**
[0035] 1.2. Configuration of Input/output Device
[0036] 2.* Study Regarding User Interface*
[0037] 3.* Technical Characteristics*
[0038] 3.1.* Outline of Input Interface*
[0039] 3.2.* Functional Configuration*
[0040] 3.3.* Processing*
[0041] 3.4.* Example*
[0042] 3.5.* Modification*
[0043] 4.* Hardware Configuration*
[0044] 5.* Conclusion*
[0045] <<1. Schematic Configuration>>
[0046] <1.1. System Configuration>
[0047] First, an example of a schematic configuration of an information processing system according to an embodiment of the present disclosure will be described with reference to FIG. 1. FIG. 1 is an explanatory view for describing an example of a schematic configuration of an information processing system according to an embodiment of the present disclosure, and illustrates an example of a case of presenting various types of content to a user applying a so-called augmented reality (AR) technology.
[0048] In FIG. 1, the reference sign m111 schematically represents an object (for example, a real object) located in a real space. Furthermore, the reference signs v131 and v133 schematically represent virtual content (for example, virtual objects) presented to be superimposed in the real space. In other words, an information processing system 1 according to the present embodiment superimposes the virtual objects on the object in the real space such as the real object m111 on the basis of the AR technology, for example, and presents the superimposed objects to the user. Note that, in FIG. 1, both the real object and the virtual objects are presented for easy understanding of the characteristics of the information processing system according to the present embodiment.
[0049] As illustrated in FIG. 1, an information processing system 1 according to the present embodiment includes an information processing apparatus 10 and an input/output device 20. The information processing apparatus 10 and the input/output device 20 are configured to be able to transmit and receive information to and from each other via a predetermined network. Note that the type of network connecting the information processing apparatus 10 and the input/output device 20 is not particularly limited. As a specific example, the network may be configured by a so-called wireless network such as a network based on a Wi-Fi (registered trademark) standard. Furthermore, as another example, the network may be configured by the Internet, a dedicated line, a local area network (LAN), a wide area network (WAN), or the like. Furthermore, the network may include a plurality of networks, and at least a part of the networks may be configured as a wired network.
[0050] The input/output device 20 is configured to obtain various types of input information and present various types of output information to the user who holds the input/output device 20. Furthermore, the presentation of the output information by the input/output device 20 is controlled by the information processing apparatus 10 on the basis of the input information acquired by the input/output device 20. For example, the input/output device 20 acquires, as the input information, information for recognizing the real object m111 (for example, a captured image of the real space), and outputs the acquired information to the information processing apparatus 10. The information processing apparatus 10 recognizes the position of the real object m111 in the real space on the basis of the information acquired from the input/output device 20, and causes the input/output device 20 to present the virtual objects v131 and v133 on the basis of the recognition result. With such control, the input/output device 20 can present, to the user, the virtual objects v131 and v133 such that the virtual objects v131 and v133 are superimposed on the real object m111 on the basis of the so-called AR technology.
[0051] Furthermore, the input/output device 20 is configured as, for example, a so-called head-mounted device that the user wears on at least a part of the head and uses, and may be configured to be able to detect a line of sight of the user. In a case where the information processing apparatus 10 recognizes that the user is gazing at a desired target (for example, the real object m111 or the virtual objects v131 and v133, or the like) on the basis of the detection result of the line of sight of the user by the input/output device 20, the information processing apparatus 10 may specify the target as an operation target, on the basis of such a configuration. Furthermore, the information processing apparatus 10 may specify the target to which the line of sight of the user is directed as the operation target in response to a predetermined operation to the input/output device 20 as a trigger. As described above, the information processing apparatus 10 may provide various services to the user via the input/output device 20 by specifying the operation target and executing processing associated with the operation target.
[0052] Furthermore, the information processing apparatus 10 may recognize a motion of at least a part of the body of the user (for example, change in position or orientation, a gesture, or the like) as an operation input of the user on the basis of the input information acquired by the input/output device 20, and execute various types of processing according to the recognition result of the operation input. As a specific example, the input/output device 20 acquires, as the input information, information for recognizing a hand of the user (for example, a captured image of the hand), and outputs the acquired information to the information processing apparatus 10. The information processing apparatus 10 recognizes the motion of the hand (for example, a gesture) on the basis of the information acquired from the input/output device 20, and recognizes an instruction from the user (in other words, the operation input of the user) according to the recognition result of the motion. Then, the information processing apparatus 10 may control display of a virtual object to be presented to the user (for example, the display position and posture of the virtual object) according to the recognition result of the operation input of the user, for example. Note that, in the present disclosure, the “operation input of the user” may be regarded as an input corresponding to the instruction from the user, that is, an input reflecting the user’s intention, as described above. Hereinafter, the “operation input of the user” may be simply referred to as “user input”.
[0053] Note that, in FIG. 1, the input/output device 20 and the information processing apparatus 10 are illustrated as devices different from each other. However, the input/output device 20 and the information processing apparatus 10 may be integrally configured. Furthermore, details of the configurations and processing of the input/output device 20 and the information processing apparatus 10 will be separately described below.
[0054] An example of a schematic configuration of the information processing system according to the embodiment of the present disclosure has been described with reference to FIG. 1.
[0055] <1.2. Configuration of Input/output Device>
[0056] Next, an example of a schematic configuration of the input/output device 20 according to the present embodiment illustrated in FIG. 1 will be described with reference to FIG. 2. FIG. 2 is an explanatory diagram for describing an example of a schematic configuration of the input/output device according to the present embodiment.
[0057] As described above, the input/output device 20 according to the present embodiment is configured as a so-called head-mounted device that the user wears on at least a part of the head and uses. For example, in the example illustrated in FIG. 2, the input/output device 20 is configured as a so-called eyewear type (glasses type) device, and at least one of a lens 293a or 293b is configured as a transmission type display (display unit 211). Furthermore, the input/output device 20 includes imaging units 201a and 201b, an operation unit 207, and a holding unit 291 corresponding to a frame of glasses. Furthermore, the input/output device 20 may include imaging units 203a and 203b. Note that, hereinafter, various descriptions will be given on the assumption that the input/output device 20 includes the imaging units 203a and 203b. The holding unit 291 holds the display unit 211, the imaging units 201a and 201b, the imaging units 203a and 203b, and the operation unit 207 to have a predetermined positional relationship with respect to the head of the user when the input/output device 20 is mounted on the head of the user. Furthermore, although not illustrated in FIG. 2, the input/output device 20 may be provided with a sound collection unit for collecting a voice of the user.
[0058] Here, a more specific configuration of the input/output device 20 will be described. For example, in the example illustrated in FIG. 2, the lens 293a corresponds to a lens on a right eye side, and the lens 293b corresponds to a lens on a left eye side. In other words, the holding unit 291 holds the display unit 211 such that the display unit 211 (in other words, the lenses 293a and 293b) is located in front of the eyes of the user in a case where the input/output device 20 is mounted.
[0059] The imaging units 201a and 201b are configured as so-called stereo cameras and are held by the holding unit 291 to face a direction in which the head of the user faces (in other words, the front of the user) when the input/output device 20 is mounted on the head of the user. At this time, the imaging unit 201a is held near the user’s right eye, and the imaging unit 201b is held near the user’s left eye. The imaging units 201a and 201b capture an object located in front of the input/output device 20 (in other words, a real object located in the real space) from different positions on the basis of such a configuration. Thereby, the input/output device 20 acquires images of the object located in front of the user and can calculate a distance to the object from the input/output device (the position of a viewpoint of the user, accordingly) on the basis of a parallax between the images respectively captured by the imaging units 201a and 201b.
[0060] Note that the configuration and method are not particularly limited as long as the distance between the input/output device 20 and the object can be measured. As a specific example, the distance between the input/output device 20 and the object may be measured on the basis of a method such as multi-camera stereo, moving parallax, time of flight (TOF), or structured light. Here, the TOF is a method of obtaining an image (so-called distance image) including a distance (depth) to an object on the basis of a measurement result by projecting light such as infrared light on the object and measuring a time required for the projected light to be reflected by the object and return, for each pixel. Furthermore, the structured light is a method of obtaining a distance image including a distance (depth) to an object on the basis of change in pattern obtained from a capture result by irradiating the object with the pattern with light such as infrared light and capturing the pattern. Furthermore, the moving parallax is a method of measuring a distance to an object on the basis of a parallax even in a so-called monocular camera. Specifically, the object is captured from different viewpoints from each other by moving the camera, and the distance to the object is measured on the basis of the parallax between the captured images. Note that, at this time, the distance to be object can be measured with more accuracy by recognizing a moving distance and a moving direction of the camera by various sensors. Note that the configuration of the imaging unit (for example, the monocular camera, the stereo camera, or the like) may be changed according to the distance measuring method.
[0061] Furthermore, the imaging units 203a and 203b are held by the holding unit 291 such that eyeballs of the user are located within respective imaging ranges when the input/output device 20 is mounted on the head of the user. As a specific example, the imaging unit 203a is held such that the user’s right eye is located within the imaging range. The direction in which the line of sight of the right eye is directed can be recognized on the basis of an image of the eyeball of the right eye captured by the imaging unit 203a and a positional relationship between the imaging unit 203a and the right eye, on the basis of such a configuration. Similarly, the imaging unit 203b is held such that the user’s left eye is located within the imaging range. In other words, the direction in which the line of sight of the left eye is directed can be recognized on the basis of an image of the eyeball of the left eye captured by the imaging unit 203b and a positional relationship between the imaging unit 203b and the left eye. Note that the example in FIG. 2 illustrates the configuration in which the input/output device 20 includes both the imaging units 203a and 203b. However, only one of the imaging units 203a and 203b may be provided.
[0062] The operation unit 207 is configured to receive an operation on the input/output device 20 from the user. The operation unit 207 may be configured by, for example, an input device such as a touch panel or a button. The operation unit 207 is held at a predetermined position of the input/output device 20 by the holding unit 291. For example, in the example illustrated in FIG. 2, the operation unit 207 is held at a position corresponding to a temple of the glasses.
[0063] Furthermore, the input/output device 20 according to the present embodiment may be provided with, for example, an acceleration sensor and an angular velocity sensor (gyro sensor) and configured to be able to detect a motion of the head (in other words, a posture of the input/output device 20 itself) of the user wearing the input/output device 20. As a specific example, the input/output device 20 may detect components in a yaw direction, a pitch direction, and a roll direction as the motion of the head of the user, thereby recognizing change in at least one of the position or posture of the head of the user.
[0064] The input/output device 20 according to the present embodiment can recognize changes in its own position and posture in the real space according to the motion of the head of the user on the basis of the above configuration. Furthermore, at this time, the input/output device 20 can present the virtual content (in other words, the virtual object) on the display unit 211 to superimpose the virtual content on the real object located in the real space on the basis of the so-called AR technology. Furthermore, at this time, the input/output device 20 may estimate the position and posture (in other words, self-position) of the input/output device 20 itself in the real space and use an estimation result for the presentation of the virtual object on the basis of a technology called simultaneous localization and mapping (SLAM) or the like, for example.
[0065] Here, as a reference, an outline of the SLAM will be described. The SLAM is a technology for performing self-position estimation and creation of an environmental map in parallel by using an imaging unit such as a camera, various sensors, an encoder, and the like. As a more specific example, in the SLAM (in particular, Visual SLAM), a three-dimensional shape of a captured scene (or object) is sequentially restored on the basis of a moving image captured by the imaging unit. Then, by associating the restoration result of the captured scene with the detection result of the position and posture of the imaging unit, the creation of a map of a surrounding environment, and the estimation of the position and posture of the imaging unit (the input/output device 20, accordingly) in the environment are performed. Note that the position and posture of the imaging unit can be estimated as information indicating relative change on the basis of the detection result of the sensor by providing various sensors such as an acceleration sensor and an angular velocity sensor to the input/output device 20, for example. Of course, the estimation method is not necessarily limited to the method based on detection results of the various sensors such as an acceleration sensor and an angular velocity sensor as long as the position and posture of the imaging unit can be estimated.
[0066] Furthermore, examples of a head mounted display (HMD) device applicable to the input/output device 20 include a see-through HMD, a video see-through HMD, and a retinal projection HMD.
[0067] The see-through HMD uses, for example, a half mirror or a transparent light guide plate to hold a virtual image optical system including a transparent light guide or the like in front of the eyes of the user, and displays an image inside the virtual image optical system. Therefore, the user wearing the see-through HMD can take the external scenery into view while viewing the image displayed inside the virtual image optical system. With such a configuration, the see-through HMD can superimpose an image of the virtual object on an optical image of the real object located in the real space according to the recognition result of at least one of the position or posture of the see-through HMD on the basis of the AR technology, for example. Note that a specific example of the see-through HMD includes a so-called glasses-type wearable device in which a portion corresponding to a lens of glasses is configured as a virtual image optical system. For example, the input/output device 20 illustrated in FIG. 2 corresponds to an example of the see-through HMD.
[0068] In a case where the video see-through HMD is mounted on the head or face of the user, the video see-through HMD is mounted to cover the eyes of the user, and a display unit such as a display is held in front of the eyes of the user. Furthermore, the video see-through HMD includes an imaging unit for capturing surrounding scenery, and causes the display unit to display an image of the scenery in front of the user captured by the imaging unit. With such a configuration, the user wearing the video see-through HMD has a difficulty in directly taking the external scenery into view but the user can confirm the external scenery with the image displayed on the display unit. Furthermore, at this time, the video see-through HMD may superimpose the virtual object on an image of the external scenery according to the recognition result of at least one of the position or posture of the video see-through HMD on the basis of the AR technology, for example.
[0069] The retinal projection HMD has a projection unit held in front of the eyes of the user, and an image is projected from the projection unit toward the eyes of the user such that the image is superimposed on the external scenery. More specifically, in the retinal projection HMD, an image is directly projected from the projection unit onto the retinas of the eyes of the user, and the image is imaged on the retinas. With such a configuration, the user can view a clearer image even in a case where the user has myopia or hyperopia. Furthermore, the user wearing the retinal projection HMD can take the external scenery into view even while viewing the image projected from the projection unit. With such a configuration, the retinal projection HMD can superimpose an image of the virtual object on an optical image of the real object located in the real space according to the recognition result of at least one of the position or posture of the retinal projection HMD on the basis of the AR technology, for example.
[0070] Furthermore, in the above description, an example of the configuration of the input/output device 20 according to the present embodiment has been described on the assumption that the AR technology is applied. However, the above description does not necessarily limit the configuration of the input/output device 20. For example, in a case of assuming application of a VR technology, the input/output device 20 according to the present embodiment may be configured as an HMD called immersive HMD. The immersive HMD is mounted to cover the eyes of the user, and a display unit such as a display is held in front of the eyes of the user, similarly to the video see-through HMD. Therefore, the user wearing the immersive HMD has a difficulty in directly taking an external scenery (in other words, scenery of a real world) into view, and only an image displayed on the display unit comes into view. With such a configuration, the immersive HMD can provide an immersive feeling to the user who is viewing the image.
[0071] An example of the schematic configuration of the input/output device according to the embodiment of the present disclosure has been described with reference to FIG. 2.
[0072] <<2. Study Regarding User Interface>>
[0073] Next, issues of the information processing apparatus according to the present embodiment will be addressed after a user interface assuming a situation where the head-mounted device is used is discussed.
[0074] Examples of the input interface for the user to input various types of information to the information processing apparatus include input devices such as a button, a switch, and a touch sensor. In the head-mounted device such as the input/output device 20 described with reference to FIG. 2, there are some cases where the input devices such as a button and a touch sensor (for example, the operation unit 207 illustrated in FIG. 2 or the like) are provided in a part (for example, a part of the holding unit that holds the display unit, the imaging unit, and the like) of a housing, for example.
[0075] Meanwhile, in the head-mounted device, there are some cases where the user has a difficulty in directly viewing the input device provided in a part of the housing due to the characteristics of the head-mounted device that is used by being worn on the head, and the cases are less convenient than a case where the user can directly view an input interface.
[0076] Furthermore, under circumstances where the input interface provided in the housing that holds the display unit and the imaging unit is operated, the housing is vibrated due to the operation of the input interface, and there are some cases where the vibration is transmitted to the display unit and the imaging unit held by the housing. Under such circumstances, for example, the relative positional relationship between the user’s eyes and the display unit and the imaging unit changes, and there are some cases where the real object and the virtual object presented to be superimposed on the real object are not visually recognized by the user in a correct positional relationship.
[0077] To cope with the cases, there are some cases where gesture input is adopted as the input interface for inputting various types of information to the information processing apparatus without via the input devices such as a button and a switch. In the gesture input, for example, by analyzing an image captured by the imaging unit or the like, a gesture using a part such as a hand is recognized, and a user input is recognized according to the recognition result of the gesture. Thereby, the user can input information to the information processing apparatus by a more intuitive operation like the gesture without operating the input device (in other words, the input device difficult to visually recognize) provided in the housing.
[0078] However, since the gesture input requires relatively high-load processing such as image recognition, power consumption tends to be larger. Meanwhile, many of head-mounted devices as described with reference to FIG. 2 are driven by batteries due to the characteristics of being used by being worn on the head and are carried and used like smart phones. In such devices, a configuration to further reduce the power consumption is more desirable.
[0079] In view of the above situation, the present disclosure proposes an example of a technology capable of recognizing a user input without via an input device provided in a housing of an apparatus and further reducing a processing load related to the recognition.
[0080] <<3. Technical Characteristics>>
[0081] Hereinafter, technical characteristics of the information processing apparatus according to an embodiment of the present disclosure will be described.
[0082] <3.1. Outline of Input Interface>
[0083] First, an outline of an example of the input interface for the information processing apparatus according to an embodiment of the present disclosure to recognize the user input will be described with reference to FIGS. 3 and 4. FIGS. 3 and 4 are explanatory views for describing the outline of the input interface according to the present embodiment.
[0084] The information processing apparatus 10 according to the present embodiment uses an imaging unit that captures an image of an external environment (for example, an imaging unit used for the recognition of the real object, the self-position estimation, and the like), like a stereo camera provided in the head-mounted device, as the recognition of the user input, for example. Therefore, in the present description, the outline of the input interface according to the present embodiment will be described by taking a case where the imaging units 201a and 201b are used to recognize the user input in the input/output device 20 described with reference to FIG. 2, as an example.
[0085] In the information processing system according to the present embodiment, the user can issue various instructions to the information processing apparatus 10 by covering at least a part of the imaging units 201a and 201b with a part such as a hand. In other words, the information processing apparatus 10 recognizes the user input according to whether or not at least a part of the imaging units of the imaging units 201a and 201b is in a predetermined shielding state. Note that the predetermined shielding state includes, for example, a state in which substantially an entire angle of view of a desired imaging unit is shielded. Note that, in the following description, description will be given on the assumption that the predetermined shielding state indicates the state in which substantially the entire angle of view of a desired imaging unit is shielded. However, the present embodiment is not necessarily limited to this state.
[0086] For example, FIG. 3 illustrates a situation in which the angle of view of the imaging unit 201a is shielded by a hand U11 of the user. In this case, the information processing apparatus 10 determines whether or not substantially an entire angle of view of the imaging unit 201a is shielded on the basis of a predetermined method, and recognizes that a predetermined input has been performed by the user (in other words, recognizes the user input) in a case of determining that substantially the entire angle of view is shielded. Note that the imaging unit 201a corresponds to an example of a “first imaging unit”. In other words, the above determination regarding the shielding state of the imaging unit 201a (for example, determination as to whether or not substantially the entire angle of view of the imaging unit 201a is shielded) corresponds to an example of “first determination”.
[0087] Further, FIG. 4 illustrates a situation in which the angle of view of the imaging unit 201b is shielded by a hand U13 of the user. In this case, the information processing apparatus 10 determines whether or not substantially an entire angle of view of the imaging unit 201b is shielded, and recognizes the user input according to the determination result, similarly to the example described with reference to FIG. 3. Note that the imaging unit 201b corresponds to an example of a “second imaging unit”. In other words, the above determination regarding the shielding state of the imaging unit 201b corresponds to an example of “second determination”.
[0088] Note that the determination method is not particularly limited as long as whether or not substantially the entire angles of view of the imaging units 201a and 201b are shielded can be determined. As a specific example, the information processing apparatus 10 may determine whether or not substantially the entire angles of view of the imaging units 201a and 201b are shielded on the basis of brightness of images respectively captured by the imaging units 201a and 201b. Note that a method of determining whether or not substantially an entire angle of view of a predetermined imaging unit according to brightness of an image captured by the imaging unit will be described below in detail as an example. Furthermore, as another example, whether or not substantially the entire angles of view of the imaging units 201a and 201b are shielded may be determined using various sensors such as a proximity sensor and a distance measuring sensor. In this case, in a case where each of the imaging units 201a and 201b and a shielding object are located close enough to shield substantially the entire view of angle of the imaging unit (in other words, the detection result of the distance between the imaging unit and the shielding object is equal to or smaller than a threshold value), it may be determined that substantially the entire view of angle is shielded.
[0089] With the above configuration, the information processing apparatus 10 can recognize the user input according to which substantially entire angle of view of the imaging units 201a and 201b is shielded, for example.
[0090] Furthermore, as another example, the information processing apparatus 10 may recognize the user input according to a combination of the imaging units of which substantially the entire views of angle are shielded, of the imaging units 201a and 201b. In other words, in a case where substantially the entire views of angle of both the imaging units 201a and 201b are shielded, the information processing apparatus 10 can recognize that a different input has been performed, from the case where substantially the entire view of angle of only one of the imaging units 201a and 201b is shielded.
[0091] An outline of an example of the input interface for the information processing apparatus according to the embodiment of the present disclosure to recognize the user input has been described with reference to FIGS. 3 and 4.
[0092] <3.2. Functional Configuration>
[0093] Next, an example of a functional configuration of the information processing system 1 according to the present embodiment will be described with reference to FIG. 5. FIG. 5 is a block diagram illustrating an example of a functional configuration of the information processing system 1 according to the present embodiment. Therefore, hereinafter, the respective configurations of the information processing apparatus 10 and the input/output device 20 will be described in more detail on the assumption that the information processing system 1 includes the information processing apparatus 10 and the input/output device 20, as described with reference to FIG. 1. Note that, as illustrated in FIG. 5, the information processing system 1 may include a storage unit 190.
[0094] First, the configuration of the input/output device 20 will be described. As illustrated in FIG. 5, the input/output device 20 includes imaging units 201a and 201b and an output unit 210. The output unit 210 includes a display unit 211. Furthermore, the output unit 210 may include an audio output unit 213. The imaging units 201a and 201b correspond to the imaging units 201a and 201b described with reference to FIG. 2. Note that, in a case where the imaging units 201a and 201b are not particularly distinguished, the imaging units 201a and 201b may be simply referred to as “imaging unit 201”. Furthermore, the display unit 211 corresponds to the display unit 211 described with reference to FIG. 2. Furthermore, the audio output unit 213 includes an audio device such as a speaker and outputs sound and audio according to information to be output.
[0095] Next, the configuration of the information processing apparatus 10 will be described. As illustrated in FIG. 5, the information processing apparatus 10 includes a determination unit 101, a recognition unit 103, a processing execution unit 105, and an output control unit 107.
[0096] The determination unit 101 acquires information according to a capture result of an image from the imaging unit 201 and determines whether or not substantially the entire angle of view of the imaging unit 201 is shielded by some sort of the real object (for example, a hand of the user or the like) according to the acquired information.
[0097] For example, the determination unit 101 may acquire the image captured by the imaging unit 201 from the imaging unit 201, and determine whether or not substantially the entire angle of view of the imaging unit 201 is shielded according to the brightness (for example, distribution of luminance for each pixel) of the acquired image. As a more specific example, the determination unit 101 may calculate an average value of the luminance of pixels of the acquired image, and determine that substantially the entire view of angle of the imaging unit 201 that has captured the image is shielded in a case where the calculated average value is equal to or smaller than a threshold value.
[0098] Furthermore, as another example, the determination unit 101 may acquire the captured image from the imaging unit 201, and determine that substantially the entire angle of view of the imaging unit 201 is shielded in a case of determining that recognition of the object (in other words, the real object) in the real space is difficult on the basis of the acquired image. As a more specific example, the determination unit 101 may determine that substantially the entire angle of view of the imaging unit 201 that has captured the image is shielded in a case where extraction of characteristic points for recognizing the real object from the acquired image is difficult (for example, the number of extracted characteristic points is equal to or smaller than a threshold value).
[0099] Of course, the above-described examples are mere examples, and the determination method is not particularly limited as long as the determination unit 101 can determine whether or not substantially the entire angle of view of the imaging unit 201 is shielded. As a specific example, in a case where the determination unit 101 detects proximity of the real object to the imaging unit 201 with a distance measuring sensor, a proximity sensor, or the like, the determination unit 101 may determine that substantially the entire angle of view of the imaging unit 201 is shielded.
[0100] Note that the number of imaging units 201 to be determined by the determination unit 101 is not particularly limited. As a specific example, the determination unit 101 may determine only one of the imaging units 201a and 201b or may determine both of the imaging units 201a and 201b. Furthermore, the determination unit 101 may determine another imaging unit other than the imaging units 201a and 201b. In other words, the determination unit 101 may determine three or more imaging units.
[0101] Furthermore, timing when the determination unit 101 performs the above-described determination is not particularly limited. As a specific example, the determination unit 101 may perform the above-described determination periodically at each predetermined timing. Furthermore, as another example, the determination unit 101 may perform the above-described determination in response to a predetermined trigger. As a specific example, the determination unit 101 may perform the above-described determination in a case where predetermined display information such as an operation menu for prompting the user input is displayed on the display unit 211. In this case, the determination unit 101 may recognize whether or not the predetermined display information is displayed on the display unit 211 on the basis of, for example, a notification from the output control unit 107 described below.
[0102] Then, the determination unit 101 notifies the recognition unit 103 of information indicating the determination result as to whether or not substantially the entire angle of view of the imaging unit 201 is shielded. At this time, for example, the determination unit 101 may notify the recognition unit 103 of the information indicating the determination result in a case of determining that substantially the entire angle of view of a predetermined imaging unit 201 is shielded. Furthermore, the determination unit 101 may notify the recognition unit 103 of the information indicating the determination result for each imaging unit 201 in a case where a plurality of candidates of the imaging unit 201 to be determined exists.
[0103] The recognition unit 103 acquires the information indicating the determination result as to whether or not substantially the entire angle of view of the imaging unit 201 is shielded from the determination unit 101, and recognizes the user input on the basis of the acquired information. At this time, the recognition unit 103 may recognize the user input according to information related to the recognition of the user input displayed on the display unit 211 and the information indicating the determination result.
[0104] For example, FIG. 6 is an explanatory diagram for describing an example of an input interface according to the present embodiment, and illustrates an example of an operation menu presented via the display unit 211 of the input/output device 20. In FIG. 6, the reference sign V101 schematically represents an optical image in the real space visually recognized by the user. Furthermore, the reference sign V103 represents a region (in other words, a drawing region) where display information (for example, the virtual object) is presented via the display unit 211. Furthermore, the reference signs V105 and V107 represent examples of the display information presented as operation menus. Specifically, the display information V105 is associated with an operation menu meaning permission of execution of predetermined processing, and the display information V107 is associated with an operation menu meaning cancellation of the execution of the processing.
[0105] Under the circumstances illustrated in FIG. 6, the recognition unit 103 recognizes that the operation menu corresponding to the display information V105 has been selected in a case where substantially the entire angle of view of the imaging unit 201b (in other words, the imaging unit 201b illustrated in FIG. 2) located on a relatively left side with respect to the user wearing the input/output device 20 is shielded, for example. In this case, the recognition unit 103 recognizes that the user has issued an instruction to affirm execution of the predetermined processing. In other words, the recognition unit 103 recognizes the above-described operation by the user as a user input meaning affirmative.
[0106] Furthermore, the recognition unit 103 recognizes that the operation menu corresponding to the display information V107 has been selected in a case where substantially the entire angle of view of the imaging unit 201a (in other words, the imaging unit 201a illustrated in FIG. 2) located on a relatively right side with respect to the user wearing the input/output device 20 is shielded. In this case, the recognition unit 103 recognizes that the user has issued an instruction to cancel the execution of the predetermined processing. In other words, the recognition unit 103 recognizes the above operation by the user as a user input meaning cancellation.
[0107] Note that the recognition unit 103 may execute the above-described processing regarding recognition of the user input in response to a predetermined trigger. As a specific example, the recognition unit 103 may execute the processing regarding recognition of the user input in a case where the predetermined display information such as an operation menu for prompting the user input is displayed on the display unit 211. In this case, the recognition unit 103 may recognize whether or not the predetermined display information is displayed on the display unit 211 on the basis of, for example, a notification from the output control unit 107.
[0108] Then, the recognition unit 103 outputs information indicating the recognition result of the user input to the processing execution unit 105.
[0109] The processing execution unit 105 is a configuration for executing various functions (for example, applications) provided by the information processing apparatus 10 (in other words, the information processing system 1). For example, the processing execution unit 105 may extract a corresponding application from a predetermined storage unit (for example, the storage unit 190 described below) according to the recognition result of the user input by the recognition unit 103 and execute the extracted application. Furthermore, the processing execution unit 105 may control the operation of the application being executed according to the recognition result of the user input by the recognition unit 103. For example, the processing execution unit 105 may switch a subsequent operation of the application being executed according to the operation menu selected by the user. Furthermore, the processing execution unit 105 may output information indicating execution results of the various applications to the output control unit 107.
[0110] The output control unit 107 causes the output unit 210 to output various types of information to be output, thereby presenting the information to the user. For example, the output control unit 107 may present the display information to be output to the user by causing the display unit 211 to display the display information. Furthermore, the output control unit 107 may present the information to be output to the user by causing the audio output unit 213 to output an audio corresponding to the information.
[0111] For example, the output control unit 107 may acquire the information indicating execution results of the various applications from the processing execution unit 105, and present output information corresponding to the acquired information to the user via the output unit 210. As a specific example, the output control unit 107 may cause the display unit 211 to display display information corresponding to an operation menu of a desired application, such as the display information V105 and V107 illustrated in FIG. 6, according to the execution result of the desired application. Furthermore, the output control unit 107 may cause the display unit 211 to display display information indicating the execution result of the desired application. Furthermore, the output control unit 107 may cause the audio output unit 213 to output output information according to the execution result of the desired application as sound or audio.
[0112] Furthermore, the output control unit 107 may notify the determination unit 101 and the recognition unit 103 of information indicating an output situation of various types of output information via the output unit 210. As a specific example, in a case where the output control unit 107 causes the display unit 211 to display the information regarding the operation of the user such as the display information V105 and V107 illustrated in FIG. 6, the output control unit 107 may notify the determination unit 101 and the recognition unit 103 that the information is being displayed.
[0113] The storage unit 190 is a storage region for temporarily or constantly storing various data. For example, the storage unit 190 may store data for the information processing apparatus 10 to execute various functions. As a more specific example, the storage unit 190 may store data (for example, a library) for executing various applications, management data for managing various settings, and the like.
[0114] Note that the functional configurations of the information processing system 1 illustrated in FIG. 5 are mere examples, and the functional configurations of the information processing system 1 are not necessarily limited to the example illustrated in FIG. 5 only as long as the processing of the above-described configurations can be implemented. As a specific example, the input/output device 20 and the information processing apparatus 10 may be integrally configured. Furthermore, as another example, the storage unit 190 may be included in the information processing apparatus 10 or may be configured as a recording medium outside the information processing apparatus 10 (for example, a recording medium externally attached to the information processing apparatus 10). Furthermore, as another example, a part of the configurations of the information processing apparatus 10 may be provided outside the information processing apparatus 10 (for example, a server or the like).
[0115] An example of the functional configurations of the information processing system 1 according to the present embodiment has been described with reference to FIG. 5.
[0116] <3.3. Processing>
[0117] Next, an example of a flow of series of processing of the information processing system 1 according to the present embodiment will be described especially focusing on the operation of the information processing apparatus 10 with reference to FIG. 7. FIG. 7 is a flowchart illustrating an example of a flow of a series of processing of the information processing system 1 according to the present embodiment.
[0118] First, the information processing apparatus 10 (the determination unit 101) acquires the information according to a capture result of an image from a predetermined imaging unit 201 held by the input/output device 20, and determines whether or not substantially the entire angle of view of the imaging unit 201 is shielded by some sort of real object (for example, a hand of the user or the like) according to the acquired information (S101).
[0119] In a case where the determination unit 101 determines that substantially the entire angle of view of the predetermined imaging unit 201 is shielded (S103, YES), the information processing apparatus 10 (recognition unit 103) recognizes the user input according to the imaging unit with the angle of view determined to be shielded (S105). Then, the information processing apparatus 10 executes processing according to the recognition result of the user input (S107). As a specific example, the information processing apparatus 10 (the processing execution unit 105) may execute a corresponding application according to the recognition result of the user input. Furthermore, the information processing apparatus 10 (output control unit 107) may present the output information according to the execution result of the application to the user via the output unit 210.
[0120] Furthermore, in a case where the determination unit 101 determines that substantially the entire angle of view of the predetermined imaging unit 201 is not shielded (S103, NO), the information processing apparatus 10 may transition to subsequent processing without executing the processing according to the reference signs S103 and S107.
[0121] Note that the timing when the information processing apparatus 10 executes the series of processing represented by the reference signs S101 to S107 is not particularly limited. For example, the information processing apparatus 10 may execute the series of processes in response to a predetermined trigger. As a more specific example, in a case where the information processing apparatus 10 prompts the user to input information via the input/output device 20, the information processing apparatus 10 may execute the above-described series of processing.
[0122] As described above, an example of a flow of the series of processing of the information processing system 1 according to the present embodiment has been described especially focusing on the operation of the information processing apparatus 10 with reference to FIG. 7.
[0123] <3.4. Examples>
[0124] Next, an example of the method of determining whether or not substantially the entire angle of view of a predetermined imaging unit is shielded on the basis of the brightness of an image captured by the imaging unit will be described citing specific examples with reference to FIGS. 8 to 16, as examples. FIGS. 8 to 16 are explanatory diagrams for describing an example of the information processing system according to the present embodiment.
[0125] First, an example illustrated in FIGS. 8 and 9 will be described. FIG. 8 illustrates an example of an image captured by a predetermined imaging unit in a case where the angle of view of the imaging unit is shielded by a hand, and illustrates a case in which the distance between the imaging unit and the hand is about 20 cm. In the example illustrated in FIG. 8, only a part of the angle of view of the imaging unit is shielded by the hand, and a background not shielded by the hand is in an identifiable situation. Furthermore, FIG. 9 is a graph illustrating distribution of the luminance of pixels of the image illustrated in FIG. 8. In FIG. 9, the horizontal axis represents the luminance of pixels and the vertical axis represents the frequency. Furthermore, in the example illustrated in FIG. 9, the luminance of each pixel indicates a value of 0 to 255, and the higher the value, the higher the luminance. As illustrated in FIG. 9, it can be seen that a large number of pixels with relatively high luminance is distributed in the case of the example illustrated in FIG. 8. This is because, in the case of the example illustrated in FIG. 8, since only a part of the angle of view of the imaging unit is shielded by the hand, leakage of light of an external environment through a region not shielded by the hand is presumed to contribute.
[0126] Next, an example illustrated in FIGS. 10 and 11 will be described. FIG. 10 illustrates an example of an image captured by a predetermined imaging unit in a case where the angle of view of the imaging unit is shielded by a hand, and illustrates a case in which the distance between the imaging unit and the hand is about 10 cm. In the example illustrated in FIG. 10, a region shielded by the hand in the angle of view of the imaging unit is wider and the brightness of the entire image is also darker than the example illustrated in FIG. 8. Furthermore, FIG. 11 is a graph illustrating distribution of the luminance of pixels of the image illustrated in FIG. 10. Note that the horizontal axis and the vertical axis in FIG. 11 are similar to the graph illustrated in FIG. 9. As can be seen by comparing FIG. 11 with FIG. 9, more pixels with lower luminance are distributed in the image illustrated in FIG. 10 than the image illustrated in FIG. 8. In other words, it can be seen that the brightness of the entire image illustrated in FIG. 10 is darker than the brightness of the entire image illustrated in FIG. 8.
[0127] Next, an example illustrated in FIGS. 12 and 13 will be described. FIG. 12 illustrates an example of an image captured by a predetermined imaging unit in a case where the angle of view of the imaging unit is shielded by a hand, and illustrates a case in which the distance between the imaging unit and the hand is about 1 cm. In the example illustrated in FIG. 12, since almost the entire angle of view of the imaging unit is shielded, it is difficult to identify the background. Furthermore, FIG. 13 is a graph illustrating distribution of the luminance of pixels of the image illustrated in FIG. 12. Note that the horizontal axis and the vertical axis in FIG. 13 are similar to the graph illustrated in FIG. 9. As can be seen by comparing FIG. 13 with FIG. 11, more pixels with lower luminance are distributed in the image illustrated in FIG. 12 than the image illustrated in FIG. 10. Note that as illustrated in FIG. 13, each pixel exhibiting slightly brighter luminance than black is presumed to be caused by leakage of light of the external environment through a gap between the imaging unit and the hand.
[0128] Next, an example illustrated in FIGS. 14 and 15 will be described. FIG. 14 illustrates an example of an image captured by the predetermined imaging unit in a case where the angle of view of the imaging unit is shielded by a hand, and illustrates a case in which the distance between the imaging unit and the hand is about 1 mm. In the example illustrated in FIG. 14, since almost the entire angle of view of the imaging unit is shielded, it is difficult to identify the background, similarly to the example illustrated in FIG. 12. Furthermore, FIG. 15 is a graph illustrating distribution of the luminance of pixels of the image illustrated in FIG. 14. Note that the horizontal axis and the vertical axis in FIG. 15 are similar to the graph illustrated in FIG. 9. As can be seen by comparing FIG. 15 with FIG. 13, more pixels with lower luminance are distributed in the image illustrated in FIG. 14 than the image illustrated in FIG. 12. This is presumably caused because the gap between the imaging unit and the hand is narrower than the example illustrated in FIGS. 12 and 13, and the amount of light leaking from the external environment proportionally decreases.
[0129] According to the above description, in the case of the imaging unit used in the present examples, the case where the distribution of the luminance of the pixels of the captured image becomes the distribution as illustrated in FIG. 16 can be regarded as a boundary (threshold value) for determining whether or not substantially the entire angle of view of the imaging unit is shielded. In other words, in the imaging unit used in the present examples, the case where the average value of the luminance of the pixels of the captured image shows a value equal to or smaller than 77 can be regarded that substantially the entire angle of view of the imaging unit is shielded.
[0130] Note that the examples described in the present examples are mere examples, and it is indisputable that the threshold value for determining whether or not substantially the entire angle of view of the imaging unit is shielded can be changed as appropriate according to various conditions such as the configuration of the imaging unit, an installation position, and an installation method.
[0131] As described above, an example of the method of determining whether or not substantially the entire angle of view of a predetermined imaging unit is shielded on the basis of the brightness of an image captured by the imaging unit has been described citing specific examples with reference to FIGS. 8 to 16, as examples.
[0132] <3.5. Modification>
[0133] Next, modifications of the information processing system 1 according to the present embodiment will be described.
[0134] (First Modification: Notification of Information Regarding a Shielding Situation of the Angle of View of the Imaging Unit)
[0135] First, as a first modification, an example of a user interface of a case of notifying the user of a situation where the angle of view of the imaging unit is shielded will be described.
[0136] The user has a difficulty in directly viewing another part other than a part located in front of the eyes, of parts of the input/output device 20 according to a mounted state due to the characteristics of the head-mounted device like the input/output device 20 that is used by being worn on the head. Therefore, for example, in the case where the imaging units 201a and 201b illustrated in FIG. 2 are used to determine the user input, there are cases where the user has a difficult in directly viewing the imaging units 201a and 201b in a state of wearing the input/output device 20.