Sony Patent | Information processing apparatus, information processing method, and program

编辑：映维 | 分类：Sony | 2021年9月23日

Patent: Information processing apparatus, information processing method, and program

Drawings: Click to check drawins

Publication Number: 20210294414

Publication Date: 20210923

Applicant: Sony

Sony Patent | Information processing apparatus, information processing method, and program

Abstract

[Object] To provide an information processing apparatus, an information processing method, and a program that make it possible to properly determine whether a user is gazing a mobile object even if an intermediate object exists between the user and a mobile object. [Solving Means] An information processing apparatus includes a mobile object control unit. When an intermediate object is determined to be situated between a user and a mobile object, the mobile object control unit controls the mobile object so that a relative positional relationship between the mobile object and the intermediate object as viewed from the user is changed, the mobile object having a movement mechanism, the user and the mobile object being in real space, the mobile object control unit controlling the mobile object on the basis of information regarding a visual line of the user that is acquired after the relative positional relationship is changed.

Claims

An information processing apparatus comprising: a mobile object control unit that controls, when an intermediate object is determined to be situated between a user and a mobile object, the mobile object so that a relative positional relationship between the mobile object and the intermediate object as viewed from the user is changed, the mobile object having a movement mechanism, the user and the mobile object being in real space, the mobile object control unit controlling the mobile object on a basis of information regarding a visual line of the user that is acquired after the relative positional relationship is changed.
The information processing apparatus according to claim 1, wherein the mobile object control unit controls the mobile object so that the mobile object after the change in the relative positional relationship is located at a position other than a position in a direction of the visual line of the user at a time immediately before the relative positional relationship is changed.
The information processing apparatus according to claim 2, wherein the mobile object control unit controls the mobile object so that the mobile object after the change in the relative positional relationship is located in an imaginary line orthogonal to the direction of the visual line.
The information processing apparatus according to claim 2, wherein there exists a plurality of the users, and the mobile object control unit controls the mobile object so that the mobile object after the change in the relative positional relationship is located at a position other than positions in visual-line directions of the respective visual lines of the plurality of the users at the time immediately before the relative positional relationship is changed.
The information processing apparatus according to claim 2, wherein the mobile object control unit controls the mobile object so that action of the mobile object acting so that the relative positional relationship is changed, is not similar to action of the intermediate object, as viewed from the user.
The information processing apparatus according to claim 5, wherein the mobile object control unit controls the mobile object so that the mobile object moves in a direction different from a movement direction of the intermediate object, as viewed from the user.
The information processing apparatus according to claim 6, wherein the mobile object control unit controls the mobile object using a predicted position of the intermediate object that is predicted on a basis of a temporal change in information regarding a past position of the moving intermediate object.
The information processing apparatus according to claim 5, wherein the mobile object control unit controls the mobile object so that the mobile object takes action different from action of the mobile object that is taken immediately before the relative positional relationship is changed and so that the relative positional relationship is changed.
The information processing apparatus according to claim 8, wherein the mobile object control unit controls the mobile object so that the mobile object moves in a movement direction different from a direction of movement of the mobile object that is performed immediately before the relative positional relationship is changed and so that the relative positional relationship is changed.
The information processing apparatus according to claim 9, wherein the mobile object control unit controls the mobile object so that the mobile object moves in a direction different by 180 degrees from the direction of the movement of the mobile object that is performed immediately before the relative positional relationship is changed and so that the relative positional relationship is changed.
The information processing apparatus according to claim 8, wherein the mobile object control unit controls the mobile object so that the mobile object moves at a speed that enables the user to follow the mobile object with an eye and so that the relative positional relationship is changed.
The information processing apparatus according to claim 11, wherein the mobile object is a mobile body having the movement mechanism.
The information processing apparatus according to claim 12, wherein the mobile body is capable of moving on the ground.
The information processing apparatus according to claim 12, wherein the mobile body is capable of flying.
The information processing apparatus according to claim 13, wherein the information processing apparatus is the mobile object including the movement mechanism and the mobile object control unit.
The information processing apparatus according to claim 15, wherein the mobile object includes an indicator indicating that the mobile object is on standby for receiving an instruction from the user.
The information processing apparatus according to claim 16, wherein the mobile object includes an image acquisition unit that acquires information regarding an image of a surrounding environment, and the mobile object control unit controls the mobile object on a basis of the information regarding the visual line of the user, the information regarding the visual line of the user being acquired using the information regarding the image.
The information processing apparatus according to claim 17, wherein the image acquisition unit includes a depth sensor.
An information processing method comprising: controlling, when an intermediate object is determined to be situated between a user and a mobile object, the mobile object so that a relative positional relationship between the mobile object and the intermediate object as viewed from the user is changed, the mobile object having a movement mechanism, the user and the mobile object being in real space; and controlling the mobile object on a basis of information regarding a visual line of the user that is acquired after the relative positional relationship is changed.
A program that causes an information processing apparatus to perform processing comprising: controlling, when an intermediate object is determined to be situated between a user and a mobile object, the mobile object so that a relative positional relationship between the mobile object and the intermediate object as viewed from the user is changed, the mobile object having a movement mechanism, the user and the mobile object being in real space; and controlling the mobile object on a basis of information regarding a visual line of the user that is acquired after the relative positional relationship is changed.

Description

TECHNICAL FIELD

[0001] The present technology relates to an information processing apparatus, an information processing method, and a program for a mobile object such as an autonomous action robot.

BACKGROUND ART

[0002] It is described in Patent Literature 1 that an image of an imaginary object is superimposed on an optical image of a real object positioned in real space on the basis of an AR technology using an HMD (Head Mounted Display). It is described in Patent Literature 1 that an imaginary object is moved so as to actively guide the visual line of a user wearing a HMD, whereby any of a plurality of imaginary objects to which the user turns his/her eyes is specified as an operation target to make an interaction with the user suitable.

CITATION LIST

Patent Literature

[0003] Patent Literature 1: WO 2017/187708

DISCLOSURE OF INVENTION

Technical Problem

[0004] Nowadays, the realization of natural communication between mobile objects such as autonomous action robots and users has been desired. For the realization of natural communication, it is sometimes necessary to determine whether users are gazing mobile objects.

[0005] Note that there is a case that an intermediate object exists between a mobile object and a user when the mobile object is controlled. In this case, it may be falsely determined that the user is gazing the mobile object even if the user is actually gazing the intermediate object. The false determination possibly causes the mobile object to take action not intended by the user.

[0006] In view of the problem, the present disclosure provides an information processing apparatus, an information processing method, and a program that make it possible to properly determine whether a user is gazing a mobile object even if an intermediate object exists between the user and the mobile object.

Solution to Problem

[0007] An information processing apparatus according to an embodiment of the present technology includes a mobile object control unit.

[0008] When an intermediate object is determined to be situated between a user and a mobile object, the mobile object control unit controls the mobile object so that a relative positional relationship between the mobile object and the intermediate object as viewed from the user is changed, the mobile object having a movement mechanism, the user and the mobile object being in real space, the mobile object control unit controlling the mobile object on the basis of information regarding a visual line of the user that is acquired after the relative positional relationship is changed.

[0009] According to such a configuration, a mobile object is controlled so that the relative positional relationship between the mobile object and an intermediate object as viewed from a user is changed. Therefore, a change in the visual line of the user is more easily detected when the relative positional relationship is changed. As a result, accuracy in determining whether the user is gazing the mobile object is improved. Accordingly, the action of the mobile object that is taken on the basis of a determination result becomes suitable for a situation, and the user can naturally communicate with the mobile object.

[0010] The mobile object control unit may control the mobile object so that the mobile object after the change in the relative positional relationship is located at a position other than a position in a direction of the visual line of the user at a time immediately before the relative positional relationship is changed.

[0011] According to such a configuration, a change in the visual line of a user is easily detected, and accuracy in determining whether the user is gazing a mobile object is improved.

[0012] For example, when a mobile object moves in the direction of a visual line as first action, the movement of the mobile object as viewed from a user is movement to a front side or movement to a back side. In this case, it is difficult to detect a change in the visual line of the user who follows the mobile object with his/her eyes. Accordingly, the mobile object is caused to move in a direction other than the direction of the visual line, whereby the change in the visual line of the user who follows the mobile object with the eyes is easily detected.

[0013] The mobile object control unit may control the mobile object so that the mobile object after the change in the relative positional relationship is located in an imaginary line orthogonal to the direction of the visual line.

[0014] According to such a configuration, a change in the visual line of a user is more easily detected, and accuracy in determining whether the user is gazing is improved.

[0015] There may exist a plurality of the users, and the mobile object control unit may control the mobile object so that the mobile object after the change in the relative positional relationship is located at a position other than positions in visual-line directions of the respective visual lines of the plurality of the users at the time immediately before the relative positional relationship is changed.

[0016] The mobile object control unit may control the mobile object so that action of the mobile object acting so that the relative positional relationship is changed, is not similar to action of the intermediate object, as viewed from the user.

[0017] According to such a configuration, a change in the visual line of a user is easily detected, and accuracy in determining whether the user is gazing is improved.

[0018] The mobile object control unit may control the mobile object so that the mobile object moves in a direction different from a movement direction of the intermediate object, as viewed from the user.

[0019] As described above, a mobile object may be caused to take action to move in a direction different from a movement direction of the intermediate object as action not similar to the action of the intermediate object.

[0020] The mobile object control unit may control the mobile object using a predicted position of the intermediate object that is predicted on the basis of a temporal change in information regarding a past position of the moving intermediate object.

[0021] According to such a configuration, even if the mobile object is an intermediate object, the prediction of action of the intermediate object makes it possible to cause the mobile object to take action by which a change in the visual line of a user is easily detected.

[0022] The mobile object control unit may control the mobile object so that the mobile object takes action different from action of the mobile object that is taken immediately before the relative positional relationship is changed and so that the relative positional relationship is changed.

[0023] According to such a configuration, a change in the visual line of a user is easily detected, and accuracy in determining whether the user is gazing a mobile object is improved.

[0024] The mobile object control unit may control the mobile object so that the mobile object moves in a movement direction different from a direction of movement of the mobile object that is performed immediately before the relative positional relationship is changed and so that the relative positional relationship is changed.

[0025] The mobile object control unit may control the mobile object so that the mobile object moves in a direction different by 180 degrees from the direction of the movement of the mobile object that is performed immediately before the relative positional relationship is changed and so that the relative positional relationship is changed.

[0026] According to such a configuration, it is possible to make a user easily follow a mobile object with his/her eyes and easily detect a change in the visual line of the user in a case in which the visual line of the user is not directed to the mobile object but the user is possibly looking at the mobile object, a case in which the estimation of imaginary lines from the directions of the respective visual lines of a plurality of users is difficult, a case in which a multiplicity of mobile objects exists and the estimation of action different from the action of the mobile objects is difficult, or the like.

[0027] The mobile object control unit may control the mobile object so that the mobile object moves at a speed that enables the user to follow the mobile object with an eye and so that the relative positional relationship is changed.

[0028] According to such a configuration, it is possible for a user to follow a mobile object with his/her eyes by natural eye motion.

[0029] The mobile object may be a mobile body having the movement mechanism.

[0030] The mobile body may be capable of moving on the ground.

[0031] The mobile body may be capable of flying.

[0032] The information processing apparatus may be the mobile object including the movement mechanism and the mobile object control unit.

[0033] The mobile object may include an indicator indicating that the mobile object is on standby for receiving an instruction from the user.

[0034] According to such a configuration, it is possible to urge a user to gaze a mobile object through an indicator.

[0035] The mobile object may include an image acquisition unit that acquires information regarding an image of a surrounding environment, and the mobile object control unit may control the mobile object on the basis of the information regarding the visual line of the user, the information regarding the visual line of the user being acquired using the information regarding the image.

[0036] The image acquisition unit may include a depth sensor.

[0037] Thus, it is possible to obtain information regarding the distance between a mobile object and each object as information regarding a surrounding environment and perform more accurate person detection, object detection, face detection, visual line detection, or the like from information regarding an image.

[0038] An information processing method according to an embodiment of the present technology includes controlling, when an intermediate object is determined to be situated between a user and a mobile object, the mobile object so that a relative positional relationship between the mobile object and the intermediate object as viewed from the user is changed, the mobile object having a movement mechanism, the user and the mobile object being in real space; and controlling the mobile object on the basis of information regarding a visual line of the user that is acquired after the relative positional relationship is changed.

[0039] A program according to an embodiment of the present technology causes an information processing apparatus to perform processing including controlling, when an intermediate object is determined to be situated between a user and a mobile object, the mobile object so that a relative positional relationship between the mobile object and the intermediate object as viewed from the user is changed, the mobile object having a movement mechanism, the user and the mobile object being in real space; and controlling the mobile object on the basis of information regarding a visual line of the user that is acquired after the relative positional relationship is changed.

Advantageous Effects of Invention

[0040] As described above, the present technology makes it possible to realize the natural communication between a user and a mobile object that exist in real space. Note that an effect achieved by the present technology is not necessarily limited to the effect described here but may include any effect described in the present disclosure.

BRIEF DESCRIPTION OF DRAWINGS

[0041] FIG. 1 is a view for describing a state in which an autonomous action robot acting as an information processing apparatus according to a first embodiment of the present technology is used.

[0042] FIG. 2 is a block diagram showing the configuration of the autonomous action robot.

[0043] FIG. 3 is a flowchart for describing an example of a gaze determination in the autonomous action robot.

[0044] FIG. 4 is a view (part 1) for describing first action for the gaze determination that is taken by the autonomous action robot.

[0045] FIG. 5 is a view (part 2) for describing the first action for the gaze determination that is taken by the autonomous action robot.

[0046] FIG. 6 is a view (part 3) for describing the first action for the gaze determination that is taken by the autonomous action robot.

[0047] FIG. 7 is a view (part 4) for describing the first action for the gaze determination that is taken by the autonomous action robot.

[0048] FIG. 8 is a view (part 5) for describing the first action for the gaze determination that is taken by the autonomous action robot.

[0049] FIG. 9 is a view for describing a method for calculating a second target position where the autonomous action robot reaches according to the first action at the time of the gaze determination.

[0050] FIG. 10 is a view for describing a method for calculating a time at which the autonomous action robot reaches the second target position according to the first action at the time of the gaze determination.

[0051] FIG. 11 is a view (part 1) for describing the first action for the gaze determination that is taken by the autonomous action robot when a mobile object exists around the autonomous action robot.

[0052] FIG. 12 is a view (part 2) for describing the first action for the gaze determination that is taken by the autonomous action robot when the mobile object exists around the autonomous action robot.

[0053] FIG. 13 is a view (part 3) for describing the first action for the gaze determination that is taken by the autonomous action robot when the mobile object exists around the autonomous action robot.

[0054] FIG. 14 is a view for describing the determination of the movement of the surrounding mobile object.

[0055] FIG. 15 is a view showing an example of the hardware configuration of the autonomous action robot.

[0056] FIG. 16 is a view for describing first action for a gaze determination that is taken by an autonomous action robot according to a second embodiment.

[0057] FIG. 17 is a view (part 1) for describing first action for a gaze determination that is taken by an autonomous action robot according to a third embodiment.

[0058] FIG. 18 is a view (part 2) for describing the first action for the gaze determination that is taken by the autonomous action robot according to the third embodiment.

[0059] FIG. 19 is a view (part 3) for describing the first action for the gaze determination that is taken by the autonomous action robot according to the third embodiment.

[0060] FIG. 20 is a view (part 4) for describing the first action for the gaze determination that is taken by the autonomous action robot according to the third embodiment.

MODE(S)* FOR CARRYING OUT THE INVENTION*

First Embodiment

[0061] (Schematic Configuration)

[0062] First, an information processing apparatus according to an embodiment of the present technology will be described with reference to FIG. 1. FIG. 1 is a view for describing a state in which an autonomous action robot acting as an information processing apparatus is used.

[0063] As shown in FIG. 1, users U1 to U3 and an autonomous action robot 1 acting as an information processing apparatus exist in, for example, a living room 60 that is real space. In the living room 60, a sofa 61, a table 62, and a TV set 63 that are stationary objects not often moved by users are arranged.

[0064] The autonomous action robot 1 (hereinafter simply called a robot) that is a mobile body acting as a mobile object is, for example, a robot that supports life as a human partner and places importance on communication with humans, and is a mobile object that exists in real space.

[0065] The robot 1 is configured to determine whether a user is gazing the robot 1 and is configured to take proper action for the user on the basis of a gaze determination result.

[0066] For example, when it is determined that a user is gazing the robot 1, the robot 1 can take any action for the user who is gazing the robot 1 and have natural communication with the user.

[0067] On the other hand, when it is determined that the user is not gazing the robot 1, the robot 1 continues to take action that has been taken immediately before the gaze determination, for example. Thus, the robot 1 is prevented from taking unnatural action for the user who is not gazing the robot 1. Further, when the robot 1 takes any action for the user although it is determined that the user is not gazing the robot 1, the robot 1 can take any action after calling attention to the user and have natural communication with the user.

[0068] Here, an autonomous action type robot in a spherical shape configured to be capable of autonomously traveling on the ground is exemplified as a mobile object, but the mobile object is not limited to such a robot. The mobile robot may be a mobile object including a movement mechanism capable of moving on the ground or in the air. Alternatively, the mobile robot may be a pet type robot imitating an animal such as a dog and a cat or a human type robot. Alternatively, the mobile robot may be a flying type robot configured to be capable of flying in the air.

[0069] (Configuration of Robot)

[0070] FIG. 2 is a block diagram showing the configuration of the robot 1 of the present embodiment.

[0071] As shown in FIG. 2, the robot 1 acting as an information processing apparatus has a control unit 10, an input/output unit 40, an environment map database 30, an action database 31, a data storage unit 32, a storage unit 33, and a movement mechanism 34.

[0072] The movement mechanism 34 is movement means for causing the robot 1 acting as a mobile object to move in real space under the control of a robot control unit 15 of the control unit 10 that will be described later.

[0073] Examples of the movement mechanism 34 include a leg type movement mechanism, a wheeled movement mechanism, a crawler type movement mechanism, and a propeller movement mechanism. A robot including the leg type movement mechanism, the wheeled movement mechanism, or the crawler type movement mechanism is capable of moving on the ground. A robot including the propeller movement mechanism is capable of flying and moving in the air.

[0074] The movement mechanism has a movement unit that causes the robot to move and driving means for driving the movement unit.

[0075] The leg type movement mechanism is used in, for example, a human type robot, a pet type robot, or the like. The leg type movement mechanism has leg units acting as movement units and actuators acting as driving means for driving the leg units.

[0076] For example, a pet type robot imitating a dog typically includes a head unit, a body unit, leg units (four legs), and a tail unit. Actuators are provided in the joints of the leg units (four legs), the connection parts between the respective leg units and the body unit, the connection parts between the head unit and the body unit, and the connection part between the tail unit and the body unit. The movement of the robot is controlled by the drive and the control of the respective actuators.

[0077] The wheeled movement mechanism has wheels acting as movement units and a motor acting as drive means for driving the wheels. By the rotation and the drive of wheels attached to the body of the robot 1, the body of the robot 1 is moved on a ground plane. In the present embodiment, the robot 1 including the wheeled movement mechanism will be exemplified.

[0078] The crawler type movement mechanism has crawlers acting as movement units and drive means for driving the crawlers. By the rotation and the drive of the crawlers attached to the body of the robot 1, the body of the robot 1 is moved on a ground plane.

[0079] The propeller movement mechanism has a propeller acting as a movement unit that moves a robot and an engine or a battery that acts as drive means for driving the propeller.

[0080] In the robot 1, the first action of the robot 1 is controlled on the basis of information regarding a surrounding environment that is acquired by various sensors constituting the input/output unit 40. The first action is action for gaze determination processing. The first action is controlled so that the relative positional relationship between the robot 1 and an intermediate object as viewed from a user is changed. The intermediate object is an object existing between the user and the robot 1 that exist in real space.

[0081] Then, the robot 1 is controlled to take second action on the basis of information regarding the visual line of the user that is acquired after the relative positional relationship is changed. Specifically, the gaze determination processing as to whether the user is gazing the robot 1 is performed according to a change in the visual line of the user during the execution of the first action of the robot 1 that is calculated on the basis of the information regarding the visual line of the user. The robot 1 is controlled to take the second action on the basis of a gaze determination result.

[0082] The input/output unit 40 has a camera (imaging device) 41, a depth sensor 42, a microphone 43, and a touch sensor 44 that constitute an input unit, and a speaker 45 and a LED (Light Emitting Diode) indicator 46 that constitute an output unit. Note that the camera 41 may include various imaging devices such as an RGB camera, a monochrome camera, an infrared camera, a polarization camera, and an event-based camera. Note that the event-based camera is a camera that outputs an image only when a change in brightness or the like occurs between images.

[0083] The camera 41 acting as an image acquisition unit is a camera that shoots surrounding real space. The camera 41 acquires RGB images, monochrome images, infrared images, polarization images, differential images, or the like depending on the type of an image sensor. Image information acquired as a shooting result is supplied to the control unit 10. The camera 41 may be in a singular or plural form. The camera 41 is installed at, for example, the top of the robot 1.

[0084] The depth sensor 42 acting as an image acquisition unit acquires information regarding the distance between the depth sensor 42 and an object. The acquired three-dimensional distance image information is supplied to the control unit 10. The depth sensor 42 is installed at, for example the top of the robot 1.

[0085] The object includes not only persons such as users but also objects other than the persons. The persons are mobile objects capable of moving.

[0086] The objects other than the persons includes not only stationary objects such as the sofa 61, the table 62, and the TV set 63 not often moved by users and fixed in real space but also mobile objects other than the persons.

[0087] Examples of the mobile objects other than the persons include animals such as dogs and cats, cleaning robots, and typically small objects such as plastic bottles and cups easily moved by users.

[0088] As the depth sensor 42, a publicly-known sensor may be used. For example, a method in which infrared rays or the like are applied and a distance to an object is measured according to a time until the reflected light of the applied infrared rays is returned, a method in which a pattern is applied by infrared rays or the like and a distance to an object is measured according to the distortion of the pattern reflected on the object, a method in which images captured by a stereo camera are matched to each other and a distance to an object is measured according to the parallax between the images, or the like may be used.

[0089] The microphone 43 collects surrounding sound. Information on the collected sound is supplied to the control unit 10.

[0090] The touch sensor 44 detects contact with the robot 1 by a user. Information on the detected contact is supplied to the control unit 10.

[0091] The speaker 45 outputs a sound signal that is a robot control signal received from the control unit 10.

[0092] In the LED indicator 46 acting as an indicator, the lighting of an LED is controlled on the basis of a light emission signal that is a robot control signal received from the control unit 10. The LED indicator 46 forms emotions or the like of a robot through the combination of the blinking patterns or the lighting timing of the LED and notifies a user U of the condition of the robot 1.

[0093] For example, when taking any action for the user U, the robot 1 is capable of letting the user know the situation that the robot 1 is turning an eye to the user by lighting up a part of the LED indicator 46.

[0094] Further, the robot 1 is capable of presenting emotions such as delight, anger, sorrow, and pleasure to the user by changing a lit color.

[0095] Further, while being on standby for receiving instructions from the user, the robot 1 is capable of letting the user know the situation that the robot 1 is on standby for receiving the instructions by blinking the LED indicator 46.

[0096] The LED indicator 46 may be provided in the whole area or a part of the spherical robot 1. In the present embodiment, the robot 1 includes two LED indicators 46 in a part of its surface as shown in FIG. 1.

[0097] The robot 1 may also include a display such as an LCD (Liquid Crystal Display) that displays an image as an output unit. Thus, the robot 1 is capable of presenting information such as the emotions and the situations of the robot to the user through desired image display.

[0098] The control unit 10 controls a series of processing relating to a gaze determination.

[0099] The control unit 10 may include a three-dimensional sensing unit 11, a sound information acquisition unit 12, a self-position estimation unit 13, an environment map generation unit 14, a robot control unit 15 acting as a mobile object control unit, an object detection unit 16, a face-and-visual-line detection unit 17, a visual line determination unit 18, a presence/absence-of-intermediate-object determination unit 19, a gaze determination unit 20, and an object position prediction unit 21.

[0100] The three-dimensional sensing unit 11 integrates image information acquired by the camera 41 and three-dimensional distance image information acquired by the depth sensor 42 with each other to construct a surrounding environment three-dimensional shape.

[0101] The surrounding environment three-dimensional shape may be constructed in such a manner that three-dimensional coordinates are output as a point group by a three-dimensional scanner using a laser.

[0102] The sound information acquisition unit 12 acquires sound information collected by the microphone 43.

[0103] The self-position estimation unit 13 estimates the self-position of the robot 1 in which the camera 41 and the depth sensor 42 are installed, by identifying the feature points of the appearance of an object that are stored in the data storage unit 32 that will be described later with the three-dimensional coordinates of the feature points of the object that are extracted from a three-dimensional shape constructed by the three-dimensional sensing unit 11. That is, the self-position estimation unit 13 estimates the position of the living room 60 at which the robot 1 exists.

[0104] Further, a sensor other than a camera such as an IMU (Inertial Measurement Unit) and a GPS (Global Positioning System) may be provided, besides a camera and a depth sensor, and their sensor information may be integrated with each other to perform more accurate self-position estimation.

[0105] The object detection unit 16 detects the regions of persons and the regions of objects other than the persons from a three-dimensional shape constructed by the three-dimensional sensing unit 11.

[0106] As a method for detecting the regions of the persons, a recognition method based on HOG (Histograms of Oriented Gradients) feature amounts and a SVM (Support Vector machine) using image information, a recognition method using a convolutional neural network, or the like may be used.

[0107] Further, more accurate detection may be performed by detecting the persons using three-dimensional distance image information, besides the image information.

[0108] A method for detecting the regions of the objects other than the persons may include detecting unknown objects from a three-dimensional shape and classifying the detected objects using shape information to recognize the objects.

[0109] Further, an object recognition technology based on shape feature amounts may be used to perform classification with the assumption that object shapes which are detection targets are known shapes. In this manner, more accurate object detection may be performed.

[0110] The environment map generation unit 14 generates an environment map expressing the position of the robot 1, the positions of respective objects, or the like that exist in real space on the basis of a three-dimensional shape constructed by the three-dimensional sensing unit 11, the estimated position of the robot 1 estimated by the self-position estimation unit 13, and the regions of persons and the regions of objects other than the persons that are detected by the object detection unit 16.

[0111] The environment map generation unit 14 generates an environment map expressing the positions of the robot 1, persons, and objects other than the persons every certain time. The generated environment map is accumulated in the environment map database 30 in a time series.

[0112] The estimation of the self-position of the robot 1 and the generation of an environment map may be performed at the same time using a technology called visual SLAM (Simultaneous Localization and Mapping).

[0113] Further, in the environment map database 30, an environment map acting as answer data may be registered in advance.

[0114] By referring to the environment map of answer data and a three-dimensional shape constructed by the three-dimensional sensing unit 11, the self-position of the robot 1 may be estimated.

[0115] Further, for example, an environment map expressing real space in which mobile objects do not exist but only stationary objects exist may be registered in advance in the environment map database 30 as answer data. On the basis of a difference in the background between the environment map of the answer data and a three-dimensional shape constructed by the three-dimensional sensing unit 11, the regions of persons and the regions of mobile objects other than the persons may be detected.

[0116] The face-and-visual-line detection unit 17 detects the regions of faces from the regions of persons that are output from the object detection unit 16 and detects the directions of the faces and the visual lines of the persons.

[0117] For the detection of the regions of faces, a publicly known method using the image feature amounts (Facial Appearance) of image information may be used. In addition, three-dimensional distance image information may be used, besides the image information to perform more accurate detection.

[0118] For the detection of the directions of faces, a publicly known method may be used. For example, the directions of noses or the directions of jaws may be detected to detect the directions of the faces.

[0119] For the detection of visual lines, a publicly known method may be used. For example, the visual lines of persons may be detected from the distances between the inner corners of eyes and the centers of black eyes using image information.

[0120] Further, a corneal reflex method may be used in which visual lines are detected using infrared light. According to the corneal reflex method, images of the portions of the eyes of a user are shot with the application of infrared light, and the direction of visual lines is detectable from the reflected positions of the infrared light on the corneal of the eyes of the user who is a detection target, that is, from the positional relationship between the luminous spots of the infrared light and the pupils of the user who is the detection target in images obtained by the shooting.

[0121] The visual line determination unit 18 determines whether any person detected from a three-dimensional shape is turning his/her face to the robot 1 or the visual line of the person is directed to the robot 1, on the basis of a result of detection performed by the face-and-visual-line detection unit 17.

[0122] The presence/absence-of-intermediate-object determination unit 19 determines, using an environment map, whether any object exists between the robot 1 and a person (user) who becomes a gaze determination target for which a determination is made as to whether the person is gazing the robot 1.

[0123] An intermediate object represents an object positioned between a person who becomes a gaze determination target and the robot 1. The present embodiment exemplifies a case in which the intermediate object is a real object existing in real space. The intermediate object may be a person or an object other than the person. Further, the intermediate object may be a stationary object or a mobile object.

[0124] The gaze determination unit 20 determines whether a person (hereinafter called a target person in some cases) who becomes a gaze determination target is gazing the robot 1.

[0125] The robot 1 takes first action for a gaze determination under the control of the robot control unit 15 that will be described later. The gaze determination unit 20 determines whether the target person is gazing the robot 1 on the basis of a change in the visual line of the target person during the first action of the robot 1, that is, during a period from the start to the end of the first action.

[0126] When the target person is gazing the robot 1, the eye motion of the user follows the movement of the robot 1.

[0127] In the present embodiment, the gaze determination unit 20 acquires information regarding the visual line of the target person during a period from the start to the end of the first action of the robot 1 from the face-and-visual-line detection unit 17 in a time series, and then calculates a change in the visual line of the target person, that is, the track of the visual line on the basis of the acquired information regarding the visual line.

[0128] The gaze determination unit 20 determines the presence or absence of the correlation between the movement track of the robot 1 and the track of the visual line of the target person on the basis of the movement track of the robot 1 during a period from the start to the end of the first action, the calculated track of the visual line of the target person, and information regarding the distance between the target person and the robot 1 that is acquired from the depth sensor 42 to determine whether the target person is gazing the robot 1.

[0129] When determining that the correlation is present, the gaze determination unit 20 determines that the target person is gazing the robot 1. On the other hand, when determining that the correlation is absent, the gaze determination unit 20 determines that the target person is not gazing the robot 1.

[0130] The object position prediction unit 21 predicts the positions of respective objects after current time from changes in information regarding the past positions of mobile objects including persons obtained on the basis of an environment map accumulated in a time series.

[0131] The robot control unit 15 controls first action taken by the robot 1 for a gaze determination and second action taken by the robot 1 on the basis of a gaze determination result. A control signal that is generated by the robot control unit 15 and relates to the action of the robot 1 will be called a robot control signal.

[0132] The first action is an action taken by the robot 1 to determine whether a target person is gazing the robot 1.

[0133] The second action is taken by the robot 1 on the basis of a change in the visual line of the target person during the first action that is calculated using information regarding the visual line of the target person that is acquired after the first action is taken. The change in the visual line is detected by the gaze determination unit 20 described above. Further, the second action is taken by the robot 1 on the basis of information regarding the visual line of the target person that is acquired after the first action is taken.

[0134] The robot control signal includes the control signal of the movement mechanism 34 that relates to the action of the robot 1, the light emission signal of the LED indicator 46 that relates to the notification of the emotions, the situations, or the like of the robot 1, a sound signal that relates to sound produced from the robot 1 and is supplied to the speaker 45, or the like.

[0135] When it is determined by the presence/absence-of-intermediate-object determination unit 19 that an intermediate object is absent, the robot control unit 15 generates a robot control signal relating to first action for a gaze determination according to prescribed processing that will be described later.

[0136] When it is determined by the presence/absence-of-intermediate-object determination unit 19 that the intermediate object is present, the robot control unit 15 generates the robot control signal relating to the first action so that the relative positional relationship between the robot 1 acting as a mobile object and the intermediate object as viewed from a target user is changed.

[0137] In addition, when the intermediate object is a mobile object, the robot control unit 15 generates the robot control signal relating to the first action with consideration given to the position of the intermediate object that is predicted by the object position prediction unit 21.

[0138] In the example shown in FIG. 1, the user U3 exists between the user U1 and the robot 1 as an intermediate object.

[0139] In such a situation, it is difficult to determine whether the user U1 is gazing one of the robot 1 and the user U3 or the user U1 is not gazing both the robot 1 and the user U3 on the basis of information regarding the visual line of the user U1 even if the face of the user U1 is directed to the robot 1.

[0140] Therefore, in order to determine whether the user U1 is gazing the robot 1, the robot 1 takes first action so that the relative positional relationship between the robot 1 and the user U3 as viewed from the user U1 is changed.

[0141] Then, by detecting how the visual line of the user U3 is changed according to the first action taken by the robot 1, the robot 1 determines whether the user U1 is gazing the robot 1.

[0142] In the present embodiment, the first action of the robot 1 is basically action different from the action of the robot 1 at a time immediately before the first action is taken. A second target position that is a position where the robot 1 reaches according to the first action is a position other than a position in the direction of the visual line of a user (target person). The first action is controlled at a speed at which a difference in the angle of the visual line of the user (target person) who follows the robot 1 with his/her eyes becomes within 30 degrees per second.

[0143] Note that the time immediately before the first action is taken here represents a time immediately before the relative positional relationship between the robot and an intermediate object as viewed from the user is changed. Further, a time at which the robot reaches the second target position according to the first action is a time immediately after the first action is taken and represents a time after the relative positional relationship is changed.

[0144] In addition, the first action of the robot 1 is controlled in a range in which the robot 1 does not collide with objects including persons or walls around the robot 1, and controlled so that the robot 1 moves to space in which the objects do not exist to a greater extent.

[0145] The details of a method for generating basic first action will be described as a method for generating prescribed first action below.

[0146] Here, as a position other than a position in the direction of the visual line of a user (target person), a position in an imaginary line orthogonal to the direction of the visual line of the user is exemplified. The imaginary line (for example, a line denoted by symbol L in FIGS. 7 to 10) passes through the robot 1, is substantially parallel to the ground plane of the robot 1, and is set at a height position near the installation position of the camera 41 or the depth sensor 42 that is installed in the robot 1. The camera 41 or the depth sensor 42 plays a role as an “eye” in the robot 1, and the imaginary line L is set at a height position corresponding to the eye line of the robot 1.

[0147] Since the first action is controlled to be different from the previous action of the robot 1, it is easy to detect a change in the visual line of a target person with respect to the movement path of the robot. Further, since the robot 1 is controlled to take action different from previous action, a target person may be urged to pay attention to the robot 1 to move the direction of his/her visual line. Thus, accuracy in the gaze determination processing may be further improved.

[0148] In the present embodiment, the robot 1 takes action to move in a direction different from a traveling direction in previous action as action different from action previous to the first action.

[0149] The action different from the previous action is desirably action by which attention from a user is easily directed to the robot 1. Thus, it is easy to determine whether the robot 1 is just randomly moving or the robot is moving in order to receive instructions from the user.

[0150] As another example of the action different from the action previous to the first action, the LED indicator 46 that has not been lit in the previous action may be lit in the first action.

[0151] In addition, as another example, the robot may move while rotating as the action different from the previous action.

[0152] Further, if a mobile object is a pet type robot imitating a dog, the robot may move while swinging a tail as the action different from the previous action.

[0153] Further, since the first action is controlled to move in a direction orthogonal to the direction of the visual line of a target person at a time immediately before the first action is taken, a change in the visual line of the target person increases.

[0154] For example, when the robot 1 moves in the direction of the visual line of the target person, i.e., when the robot 1 moves on the vector of the visual line of the target person and on the extension of the vector of the visual line, the movement of the robot 1 as viewed from the target person is movement to a front side or movement to a back side. Since the visual line of the user who follows the robot 1 with his/her eyes does not change largely in this case, it is difficult to make a gaze determination. Here, the vector of the visual line represents a directed segment directed from the eyes of the target person to the robot 1 and represents information regarding the visual line of the target person.

[0155] Accordingly, when the user is gazing the robot 1, a second target position where the robot 1 reaches according to the first action is set in an imaginary line orthogonal to the vector of the visual line of the target person and the robot 1 is moved to the second target position. In this manner, since a change in the visual line of the target person increases, it is easy to detect the change in the visual line. Thus, accuracy in the gaze determination processing may be further improved.

……
……
……

本文链接：https://patent.nweon.com/20574

Sony Patent | Information processing apparatus, information processing method, and program

您可能还喜欢...

分类

最新AR/VR行业分享

最新AR/VR论文

最新AR/VR行业招聘

Sony Patent | Information processing apparatus, information processing method, and program

您可能还喜欢...

Sony Patent | Varying Effective Resolution By Screen Location By Altering Rasterization Parameters

Sony Patent | System and method for use in playing back panorama video content

Sony Patent | Information processing apparatus, information processing method, and recording medium

分类

最新AR/VR行业分享

最新AR/VR论文

最新AR/VR行业招聘