Sony Patent | Information processing apparatus, information processing method, and program

编辑：映维 | 分类：Sony | 2021年11月25日

Sony Patent | Information processing apparatus, information processing method, and program

Patent: Information processing apparatus, information processing method, and program

Drawings: Click to check drawins

Publication Number: 20210368152

Publication Date: 20211125

Applicant: Sony

Abstract

An information processing apparatus includes an acquisition unit (101) configured to acquire first information regarding a recognition result of at least one of a position or an orientation of a viewpoint, and a control unit (117) configured to project a target object on a display region on the basis of the first information and cause display information to be presented to the display region according to a result of the projection, in which the control unit projects the object on a first partial region and a second partial region included in the display region on the basis of the first information according to the recognition result at timings different from each other.

Claims

An information processing apparatus comprising: an acquisition unit configured to acquire first information regarding a recognition result of at least one of a position or an orientation of a viewpoint; and a control unit configured to project a target object on a display region on a basis of the first information and control display information to be presented to the display region according to a result of the projection, wherein the control unit projects the object on a first partial region and a second partial region included in the display region on a basis of the first information according to the recognition result at timings different from each other.
The information processing apparatus according to claim 1, wherein the control unit controls presentation of first display information to the first partial region according to a projection result of the object on the first partial region, and presentation of second display information to the second partial region according to a projection result of the object on the second partial region at timings different from each other.
The information processing apparatus according to claim 2, wherein the control unit controls presentation of the display information to a partial region of at least one of the first partial region or the second partial region according to a timing when a result of projection of the object on the partial region is acquired.
The information processing apparatus according to claim 3, wherein, when a frame rate related to the presentation of the display information to the partial region is reduced to a first frame rate, the control unit sets a second frame rate related to projection of the object on a partial region of at least one of the first partial region or the second partial region to be larger than the first frame rate.
The information processing apparatus according to claim 3, wherein the control unit executes first processing of sequentially executing projection of the object on the first partial region based on the first information according to the recognition result at a first timing and presentation of the first display information according to a result of the projection to the first partial region, and second processing of sequentially executing projection of the object on the second partial region based on the first information according to the recognition result at a second timing and presentation of the second display information according to a result of the projection to the second partial region at timings different from each other.
The information processing apparatus according to claim 5, wherein the control unit executes the first processing and the second processing in a predetermined order for each predetermined unit period.
The information processing apparatus according to claim 6, wherein the second timing is a timing later than the first timing, and the control unit executes the second processing after executing the first processing.
The information processing apparatus according to claim 5, wherein the control unit estimates a processing time for processing of at least one of the first processing or the second processing, and starts processing regarding presentation of the display information to a corresponding partial region before the first information corresponding to the processing of at least one of the first processing or the second processing is acquired according to an estimation result of the processing time.
The information processing apparatus according to claim 5, wherein the control unit estimates a processing time for processing of at least one of the first processing or the second processing, and skips execution of the processing according to an estimation result of the processing time.
The information processing apparatus according to claim 9, wherein, in the case where the control unit has skipped execution of the processing of at least one of the first processing or the second processing, the control unit causes another display information to be presented in a corresponding partial region instead of the display information presented by the processing.
The information processing apparatus according to claim 2, wherein the first partial region and the second partial region are regions adjacent to each other, and the control unit corrects at least one of the first display information or the second display information such that the first display information and the second display information are presented as a series of continuous display information.
The information processing apparatus according to claim 1, wherein the control unit acquires a projection result of the object on the display region by projecting the object on a projection surface associated with the display region.
The information processing apparatus according to claim 12, wherein the control unit projects the object on the projection surface according to a relationship of at least one of positions or orientations among the viewpoint, the projection surface, and the object based on the first information.
The information processing apparatus according to claim 1, wherein the first partial region and the second partial region include one or more unit regions different from one another among a plurality of unit regions configuring the display region.
The information processing apparatus according to claim 14, wherein the unit region is either a scan line or a tile.
The information processing apparatus according to claim 1, wherein the object targeted for the projection is a virtual object, the acquisition unit acquires second information regarding a recognition result of a real object in a real space, and the control unit associates the virtual object with a position in the real space according to the second information, and projects the virtual object on the display region on a basis of the first information.
The information processing apparatus according to claim 16, wherein the control unit associates the virtual object with the position in the real space so that the virtual object is visually recognized to be superimposed on the real object.
The information processing apparatus according to claim 1, further comprising: a recognition processing unit configured to recognize at least one of the position or the orientation of the viewpoint according to a detection result of a detection unit, wherein the acquisition unit acquires the first information according to a result of the recognition.
An information processing method comprising: by a computer, acquiring first information regarding a recognition result of at least one of a position or an orientation of a viewpoint; and projecting a target object on a display region on a basis of the first information and causing display information to be presented to the display region according to a result of the projection, wherein the object is projected on a first partial region and a second partial region included in the display region on a basis of the first information according to the recognition result at timings different from each other.
A program for causing a computer to: acquire first information regarding a recognition result of at least one of a position or an orientation of a viewpoint; and project a target object on a display region on a basis of the first information and cause display information to be presented to the display region according to a result of the projection, wherein the object is projected on a first partial region and a second partial region included in the display region on a basis of the first information according to the recognition result at timings different from each other.

Description

TECHNICAL FIELD

[0001] The present disclosure relates to an information processing apparatus, an information processing method, and a program.

BACKGROUND ART

[0002] In recent years, the advancement of image recognition technology has enabled recognition of the position and orientation of a real object (that is, an object in a real space) included in an image captured by an imaging device. As one of application examples of such object recognition, there is a technology called augmented reality (AR). By using the AR technology, virtual content in various modes such as text, icons, and animations (hereinafter, referred to as “virtual object”) can be superimposed on an object in the real space (hereinafter referred to as “real object”) and a superimposed image can be presented to a user. For example, Patent Document 1 discloses an example of a technology of presenting virtual content to a user using an AR technology.

CITATION LIST

Patent Document

[0003] Patent Document 1: International Publication No. 2017/183346

SUMMARY OF THE INVENTION

Problems to be Solved by the Invention

[0004] By the way, a processing load regarding drawing of the virtual object is relatively high, depending on display information to be presented as the virtual object, and there are some cases where a delay occurs from the start of the drawing of the virtual object to output as the display information. Therefore, for example, when the position or orientation of a viewpoint of the user changes due to the delay during time until the drawn virtual object is presented to the user, there are some cases where a gap is caused in a relative position or orientation relationship between the viewpoint and the position where the drawn virtual object is superimposed. Such a gap may be recognized by the user as a gap of the position in the space where the virtual object is superimposed, for example. This applies not only to AR but also similarly applies to so-called virtual reality (VR) for presenting a virtual object in an artificially constructed virtual space.

[0005] Therefore, the present disclosure proposes a technology that enables presentation of information according to the position or orientation of a viewpoint in a more favorable mode.

Solutions to Problems

[0006] According to the present disclosure, there is provided an information processing apparatus including: an acquisition unit configured to acquire first information regarding a recognition result of at least one of a position or an orientation of a viewpoint; and a control unit configured to project a target object on a display region on the basis of the first information and cause display information to be presented to the display region according to a result of the projection, in which the control unit projects the object on a first partial region and a second partial region included in the display region on the basis of the first information according to the recognition result at timings different from each other.

[0007] Furthermore, according to the present disclosure, there is provided an information processing method including: by a computer, acquiring first information regarding a recognition result of at least one of a position or an orientation of a viewpoint; and projecting a target object on a display region on the basis of the first information and causing display information to be presented to the display region according to a result of the projection, in which the object is projected on a first partial region and a second partial region included in the display region on the basis of the first information according to the recognition result at timings different from each other.

[0008] Furthermore, according to the present disclosure, there is provided a program for causing a computer to: acquire first information regarding a recognition result of at least one of a position or an orientation of a viewpoint; and project a target object on a display region on the basis of the first information and cause display information to be presented to the display region according to a result of the projection, in which the object is projected on a first partial region and a second partial region included in the display region on the basis of the first information according to the recognition result at timings different from each other.

Effects of the Invention

[0009] As described above, according to the present disclosure, there is provided a technology for enabling presentation of information according to the position or orientation of a viewpoint in a more favorable mode.

[0010] Note that the above-described effect is not necessarily limited, and any of effects described in the present specification or another effect that can be grasped from the present specification may be exerted in addition to or in place of the above-described effect.

BRIEF DESCRIPTION OF THE DRAWINGS

[0011] FIG. 1 is an explanatory view for describing an example of a schematic configuration of an information processing system according to an embodiment of the present disclosure.

[0012] FIG. 2 is an explanatory view for describing an example of a schematic configuration of an input/output device according to the present embodiment.

[0013] FIG. 3 is an explanatory diagram for describing an outline of an example of an influence of a delay between movement of a viewpoint and presentation of information.

[0014] FIG. 4 is an explanatory diagram for describing an example of a method of reducing an influence of a delay between movement of a viewpoint and presentation of information.

[0015] FIG. 5 is an explanatory diagram for describing an outline of the example of an influence of a delay between movement of a viewpoint and presentation of information.

[0016] FIG. 6 is an explanatory diagram for describing an outline of an example of processing of drawing an object having three-dimensional shape information as two-dimensional display information.

[0017] FIG. 7 is an explanatory diagram for describing an outline of an example of processing of drawing an object having three-dimensional shape information as two-dimensional display information.

[0018] FIG. 8 is an explanatory diagram for describing a basic principle of processing regarding drawing and presentation of display information in the information processing system according to the embodiment.

[0019] FIG. 9 is an explanatory diagram for describing an example of processing regarding correction of a presentation position of display information.

[0020] FIG. 10 is a block diagram illustrating an example of a functional configuration of the information processing system according to the embodiment.

[0021] FIG. 11 is a flowchart illustrating an example of a flow of a series of processing of the information processing system according to the embodiment.

[0022] FIG. 12 is an explanatory diagram for describing an outline of an example of the information processing system according to the embodiment.

[0023] FIG. 13 is an explanatory diagram for describing an outline of an example of the information processing system according to the embodiment.

[0024] FIG. 14 is a functional block diagram illustrating an example of a hardware configuration of an information processing apparatus configuring an information processing system according to an embodiment of the present disclosure.

[0025] FIG. 15 is a functional block diagram illustrating an example of a hardware configuration in a case where an information processing apparatus configuring an information processing system according to an embodiment of the present disclosure is implemented as a chip.

MODE FOR CARRYING OUT THE INVENTION

[0026] Favorable embodiments of the present disclosure will be described in detail below with reference to the accompanying drawings. Note that, in the present specification and the drawings, redundant description of constituent elements having substantially the same functional configurations is omitted by giving the same reference numerals.

[0027] Note that the description will be given in the following order.

[0028] 1. Outline

[0029] 1.1. Schematic Configuration

[0030] 1.2. Configuration of Input/Output Device

[0031] 1.3. Principle of Self-position Estimation

[0032] 2. Study on Delay Between Movement of Viewpoint and Presentation of Information

[0033] 3. Technical Characteristics

[0034] 3.1. Outline of Processing of Drawing Object as Display Information

[0035] 3.2. Basic Principle of Processing Regarding Drawing and Presentation of Display Information

[0036] 3.3. Functional Configuration

[0037] 3.4. Processing

[0038] 3.5. Modification

[0039] 3.6. Example

[0040] 4. Hardware Configuration

[0041] 5. Conclusion

Outline

1.1. Schematic Configuration

[0042] First, an example of a schematic configuration of an information processing system according to an embodiment of the present disclosure will be described with reference to FIG. 1. FIG. 1 is an explanatory view for describing an example of a schematic configuration of an information processing system according to an embodiment of the present disclosure. In FIG. 1, reference numeral M11 schematically represents an object (that is, a real object) located in a real space. Furthermore, reference numerals V13 and V15 schematically represent virtual content (that is, virtual objects) presented to be superimposed in the real space. In other words, an information processing system 1 according to the present embodiment superimposes the virtual objects on an object in the real space such as the real object M11 on the basis of a so-called augmented reality (AR) technology, for example, and presents the superimposed objects to a user. Note that, in FIG. 1, both the real object and the virtual objects are presented for easy understanding of the characteristics of the information processing system according to the present embodiment.

[0043] As illustrated in FIG. 1, an information processing system 1 according to the present embodiment includes an information processing apparatus 10 and an input/output device 20. The information processing apparatus 10 and the input/output device 20 are able to transmit and receive information to and from each other via a predetermined network. Note that the type of network connecting the information processing apparatus 10 and the input/output device 20 is not particularly limited. As a specific example, the network may be configured by a so-called wireless network such as a network based on a Wi-Fi (registered trademark) standard. Furthermore, as another example, the network may be configured by the Internet, a dedicated line, a local area network (LAN), a wide area network (WAN), or the like. Furthermore, the network may include a plurality of networks, and some of the networks may be configured as a wired network.

[0044] The input/output device 20 is a configuration for obtaining various types of input information and presenting various types of output information to the user who holds the input/output device 20. Furthermore, the presentation of the output information by the input/output device 20 is controlled by the information processing apparatus 10 on the basis of the input information acquired by the input/output device 20. For example, the input/output device 20 acquires, as the input information, information for recognizing the real object M11, and outputs the acquired information to the information processing apparatus 10. The information processing apparatus 10 recognizes the position of the real object M11 in the real space on the basis of the information acquired from the input/output device 20, and causes the input/output device 20 to present the virtual objects V13 and V15 on the basis of the recognition result. With such control, the input/output device 20 can present, to the user, the virtual objects V13 and V15 such that the virtual objects V13 and V15 are superimposed on the real object M11 on the basis of the so-called AR technology. Note that, in FIG. 1, the input/output device 20 and the information processing apparatus 10 are illustrated as devices different from each other. However, the input/output device 20 and the information processing apparatus 10 may be integrally configured. Furthermore, details of the configurations and processing of the input/output device 20 and the information processing apparatus 10 will be separately described below.

[0045] An example of the schematic configuration of the information processing system according to the embodiment of the present disclosure has been described with reference to FIG. 1.

1.2. Configuration of Input/Output Device

[0046] Next, an example of a schematic configuration of the input/output device 20 according to the present embodiment illustrated in FIG. 1 will be described with reference to FIG. 2. FIG. 2 is an explanatory view for describing an example of a schematic configuration of the input/output device according to the present embodiment.

[0047] The input/output device 20 according to the present embodiment is configured as a so-called head-mounted device worn on at least part of the head of the user and used by the user. For example, in the example illustrated in FIG. 2, the input/output device 20 is configured as a so-called eyewear-type (glasses-type) device, and at least either a lens 293a or a lens 293b is configured as a transmission-type display (output unit 211). Furthermore, the input/output device 20 includes first imaging units 201a and 201b, second imaging units 203a and 203b, an operation unit 207, and a holding unit 291 corresponding to a frame of the glasses. The holding unit 291 holds the output unit 211, the first imaging units 201a and 201b, the second imaging units 203a and 203b, and the operation unit 207 to have a predetermined positional relationship with respect to the head of the user when the input/output device 20 is mounted on the head of the user. Furthermore, although not illustrated in FIG. 2, the input/output device 20 may be provided with a sound collection unit for collecting a voice of the user.

[0048] Here, a more specific configuration of the input/output device 20 will be described. For example, in the example illustrated in FIG. 2, the lens 293a corresponds to a lens on a right eye side, and the lens 293b corresponds to a lens on a left eye side. In other words, the holding unit 291 holds the output unit 211 such that the output unit 211 (in other words, the lenses 293a and 293b) is located in front of the eyes of the user in the case where the input/output device 20 is mounted.

[0049] The first imaging units 201a and 201b are configured as so-called stereo cameras and are held by the holding unit 291 to face a direction in which the head of the user faces (in other words, the front of the user) when the input/output device 20 is mounted on the head of the user. At this time, the first imaging unit 201a is held near the user’s right eye, and the first imaging unit 201b is held near the user’s left eye. The first imaging units 201a and 201b capture a subject located in front of the input/output device 20 (in other words, the real object located in the real space) from different positions from each other on the basis of such a configuration. Thereby, the input/output device 20 acquires images of the subject located in front of the user and can calculate a distance to the subject from the input/output device 20 on the basis of a parallax between the images respectively captured by the first imaging units 201a and 201b.

[0050] Note that the configuration and method are not particularly limited as long as the distance between the input/output device 20 and the subject can be measured. As a specific example, the distance between the input/output device 20 and the subject may be measured on the basis of a method such as multi-camera stereo, moving parallax, time of flight (TOF), or structured light. Here, the TOF is a method of obtaining an image (so-called distance image) including a distance (depth) to a subject on the basis of a measurement result by projecting light such as infrared light on the subject and measuring a time required for the projected light to be reflected by the subject and return, for each pixel. Furthermore, the structured light is a method of obtaining a distance image including a distance (depth) to a subject on the basis of change in a pattern obtained from a capture result by irradiating the subject with the pattern of light such as infrared light and capturing the pattern. Furthermore, the moving parallax is a method of measuring a distance to a subject on the basis of a parallax even in a so-called monocular camera. Specifically, the subject is captured from different viewpoints from each other by moving the camera, and the distance to the subject is measured on the basis of the parallax between the captured images. Note that, at this time, the distance to be subject can be measured with more accuracy by recognizing a moving distance and a moving direction of the camera using various sensors. Note that the configuration of the imaging unit (for example, the monocular camera, the stereo camera, or the like) may be changed according to the distance measuring method.

[0051] Furthermore, the second imaging units 203a and 203b are held by the holding unit 291 such that eyeballs of the user are located within respective imaging ranges when the input/output device 20 is mounted on the head of the user. As a specific example, the second imaging unit 203a is held such that the user’s right eye is located within the imaging range. The direction in which the line-of-sight of the right eye is directed can be recognized on the basis of an image of the eyeball of the right eye captured by the second imaging unit 203a and a positional relationship between the second imaging unit 203a and the right eye, on the basis of such a configuration. Similarly, the second imaging unit 203b is held such that the user’s left eye is located within the imaging range. In other words, the direction in which the line-of-sight of the left eye is directed can be recognized on the basis of an image of the eyeball of the left eye captured by the second imaging unit 203b and a positional relationship between the second imaging unit 203b and the left eye. Note that the example in FIG. 2 illustrates the configuration in which the input/output device 20 includes both the second imaging units 203a and 203b. However, only one of the second imaging units 203a and 203b may be provided.

[0052] The operation unit 207 is a configuration for receiving an operation on the input/output device 20 from the user. The operation unit 207 may be configured by, for example, an input device such as a touch panel or a button. The operation unit 207 is held at a predetermined position of the input/output device 20 by the holding unit 291. For example, in the example illustrated in FIG. 2, the operation unit 207 is held at a position corresponding to a temple of the glasses.

[0053] Furthermore, the input/output device 20 according to the present embodiment may be provided with, for example, an acceleration sensor and an angular velocity sensor (gyro sensor) and may be able to detect movement of the head of the user wearing the input/output device 20 (in other words, movement of the input/output device 20 itself). As a specific example, the input/output device 20 may recognize a change in at least either the position or orientation of the head of the user by detecting components in a yaw direction, a pitch direction, and a roll direction as the movement of the head of the user.

[0054] The input/output device 20 according to the present embodiment can recognize changes in its own position and orientation in the real space according to the movement of the head of the user on the basis of the above configuration. Furthermore, at this time, the input/output device 20 can present the virtual content (in other words, the virtual object) to the output unit 211 to superimpose the virtual content on the real object located in the real space on the basis of the so-called AR technology. Note that an example of a method for the input/output device 20 to estimate its own position and orientation in the real space (that is, self-position estimation) will be described below in detail.

[0055] Note that examples of a head-mounted display (HMD) device applicable as the input/output device 20 include a see-through HMD, a video see-through HMD, and a retinal projection HMD.

[0056] The see-through HMD uses, for example, a half mirror or a transparent light guide plate to hold a virtual image optical system including a transparent light guide or the like in front of the eyes of the user, and displays an image inside the virtual image optical system. Therefore, the user wearing the see-through HMD can take the external scenery into view while viewing the image displayed inside the virtual image optical system. With such a configuration, the see-through HMD can superimpose an image of the virtual object on an optical image of the real object located in the real space according to the recognition result of at least one of the position or orientation of the see-through HMD on the basis of the AR technology, for example. Note that a specific example of the see-through HMD includes a so-called glasses-type wearable device in which a portion corresponding to a lens of glasses is configured as a virtual image optical system. For example, the input/output device 20 illustrated in FIG. 2 corresponds to an example of the see-through HMD.

[0057] In a case where the video see-through HMD is mounted on the head or face of the user, the video see-through HMD is mounted to cover the eyes of the user, and a display unit such as a display is held in front of the eyes of the user. Furthermore, the video see-through HMD includes an imaging unit for capturing surrounding scenery, and causes the display unit to display an image of the scenery in front of the user captured by the imaging unit. With such a configuration, the user wearing the video see-through HMD has a difficulty in directly taking the external scenery into view but the user can confirm the external scenery with the image displayed on the display unit. Furthermore, at this time, the video see-through HMD may superimpose the virtual object on an image of the external scenery according to the recognition result of at least one of the position or orientation of the video see-through HMD on the basis of the AR technology, for example.

[0058] The retinal projection HMD has a projection unit held in front of the eyes of the user, and an image is projected from the projection unit toward the eyes of the user such that the image is superimposed on the external scenery. More specifically, in the retinal projection HMD, an image is directly projected from the projection unit onto the retinas of the eyes of the user, and the image is imaged on the retinas. With such a configuration, the user can view a clearer video even in a case where the user has myopia or hyperopia. Furthermore, the user wearing the retinal projection HMD can take the external scenery into view even while viewing the image projected from the projection unit. With such a configuration, the retinal projection HMD can superimpose an image of the virtual object on an optical image of the real object located in the real space according to the recognition result of at least one of the position or orientation of the retinal projection HMD on the basis of the AR technology, for example.

[0059] Furthermore, an HMD called immersive HMD can also be mentioned in addition to the above-described examples. The immersive HMD is mounted to cover the eyes of the user, and a display unit such as a display is held in front of the eyes of the user, similarly to the video see-through HMD. Therefore, the user wearing the immersive HMD has a difficulty in directly taking an external scenery (in other words, scenery of a real world) into view, and only an image displayed on the display unit comes into view. With such a configuration, the immersive HMD can provide an immersive feeling to the user who is viewing the image. Therefore, the immersive HMD can be applied in a case of presenting information mainly based on a virtual reality (VR) technology, for example.

[0060] An example of the schematic configuration of the input/output device according to the embodiment of the present disclosure has been described with reference to FIG. 2.

1.3. Principle of Self-Position Estimation

[0061] Next, an example of a principle of a technique for the input/output device 20 to estimate its own position and orientation in the real space (that is, self-position estimation) when superimposing the virtual object on the real object will be described.

[0062] As a specific example of the self-position estimation, the input/output device 20 captures an image of a marker or the like having a known size presented on the real object in the real space, using an imaging unit such as a camera provided in the input/output device 20. Then, the input/output device 20 estimates at least one of its own relative position or orientation with respect to the marker (and thus the real object on which the marker is presented) by analyzing the captured image. Note that the following description will be given focusing on the case where the input/output device 20 estimates its own position and orientation. However, the input/output device 20 may estimate only one of its own position or orientation.

[0063] Specifically, a relative direction of the imaging unit with respect to the marker (and thus the input/output device 20 provided with the imaging unit) can be estimated according to the direction of the marker (for example, the direction of a pattern and the like of the marker) captured in the image. Furthermore, in the case where the size of the marker is known, the distance between the marker and the imaging unit (that is, the input/output device 20 provided with the imaging unit) can be estimated according to the size of the marker in the image. More specifically, when the marker is captured from a farther distance, the marker is captured smaller. Furthermore, a range in the real space captured in the image at this time can be estimated on the basis of an angle of view of the imaging unit. By using the above characteristics, the distance between the marker and the imaging unit can be calculated backward according to the size of the marker captured in the image (in other words, a ratio occupied by the marker in the angle of view). With the above configuration, the input/output device 20 can estimate its own relative position and orientation with respect to the marker.

[0064] Furthermore, a technology so-called simultaneous localization and mapping (SLAM) may be used for the self-position estimation of the input/output device 20. SLAM is a technology for performing self-position estimation and creation of an environmental map in parallel by using an imaging unit such as a camera, various sensors, an encoder, and the like. As a more specific example, in SLAM (in particular, Visual SLAM), a three-dimensional shape of a captured scene (or subject) is sequentially restored on the basis of a moving image captured by the imaging unit. Then, by associating a restoration result of the captured scene with a detection result of the position and orientation of the imaging unit, creation of a map of a surrounding environment and estimation of the position and orientation of the imaging unit (and thus the input/output device 20) in the environment are performed. Note that the position and orientation of the imaging unit can be estimated as information indicating relative change on the basis of detection results of various sensors by providing the various sensors such as an acceleration sensor and an angular velocity sensor to the input/output device 20, for example. Of course, the estimation method is not necessarily limited to the method based on the detection results of the various sensors such as an acceleration sensor and an angular velocity sensor as long as the position and orientation of the imaging unit can be estimated.

[0065] Under the above configuration, the estimation result of the relative position and orientation of the input/output device 20 with respect to the known marker, which is based on the imaging result of the marker by the imaging unit, may be used for initialization processing or position correction in SLAM described above, for example. With the configuration, the input/output device 20 can estimate its own position and orientation with respect to the marker (and thus the real object on which the marker is presented) by the self-position estimation based on SLAM reflecting results of the initialization and position correction executed before even in a situation where the marker is not included in the angle of view of the imaging unit.

[0066] Furthermore, the above description has been made focusing on the example of the case of performing the self-position estimation mainly on the basis of the imaging result of the marker. However, a detection result of another target other than the marker may be used for the self-position estimation as long as the detection result can be used as a reference for the self-position estimation. As a specific example, a detection result of a characteristic portion of an object (real object) in the real space, such as a shape or pattern of the object, may be used for the initialization processing or position correction in SLAM.

[0067] An example of the principle of the technique for the input/output device 20 to estimate its own position and orientation in the real space (that is, self-position estimation) when superimposing the virtual object on the real object has been described. Note that the following description will be given on the assumption that the position and orientation of the input/output device 20 with respect to an object (real object) in the real space can be estimated on the basis of the above-described principle, for example.

Study on Delay Between Movement of Viewpoint and Presentation of Information

[0068] Next, an outline of a delay between movement of a viewpoint (for example, the head of the user) and presentation of information in the case of presenting the information to the user according to a change in the position or orientation of the viewpoint, such as AR or VR, will be described.

[0069] In the case of presenting information to the user according to a change in the position or orientation of a viewpoint, a delay from when the movement of the viewpoint is detected to when the information is presented (so-called motion-to-photon latency) may affect an experience of the user. As an example, in the case of presenting a virtual object as if the virtual object exists in front of the user according to the orientation of the head of the user, a series of processing of recognizing the orientation of the head from a detection result of the movement of the head of the user and presenting the information according to the recognition result may require time. In such a case, a gap according to the processing delay may occur between the movement of the head of the user and a change in a field of view according to the movement of the head (that is, a change in the information presented to the user), for example.

[0070] In particular, the delay becomes apparent as a gap between a real world and the virtual object in a situation where the virtual object is superimposed on the real world such as AR. Therefore, even if the gap that becomes apparent as an influence of the delay is a slight amount that is hardly perceived by the user in the case of VR, the gap may be easily perceived by the user in the case of AR.

[0071] As an example of a method of reducing the influence of the delay, there is a method of reducing the delay by increasing a processing speed (frame per second: FPS). However, in this case, a higher-performance processor such as CPU or GPU is required in proportion to improvement of the processing speed. Furthermore, a situation where power consumption increases and a situation where heat is generated can be assumed with the improvement of the processing speed. In particular, a device for implementing AR or VR, such as the input/output device 20 described with reference to FIG. 2, is assumed to operate by power supply from a battery, and the increase in power consumption has a more significant influence. Furthermore, in a device such as the input/output device 20 worn on a part of the body of the user, the influence of the heat generation (for example, the influence on the user who wears the device) tends to become more apparent due to the characteristic of the use method, as compared with other devices. Furthermore, in the device such as the input/output device 20, a space where a device such as a processor is provided is limited, as compared with a stationary device, and application of a high-performance processor may be difficult.

[0072] Furthermore, as another example of the method of reducing the influence of the delay, there is a method of two-dimensionally correcting a presentation position of information within a display region according to the position or orientation of the viewpoint at a presentation timing when presenting the information to the user. As an example of the technology of two-dimensionally correcting the presentation position of information, there is a technology called “timewarp”.

[0073] In the meantime, the information cannot be necessarily presented in an originally expected mode only by two-dimensionally correcting the presentation position of the information. For example, FIG. 3 is an explanatory diagram for describing an outline of an example of the influence of the delay between movement of the viewpoint and presentation of information, and illustrates an example of an output in the case of two-dimensionally correcting the presentation position of the information according to the position and orientation of the viewpoint.

[0074] In FIG. 3, reference numerals P181a and P181b each schematically represent the position and orientation of the viewpoint. Specifically, reference numeral P181a represents the position and orientation of the viewpoint before movement. Furthermore, reference numeral P181b represents the position and orientation of the viewpoint after movement. Note that, in the following description, the viewpoints P181a and P181b may be simply referred to as “viewpoint(s) P181” unless otherwise particularly distinguished. Furthermore, reference numerals M181 and M183 each schematically represent an object (for example, a virtual object) to be displayed in the display region. That is, FIG. 3 illustrates an example of the case of presenting images of the objects M181 and M183 in the case of viewing the objects M181 and M183 from the viewpoint P181 according to a three-dimensional positional relationship between the viewpoint P181 and the objects M181 and M183.

[0075] For example, reference numeral V181 schematically represents a video corresponding to the field of view in the case of viewing the objects M181 and M183 from the viewpoint P181a before movement. The images of the objects M181 and M183 presented as the video V181 are drawn as two-dimensional images according to position and orientation relationship between the viewpoint P181a and the objects M181 and M183 according to the position and orientation of the viewpoint P181a.

[0076] Furthermore, reference numeral V183 schematically represents an originally expected video as the field of view in the case of viewing the objects M181 and M183 from the viewpoint P181b after movement. In contrast, reference numeral V185 schematically illustrates a video corresponding to the field of view from the viewpoint P181, which is presented by two-dimensionally correcting the presentation positions of the images of the objects M181 and M183 presented as the video V181 according to the movement of the viewpoint P181. As can be seen by comparing the video V183 with the video V185, the objects M181 and M183 are located at different positions in a depth direction with respect to the viewpoint P181, and thus moving amounts are originally different in the field of view with the movement of the viewpoint P181. Meanwhile, the moving amounts in the field of view of the objects M181 and M183 become equal when the images of the objects M181 and M183 in the video V181 are regarded as a series of images, and the presentation position of the series of images is simply two-dimensionally corrected according to the movement of the viewpoint P181, as in the video V185. Therefore, in the case of two-dimensionally correcting the presentation position of the images of the objects M181 and M183, there are some cases where a logically broken video is visually recognized as the field of view from the viewpoint P181b after movement.

[0077] As an example of a method for solving such a problem, there is a method of dividing a region corresponding to the field of view based on the viewpoint into a plurality of regions along the depth direction, and correcting the presentation position of the image of the object (for example, the virtual object) for each region. For example, FIG. 4 is an explanatory diagram for describing an example of the method of reducing an influence of a delay between movement of a viewpoint and presentation of information, and illustrates an outline of an example of the method of correcting the presentation position of an object image for each region divided along the depth direction.

[0078] In the example illustrated in FIG. 4, a region corresponding to the field of view based on a viewpoint P191 is divided into a plurality of regions R191, R193, and R195 along the depth direction, and buffers different from each other are associated with the respective divided regions. Note that the buffers associated with the regions R191, R193, and R195 are referred to as buffers B191, B193, and B195, respectively. Further, in a case of assuming use in AR, a buffer B190 for holding a depth map according to a measurement result of the distance between a real object in the real space and the viewpoint may be provided separately from the buffers B191, B193, and B195.

[0079] With the above configuration, an image of each virtual object is drawn in the buffer corresponding to the region in which the virtual object is presented. That is, the image of the virtual object V191 located in the region R191 is drawn in the buffer B191. Similarly, the image of the virtual object V193 located in the region R193 is drawn in the buffer B193. Furthermore, the images of the virtual objects V195 and V197 located in the area R195 are drawn in the buffer B195. Furthermore, the depth map according to the measurement result of the distance between a real object M199 and the viewpoint P191 is held in the buffer B190.

[0080] Then, in a case where the viewpoint P191 has moved the presentation position of the image of the virtual object drawn in each of the buffers B191 to B195 is individually corrected for each buffer according to the change in the position or orientation of the viewpoint P191. With such a configuration, it becomes possible to individually correct the presentation position of the image of each virtual object in consideration of the moving amount according to the distance between the viewpoint P191 and each of the virtual objects V191 to V197. Furthermore, even in a situation where some virtual object is shielded by the real object or another object with the movement of the viewpoint P191, presentation of the image of each virtual object can be controlled in consideration of the shielding. That is, according to the example illustrated in FIG. 4, information can be presented to the user in a less logically broken mode than the case described with reference to FIG. 3.

[0081] Meanwhile, in the example described with reference to FIG. 4, the accuracy related to the correction of the presentation position of the image according to the position in the depth direction may change according to the number of buffers. That is, there is a possibility that an error of the presentation position of display information becomes larger as the number of buffers is smaller. Furthermore, since the image of the object is drawn in each buffer regardless of whether or not information is eventually presented, there is a possibility that the processing cost increases in proportion to the number of buffers. Furthermore, since the images drawn in the plurality of buffers are eventually combined and presented, the cost of the combining processing is required.

[0082] Furthermore, a situation in which the position or orientation of the viewpoint changes during execution of processing of drawing an image or processing of displaying a drawing result can be assumed. For example, FIG. 5 is an explanatory diagram for describing an outline of the example of an influence of a delay between movement of a viewpoint and presentation of information. Specifically, FIG. 5 illustrates an example of a flow of processing regarding presentation of display information V103 in the case of presenting an image of a virtual object as the display information V103 to be superimposed on a real object M101 in the real space according to the position and orientation of a viewpoint P101 on the basis of the AR technology. More specifically, FIG. 5 illustrates an example of a case where the position and orientation of the viewpoint P101 further change in a series of flow where the position and orientation of the viewpoint P101 are recognized according to an imaging result of the image of the real space, the display information V103 is rendered according to a result of the recognition, and then the display information V103 is displayed in the display region. In FIG. 5, reference numeral P101a represents the position and orientation of the viewpoint P101 before movement. Furthermore, reference numeral P101b represents the position and orientation of the viewpoint P101 after movement.

[0083] Reference numeral V101a schematically represents a video within the field of view from the viewpoint P101 (in other words, a video image visually recognized by the user) at the time of completion of capturing the image of the real object M101. The position and orientation of the viewpoint P101 at the time of capturing the image of the real object M101 (that is, the position and orientation of the viewpoint P101a before movement) are recognized by using the technology of self-position estimation such as SLAM, for example, on the basis of the image of the real object M101.

[0084] Reference numeral V101b schematically represents a video within the field of view from the viewpoint P101 at the start of rendering the display information V103. In the video V101b, the position of the real object M101 in the video (that is, in the field of view) has changed from the state illustrated as the video V101a with the change in the position and orientation of the viewpoint P101 from the completion of capturing the image of the real object M101. Note that, in the example illustrated in FIG. 5, the presentation position of the display information V103 is corrected according to the change in the position and orientation of the viewpoint P101 at the start of rendering. Reference numeral V105 schematically represents the presentation position of the display information V103 before the presentation position is corrected. That is, the presentation position V105 corresponds to the position of the real object M101 in the video V101a at the time of completion of capturing the image. At this point of time, the position of the real object M101 and the position at which the display information V103 is presented are substantially coincident in the video V101b with the above correction.

[0085] Reference numeral V101c schematically represents a video within the field of view from the viewpoint P101 at the start of processing regarding display of the display information V103 (for example, drawing the display information V103 or the like) Furthermore, reference numeral V101d schematically represents a video within the field of view from the viewpoint P101 at the end of the processing regarding display of the display information V103. In each of the videos V101c and V101d, the position of the real object M101 in the video (that is, in the field of view) has changed from the state illustrated as the video V101b with the change in the position and orientation of the viewpoint P101 from the start of rendering the display information V103.

[0086] As can be seen by comparing the videos V101b to V101d, since the position and orientation of the viewpoint P101 change from the start of rendering to the end of display of the display information V103 according to the result of the rendering, the position of the real object M101 in the field of view from the viewpoint P101 changes. Meanwhile, the position at which the display information V103 is presented in the field of view does not change from the start of rendering. Therefore, in the example illustrated in FIG. 5, a delay time from the start of rendering of the display information V103 to the end of display of the display information V103 becomes apparent as a gap between the position of the real object M101 and the presentation position of the display information V103 in the field of view from the viewpoint P101.

[0087] In view of the foregoing, the present disclosure proposes a technology for enabling presentation of information in a more favorable mode (for example, in a less logically broken mode) even in the case where the position or orientation of the viewpoint changes in the situation where the information is presented according to the position or orientation of the viewpoint.

Technical Characteristics

[0088] Hereinafter, technical characteristics of the information processing system according to the present disclosure will be described.

3.1. Outline of Processing of Drawing Object as Display Information

[0089] First, an outline of an example of processing of drawing an object having three-dimensional shape information (for example, a virtual object) as two-dimensional display information in a situation where the object is presented to an output unit such as a display as the display information.

[0090] For example, FIG. 6 is an explanatory diagram for describing an outline of an example of processing of drawing an object having three-dimensional shape information as two-dimensional display information. Note that the processing illustrated in FIG. 6 can be implemented by processing called “vertex shader”, for example.

[0091] As illustrated in FIG. 6, in the case of presenting an object having three-dimensional shape information as two-dimensional display information, a target object is projected on a screen surface defined according to the field of view (angle of view) from an observation point with respect to the observation point (that is, viewpoint) in a three-dimensional space. That is, the screen surface can correspond to a projection surface. At this time, a color of the object when the object is drawn as the two-dimensional display information may be calculated according to a positional relationship between a light source defined in the three-dimensional space and the object. The two-dimensional shape of the object, the color of the object, the two-dimensional position at which the object is presented (that is, the position on the screen surface), and the like are calculated according to the relative positional relationship among the observation point, the object, and the screen surface according to the position and orientation of the observation point.

[0092] Note that the example in FIG. 6 illustrates an example of a projection method in a case where a clip surface is located on a back side of the target object, that is, in a case where the object is projected on the screen surface located on the back side. In contrast, as another example of the projection method, there is a method in which the clip surface is located on a front side of the target object, and the object is projected on the screen surface located on the front side. In the following description, the case where the clip surface is located on the back side of the target object will be described as an example, as illustrated in FIG. 6. However, the object projection method is not necessarily limited, and the case where the clip surface is located on the front side of the object similarly works unless technical discrepancies occur.

[0093] Next, an example of processing in which the information processing system according to the embodiment of the present disclosure draws a projection result of the object as two-dimensional display information will be described with reference to FIG. 7. FIG. 7 is an explanatory diagram for describing an outline of an example of processing of drawing an object having three-dimensional shape information as two-dimensional display information. Note that the processing illustrated in FIG. 7 can correspond to, for example, processing called “pixel shader”.

[0094] In FIG. 7, reference numeral V111 represents a drawing region in which the projection result of the object is drawn as two-dimensional display information. That is, the drawing region V111 is associated with at least a part of the screen surface illustrated in FIG. 6, and the projection result of the object onto the region is drawn as display information. Furthermore, the drawing region V111 is associated with a display region of an output unit such as a display, and a drawing result of the display information to the drawing region V111 can be presented in the display region. The drawing region V111 may be defined as at least a partial region of a predetermined buffer (for example, a frame buffer or the like) that temporarily or permanently holds data such as the drawing result. Furthermore, the display region itself may be used as the drawing region V111. In this case, the display information is directly drawn in the display region corresponding to the drawing region V111, so that the display information is presented in the display region. Furthermore, in this case, the display region itself can correspond to the screen surface (projection surface).

[0095] Furthermore, reference numeral V113 represents the projection result of the target object, that is, reference numeral V113 corresponds to the display information to be drawn. The drawing of the display information V113 is performed such that the drawing region V111 (in other words, the display region) is divided into a plurality of partial regions V115, and the drawing is performed for each partial region V115, for example. An example of the unit in which the partial region V115 is defined includes a unit region that is obtained by dividing the drawing region V111 to have a predetermined size, such as a scan line or a tile. For example, in the example illustrated in FIG. 7, the partial region V115 including one or more scan lines is defined, where a scan line is defined as a unit region.

[0096] Here, an outline of a flow of processing regarding drawing of the display information V113 for each partial region V115 will be described below.

[0097] In the information processing system according to the embodiment of the present disclosure, first, vertices of a portion of the display information V113, the portion corresponding to the target partial region V115, are extracted. For example, reference numerals V117a to V117d represent the vertices of the portion of the display information V113, the vertices corresponding to the partial region V115. In other words, in a case where the portion corresponding to the partial region V115 is cut out from the display information V113, the vertices of the cutout portion are extracted.

[0098] Next, the recognition result of the position and orientation of the viewpoint at an immediately preceding timing (for example, the latest recognition result) is acquired, and the positions of the vertices V117a to V117d are corrected according to the position and orientation of the viewpoint. More specifically, the target object is reprojected onto the partial region V115 according to the position and orientation of the viewpoint at the immediately preceding timing, and the positions of the vertices V117a to V117d (in other words, the shape of the portion of the display information V113, the portion corresponding to the partial region V115) are corrected according to a result of the reprojection. At this time, color information drawn as the portion corresponding to the partial region V115, of the display information V113, may be updated according to the result of the reprojection. Furthermore, a recognition timing of the position and orientation of the viewpoint used for the reprojection is more desirably a past timing closer to an execution timing of the processing regarding correction. As described above, the information processing system according to the present disclosure performs for each partial region V115, reprojection of the target object to the partial regions V115 according to the recognition results of the position and orientation of the viewpoint at different timings, and performs the correction according to the result of the reprojection. Note that, hereinafter, the processing regarding correction based on reprojection is also referred to as “reprojection shader”. Furthermore, the processing regarding reprojection of the object corresponds to an example of processing regarding projection of an object to a partial region.

[0099] Then, when the processing regarding correction based on reprojection is completed, the processing regarding drawing of the display information V113 according to the correction result in the target partial region V115 is executed. In the example illustrated in FIG. 7, the drawing of the display information V113 in the partial region V115 is sequentially executed for each scan line configuring the partial region V115, for example.

[0100] An outline of an example of the processing of drawing an object having three-dimensional shape information as two-dimensional display information in a situation where the object is presented to an output unit has been described with reference to FIGS. 6 and 7.

3.2. Basic Principle of Processing Regarding Drawing and Presentation of Display Information

[0101] Next, a basic principle of processing of drawing an object as display information and presenting a result of the drawing in the information processing system according to the embodiment of the present disclosure will be described, particularly focusing on a timing when the recognition result of the position and orientation of the viewpoint is reflected. For example, FIG. 8 is an explanatory diagram for describing a basic principle of processing regarding drawing and presentation of display information in the information processing system according to an embodiment of the present disclosure. In FIG. 8, a timing chart illustrated as “Example” corresponds to an example of a timing chart of the processing of drawing an object as display information and presenting a result of the drawing in the information processing system according to the present embodiment. Note that FIG. 8 illustrates examples of timing charts of processing regarding drawing of an object applied in a conventional system as Comparative Examples 1 and 2 so that the characteristics of the information processing system according to the present embodiment can be more easily understood. Therefore, hereinafter, the characteristics of the information processing system according to the present embodiment will be described after an outline of each of Comparative Examples 1 and 2 is described. Note that the timing charts in FIG. 8 illustrate processing executed for each predetermined unit period regarding presentation of information such as a frame. Therefore, hereinafter, description will be given on the assumption that the processing regarding presentation of information is executed using one frame as the unit period, for convenience. However, the unit period regarding execution of the processing is not necessarily limited.

Comparative Example 1

[0102] First, processing according to Comparative Example 1 will be described. The processing according to Comparative Example 1 corresponds to processing regarding drawing and presentation of display information in the case of two-dimensionally correcting the presentation position of the display information according to the position and orientation of the viewpoint, as in the example described with reference to FIG. 5.

[0103] For example, reference numeral t101 represents a start timing of processing regarding presentation of information for each frame in the processing according to Comparative Example 1. Specifically, at timing t101, first, information (for example, a scene graph) regarding a positional relationship between the viewpoint and an object to be drawn (for example, a virtual object) is updated (Scene Update), and the object is projected on the screen surface as two-dimensional display information according to a result of the update (Vertex Shader).

[0104] Then, when the processing regarding projection is completed, drawing of the display information according to a projection result of the object is executed in the frame buffer (Pixel Shader), and synchronization of processing regarding display of the display information in the display region of the output unit is waited (Wait vsync). For example, reference numeral t103 represents a timing when the processing regarding display of the display information in the display region of the output unit is started. As a more specific example, a timing of a vertical synchronization (Vsync) corresponds to an example of the timing t103.

[0105] In the processing according to Comparative Example 1, at the timing t103 when display of the display information in the display region is started, the presentation position of the display information in the display region is corrected on the basis of information regarding a recognition result of the position and orientation of the viewpoint obtained at an immediately preceding timing. A correction amount at this time is calculated such that a position among positions in the depth direction (z direction) (for example, a position of interest) can be consistent with a visually recognized position with the change in the position or orientation of the viewpoint For example, a timing illustrated by reference numeral IMU schematically represents a timing when the information according to the recognition result of the position or orientation of the viewpoint is acquired. Note that the information acquired at the timing may be favorably information according to the recognition result of the position and orientation of the viewpoint at a timing immediately before the timing, for example.

[0106] Then, when the correction of the presentation position of the display information in the display region is completed, drawing results of the display information held in the frame buffer are sequentially transferred to the output unit, and the display information is displayed in the display region of the output according to a result of the correction (Transfer FB to Display).

[0107] From the above characteristics, in the processing according to Comparative Example 1, the display information can be presented at a logically correct position regarding the position in the depth direction that is used as the reference for calculating the correction amount of the presentation position of the display information in a period T11 from the timing t101 to the timing t103. Meanwhile, even in the period T11, there are some cases where positions other than the position used as the reference for calculating the correction amount have a gap between the position at which the display information should be originally presented according to the position and orientation of the viewpoint and the position at which the display information is actually and visually recognized.

[0108] Furthermore, in a case where the viewpoint still moves at and after the timing t103, the presentation position of the display information in the display region associated with the field of view does not change even though the field of view changes with the movement of the viewpoint. Therefore, in this case, a gap occurs between the position at which the display information should be originally presented according to the position and orientation of the viewpoint after movement and the position at which the display information is actually and visually recognized regardless of the position in the depth direction, and this gap may become larger in proportion to an elapsed time from the timing t103 at which the correction has been performed.

[0109] For example, reference numeral t105 schematically represents an arbitrary timing during the period in which the drawing results of the display information are sequentially displayed in the display region of the output unit. As a specific example, the timing t105 may correspond to a timing when the display information is displayed on some scan lines. That is, in a period T13 between the timing t103 and the timing t105, the above-described gap of the presentation position of the display information becomes larger in proportion to the length of the period T13.

Comparative Example 2

[0110] Next, processing according to Comparative Example 2 will be described. The processing according to Comparative Example 2 is different from the processing according to Comparative Example 1 in sequentially monitoring the position and orientation of the viewpoint, and correcting the presentation position of the display information to be sequentially displayed each time according to a monitoring result, even during execution of the processing of sequentially displaying the drawing results of the display information in the display region of the output unit (Transfer FB to Display). That is, as illustrated in FIG. 8, at each of timings illustrated with reference numerals IMU, the presentation position of the display information to be displayed in the display region at the each timing is corrected according to the recognition result of the position and orientation of the viewpoint acquired at the each timing. An example of the technology regarding the correction applied in the processing of Comparative Example 2 includes a technology called “raster correction”.

[0111] Here, an outline of processing according to Comparative Example 2 will be described giving a specific example with reference to FIG. 9. FIG. 9 is an explanatory diagram for describing an example of the processing regarding correction of the presentation position of the display information, and illustrates an example of the case of sequentially monitoring a change in the position and orientation of the viewpoint, and correcting the presentation position of display information V133 in the display region according to the monitoring result. Note that, in the example illustrated in FIG. 9, the case of presenting an image of a virtual object as the display information V133 to be superimposed on the real object M101 in the real space according to the position and orientation of the viewpoint P101 on the basis of the AR technology is assumed, similarly to the example described with reference to FIG. 5. Furthermore, in the example illustrated in FIG. 9, the position and orientation of the viewpoint P101 are assumed to further change in a series of flow where the display information V133 is rendered according to a recognition result of the position and orientation of the viewpoint P101, and then the display information V133 is displayed in the display region, similarly to the example described with reference to FIG. 5.

[0112] In FIG. 9, reference numeral V121b schematically represents a video within the field of view from the viewpoint P101 at the start of rendering the display information V133. Furthermore, reference numeral B131 schematically represents a frame buffer. That is, the display information V133 is drawn in the frame buffer B131 according to the previously acquired recognition result of the position and orientation of the viewpoint P101.

[0113] Each of reference numerals V133a to V133c schematically represents display information corresponding to a part of the display information V133. As a specific example, each of the display information V133a to V133c corresponds to a portion of the display information V133, the portion corresponding to each partial region, in the case of sequentially displaying the display information V133 for each partial region including one or more scan lines in the display region. Furthermore, each of reference numerals V131a, V131b, V131c, and V131h schematically represents a video in the field of view from the viewpoint P101 in each process from when the processing regarding display of the display information V133 in the display region is started to when the processing regarding display is completed. That is, the video in the field of view is sequentially updated as illustrated as the videos V131a, V131b, and V131c, by sequentially displaying the display information V133a, V133b, and V133c for each partial region in the display region. Furthermore, a video V131h corresponds to a video in the field of view at the timing when the display of the display information V133 is completed.

[0114] Furthermore, in the example illustrated in FIG. 9, a video in the field of view from the viewpoint P101 at the timing when display of the display information V133 is completed, in a case where correction is performed by a method similar to the case described with reference to FIG. 5, is illustrated as a video V121d, for comparison.

[0115] As illustrated in FIG. 9, in the processing according to Comparative Example 2, when a part of the display information V133, the part corresponding to a partial region, is displayed for the each partial region in the display region, the presentation position of the each part is corrected according to the acquired recognition result of the position and orientation of the viewpoint P101 acquired at an immediately preceding timing. That is, at a timing when display information V133a is displayed (in other words, a timing when a video V131a is displayed), the presentation position of the display information V133a in the display region is corrected according to the position and orientation of the viewpoint P101 acquired at an immediately preceding timing of the timing.

[0116] Specifically, in FIG. 8, reference numeral till represents a start timing of processing regarding presentation of information for each frame in the processing according to Comparative Example 2. Furthermore, reference numeral t113 schematically represents an arbitrary timing during the period in which the drawing results of the display information are sequentially displayed in the display region of the output unit, and corresponds to the timing t105 in the processing according to Comparative Example 1. According to the processing of Comparative Example 2, the display information can be presented at a logically correct position regarding the position in the depth direction that is used as the reference for calculating the correction amount of the presentation position of the display information in a period T15 from the timing till to the timing t113. That is, according to the processing of Comparative Example 2, a gap of the presentation position of the display information V133 with the movement of the viewpoint P101 can be suppressed, as compared with Comparative Example 1. Meanwhile, positions other than the position used as the reference for calculating the correction amount among the positions in the depth direction have the gap between the position at which the display information should be originally presented according to the position and orientation of the viewpoint and the position at which the display information is actually and visually recognized, even in the case where the processing according to Comparative Example 2 is applied.

Example

[0117] Next, as processing according to Example, processing of drawing an object as display information and presenting a result of the drawing by the information processing system according to the embodiment of the present disclosure will be described. In FIG. 8, reference numeral t121 represents a start timing of processing regarding presentation of information for each frame in the processing according to Example. Furthermore, reference numeral t123 schematically represents an arbitrary timing during the period in which the drawing results of the display information are sequentially displayed in the display region of the output unit, and corresponds to the timing t105 in the processing according to Comparative Example 1 and the timing t113 in the processing according to Comparative Example 2. Furthermore, in FIG. 8, processing illustrated with reference numeral PS corresponds to the processing regarding drawing of the display information according to the projection result of the object (Pixel Shader) in the processing according to Comparative Example 1 and Comparative Example 2, and corresponds to the processing described with reference to FIG. 7 in the present example.

[0118] As described above with reference to FIG. 7, the information processing system according to the present embodiment divides the display region into a plurality of partial regions, and draws the display information for each partial region. At this time, the information processing system reprojects the target object in the partial region on the basis of the recognition result of the position and orientation of the viewpoint acquired at an immediately preceding timing (reprojection shader), and the result of the reprojection is reflected in the drawing of the display information in the partial region. That is, when the processing regarding drawing and presentation of the display information is executed for each partial region, the reprojection of the target object is performed each time according to the recognition result of the position and orientation of the viewpoint. For example, the timing illustrated by reference numeral IMU schematically represents the timing when the information according to the recognition result of the position and orientation of the viewpoint is acquired.

[0119] Specifically, at timing t121, first, information (for example, a scene graph) regarding the positional relationship between the viewpoint and the object to be drawn (for example, a virtual object) is updated (Scene Update), and the object is projected on the screen surface as two-dimensional display information according to a result of the update (Vertex Shader). Next, when the processing regarding projection is completed, synchronization regarding display of the display information in the display region of the output unit is performed (Wait vsync). Furthermore, the information processing system according to the present embodiment executes the processing regarding acquisition of the information regarding the recognition result of the position and orientation of the viewpoint, reprojection of the object according to the recognition result for each partial region, and drawing of the display information based on a result of the reprojection in the partial region (Pixel Shader) in parallel to the above processing.

……
……
……

本文链接：https://patent.nweon.com/21183

Sony Patent | Information processing apparatus, information processing method, and program

Sony Patent | Information processing apparatus, information processing method, and program

您可能还喜欢...

分类

最新AR/VR行业分享

最新AR/VR论文

最新AR/VR行业招聘

Sony Patent | Information processing apparatus, information processing method, and program

Sony Patent | Information processing apparatus, information processing method, and program

您可能还喜欢...

Sony Patent | Data processing

Sony Patent | Delivery of spectator feedback content to virtual reality environments provided by head mounted display

Sony Patent | Information processing apparatus, information processing program, and information processing method

分类

最新AR/VR行业分享

最新AR/VR论文

最新AR/VR行业招聘