Sony Patent | Signal processing device and method, and program

编辑：映维 | 分类：Sony | 2022年5月19日

Patent: Signal processing device and method, and program

Publication Number: 20220159402

Publication Date: 20220519

Applicant: Sony

Abstract

The present technology relates to a signal processing device, a signal processing method, and a program that enable more efficient sound reproduction. A signal processing device includes an order determination unit that determines an order for limiting an operation amount of an operation related to a rotation matrix corresponding to head rotation of a listener, a rotation operation unit that rotates a head-related transfer function of a spherical harmonic domain by the operation in which the rotation matrix is limited by the order, and a synthesis unit that generates a headphone drive signal by synthesizing the head-related transfer function after rotation obtained by the operation with a sound signal in the spherical harmonic domain. The present technology can be applied to an audio processing device.

Claims

A signal processing device comprising: an order determination unit that determines an order for limiting an operation amount of an operation related to a rotation matrix corresponding to head rotation of a listener; a rotation operation unit that rotates a head-related transfer function of a spherical harmonic domain by the operation in which the rotation matrix is limited by the order; and a synthesis unit that generates a headphone drive signal by synthesizing the head-related transfer function after rotation obtained by the operation with a sound signal in the spherical harmonic domain.
The signal processing device according to claim 1, wherein the order determination unit determines the order by setting an allowable value of an error of the operation related to the rotation matrix or setting an upper limit of the operation amount.
The signal processing device according to claim 2, wherein the order determination unit obtains a degree of Taylor expansion in which a truncation error when the rotation matrix is Taylor expanded is equal to or less than an allowable error corresponding to the allowable value, and determines the order on a basis of the degree of the Taylor expansion.
The signal processing device according to claim 2, wherein the order determination unit determines the order for each time frequency.
The signal processing device according to claim 2, wherein the order determination unit determines the order for each degree of the spherical harmonic domain.
The signal processing device according to claim 2, wherein the rotation operation unit performs an operation of rotating the head-related transfer function by the rotation matrix only for an element of the order in a predetermined range as the operation in which the rotation matrix is limited by the order.
The signal processing device according to claim 2, wherein for a rotation operation of the head-related transfer function with respect to at least one rotation direction, the rotation operation unit obtains the head-related transfer function after the rotation at a predetermined time by performing the rotation operation at the predetermined time using an operation result of the rotation operation in the rotation direction at another time before the predetermined time.
The signal processing device according to claim 7, wherein the rotation operation unit performs the rotation operation in the rotation direction at the predetermined time on a basis of a rotation matrix according to a difference between a rotation angle in the rotation direction of the head of the listener at the predetermined time and a rotation angle in the rotation direction of the head of the listener at the another time, and an operation result of the rotation operation in the rotation direction at the another time.
The signal processing device according to claim 8, wherein the rotation operation unit performs the rotation operation only for an element of the order in a predetermined range as the operation in which the rotation matrix is limited by the order.
The signal processing device according to claim 8, wherein the rotation operation unit performs the rotation operation in the rotation direction at the predetermined time using an operation result of the rotation operation in the rotation direction at the another time for an elevation angle direction as the rotation direction.
The signal processing device according to claim 8, wherein the rotation operation unit performs, in a case where reset of the rotation matrix is not performed, the rotation operation in the rotation direction at the predetermined time by using an operation result of the rotation operation in the rotation direction at the another time, and performs, in a case where the reset of the rotation matrix is performed, the rotation operation in the rotation direction at the predetermined time on a basis of a rotation matrix according to a rotation angle in the rotation direction of the head of the listener at the predetermined time and the head-related transfer function.
The signal processing device according to claim 11, wherein the order determination unit performs the reset on a basis of an upper limit of accumulation of the allowable value at each time.
The signal processing device according to claim 11, wherein the reset is performed for each degree, for each order, or for each time frequency.
The signal processing device according to claim 11, wherein in a case where the headphone drive signal is generated for each of a plurality of the listeners, the reset is performed for each of the listeners.
The signal processing device according to claim 1, wherein in a case where a rotation matrix for performing rotation in a predetermined rotation direction constituting the rotation matrix corresponding to the head rotation is represented by a sum of a plurality of matrices, the rotation operation unit performs an operation of rotating the head-related transfer function by using a sum of several matrices among the plurality of the matrices as a rotation matrix for performing rotation in the predetermined rotation direction as the operation in which the rotation matrix is limited by the order.
A signal processing method comprising: by a signal processing device, determining an order for limiting an operation amount of an operation related to a rotation matrix corresponding to head rotation of a listener; rotating a head-related transfer function of a spherical harmonic domain by the operation in which the rotation matrix is limited by the order; and generating a headphone drive signal by synthesizing the head-related transfer function after rotation obtained by the operation with a sound signal in the spherical harmonic domain.
A program for causing a computer to execute processing comprising steps of: determining an order for limiting an operation amount of an operation related to a rotation matrix corresponding to head rotation of a listener; rotating a head-related transfer function of a spherical harmonic domain by the operation in which the rotation matrix is limited by the order; and generating a headphone drive signal by synthesizing the head-related transfer function after rotation obtained by the operation with a sound signal in the spherical harmonic domain.

Description

TECHNICAL FIELD

[0001] The present technology relates to a signal processing device and method, and a program, and more particularly relates to a signal processing device and method, and a program that enable more efficient sound reproduction.

BACKGROUND ART

[0002] In recent years, development and widespread use of systems for recording, transmitting, and reproducing spatial information from the entire periphery in the field of audio has been in progress. For example, in Super Hi-Vision, broadcasting with three-dimensional 22.2 multi-channel audio is planned.

[0003] Furthermore, in the field of virtual reality, devices that reproduce a signal that surrounds the entire periphery also in audio, in addition to a video that surrounds the entire periphery, are spreading in the world.

[0004] Among them, there is a method of expressing three-dimensional audio information that is called Ambisonics and can flexibly support any recording and reproduction system, and is attracting attention. In particular, Ambisonics whose degree is second or higher is called higher order Ambisonics (HOA) (see, for example, Non-Patent Document 1).

[0005] In a three-dimensional multichannel sound, sound information spreads on a spatial axis in addition to a time axis, and in the Ambisonics, frequency conversion, that is, spherical harmonic transform is performed with respect to an angular direction of three-dimensional polar coordinates to retain information. The spherical harmonic transform can be considered to correspond to time-frequency transform with respect to the time axis of the audio signal.

[0006] An advantage of this method is that information can be encoded and decoded from any microphone array to any speaker array without limiting the number of microphones or speakers.

[0007] On the other hand, factors that hinder the spread of the Ambisonics include the need for a speaker array including a large number of speakers in a reproduction environment, and a narrow range (sweet spot) in which a sound space can be reproduced.

[0008] For example, in order to increase the spatial resolution of sound, a speaker array including more speakers is required, but it is unrealistic to make such a system at home or the like. Furthermore, in a space such as a movie theater, an area where a sound space can be reproduced is narrow, and it is difficult to give a desired effect to all spectators.

CITATION LIST

Non-Patent Document

[0009] Non-Patent Literature 1: Jerome Daniel, Rozenn Nicol, Sebastien Moreau, “Further Investigations of High Order Ambisonics and Wavefield Synthesis for Holophonic Sound Imaging,” AES 114th Convention, Amsterdam, Netherlands, 2003.

SUMMARY OF THE INVENTION

Problems to be Solved by the Invention

[0010] Accordingly, it is conceivable to combine the Ambisonics and binaural reproduction technology. The binaural reproduction technology is generally called a virtual auditory display (VAD), and is achieved by using a head-related transfer function (HRTF).

[0011] Here, the head-related transfer function represents information regarding how sound is transmitted from all directions surrounding the human head to the eardrums of both ears as a function of a frequency and an arrival direction.

[0012] In a case where a synthetic sound of the head-related transfer function from a certain direction with a target sound is presented by the headphones, the listener perceives as if the sound comes not from the headphones but from the direction of the head-related transfer function used. The VAD is a system using such a principle.

[0013] By reproducing a plurality of virtual speakers using the VAD, it is possible to achieve the same effect as the Ambisonics in a speaker array system including a large number of speakers, which is difficult in reality, by headphone presentation.

[0014] However, it has not been possible with such a system to reproduce sound sufficiently efficiently. For example, in a case where the Ambisonics and the binaural reproduction technology are combined, not only an operation amount such as a convolution operation of a head-related transfer function increases, but also a usage amount of a memory used for the operation and the like increases.

[0015] The present technology has been made in view of such a situation, and is intended to enable sound to be reproduced more efficiently.

Solutions to Problems

[0016] A signal processing device according to one aspect of the present technology includes an order determination unit that determines an order for limiting an operation amount of an operation related to a rotation matrix corresponding to head rotation of a listener, a rotation operation unit that rotates a head-related transfer function of a spherical harmonic domain by the operation in which the rotation matrix is limited by the order, and a synthesis unit that generates a headphone drive signal by synthesizing the head-related transfer function after rotation obtained by the operation with a sound signal in the spherical harmonic domain.

[0017] A signal processing method or a program according to one aspect of the present technology includes determining an order for limiting an operation amount of an operation related to a rotation matrix corresponding to head rotation of a listener, rotating a head-related transfer function of a spherical harmonic domain by the operation in which the rotation matrix is limited by the order, and generating a headphone drive signal by synthesizing the head-related transfer function after rotation obtained by the operation with a sound signal in the spherical harmonic domain.

[0018] In one aspect of the present technology, an order for limiting an operation amount of an operation related to a rotation matrix corresponding to head rotation of a listener is determined, a head-related transfer function of a spherical harmonic domain is rotated by the operation in which the rotation matrix is limited by the order, and a headphone drive signal is generated by synthesizing the head-related transfer function after rotation obtained by the operation with a sound signal in the spherical harmonic domain.

BRIEF DESCRIPTION OF DRAWINGS

[0019] FIG. 1 is a diagram describing simulation of stereophonic sound using a head-related transfer function.

[0020] FIG. 2 is a diagram describing calculation of a drive signal in a first method.

[0021] FIG. 3 is a diagram describing calculation of a drive signal in a case of performing head tracking.

[0022] FIG. 4 is a diagram describing calculation of a drive signal in a second method.

[0023] FIG. 5 is a diagram describing calculation of a drive signal in a third method.

[0024] FIG. 6 is a diagram describing an operation amount and a necessary memory amount.

[0025] FIG. 7 is a diagram describing calculation of a drive signal in a fourth method.

[0026] FIG. 8 is a diagram describing a rotation matrix.

[0027] FIG. 9 is a diagram describing the rotation matrix.

[0028] FIG. 10 is a diagram describing the rotation matrix.

[0029] FIG. 11 is a diagram illustrating a configuration example of an audio processing device.

[0030] FIG. 12 is a diagram describing a difference in an elevation angle direction.

[0031] FIG. 13 is a flowchart describing drive signal generation processing.

[0032] FIG. 14 is a diagram illustrating a configuration example of an audio processing device.

[0033] FIG. 15 is a flowchart describing drive signal generation processing.

[0034] FIG. 16 is a diagram illustrating a configuration example of a control system.

[0035] FIG. 17 is a diagram describing a reset and an operation amount.

[0036] FIG. 18 is a diagram describing resetting for each degree.

[0037] FIG. 19 is a diagram describing resetting for each time frequency.

[0038] FIG. 20 is a diagram illustrating a configuration example of a control system.

[0039] FIG. 21 is a diagram illustrating a configuration example of an audio processing device.

[0040] FIG. 22 is a flowchart describing drive signal generation processing.

[0041] FIG. 23 is a diagram describing a reset timing.

[0042] FIG. 24 is a diagram describing a reset timing.

[0043] FIG. 25 is a diagram illustrating a configuration example of an audio processing device.

[0044] FIG. 26 is a flowchart describing drive signal generation processing.

[0045] FIG. 27 is a diagram illustrating a configuration example of an audio processing device.

[0046] FIG. 28 is a flowchart describing drive signal generation processing.

[0047] FIG. 29 is a diagram describing setting of an allowable error for each time frequency.

[0048] FIG. 30 is a diagram describing setting of an allowable error according to a degree.

[0049] FIG. 31 is a diagram illustrating a configuration example of a computer.

MODE FOR CARRYING OUT THE INVENTION

[0050] Hereinafter, embodiments to which the present technology is applied will be described with reference to the drawings.

First Embodiment

[0051]

[0052] The present technology obtains a head-related transfer function in a spherical harmonic domain according to the rotation of the head using stacking of minute rotations, and synthesizes the head-related transfer function with an input signal of sound to be reproduced in the spherical harmonic domain, thereby achieving a more efficient reproduction system in terms of the operation amount and the memory usage.

[0053] For example, a spherical harmonic transform for the function f(.theta., .phi.) on the spherical coordinates is expressed by following Equation (1).

[Equation 1]

F.sub.n.sup.m=.intg..sub.0.sup.2.pi..intg..sub.0.sup..pi.f(.theta.,.PHI.- )Y.sub.n.sup.m(.theta.,.PHI.)sin .theta.d.theta.d.PHI. (1)

[0054] In Equation (1), .theta. and .phi. represent an elevation angle and a horizontal angle in spherical coordinates, respectively, and Y.sub.n.sup.m(.theta., .phi.) represents spherical harmonics. Furthermore, a symbol “-” above the spherical harmonics Y.sub.n.sup.m(.theta., .phi.) represents a complex conjugate of the spherical harmonics Y.sub.n.sup.m(.theta., .phi.).

[0055] Here, the spherical harmonics Y.sub.n.sup.m(.theta., .phi.) is expressed by following Equation (2).

[ Equation .times. .times. 2 ] ##EQU00001## Y n m .function. ( .theta. , .PHI. ) = ( - 1 ) m .times. ( 2 .times. n + 1 ) .times. ( n - m ) ! 4 .times. .pi. .function. ( n + m ) ! .times. P n m .function. ( cos .times. .times. .theta. ) .times. e im .times. .times. .PHI. ( 2 ) ##EQU00001.2##

[0056] In Equation (2), n and m represent the degree and order of the spherical harmonics Y.sub.n.sup.m(.theta., .phi.), and -n.ltoreq.m.ltoreq.n. The order m is also referred to as an order, a period, or the like, and hereinafter, when it is not necessary to particularly distinguish n and m, the degree n and the order m will also be collectively referred to as a degree.

[0057] Furthermore, in Equation (2), i represents a pure imaginary number, and P.sub.n.sup.m(x) is an associated Lujandre function.

[0058] When n.gtoreq.0 and 0.ltoreq.m.ltoreq.n, the associated Lujandre function P.sub.n.sup.m(x) is expressed by following Equation (3) or (4). Note that Equation (3) is a case where m=0.

[ Equation .times. .times. 3 ] P n 0 .function. ( x ) = 1 2 n .times. n ! .times. d n dx n .times. ( x 2 - 1 ) n ( 3 ) [ Equation .times. .times. 4 ] P n m .function. ( x ) = ( 1 - x 2 ) m .times. / .times. 2 .times. d n dx n .times. P m 0 .function. ( x ) ( 4 ) ##EQU00002##

[0059] Furthermore, in a case where -n.ltoreq.m.ltoreq.0, the associated Lujandre function P.sub.n.sup.m(x) is expressed by following Equation (5).

[ Equation .times. .times. 5 ] ##EQU00003## P n m .function. ( x ) = ( - 1 ) - m .times. ( n + m ) ! ( n - m ) ! .times. P n - m .function. ( x ) ( 5 ) ##EQU00003.2##

[0060] Moreover, an inverse transform from the function F.sub.n.sup.m subjected to the spherical harmonic transform to the function f(.theta., .phi.) on the spherical coordinates is as represented in following Equation (6).

[ Equation .times. .times. 6 ] ##EQU00004## f .function. ( .theta. , .PHI. ) = n = 0 .infin. .times. .times. m = - n n .times. .times. F n m .times. Y n m .function. ( .theta. , .PHI. ) ( 6 ) ##EQU00004.2##

[0061] From the above, conversion from the sound input signal D’.sub.n.sup.m(.omega.) after performing radial correction, which are retained in the spherical harmonic domain, into speaker drive signals S(x.sub.i, .omega.) of L respective speakers arranged on a spherical surface with a radius R is as represented in following Equation (7).

[ Equation .times. .times. 7 ] ##EQU00005## S .function. ( x i , .omega. ) = n = 0 N .times. .times. m = - n n .times. .times. D n ’ .times. .times. m .function. ( .omega. ) .times. Y n m .function. ( .beta. i , .alpha. i ) ( 7 ) ##EQU00005.2##

[0062] Note that in Equation (7), x.sub.i represents the position of the speaker, and .omega. represents the time frequency of a sound signal. The input signal D’.sub.n.sup.m(.omega.) is a sound signal corresponding to each degree n and order m of the spherical harmonics for a given time frequency .omega..

[0063] Furthermore, x.sub.i=(R sin .theta..sub.i cos .phi..sub.i, R sin .theta..sub.i sin .phi..sub.i, R cos .theta..sub.i), where i indicates a speaker index for specifying a speaker. Here, i=1, 2, … , L, and .theta..sub.i and .phi..sub.i respectively represent an elevation angle and a horizontal angle indicating the position of an i-th speaker.

[0064] Such a transform expressed by Equation (7) is a spherical harmonic inverse transform corresponding to Equation (6). Furthermore, in a case where a speaker drive signal S(x.sub.i, .omega.) is obtained by Equation (7), the number of speakers L that is the number of reproduction speakers and the degree N of the spherical harmonics, that is, the maximum value N of the degree n need to satisfy the relationship represented in following Equation (8).

[Equation 8]

L>(N+1).sup.2 (8)

[0065] Meanwhile, a general method for simulating stereophonic sound at the ears by headphone presentation is, for example, a method using the head-related transfer function as illustrated in FIG. 1.

[0066] In the example illustrated in FIG. 1, the input Ambisonics signal is decoded, and respective speaker drive signals of virtual speakers SP11-1 to SP11-8, which are a plurality of virtual speakers, are generated. The signal decoded at this time corresponds to, for example, the above-described input signal D’.sub.n.sup.m(.omega.).

[0067] Here, the virtual speakers SP11-1 to SP11-8 are arranged in a loop and virtually arranged, and the speaker drive signal of each virtual speaker is obtained by the above-described calculation of Equation (7). Note that, hereinafter, the virtual speakers SP11-1 to SP11-8 will also be simply referred to as the virtual speakers SP11 in a case where it is not particularly necessary to distinguish them.

[0068] When the speaker drive signals of the respective virtual speakers SP11 are obtained in this manner, left and right drive signals (binaural signals) of headphones HD11 that actually reproduce sound are generated by the convolution operation using the head-related transfer function for each of the virtual speakers SP11. Then, the sum of the respective drive signals of the headphones HD11 obtained for the respective virtual speakers SP11 is set as the final drive signal.

[0069] Note that such a method is described in detail in, for example, “ADVANCED SYSTEM OPTIONS FOR BINAURAL RENDERING OF AMBISONIC FORMAT (Gerald Enzner et. al. ICASSP 2013)” or the like.

[0070] The head-related transfer function H(x, .omega.) used to generate the left and right drive signals of the headphones HD11 is obtained by normalizing a transfer characteristic H.sub.1(x, .omega.) from a sound source position x to an eardrum position of the user in a state where the head of the user who is a listener exists in the free space with a transfer characteristic H.sub.0(x, .omega.) from the sound source position x to a head center O in a state where the head does not exist. That is, the head-related transfer function H(x, .omega.) for the sound source position x is obtained by following Equation (9).

[ Equation .times. .times. 9 ] ##EQU00006## H .function. ( x , .omega. ) = H 1 .function. ( x , .omega. ) H 0 .function. ( x , .omega. ) ( 9 ) ##EQU00006.2##

[0071] Here, by convolving the head-related transfer function H(x, .omega.) into any audio signal and presenting the audio signal with headphones or the like, it is possible to give the listener a perceptual illusion as if the sound is heard from the direction of the convoluted head-related transfer function H(x, .omega.), that is, the direction of the sound source position x.

[0072] In the example illustrated in FIG. 1, the left and right drive signals of the headphones HD11 are generated using such a principle.

[0073] Specifically, the position of each virtual speaker SP11 is set as a position x.sub.i, and the speaker drive signal of each virtual speaker SP11 is set as S(x.sub.i, .omega.).

[0074] Furthermore, the number of virtual speakers SP11 is L (here, L=8), and the final left and right drive signals of the headphones HD11 are P.sub.l and P.sub.r, respectively.

[0075] In this case, when the speaker drive signal S(x.sub.i, .omega.) is simulated by the presentation of the headphones HD11, the left and right drive signals P.sub.l and P.sub.r of the headphones HD11 can be obtained by calculating following Equation (10).

[ Equation .times. .times. 10 ] ##EQU00007## P l = i = 1 L .times. .times. S .function. ( x i , .omega. ) .times. H l .function. ( x i , .omega. ) .times. .times. P r = i = 1 L .times. .times. S .function. ( x i , .omega. ) .times. H r .function. ( x i , .omega. ) ( 10 ) ##EQU00007.2##

[0076] Note that, in Equation (10), H.sub.l(x.sub.i, .omega.) and H.sub.r(x.sub.i, .omega.) represent normalized head-related transfer functions from the position x.sub.i of the virtual speaker SP11 to the left and right eardrum positions of the listener, respectively.

[0077] By such operation, an input signal D’.sub.n.sup.m(.omega.) in the spherical harmonic domain can be finally reproduced by headphone presentation. That is, it is possible to achieve the same effect as Ambisonics by headphone presentation.

[0078] Note that in the following description, the drive signal P.sub.l and the drive signal P.sub.r for the time frequency .omega. will also be simply referred to as drive signals P(.omega.) in a case where it is not particularly necessary to distinguish the drive signal P.sub.l and the drive signal P.sub.r. Furthermore, hereinafter, in a case where it is not particularly necessary to distinguish the head-related transfer function H.sub.l(x.sub.i, .omega.) and the head-related transfer function H.sub.r(x.sub.i, .omega.) from each other, they will also be simply referred to as head-related transfer functions H(x.sub.i, .omega.).

[0079] Moreover, in the following description, a method of combining Ambisonics and binaural reproduction technology described above will also be referred to as a first method.

[0080] In the first method, for example, the operation illustrated in FIG. 2 is performed to obtain the drive signal P(.omega.) of 1.times.1, that is, one row and one column.

[0081] In FIG. 2, H(.omega.) represents a vector (matrix) of 1.times.L including L head-related transfer functions H(x.sub.i, .omega.). Furthermore, D’(.omega.) represents a vector including the input signal D’.sub.n.sup.m(.omega.), and when the number of input signals D’.sub.n.sup.m(.omega.) in a bin of the same time frequency .omega. is K, a vector D’(.omega.) is K.times.1. Moreover, Y(x) represents a matrix including spherical harmonics Y.sub.n.sup.m(.theta..sub.i, .phi..sub.i) of each degree, and the matrix Y(x) is a matrix of L.times.K.

[0082] Therefore, in the first method, a matrix (vector) S obtained from a matrix operation of the L.times.K matrix Y(x) and the K.times.1 vector D’(.omega.) is obtained, and a matrix operation of the matrix S and the vector (matrix) H(.omega.) of 1.times.L is further performed to obtain one drive signal P(.omega.).

[0083] Furthermore, in a case where the head of the listener wearing the headphones HD11 rotates in a predetermined direction represented in a rotation matrix g.sub.j (hereinafter, also referred to as a direction g.sub.j), for example, a drive signal P.sub.l(g.sub.j, .omega.) of the left headphone of the headphones HD11 is as represented in following Equation (11).

[ Equation .times. .times. 11 ] ##EQU00008## P l .function. ( g j , .omega. ) = i = 1 L .times. .times. S .function. ( x i , .omega. ) .times. H l .function. ( g j - 1 .times. x i , .omega. ) ( 11 ) ##EQU00008.2##

[0084] Note that the rotation matrix g.sub.j is a three-dimensional, that is, a 3.times.3 rotation matrix represented by .alpha., .beta., and .gamma. which are rotation angles of Euler angles. Furthermore, in Equation (11), the drive signal P.sub.l(g.sub.j, .omega.) indicates the drive signal P.sub.l described above, and here, the drive signal P.sub.l(g.sub.j, .omega.) is described in order to clarify the position, that is, the direction g.sub.j and the time frequency .omega..

[0085] In this case, it is only required to acquire the rotation direction of the head of the listener, that is, the direction g.sub.j of the head of the listener by some sensor, and calculate the left and right drive signals of the headphones HD11 by using the head-related transfer function of a relative direction g.sub.j.sup.-1x.sub.i of each virtual speaker SP11 viewed from the head of the listener among the plurality of head-related transfer functions. Thus, as in a case of using a real speaker, even in a case where sound is reproduced by the headphones HD11, a sound image position viewed from the listener can be fixed in the space.

[0086]

[0087] Furthermore, the convolution of the head-related transfer function performed in the time-frequency domain in the first method may be performed in the spherical harmonic domain. In this manner, the operation amount and the memory amount required can be reduced as compared with the first method, and sound can be reproduced more efficiently. A method for performing the convolution of the head-related transfer function in such a spherical harmonic domain is also referred to as a second method, and this second method will be described below.

[0088] For example, focusing on the left headphone, a vector P.sub.l(.omega.) including each drive signal P.sub.l(g.sub.j, .omega.) of the left headphone with respect to all rotation directions of the head of the user (listener) who is the listener is expressed as represented in following Equation (12).

[ Equation .times. .times. 12 ] ##EQU00009## P l .function. ( .omega. ) = H .function. ( .omega. ) .times. S .function. ( .omega. ) = H .function. ( .omega. ) .times. Y .function. ( x ) .times. D ’ .function. ( .omega. ) ( 12 ) ##EQU00009.2##

[0089] Note that in Equation (12), S(.omega.) is a vector including the speaker drive signal S(x.sub.i, .omega.), and S(.omega.)=Y(x)D’(.omega.). Furthermore, in Equation (12), Y(x) represents a matrix including spherical harmonics Y.sub.n.sup.m(x.sub.i) of each degree and the position x.sub.i of each virtual speaker, which is expressed by following Equation (13). Here, i=1, 2, … , L, and the maximum value (maximum degree) of the degree n is N.

[0090] D’(.omega.) represents a vector (matrix) including the input signal D’.sub.n.sup.m(.omega.) of the sound corresponding to each degree indicated by following Equation (14). Each input signal D’.sub.n.sup.m(.omega.) is a sound signal in the spherical harmonic domain.

[0091] Moreover, in Equation (12), H(.omega.) represents a matrix including the head-related transfer function H(g.sub.j.sup.-1x.sub.i, .omega.) of the relative direction g.sub.j.sup.-1x.sub.i of each virtual speaker viewed from the head of the listener in a case where the direction of the head of the listener is the direction g.sub.j, which is expressed by following Equation (15). In this example, head-related transfer functions H(g.sub.j.sup.-1x.sub.i, .omega.) of the respective virtual speakers are prepared for a total of M respective directions from the direction g.sub.1 to the direction g.sub.M.

[ Equation .times. .times. 13 ] Y .function. ( x ) = ( Y 0 0 .function. ( x 1 ) Y N N .function. ( x 1 ) Y 0 0 .function. ( x L ) Y N N .function. ( x L ) ) ( 13 ) [ Equation .times. .times. 14 ] D ’ .function. ( .omega. ) = ( D 0 ‘0 .function. ( .omega. ) D N ’ .times. .times. N .function. ( .omega. ) ) ( 14 ) [ Equation .times. .times. 15 ] H .function. ( .omega. ) = ( H .function. ( g 1 - 1 .times. x 1 , .omega. ) H .function. ( g 1 - 1 .times. x L , .omega. ) H .function. ( g M - 1 .times. x 1 , .omega. ) H .function. ( g M - 1 .times. x L , .omega. ) ) ( 15 ) ##EQU00010##

[0092] Upon calculating the drive signal P.sub.l(g.sub.j, .omega.) of the left headphone when the head of the listener is facing the direction g.sub.j, it is only required to perform the calculation of Equation (12) by selecting a row corresponding to the direction g.sub.j that is the direction of the head of the listener, that is, a row including the head-related transfer function H(g.sub.j.sup.-1x.sub.i, .omega.) for the direction g.sub.j from the matrix H(.omega.) of the head-related transfer functions.

[0093] In this case, for example, only necessary rows are calculated as illustrated in FIG. 3.

[0094] In this example, since the head-related transfer function is prepared for each of the M directions, the matrix calculation represented in Equation (12) is as indicated by an arrow A11.

[0095] That is, when the number of input signals D’.sub.n.sup.m(.omega.) of the time frequency .omega. is K, the vector D’(.omega.) is K.times.1, that is, a matrix of K rows and one column. Furthermore, the matrix Y(x) of the spherical harmonics is L.times.K, and the matrix H(.omega.) is M.times.L. Therefore, in the calculation of Equation (12), the vector P.sub.l(.omega.) is M.times.1.

[0096] Here, if the vector S(.omega.) is obtained by first performing a matrix operation (product-sum operation) of the matrix Y(x) and the vector D’(.omega.) in the online operation, at the time of calculating the drive signal P.sub.l(g.sub.j, .omega.), a row corresponding to the direction g.sub.j of the head of the listener can be selected from the matrix H(.omega.) as indicated by an arrow A12, so as to reduce the operation amount. In FIG. 3, a hatched portion in the matrix H(.omega.) represents a row corresponding to the direction g.sub.j, and the operation of this row and the vector S(.omega.) are performed to calculate a desired drive signal P.sub.l(g.sub.j, .omega.) of the left headphone.

[0097] Here, when a matrix H’(.omega.) is defined as represented in following Equation (16), the vector P.sub.l(.omega.) represented in Equation (12) can be expressed by following Equation (17).

[Equation 16]

H’(.omega.)=H(.omega.)Y(x) (16)

[Equation 17]

P.sub.l(.omega.)=H’(.omega.)D’(.omega.) (17)

[0098] In Equation (16), by the spherical harmonic transform using the spherical harmonics, the matrix H(.omega.) including the head-related transfer function, more specifically, the head-related transfer function in the time-frequency domain is converted into the matrix H’(.omega.) including the head-related transfer function in the spherical harmonic domain.

[0099] Therefore, in the calculation of Equation (17), convolution of the speaker drive signal and the head-related transfer function is performed in the spherical harmonic domain. In other words, the product-sum operation of the head-related transfer function and the input signal is performed in the spherical harmonic domain. Note that the matrix H’(.omega.) can be calculated and retained in advance.

[0100] In this case, upon calculating the drive signal P.sub.l(g.sub.j, .omega.) of the left headphone when the head of the listener is directed in the direction g.sub.j, it is only required to perform the calculation of Equation (17) by selecting a row corresponding to the direction g.sub.j of the head of the listener from the matrix H’(.omega.) retained in advance.

[0101] In such a case, the calculation of Equation (17) is a calculation represented in following Equation (18). Thus, the operation amount and the required memory amount can be significantly reduced.

[ Equation .times. .times. 18 ] ##EQU00011## P l .function. ( g j , .omega. ) = n = 0 N .times. .times. m = - n n .times. .times. H n ’ .times. .times. m .function. ( g j , .omega. ) .times. D n ’ .times. .times. m .function. ( .omega. ) ( 18 ) ##EQU00011.2##

[0102] In Equation (18), H’.sub.n.sup.m(g.sub.j, .omega.) represents a head-related transfer function of the spherical harmonic domain to be one element of the matrix H’(.omega.), that is, a component (element) corresponding to the direction g.sub.j of the head in the matrix H’(.omega.). n and m in the head-related transfer function H’.sub.n.sup.m(g.sub.j, .omega.) indicate the degree n and the order m of the spherical harmonics.

[0103] In such an operation represented in Equation (18), the operation amount is reduced as illustrated in FIG. 4. That is, the calculation represented in Equation (12) is a calculation for obtaining a product of the matrix H(.omega.) of M.times.L, the matrix Y(x) of L.times.K, and the vector D’(.omega.) of K.times.1 as indicated by an arrow A21 in FIG. 4.

[0104] Here, since H(.omega.)Y(x) is the matrix H’(.omega.) as defined in Equation (16), the calculation indicated by the arrow A21 eventually becomes as indicated by an arrow A22. In particular, since the calculation of obtaining the matrix H’(.omega.) can be performed offline, that is, in advance, if the matrix H’(.omega.) is obtained and retained in advance, it is possible to reduce the operation amount when obtaining the drive signal of the headphones online by that amount.

[0105] If the matrix H’(.omega.) is obtained in advance in this manner, the calculation indicated by the arrow A22, that is, the above-described calculation of Equation (18) is performed when the drive signal of the headphones is actually obtained.

[0106] That is, a row corresponding to the direction g.sub.j of the head of the listener is selected from the matrix H’(.omega.) as indicated by the arrow A22, and the drive signal P.sub.l(g.sub.j, .omega.) of the left headphone is calculated by a matrix operation of the selected row and the vector D’(.omega.) including the input signal D’.sub.n.sup.m(.omega.). In FIG. 4, a hatched portion in the matrix H’(.omega.) represents a row corresponding to the direction g.sub.j, and an element constituting this row is the head-related transfer function H’.sub.n.sup.m(g.sub.j, .omega.) represented in Equation (18).

[0107]

[0108] Incidentally, in the second method described above, while the operation amount and the necessary memory amount can be greatly reduced, it is necessary to retain all rotation directions of the head of the listener, that is, rows corresponding to respective directions g.sub.j on the memory as the matrix H’(.omega.) of the head-related transfer function.

[0109] Therefore, a matrix (row vector) including the head-related transfer function of the spherical harmonic domain for one direction g.sub.j may be set as H.sub.S(.omega.)=H’(g.sub.j), only a row vector H.sub.S(.omega.), which is a row corresponding to one direction g.sub.j of the matrix H’(.omega.), may be retained, and a rotation matrix R’(g.sub.j) for performing rotation corresponding to head rotation of the listener in the spherical harmonic domain may be retained by the number of a plurality of respective directions g.sub.j. Hereinafter, such a method will be referred to as a third method.

[0110] The rotation matrix R’(g.sub.j) in each direction g.sub.j is different from the matrix H’(.omega.) and has no time-frequency dependence. Thus, in the third method, the memory amount can be significantly reduced as compared with a case where the matrix H’(.omega.) has the component of the rotation direction g.sub.j of the head.

[0111] First, a product H’(g.sub.j.sup.-1, .omega.) of a row H(g .omega.) corresponding to a predetermined direction g.sub.j of the matrix H(.omega.) and a matrix Y(x) of the spherical harmonics is considered as represented in following Equation (19).

[Equation 19]

H’(g.sub.j.sup.-1,.omega.)=H(g.sub.j.sup.-1x,.omega.)Y(x) (19)

[0112] In the second method described above, coordinates of the head-related transfer function to be used with respect to the rotation direction g.sub.j of the head of the listener are rotated from x to g.sub.j.sup.-1x, but the same result can be obtained even if the coordinates of the spherical harmonics are rotated from x to g.sub.jx without changing the coordinates of the position x of the head-related transfer function. That is, following Equation (20) holds.

[Equation 20]

H’(g.sub.j.sup.-1,.omega.)=H(g.sub.j.sup.-1x,.omega.)Y(x)=H(x,.omega.)Y(- g.sub.jx) (20)

[0113] Moreover, the matrix Y(g.sub.jx) of the spherical harmonics is a product of the matrix Y(x) and a rotation matrix R’(g.sub.j.sup.-1) and is as represented in following Equation (21). Note that the rotation matrix R’(g.sub.j.sup.-1) is a matrix that rotates coordinates by g.sub.j in the spherical harmonic domain.

[Equation 21]

Y(g.sub.jx)=Y(x)R’(g.sub.j.sup.-1) (21)

[0114] Here, for a set Q represented in following Equation (22), elements other than elements in a (n.sup.2+n+1+k) row and a (n.sup.2+n+1+m) column, where (n.sup.2+n+1+k), (n.sup.2+n+1+m) Q, of the rotation matrix R’(g.sub.j) are zero.

[Equation 22]

Q={q|n.sup.2+1.ltoreq.q.ltoreq.(n+1).sup.2,q,n {0,1,2 … }} (22)

[0115] Therefore, spherical harmonics Y.sub.n.sup.m(g.sub.jx), which is an element of the matrix Y(g.sub.jx), can be expressed by following Equation (23) using an element R’.sup.(n).sub.k, m(g.sub.j) of the (n.sup.2+n+1+k) row and the (n.sup.2+n+1+m) column of the rotation matrix R’(g.sub.j).

[ Equation .times. .times. 23 ] ##EQU00012## Y n m .function. ( g j .times. x ) = k = - n n .times. .times. Y n k .function. ( x ) .times. R k , m ’ .function. ( n ) .function. ( g j - 1 ) ( 23 ) ##EQU00012.2##

[0116] Here, the element R’.sup.(n).sub.k, m(g.sub.j) is expressed by following Equation (24).

[ Equation .times. .times. 24 ] ##EQU00013## R k , m ’ .function. ( n ) .function. ( g j ) = e - im .times. .times. .alpha. .times. r k , m ( n ) .function. ( .beta. ) .times. e - ik .times. .times. .gamma. ( 24 ) ##EQU00013.2##

[0117] Note that, in Equation (24), i represents a pure imaginary number, .beta., .alpha., and .gamma. represent the rotation angles of the Euler angles of the rotation matrix, and r.sup.(n).sub.k, m(.beta.) is expressed by following Equation (25).

.times. [ Equation .times. .times. 25 ] ##EQU00014## r k , m ( n ) .function. ( .beta. ) = ( n + k ) ! .times. ( n - k ) ! ( n + m ) ! .times. ( n - m ) ! .times. .sigma. .times. ( n + m n - k - .sigma. ) .times. ( n - m .sigma. ) .times. ( - 1 ) n - k - .sigma. .times. ( cos .times. .beta. 2 ) 2 .times. .sigma. + k + m .times. ( sin .times. .beta. 2 ) 2 .times. n - 2 .times. .sigma. - k - m ( 25 ) ##EQU00014.2##

[0118] From the above, a binaural reproduction signal reflecting the rotation of the head of the listener using the rotation matrix R’(g.sub.j.sup.-1), for example, the drive signal P.sub.l(g.sub.j, .omega.) of the left headphone is obtained by calculating following Equation (26). Furthermore, in a case where the left and right head-related transfer functions may be regarded as symmetric, by performing inversion using a matrix R.sub.ref that horizontally inverts either the matrix D’(.omega.) of the input signal or the row vector H.sub.S(.omega.) of the left head-related transfer function as preprocessing of Equation (26), the right headphone drive signal can be obtained by just retaining only the row vector H.sub.S(.omega.) of the left head-related transfer function. However, a case where different left and right head-related transfer functions are basically required will be described below.

[ Equation .times. .times. 26 ] ##EQU00015## P l .function. ( g j , .omega. ) = H .function. ( g j - 1 .times. x , .omega. ) .times. Y .function. ( x ) .times. D ’ .function. ( .omega. ) = H .function. ( x , .omega. ) .times. Y .function. ( x ) .times. R ’ .function. ( g j - 1 ) .times. D ’ .function. ( .omega. ) = H S .function. ( .omega. ) .times. R ’ .function. ( g j - 1 ) .times. D ’ .function. ( .omega. ) ( 26 ) ##EQU00015.2##

[0119] In Equation (26), the drive signal P.sub.l (g.sub.j, .omega.) is obtained by synthesizing the row vector H.sub.S(.omega.), the rotation matrix R’(g.sub.j.sup.-1), and the vector D’(.omega.).

[0120] The above calculation is, for example, the calculation illustrated in FIG. 5. That is, the vector P.sub.l(.omega.) including the drive signal P.sub.l(g.sub.j, .omega.) of the left headphone is obtained by the product of the matrix H(.omega.) of M.times.L, the matrix Y(x) of L.times.K, and the vector D’(.omega.) of K.times.1 as indicated by an arrow A41 in FIG. 5. This matrix operation is as represented in above-described Equation (12).

[0121] When this operation is expressed using the matrix Y(g.sub.jx) of the spherical harmonics prepared for each of the M directions g.sub.j, the operation is as indicated by an arrow A42. That is, the vector P.sub.l(.omega.) including the drive signal P.sub.l (g.sub.j, .omega.) corresponding to each of the M directions g.sub.j is obtained by a product of a predetermined row H(x, .omega.) of the matrix H(.omega.), the matrix Y(g.sub.jx), and the vector D’(.omega.) from the relationship represented in Equation (20).

[0122] Here, the row H(x, .omega.) as a vector is 1.times.L, the matrix Y(g.sub.jx) is L.times.K, and the vector D’(.omega.) is K.times.1. When this is further transformed using the relationships represented in Equations (17) and (21), the result is as indicated by an arrow A43. That is, as represented in Equation (26), the vector P.sub.l(.omega.) is obtained by a product of the row vector H.sub.S(.omega.) of 1.times.K, the rotation matrix R’(g.sub.j.sup.-1) of K.times.K in each of the M directions g.sub.j, and the vector D’(.omega.) of K.times.1.

[0123] Note that, in FIG. 5, hatched portions of the rotation matrix R’(g.sub.j.sup.-1) represent non-zero elements of the rotation matrix R’(g.sub.j.sup.-1).

[0124] Furthermore, the operation amount and the necessary memory amount in such a third method are as illustrated in FIG. 6.

[0125] That is, as illustrated in FIG. 6, it is assumed that a row vector H.sub.S (.omega.) of 1.times.K is prepared for each time-frequency bin co, the rotation matrix R’(g.sub.j.sup.-1) of K.times.K is prepared for the M directions g.sub.j, and the vector D’(.omega.) is K.times.1. Furthermore, it is assumed that the number of time-frequency bins .omega. is W, and a maximum value of the degree n of the spherical harmonics, that is, the maximum degree is J.

[0126] At this time, since the number of non-zero elements of the rotation matrix R’(g.sub.j.sup.-1) is (J+1) (2J+1) (2J+3)/3, a sum calc/W of the number of product-sum operations per each time-frequency bin .omega. in the third method is as represented in following Equation (27).

[ Equation .times. .times. 27 ] ##EQU00016## calc .times. / .times. W = ( J + 1 ) .times. ( 2 .times. J + 1 ) .times. ( 2 .times. J + 3 ) 3 + 2 .times. K ( 27 ) ##EQU00016.2##

[0127] Furthermore, for the operation by the third method, it is necessary to retain the row vector H.sub.S(.omega.) of 1.times.K for each time-frequency bin .omega. for the left and right ears, and it is further necessary to retain non-zero elements of the rotation matrix R’(g.sub.j.sup.-1) by the amount of each of the M directions. Therefore, the memory amount needed for the operation by the third method is as represented in following Equation (28).

[ Equation .times. .times. 28 ] memory = M .times. ( J + 1 ) .times. ( 2 .times. J + 1 ) .times. ( 2 .times. J + 3 ) 3 + 2 .times. K .times. W ( 28 ) ##EQU00017##

[0128] In the third method, by retaining the number of non-zero elements of the rotation matrix R’(g.sub.j.sup.-1), the required memory amount can be greatly reduced as compared with the second method.

[0129]

[0130] Note that, in the third method, it is necessary to retain the rotation matrix R’(g.sub.j.sup.-1) by the amount of rotation of the three axes of the head of the listener, that is, by the amount of each of any M directions g.sub.j. Retaining such a rotation matrix R’(g.sub.j.sup.-1) requires a certain memory amount even through it is smaller than retaining the matrix H’(.omega.) having time-frequency dependence.

[0131] Accordingly, the rotation matrix R’(g.sub.j.sup.-1) for performing rotation in the spherical harmonic domain with the head of the listener being a rotation center may be sequentially obtained at the time of operation. Hereinafter, such a method will also be referred to as a fourth method.

[0132] Here, the rotation matrix R’(g) can be expressed by following Equation (29). Furthermore, g in Equation (29) is a rotation matrix, and is expressed by a product of a matrix u(.alpha.), a matrix a(.beta.), and a matrix u(.gamma.) as represented in following Equation(30).

[Equation 29]

R’(g)=R’(u(.alpha.)a(.beta.)u(.gamma.))=R’(u(.alpha.))R’(a(.beta.))R’(u(- .gamma.)) (29)

[Equation 30]

g=u(.alpha.)a(.beta.)u(.gamma.) (30)

[0133] Note that in Equation (29), a(.beta.) and u(.alpha.) are rotation matrices that rotate coordinates by an angle .beta. and an angle .alpha. with a coordinate axis of a coordinate system in which the position of the head of the listener is the origin being a rotation axis. Furthermore, u(.gamma.) is a rotation matrix that is different from u(.alpha.) only in the rotation angle and rotates the coordinates by an angle .gamma. with the same coordinate axis being a rotation axis. Note that the rotation angles of the respective matrices u(.alpha.), a(.beta.), and u(.gamma.), that is, the angle .alpha., the angle .beta., and the angle .gamma. are Euler angles.

[0134] For example, it is assumed that there is an orthogonal coordinate system in which the position of the head of the listener is the origin and the x axis, the y axis, and the z axis orthogonal to each other are respective axes. Here, a positive direction of the x axis is a direction directly forward of the listener in a state where the listener faces directly forward, and the z axis is an axis in an up and down direction, that is, a vertical direction, as viewed from the listener facing directly forward. The angle .alpha., the angle .beta., and the angle .gamma. are rotation angles in respective rotation directions with reference to a state in which the listener faces directly forward, that is, the positive direction of the x axis.

[0135] Specifically, a rotation angle of the head when the head is moved in the up and down direction with the y axis being the rotation axis in a state where the listener is looking directly forward is the angle .beta. that is an elevation angle. Moreover, a rotation angle of the head when the head is moved in the horizontal direction as viewed from the listener with the z axis being the rotation axis in a state where the listener is facing directly forward is the angle .alpha. that is a horizontal angle.

[0136] The matrix a(.beta.) is a rotation matrix that rotates the coordinates (coordinate system) by the angle .beta. with the y axis being the rotation axis, and the matrix u(.alpha.) is a rotation matrix that rotates the coordinates (coordinate system) by the angle .alpha. with the z axis being the rotation axis. Specifically, the matrix a(.beta.) and the matrix u(.alpha.) are as represented in following Equations (31) and (32), respectively.

[ Equation .times. .times. 31 ] { a .function. ( .beta. ) = ( cos .times. .times. .beta. 0 sin .times. .times. .beta. 0 1 0 - sin .times. .times. .beta. 0 cos .times. .times. .beta. ) | .beta. .di-elect cons. [ 0 , 2 .times. .times. .pi. ] } ( 31 ) [ Equation .times. .times. 32 ] { u .function. ( .alpha. ) = ( cos .times. .times. .alpha. - sin .times. .times. .alpha. 0 sin .times. .times. .alpha. cos .times. .times. .alpha. 0 0 0 1 ) | .alpha. .di-elect cons. [ 0 , 2 .times. .times. .pi. ] } ( 32 ) ##EQU00018##

[0137] Therefore, for example, when the matrix a(.beta.) is applied to any position v=(v.sub.x, v.sub.y, v.sub.z).sup.T in the coordinate system with the position of the head of the listener being the origin, rotation with the y axis being the rotation axis can be given to the position v, and a position v.sub.2 after rotation of the position v is expressed by following Equation (33).

[0138] Similarly, when the matrix u(.alpha.) is applied to the position v, rotation with the z axis being the rotation axis can be given to the position v, and a position v.sub.3 after rotation of the position v is expressed by following Equation (34).

[Equation 33]

v.sub.2=a(.beta.)v (33)

[Equation 34]

v.sub.3=u(.alpha.)v (34)

[0139] Therefore, the rotation matrix R’(g)=R’(u(.alpha.)a(.beta.)u(.gamma.)) is a rotation matrix that rotates the coordinate system by the angle .alpha. in a horizontal angle direction in the spherical harmonic domain, thereafter rotates the coordinate system after rotation of the angle .alpha. by the angle .beta. in an elevation angle direction as viewed from this coordinate system, and further rotates the coordinate system after rotation of the angle .beta. by the angle .gamma. in the horizontal angle direction as viewed from the coordinate system.

[0140] Furthermore, R’(u(.alpha.)), R’(a(.beta.), and R’(u(.gamma.)) indicate the rotation matrix R’(g) when the coordinates are rotated by each of the matrix u(.alpha.), the matrix a(.beta.), and the matrix u(.gamma.).

[0141] In other words, the rotation matrix R’(u(.alpha.)) is a rotation matrix that rotates coordinates by the angle .alpha. in the horizontal angle direction in the spherical harmonic domain, and the rotation matrix R’(a(.beta.) is a rotation matrix that rotates coordinates by the angle .beta. in the elevation angle direction in the spherical harmonic domain. Furthermore, the rotation matrix R’(u(.gamma.)) is a rotation matrix that rotates coordinates in the horizontal angle direction by the angle .gamma. in the spherical harmonic domain.

[0142] Therefore, for example, as indicated by an arrow A51 in FIG. 7, the rotation matrix R’(g)=R’(u(.alpha.)a(.beta.)u(.gamma.)) for rotating the coordinates three times using the angle .alpha., the angle and the angle .gamma. being rotation angles can be expressed by a product of the three rotation matrices R’(u(.alpha.)), R’(a(.beta.), and R’(u(.gamma.)).

[0143] In this case, as data for obtaining the rotation matrix R’(g.sub.j.sup.-1), each of the rotation matrix R’(u(.alpha.)), the rotation matrix R’(a(.beta.), and the rotation matrix R’(u(.gamma.)) for respective values of the rotation angles .alpha., .beta., and .gamma. is only required to be retained in a memory in a table. Furthermore, in a case where the same head-related transfer function may be used for the left and right, the rotation matrix for the opposite ear can be obtained by retaining the row vector H.sub.S(.omega.) for only one ear, also retaining the above-described matrix R.sub.ref for inverting the left and right in advance, and obtaining a product of this and the generated rotation matrix.

[0144] Furthermore, when the vector P.sub.l(.omega.) is actually calculated, one rotation matrix R’(g.sub.j.sup.-1) is calculated by calculating a product of each rotation matrix read from the table. Then, as indicated by an arrow A52, for each time-frequency bin co, a product of the row vector H.sub.S(.omega.) of 1.times.K, the rotation matrix R’(g.sub.j.sup.-1) of K.times.K common to all time-frequency bins co, and the vector D’(.omega.) of K.times.1 is calculated to obtain the vector P.sub.l(.omega.).

[0145] Here, for example, in a case where the rotation matrix R’(g.sub.j.sup.-1) itself of each rotation angle is retained in the table, when accuracy of the angle .alpha., the angle .beta., and the angle .gamma. of each rotation is one degree (1.degree.), it is necessary to retain 360.sup.3=46656000 rotation matrices R’(g.sub.j.sup.-1).

[0146] On the other hand, in a case where the accuracy of the angle .alpha., the angle .beta., and the angle .gamma. of respective rotations is one degree (1.degree.), and the rotation matrix R’(u(.alpha.)), the rotation matrix R’(a(.beta.), and the rotation matrix R’(u(.gamma.)) of each rotation angle are retained in the table, it is only necessary to retain 360.times.3=1080 rotation matrices.

[0147] Therefore, while it has been necessary to retain data of the order of O(n.sup.3) when retaining the rotation matrix R’(g.sub.j.sup.-1) itself, it is only necessary to retain data of the order of O(n) when retaining the rotation matrix R’(u(.alpha.)), the rotation matrix R’(a(.beta.), and the rotation matrix R’(u(.gamma.)), and the memory amount can be greatly reduced.

[0148] Moreover, as indicated by an arrow A51, since the rotation matrix R’(u(.alpha.)) and the rotation matrix R’(u(.gamma.)) are diagonal matrices, it is only required to retain only diagonal components.

[0149] Furthermore, since both the rotation matrix R’(u(.alpha.)) and the rotation matrix R’(u(.gamma.)) are rotation matrices that perform rotation in the horizontal angle direction, the rotation matrix R’(u(.alpha.)) and the rotation matrix R’(u(.gamma.)) can be obtained from the same common table. That is, the table of the rotation matrix R’(u(.alpha.)) and the table of the rotation matrix R’(u(.gamma.)) can be the same.

[0150] Note that, in FIG. 7, a hatched portion of each rotation matrix represents a non-zero element.

[0151] Moreover, with respect to k and m when (n.sup.2+n+1+k) and (n.sup.2+n+1+m) belong to the set Q represented in the above-described Equation (22), elements other than the (n.sup.2+n+1+k) row and the (n.sup.2+n+1+m) column among the elements of the rotation matrix R’(a(.beta.) are zero, and thus it is only required to retain only non-zero elements as the rotation matrix R’(a(.beta.), and the memory amount can be reduced.

[0152] From the above, the memory amount required to retain data for obtaining the rotation matrix R’(g.sub.j.sup.-1) can be further reduced.

[0153] Specifically, for example, when .PHI., .THETA., and .PSI. rotation matrices R’(u(.alpha.)), R’(a(.beta.), and R’(u(.gamma.)) are retained, the number M of rotation directions g.sub.j of the head is M=.PHI..times..THETA..times..PSI..

[0154] In the fourth method, since the rotation matrix R’(a(.beta.) is retained by the amount for accuracy of the angle .beta., that is, .THETA. rotation matrices, the memory amount necessary for retaining the rotation matrix R’(a(.beta.) is memory (.alpha.)=.THETA..times.(J+1) (2J+1) (2J+3)/3.

[0155] Furthermore, a common table can be used for the rotation matrix R’(u(.alpha.)) and the rotation matrix R’(u(.gamma.)), and when the accuracy of the angle .alpha. and the accuracy of the angle .gamma. are the same, it is only required to retain the rotation matrices by the amount of the angle .alpha., that is, .PHI., and it is only required to retain diagonal components of these rotation matrices. Therefore, when the length of the vector D’(.omega.) is K, the memory amount necessary for retaining the rotation matrix R’(u(.alpha.)) and the rotation matrix R’(u(.gamma.)) is memory (b)=.PHI..times.K.

[0156] Moreover, when the number of time-frequency bins .omega. is W, the memory amount required to retain the row vector H.sub.S(.omega.) of 1.times.K for the left and right ears by the amount of each time-frequency bin .omega. is 2.times.K.times.W.

[0157] Therefore, when these are summed, the memory amount required by the fourth method is memory=memory (a)+memory (b)+2KW.

[0158] Such a fourth method can significantly reduce the memory amount required with approximately the same operation amount as that of the third method. In particular, the fourth method is more effective, for example, when the accuracy of the angle .alpha., the angle .beta., and the angle .gamma. is set to one degree(1.degree.) or the like so that a head tracking function can withstand more practical use when it is achieved.

[0159]

[0160] Incidentally, in the fourth method, the number of rotation matrices to be retained can be reduced to 1080 by having rotation with respect to three axes, for example, every one degree, that is, by setting the accuracy of the angle .alpha., the angle .beta., and the angle .gamma. to one degree (1.degree.).

[0161] However, in the fourth method, in terms of the operation amount, the maximum degree J of the degree n of the spherical harmonics can be reduced only to the order of the cube.

[0162] The reason is that the rotation matrix R’(a(.beta.) for following the rotation of the head of the listener (user) is a block diagonal matrix as illustrated in FIG. 8, for example.

[0163] Note that, in FIG. 8, the horizontal axis represents the components of the columns of the rotation matrix R’(a(.beta.), and the vertical axis represents the components of the rows of the rotation matrix R’(a(.beta.)). Furthermore, in FIG. 8, shading at respective positions of the rotation matrix R’(a(.beta.) indicates a level (dB) of the element of the rotation matrix R’(a(.beta.) corresponding to those positions.

[0164] FIG. 8 illustrates a rotation matrix R’(a(.beta.)) when the rotation angle .beta. is one degree. In this example, when attention is paid to an element having a value of, for example, -400 dB or more in the rotation matrix R’(a(.beta.)), a portion including the element having such a value is a block having a size of (2n+1).times.(2n+1) with respect to the degree n. For example, a square portion indicated by an arrow A71 is a portion of one block of the block diagonal matrix, and a width (thickness) W11 of the block is 2n+1. That is, in the square portion indicated by the arrow A71, (2n+1) elements are arranged in the row direction, and (2n+1) elements are also arranged in the column direction.

[0165] When the rotation matrix R’(a(.beta.)) that is such a block diagonal matrix is used, the operation amount can be reduced to some extent, but if the operation amount can be further reduced, the drive signal can be obtained more quickly and efficiently.

[0166] Accordingly, in the fifth method, attention is paid to characteristics of the rotation matrix with respect to minute rotation, and the operation amount can be reduced to the order of the square with respect to the degree J by following the rotation of the head of the listener (user) by accumulation of the minute rotation.

[0167] Hereinafter, the fifth method will be specifically described.

[0168] Among three axes of rotation of the head of the listener, that is, the rotation matrix R’(u(.alpha.)), the rotation matrix R’(a(.beta.)), and the rotation matrix R’(u(.gamma.)), only the rotation matrix R’(a(.beta.)) is the block diagonal matrix, and the other rotation matrices R’(u(.alpha.)) and R’(u(.gamma.)) are complete diagonal matrices.

[0169] However, depending on how to select the rotation axis, two or more rotation matrices may be block diagonal matrices. In the example of the present description, although a rotation axis in which two or more rotation matrices are block diagonal matrices is not used, the present technology can also be applied to a case where two or more rotation matrices are block diagonal matrices.

[0170] It is assumed that the angle .beta. when the listener is facing the front direction in the up and down direction (vertical direction), that is, the elevation angle direction is zero degrees.

[0171] When the listener moves the head by +1 degree in the upward direction (a positive direction of the z axis) from the state where the angle .beta. is zero degrees, that is, when the listener rotates the head by +1 degree in the positive direction of the z axis with the y axis being the rotation axis, the angle .beta. is one degree.

[0172] As described above, the rotation matrix R’(a(.beta.)) when the angle .beta. is one degree is as illustrated in FIG. 8.

[0173] In the example illustrated in FIG. 8, it can be seen that the rotation matrix R’(a(.beta.) is a block diagonal matrix, and the portion of each block of the block diagonal matrix is a square having (2n+1) elements on one side for each degree n. At the same time, a rotation matrix R’(g) that is a synthesis of the rotation matrix R’(a(.beta.), the rotation matrix R’(u(.alpha.)) that is a diagonal matrix, and the rotation matrix R’(u(.gamma.)) that is a diagonal matrix is also a similar block diagonal matrix. Here, since the direction g.sub.j may be either a discrete value or a continuous value, g.sub.j will also be simply referred to as g in the following description.

[0174] Now, when the head-related transfer function of the spherical harmonic domain is rotated for one block of the rotation matrix R’(g), which is a block diagonal matrix, that is, a certain degree n, the head-related transfer function H’.sub.n.sup.m(g.sup.-1) after the rotation is expressed by following Equation (35). That is, when the head-related transfer function of the spherical harmonic domain is rotated by the angle of a direction g by using the portion of the block of the degree n of the rotation matrix R’(g), the head-related transfer function H’.sub.n.sup.m(g.sup.-1) after the rotation is expressed by following Equation (35).

[ Equation .times. .times. 35 ] H n ’ .times. .times. m .function. ( g - 1 ) = k = - n n .times. H n ’ .times. .times. k .times. R k , m ’ .function. ( n ) .function. ( g ) ( 35 ) ##EQU00019##

[0175] Note that in Equation (35), k represents the order before rotation, and m represents the order after rotation. Furthermore, H’.sub.n.sup.k indicates elements of the degree n and an order k in the row vector H.sub.S(.omega.).

[0176] From the calculation of such equation (35), it can be seen that all (2n+1) elements R’.sup.(n).sub.k, m(g) are used to obtain the element of the order m after one rotation.

[0177] However, in the rotation when the angle .beta. is minute, such as when the angle .beta.=1 degree, most of the respective elements of the rotation matrix R’(a(.beta.)) which is a block diagonal matrix have minute values. Therefore, most of the respective elements R’.sup.(n).sub.n, m(g) of the rotation matrix R’(g) also have minute values.

[0178] That is, for example, the rotation matrix R’(a(.beta.)) illustrated in FIG. 9 indicates the rotation matrix R’(a(.beta.)) when the angle .beta. is one degree, which is the same as the rotation matrix R’(a(.beta.)) illustrated in FIG. 8.

[0179] That is, in FIG. 9, the horizontal axis represents the components of the columns of the rotation matrix R’(a(.beta.)), and the vertical axis represents the components of the rows of the rotation matrix R’(a(.beta.)). Furthermore, shading at respective positions of the rotation matrix R’(a(.beta.)) indicates a level (dB) of the element of the rotation matrix R’(a(.beta.)) corresponding to those positions.

[0180] However, while the range of the level of each element of the rotation matrix R’(a(.beta.)) is -400 dB to zero dB in FIG. 8, the range of the level of each element of the rotation matrix R’(a(.beta.)) is limited to -100 dB to zero dB in FIG. 9.

[0181] As in the example illustrated in FIG. 9, when an element having a valid value in the rotation matrix R’(a(.beta.) is an element having a level of -100 dB to 0 dB, it can be seen that an element having a valid value exists only around the diagonal components.

[0182] Moreover, it can be seen that the number of elements having valid values when one row of the rotation matrix R’(a(.beta.) is viewed, that is, the number of elements having valid values (hereinafter, also referred to as effective element width) arranged continuously in the horizontal direction in FIG. 9 is almost the same in all degrees n.

[0183] Thus, the number of elements having valid values in each degree n is only on the order of the square of J, which is approximately the maximum value of the degree n, even with increasing degree n.

[0184] Accordingly, if an element of a value within a range of a predetermined level such as an element at a level from -100 dB to zero dB of the rotation matrix R’(a(.beta.) is set as an effective element, and an operation of rotating the head-related transfer function in the spherical harmonic domain is performed using only the effective element, the operation amount can be reduced. In other words, if the element of the value within the range of the predetermined level of the rotation matrix R’(g) is set as the effective element, and the operation of rotating the head-related transfer function in the spherical harmonic domain is performed using only the effective element, the operation amount can be reduced. An effective element width of the rotation matrix R’(g) is the same as the effective element width of the rotation matrix R’(a(.beta.)).

[0185] For example, in a case where the effective element width is 2C+1, the above-described calculation of Equation (35) is as represented in following Equation (36).

[ Equation .times. .times. 36 ] H n ’ .times. .times. m .function. ( g - 1 ) .apprxeq. k = max .function. ( - n , m - C ) min .function. ( n , m + C ) .times. H n ’ .times. .times. k .times. R k , m ’ .function. ( n ) .function. ( g ) ( 36 ) ##EQU00020##

[0186] However, in Equation (36), min(a, b) represents a function for selecting the smaller one of a and b. Furthermore, in Equation (36), max(a, b) represents a function for selecting the larger one of a and b.

[0187] In Equation (35), (2n+1) elements R’.sup.(n).sub.k, m(g) in which the order k is from -n to n are used for each degree n, but in the calculation of Equation (36), only (2C+1) elements R’.sup.(n).sub.k, m(g) in which the order k is within a range from m-C to m+C with m being a center are used, and reduction in the operation amount is achieved. Note that, in a case where k is larger than n or in a case where k is smaller than -n, the operation is performed with k up to n or k up to -n, respectively, so as not to exceed the range of the matrix. In this manner, the operation amount can be reduced by performing the operation while limiting the order k, that is, by performing the operation only for the element in which the order k is a value within the range determined by C.

[0188] In this case, since the effective element width 2C+1 is the same in all the degrees n, it can be seen that the fifth method is more advantageous in terms of operation as the degree J is larger as compared with the fourth method described above.

[0189] Note that, in Equation (36), a constant C determined from the effective element width is applied to all the degrees n. However, the C that determines the effective element width 2C+1 is not limited to a constant, and a function C(n) (where C(n)

[0190] Furthermore, an element used for the operation of the rotation matrix R’(a(.beta.) may be the element itself of the rotation matrix R’(a(.beta.) or an approximate value of the element of the rotation matrix R’(a(.beta.)).

[0191] That is, more generally speaking, it is assumed that the rotation matrix R’(a(.beta.) can be expressed as R’(a(.beta.)=A.sub.1+A.sub.2+A.sub.3+ … by a combination of a plurality of matrices. In this case, for an approximate rotation matrix Rs’(a(.beta.)) represented by the sum of some of those matrices that constitute the rotation matrix R’(a(.beta.), it is only required to perform the operation by using fewer elements than (2n+1).times.(2n+1) in the next nth-order block.

[0192] For example, an nth-order block diagonal matrix R’.sup.(n)(a(.beta.) of the rotation matrix R’(a(.beta.) can be expressed by following Equation (37).

[ Equation .times. .times. 37 ] R ’ .function. ( n ) .function. ( a .function. ( .beta. ) ) = .times. exp .function. ( i .times. .times. .beta. .times. .times. V y ( n ) ) = .times. E + i .times. .times. .beta. .times. .times. V y ( n ) - .beta. 2 2 ! .times. V y ( n ) .times. 2 - i .times. .times. .beta. 3 3 ! .times. V y ( n ) .times. .times. 3 + ( 37 ) ##EQU00021##

[0193] Here, the matrix V.sub.y.sup.(n) in Equation (37) is expressed as following Equation (38). In a case where it is desired to set a thickness of the approximate rotation matrix Rs’(a(.beta.)) to C using the matrix V.sub.y.sup.(n), it is only required to perform the calculation by limiting the calculation up to the C-th power in the polynomial of the matrix represented in Equation (37).

.times. [ Equation .times. .times. 38 ] V y ( n ) = 1 2 .times. i .times. ( 0 2 .times. n 0 0 - 2 .times. n 0 2 .times. ( 2 .times. n - 1 ) 0 0 - ( 2 .times. n - 1 ) .times. 2 0 0 0 2 .times. n 0 0 - 2 .times. n 0 ) ( 38 ) ##EQU00022##

[0194] In this manner, in the rotation matrix Rs’(a(.beta.) used as the rotation matrix R’(a(.beta.), elements having non-zero values are substantially only diagonal components. Therefore, if a rotation operation of rotating the head-related transfer function using a non-zero element of the rotation matrix R’(g) obtained using the rotation matrix Rs’(a(.beta.), that is, a matrix operation of the rotation matrix R’(g) and the row vector H.sub.S(.omega.) is performed, consequently, an operation with a limited order of the rotation matrix R’(g) is performed, and the operation amount can be reduced.

[0195] Note that in this case, for example, the rotation matrix R’(u(.alpha.)), the rotation matrix Rs’(a(.beta.), and the rotation matrix R’(u(.gamma.)) are synthesized to form a rotation matrix R’(g), and a matrix operation with limited orders is performed.

[0196] In a case of following the rotation of the head of the listener by the fifth method as described above, for example, it is assumed that the listener has rotated the head by 30 degrees in the upward direction, that is, the elevation angle direction. That is, it is assumed that the elevation angle (angle .beta.) indicating the direction of the head of the listener is 30 degrees.

[0197] In this case, the rotation matrix R’(a(.beta.) is as illustrated in FIG. 10. Note that in FIG. 10, the horizontal axis represents components of the columns of the rotation matrix R’(a(.beta.), and the vertical axis represents components of the rows of the rotation matrix R’(a(.beta.)). Furthermore, shading at respective positions of the rotation matrix R’(a(.beta.)) indicates a level (dB) of the element of the rotation matrix R’(a(.beta.)) corresponding to those positions.

[0198] In FIG. 10, as in the case of FIG. 9, the range of the level of each element of the rotation matrix R’(a(.beta.)) is from -100 dB to 0 dB.

[0199] However, in the example illustrated in FIG. 10, as the degree n increases, the effective element width of the block for the degree n increases (enlarges). That is, even if the components of -100 dB or less are discarded, the rotation matrix R’(a(.beta.)) becomes a block diagonal matrix having a large effective element width.

[0200] As described above, with the rotation matrix R’(a(.beta.)), while the effective element width is narrow and the operation amount can be reduced as described with reference to FIG. 9 when the rotation angle .beta. is small, the effective element width increases and the operation amount reduction effect decreases as the rotation angle .beta. increases.

[0201] Furthermore, if this continues, as the rotation of the head with respect to the elevation angle direction of the listener increases, the constant C for determining the effective element width 2C+1 has to be increased.

[0202] In order to follow the rotation of the head up to the rotation angle .beta. in the large elevation angle direction while keeping the operation amount small, it is only required to use accumulation of minute rotations.

[0203] That is, for example, the direction of the head of the listener (user) at a predetermined time is expressed as (.alpha., .beta., .gamma.) using the Euler angle. Here, the angle .alpha., the angle .beta., and the angle .gamma. respectively correspond to the rotation angle .alpha., the rotation angle .beta., and the rotation angle .gamma. described above. Note that here, the direction g, which is the rotation direction of the head of the listener, is represented using the Euler angle, but may be represented by another method such as quaternion, for example. Hereinafter, the description will be continued assuming that the direction g is represented using the Euler angle unless otherwise specified.

[0204] In particular, the angle .alpha. and the angle .gamma. are horizontal angles viewed from the listener, and the angle .beta. is an elevation angle viewed from the listener. Hereinafter, in particular, the angle .beta. at time t is referred to as an angle .beta..sub.t. Similarly, hereinafter, the angle .alpha. and the angle .gamma. at time t are referred to as an angle .alpha..sub.t and an angle .gamma..sub.t, respectively.

[0205] In a case where accumulation of minute rotations is used, it is only required to obtain a difference .DELTA.g.sub.t=g.sub.tg.sub.t-1.sup.-1 between an angle g.sub.t indicating the direction g at time t and an angle g.sub.t-1 at time (t-1) immediately before time t, that is, time (t-1) before time t, and rotate the rotation matrix R’(g.sub.t-1) obtained last time by the amount of the difference .DELTA.g.sub.t, thereby updating the rotation matrix R’(g.sub.t). That is, a product of the rotation matrix R’(g.sub.t-1) at the previously obtained time (t-1) and the rotation matrix R’(.DELTA.g.sub.t) corresponding to the difference .DELTA.g.sub.t is only required to be set as the rotation matrix R’(g.sub.t) at time t.

[0206] Thus, it is possible to obtain the rotation matrix R’(g.sub.t) with a smaller operation amount using the rotation matrix R’(.DELTA.g.sub.t)=R’(u(.DELTA..alpha..sub.t))Rs’(a(.DELTA..beta..sub.t))- R’(u(.DELTA..gamma..sub.t)) obtained by synthesizing the rotation matrix Rs’(a(.DELTA..beta..sub.t)) having a small effective element width for the difference .DELTA..beta..sub.t of the difference .DELTA.g.sub.t which are minute rotation angles, the rotation matrix R’(u(.DELTA..alpha..sub.t)) which is a diagonal matrix for the difference .DELTA..alpha..sub.t of the difference .DELTA.g.sub.t, and the rotation matrix R’(u(.DELTA..gamma..sub.t)) which is a diagonal matrix for the difference .DELTA..gamma..sub.t of the difference .DELTA.g.sub.t.

[0207] Note that the difference .DELTA..alpha..sub.t, the difference .DELTA..beta..sub.t, and the difference .DELTA..gamma..sub.t are Euler angles such that .DELTA.g.sub.t=u(.DELTA..alpha..sub.t)a(.DELTA..beta..sub.t)u(.DELTA..gam- ma..sub.t).

[0208]

[0209] Here, an audio processing device to which the present technology described above is applied will be described. FIG. 11 is a diagram illustrating a configuration example of one embodiment of the audio processing device to which the present technology is applied.

[0210] An audio processing device 11 illustrated in FIG. 11 is a signal processing device that is built in, for example, a headphone or the like, receives an input signal D’.sub.n.sup.m(.omega.) of a spherical harmonic domain that is an acoustic signal of a sound to be reproduced, and outputs drive signals of sounds of two channels in a time domain. Note that, although an example in which the audio processing device 11 is incorporated in the headphones will be described here, the audio processing device 11 may be incorporated in another device different from the headphones or may be another device different from the headphones or the like.

[0211] The audio processing device 11 includes a head rotation sensor unit 21, a previous direction retention unit 22, a rotation matrix operation unit 23, a rotation operation unit 24, a rotation coefficient retention unit 25, a head-related transfer function retention unit 26, a head-related transfer function synthesis unit 27, and a time-frequency inverse transform unit 28.

[0212] The head rotation sensor unit 21 includes, for example, an acceleration sensor, an image sensor, or the like attached to the head of the listener (user) as necessary, detects rotation (movement) of the head of the listener, and supplies a detection result to the rotation matrix operation unit 23.

[0213] Note that the listener here is a user wearing headphones, that is, a user who listens to sound reproduced by the headphones on the basis of drive signals of the left and right headphones obtained by the time-frequency inverse transform unit 28.

[0214] In the head rotation sensor unit 21, an angle .alpha..sub.t, an angle .beta..sub.t, and an angle .gamma..sub.t at the current time t are obtained as detection results of the rotation of the head of the listener, that is, the direction in which the head of the listener faces. Hereinafter, the information indicating the direction (rotation) of the head of the listener including the angle .alpha..sub.t, the angle .beta..sub.t, and the angle .gamma..sub.t will also be referred to as head rotation information. The direction at certain time t indicated by the head rotation information is an angle g.sub.t corresponding to the above-described direction g, and is, for example, angle information indicating the direction of the head with reference to the x-axis direction.

[0215] The previous direction retention unit 22 retains the angle at each time supplied from the rotation matrix operation unit 23 as previous direction information, and supplies the previous direction information retained to the rotation matrix operation unit 23 at the next time. Therefore, for example, when the head rotation information at time t is supplied from the head rotation sensor unit 21 to the rotation matrix operation unit 23, the angle g.sub.t-1 at time (t-1) is supplied from the previous direction retention unit 22 to the rotation matrix operation unit 23 as the previous direction information.

[0216] The rotation matrix operation unit 23 retains a table indicating the rotation matrix R’(u(.alpha.)) at each angle .alpha. and a table indicating the rotation matrix R’(a(.beta.) at each angle .beta.. Note that the table indicating the rotation matrix R’(u(.alpha.)) is also used when the rotation matrix R’(u(.gamma.)) is obtained. That is, the tables of the rotation matrix R’(u(.alpha.)) and the rotation matrix R’(u(.gamma.)) are used in common.

[0217] The rotation matrix operation unit 23 obtains and outputs the rotation matrix R’(u(.DELTA..alpha..sub.t)), a rotation matrix R’(a(.DELTA..beta..sub.t)), and the rotation matrix R’(u(.DELTA..gamma..sub.t)) on the basis of the retained table, the head rotation information supplied from the head rotation sensor unit 21, and the previous direction information supplied from the previous direction retention unit 22. The rotation matrix operation unit 23 supplies the rotation matrix R’(u(.DELTA..alpha..sub.t)), the rotation matrix R’(a(.DELTA..beta..sub.t)), and the rotation matrix R’(u(.DELTA..gamma..sub.t)) to the rotation operation unit 24.

[0218] The rotation matrix R’(.DELTA.g.sub.t) that is a synthesis of the rotation matrix R’(u(.DELTA..alpha..sub.t)), the rotation matrix R’(a(.DELTA..beta..sub.t)), and the rotation matrix R’(u(.DELTA..gamma..sub.t)) is a rotation matrix that performs rotation by an angle (difference .DELTA.g.sub.t) of a difference between the rotation g.sub.t of the head of the listener at time t and the rotation g.sub.t-1 of the head of the listener at time (t-1).

[0219] Note that, for the rotation matrix R’(u(.DELTA..alpha..sub.t)), the rotation matrix R’(a(.DELTA..beta..sub.t)), and the rotation matrix R’(u(.DELTA..gamma..sub.t)), the rotation matrix operation unit 23 may obtain the rotation matrix R’(u(.DELTA..alpha..sub.t)), the rotation matrix R’(a(.DELTA..beta..sub.t)), and the rotation matrix R’(u(.DELTA..gamma..sub.t)) by operation on the basis of the differences .DELTA..alpha..sub.t, .DELTA..beta..sub.t, and .DELTA..gamma..sub.t instead of using a table. Furthermore, the table of the rotation matrix R’(a(.DELTA..beta..sub.t)) may indicate the rotation matrix Rs’(a(.DELTA..beta..sub.t)) that is an approximation of the rotation matrix R’(a(.DELTA..beta..sub.t)), and the rotation matrix Rs’(a(.DELTA..beta..sub.t)) may be obtained by operation instead of from the table.

[0220] Furthermore, the rotation matrix operation unit 23 supplies the head rotation information g.sub.t supplied from the head rotation sensor unit 21 to the previous direction retention unit 22 as the previous direction information, and causes the previous direction retention unit 22 to retain the information.

[0221] The rotation operation unit 24 calculates a row vector H’(g.sub.t.sup.-1, .omega.) and supplies the row vector H’(g.sub.t.sup.-1, .omega.) to the rotation coefficient retention unit 25 and the head-related transfer function synthesis unit 27.

[0222] Here, the row vector H’(g.sub.t.sup.-1, .omega.) is a row vector obtained by performing a rotation operation of rotating the head-related transfer function of the spherical harmonic domain, that is, the row vector H.sub.S(.omega.) by the angle g.sub.t on the basis of the rotation matrix R’(g.sub.t) at time t.

[0223] In practice, the rotation operation unit 24 calculates the row vector H’(g.sub.t.sup.-1, .omega.) at time t on the basis of the rotation matrix R’(.DELTA.g.sub.t) supplied from the rotation matrix operation unit 23 and the row vector H’(g.sub.t-1.sup.-1, .omega.) at time (t-1) supplied from the rotation coefficient retention unit 25.

[0224] Such an operation is a rotation operation of further performing rotation by an angle indicated by the difference .DELTA.g.sub.t with respect to an operation result of the rotation operation at time (t-1), that is, the head-related transfer function after rotation obtained by the rotation operation of rotating the row vector H.sub.S (.omega.) by the angle g.sub.t-1.

[0225] Moreover, the rotation operation based on the rotation matrix R’(.DELTA.g.sub.t) is a matrix operation in which a calculation is performed only for an element having the order k within a range determined by a predetermined value C in the rotation matrix R’(.DELTA.g.sub.t), that is, an operation limited by the order k is performed. Therefore, it can be said that the rotation matrix R’(.DELTA.g.sub.t) is a rotation matrix in which only the element having the order k within the range determined by the predetermined value C is an element having a non-zero valid value, that is, limited by the order k.

[0226] Note that at the start of processing, that is, in a state where there is no row vector H’(g.sub.t-1.sup.-1, .omega.), the rotation operation unit 24 calculates the row vector H’(g.sub.t.sup.-1, .omega.) on the basis of the row vector H.sub.S(.omega.) of the head-related transfer function supplied from the head-related transfer function retention unit 26 and the rotation matrix R’(.DELTA.g.sub.t) supplied from the rotation matrix operation unit 23. In this case, since the angle g.sub.t-1 is zero degrees, the rotation matrix R’(.DELTA.g.sub.t) is equivalent to the rotation matrix R’(g.sub.t).

[0227] The rotation coefficient retention unit 25 retains the row vector H’(g.sub.t.sup.-1, .omega.) at time t supplied from the rotation operation unit 24, and supplies the row vector H’(g.sub.t.sup.-1, .omega.) retained at next time (t+1) to the rotation operation unit 24.

[0228] The head-related transfer function retention unit 26 retains a predetermined row vector H.sub.S (.omega.) or a row vector H.sub.S (.omega.) supplied from the outside, and supplies the retained row vector H.sub.S(.omega.) to the rotation operation unit 24. Note that the row vector H.sub.S(.omega.) may be prepared for each listener (user), or the row vector H.sub.S(.omega.) common to all listeners or a plurality of listeners constituting one group may be prepared.

[0229] Here, the row vector H’(g.sup.-1, .omega.) is a matrix obtained by rotating the row vector H.sub.S(.omega.) including the head-related transfer function in the spherical harmonic domain by the rotation matrix R’(g.sup.-1), that is, a matrix including the head-related transfer function after rotation. In other words, the row vector H’(g.sup.-1, .omega.) is a matrix (vector) including, as an element, a head-related transfer function rotated by an angle .alpha. in the horizontal direction, an angle .beta. in the elevation angle direction, and an angle .gamma. in the horizontal direction by an angle determined by the direction of the head of the listener in the spherical harmonic domain.

[0230] Note that, here, the example has been described in which the head-related transfer function is rotated by the amount of the difference between rotations at time t and time (t-1) using the row vector H’(g.sub.t-1.sup.-1, .omega.) that is the operation result at time (t-1) in all the directions of the angle the angle .alpha., and the angle .gamma.. However, it is not limited thereto, and the result of the rotation operation of the head-related transfer function at time (t-1) may be further rotated by the amount of the difference between the angles at time t and time (t-1) with respect to the direction (rotation direction) of at least one of the angle .alpha., the angle .beta. or the angle .gamma..

[0231] The head-related transfer function synthesis unit 27 synthesizes the input signal D’.sub.n.sup.m(.omega.) for each time-frequency bin .omega. that is a sound signal in the spherical harmonic domain supplied from the outside with the row vector H’(g.sub.t.sup.-1, .omega.) supplied from the rotation operation unit 24 to generate drive signals of the left and right headphones.

[0232] That is, the head-related transfer function synthesis unit 27 calculates a drive signal P.sub.l(g, .omega.) and a drive signal P.sub.r(g, .omega.) of the left and right headphones by obtaining a product of the row vector H’(g.sub.t.sup.-1, .omega.) and the matrix D’(.omega.) including the input signal D’.sub.n.sup.m(.omega.), which is a sound signal in the spherical harmonic domain, for each of the left and right headphones, and supplies the drive signal P.sub.l(g, .omega.) and the drive signal P.sub.r(g, .omega.) to the time-frequency inverse transform unit 28.

[0233] Here, the drive signal P.sub.l(g, .omega.) is a drive signal (binaural signal) of the left headphone in the time-frequency domain, and the drive signal P.sub.r(g, .omega.) is a drive signal (binaural signal) of the right headphone in the time-frequency domain.

[0234] In the head-related transfer function synthesis unit 27, synthesis of the head-related transfer function with respect to the input signal and spherical harmonic inverse transform with respect to the input signal are simultaneously performed.

[0235] The time-frequency inverse transform unit 28 performs the time-frequency inverse transform on the drive signal in the time-frequency domain supplied from the head-related transfer function synthesis unit 27 for each of the left and right headphones to obtain the drive signal p.sub.l(g, t) of the left headphone in the time domain and the drive signal p.sub.r(g, t) of the right headphone in the time domain, and outputs these drive signals to the subsequent stage. In a reproduction device that reproduces sound in two channels or a plurality of channels, such as headphones in a subsequent stage, more specifically, headphones including an earphone or a speaker using a transaural technology, sound is reproduced on the basis of the drive signal output from the time-frequency inverse transform unit 28. Note that in a case where an input signal has not been subjected to the time-frequency conversion, a time-frequency transform unit is provided at an input part of the signal, that is, for example, at a preceding stage of the head-related transfer function synthesis unit 27, or a convolution operation in the time domain is performed by the head-related transfer function synthesis unit 27.

[0236] Here, processing in each unit of the audio processing device 11 will be specifically described.

[0237] For example, the rotation matrix operation unit 23 obtains the head rotation information at time t, that is, a difference .DELTA.g.sub.t=g.sub.tg.sub.t-1.sup.-1 between the angle g.sub.t at time t and the angle g.sub.t-1 at time (t-1). Then, the rotation matrix operation unit 23 obtains the difference .DELTA..beta..sub.t, the difference .DELTA..alpha..sub.t, and the difference .DELTA..gamma..sub.t from the difference .DELTA.g.sub.t, and reads the rotation matrix R’(a(.beta.)) when the angle .beta. is the difference .DELTA..beta..sub.t and the rotation matrix R’(u(.alpha.)) when the angle .alpha. is the difference .DELTA..alpha..sub.t and the difference .DELTA..gamma..sub.t from the retained tables of the rotation matrix R’(a(.beta.)) and the rotation matrix R’(u(.alpha.)) to obtain the rotation matrix R’(a(.DELTA..beta..sub.t)), the rotation matrix R’(u(.DELTA..alpha..sub.t)), and the rotation matrix R’(u(.DELTA..gamma..sub.t)).

[0238] Moreover, the rotation matrix operation unit 23 performs an operation similar to the above-described Equation (29) to synthesize the rotation matrix R’(u(.DELTA..alpha..sub.t)), the rotation matrix R’(a(.DELTA..beta..sub.t)), and the rotation matrix R’(u(.DELTA..gamma..sub.t)) obtained in this manner to obtain a rotation matrix R’(.DELTA.g.sub.t).

[0239] For example, when the difference .DELTA..beta..sub.t is obtained for each frame of the input signal D’.sub.n.sup.m(.omega.), that is, for each one frame, the difference .DELTA..beta..sub.t is as illustrated in FIG. 12. Note that in FIG. 12, the vertical axis represents the angle .beta. (elevation angle .beta.) at each time, and the horizontal axis represents the time.

[0240] In the example illustrated in FIG. 12, a curve L11 indicates the angle .beta. at each time, and a portion of a region RZ11 in the curve L11 is enlarged as illustrated on a lower side in the drawing.

[0241] Here, a period from time (t-1) to time t is a period of one frame. Thus, the difference between the angle .beta..sub.t, which is the angle .beta. at time t, and the angle .beta..sub.t-1, which is the angle .beta. at time (t-1), is .DELTA..beta..sub.t.

[0242] In the rotation matrix operation unit 23, the rotation matrix R’(.DELTA.g.sub.t) obtained on the basis of the difference .DELTA.g.sub.t is supplied to the rotation operation unit 24, and the angle g.sub.t at time t is supplied to the previous direction retention unit 22 to update the previous direction information. That is, the newly supplied angle g.sub.t at time t is retained as the updated previous direction information.

[0243] The rotation operation unit 24 calculates the row vector H’(g.sub.t.sup.-1, .omega.) at time t on the basis of the rotation matrix R’(.DELTA.g.sub.t) and the row vector H’(g.sub.t-1.sup.-1, .omega.) at time (t-1).

[0244] For example, following Equation (39) holds for any rotation matrix g.sub.1 and rotation matrix g.sub.2.

[Equation 39]

R’(g.sub.1g.sub.2)=R’(g.sub.1)R’(g.sub.2) (39)

[0245] From this, following Equation (40) holds, and it can be seen that the row vector H’(g.sub.t.sup.-1, .omega.) is obtained by obtaining the product of the row vector H’(g.sub.t-1.sup.-1, .omega.) and the rotation matrix R’(.DELTA.g.sub.t).

[Equation 40]

H’(g.sub.t.sup.-1,.omega.)=H’(g.sub.t-1.sup.-1,.omega.)R’(.DELTA.g.sub.t- ) (40)

[0246] That is, elements of the degree n and the order m of the row vector H’(g.sub.t.sup.-1, .omega.) are H’.sub.n.sup.m(g.sub.t-1, .omega.), elements of the degree n and the order m of the rotation matrix R’(.DELTA.g.sub.t) are R’.sup.(n).sub.k, m(.DELTA.g.sub.t), and a constant that determines an effective element width in the degree n of the rotation matrix R’(.DELTA.g.sub.t) is C. In this case, following Equation (41) holds. That is, each non-zero element of the row vector H’(g.sub.t.sup.-1, .omega.) can be obtained by the operation of following Equation (41).

.times. [ Equation .times. .times. 41 ] H n ’ .times. .times. m .function. ( g t - 1 , .omega. ) = k = - n n .times. H n ’ .times. .times. k .function. ( g t - 1 - 1 , .omega. ) .times. R k , m ’ .function. ( n ) .function. ( .DELTA. .times. .times. g t ) .apprxeq. k = max .function. ( - n , m - C ) min .function. ( n , m + C ) .times. H n ’ .times. .times. k .function. ( g t - 1 - 1 , .omega. ) .times. R k , m ’ .function. ( n ) .function. ( .DELTA. .times. .times. g t ) ( 41 ) ##EQU00023##

[0247] The rotation operation unit 24 obtains the row vector H’(g.sub.t.sup.-1, .omega.) by calculating Equation (41). In an operation of Equation (41), only (2C+1) elements whose order k is within the range from m-C to m+C centered on m are calculated, similarly to Equation (36) described above. However, the range is limited to -n.ltoreq.k.ltoreq.n. That is, the operation is a rotation operation in which the order k is limited, in which the operation is performed only for an element whose order k is a value within a range determined by C, and the operation amount is reduced.

[0248] Note that the rotation matrix operation unit 23 may sequentially obtain the rotation matrix R’(a(.DELTA..beta..sub.t)) by calculation, or may select the rotation matrix R’(a(.DELTA..beta..sub.t)) from one or more candidates prepared in advance.

[0249] Moreover, by combining a method of operating the rotation matrix R’(a(.DELTA..beta..sub.t)) according to the time and a method of selecting the rotation matrix R’(a(.DELTA..beta..sub.t)) from one or more candidates, the angle of rotating the head-related transfer function may be adjusted by following the actual angle .beta..sub.t of rotation of the head of the listener while changing the frequency of using each of these methods.

[0250]

[0251] Next, drive signal generation processing performed by the audio processing device 11 will be described with reference to a flowchart of FIG. 13.

[0252] In step S11, the head rotation sensor unit 21 detects rotation of the head of the user who is a listener, and supplies the head rotation information obtained as a detection result thereof to the rotation matrix operation unit 23.

[0253] In step S12, the rotation matrix operation unit 23 obtains a difference .DELTA.g.sub.t between the angle g.sub.t of the head rotation information supplied from the head rotation sensor unit 21 and the angle g.sub.t-1 at time (t-1) retained as the previous direction information in the previous direction retention unit 22.

[0254] Furthermore, when the difference .DELTA.g.sub.t is obtained, the rotation matrix operation unit 23 supplies the angle g.sub.t of the head rotation information obtained in step S11 to the previous direction retention unit 22 to update the previous direction information. The previous direction retention unit 22 updates the previous direction information so that the angle g.sub.t supplied from the rotation matrix operation unit 23 becomes new previous direction information, and retains an update result thereof.

[0255] In step S13, the rotation matrix operation unit 23 obtains the rotation matrix R’(a(.DELTA..beta..sub.t)) in the elevation angle direction according to the difference .DELTA..beta..sub.t of the difference .DELTA.g.sub.t on the basis of the difference .DELTA.g.sub.t obtained in step S12. Note that in step S13, the rotation matrix operation unit 23 may obtain the rotation matrix Rs’(a(.DELTA..beta..sub.t)) according to the difference .DELTA..beta..sub.t corresponding to the above-described rotation matrix Rs’(a(.beta.) as the rotation matrix R’(a(.DELTA..beta..sub.t)).

[0256] In step S14, on the basis of the difference .DELTA..alpha..sub.t and the difference .DELTA..gamma..sub.t of the head rotation obtained from the difference .DELTA.g.sub.t obtained in step S12, the rotation matrix operation unit 23 obtains the rotation matrix R’(u(.DELTA..alpha..sub.t)) and the rotation matrix R’(u(.DELTA..gamma..sub.t)) in the horizontal direction according to the differences.

[0257] In step S15, the rotation matrix operation unit 23 synthesizes the rotation matrix R’(a(.DELTA..beta..sub.t)) in the elevation angle direction obtained in step S13 with the rotation matrix R’(u(.DELTA..alpha..sub.t)) and the rotation matrix R’(u(.DELTA..gamma..sub.t)) in the horizontal direction obtained in step S14 to obtain the rotation matrix R’(.DELTA.g.sub.t) that performs rotation by the amount of a difference in the entire rotation of the head, and supplies the rotation matrix R’(.DELTA.g.sub.t) to the rotation operation unit 24.

[0258] In step S16, the rotation operation unit 24 performs a rotation operation on the basis of the rotation matrix R’(.DELTA.g.sub.t) supplied from the rotation matrix operation unit 23 and the row vector H’(g.sub.t-1.sup.-1, .omega.) retained in the rotation coefficient retention unit 25.

[0259] That is, for example, in step S16, the above-described calculation of Equation (41) is performed as the rotation operation on the basis of the effective element width 2C+1 determined by the constant C, and the row vector H’(g.sub.t.sup.-1, .omega.) is calculated.

[0260] The rotation operation unit 24 supplies the obtained row vector H’(g.sub.t.sup.-1, .omega.) to the rotation coefficient retention unit 25 to retain, and also supplies the row vector H’(g.sub.t.sup.-1, .omega.) to the head-related transfer function synthesis unit 27.

[0261] In step S17, the head-related transfer function synthesis unit 27 synthesizes the supplied input signal D’.sub.n.sup.m(.omega.) and the row vector H’(g.sub.t.sup.-1, .omega.) of the head-related transfer function supplied from the rotation operation unit 24 to generate drive signals of the left and right headphones.

[0262] For example, in step S17, a product of the row vector H’(g.sub.t.sup.-1, .omega.) and the matrix D’(.omega.) is obtained for each of the left and right headphones, and the drive signal P.sub.l(g, .omega.) and the drive signal P.sub.r(g, .omega.) of the left and right headphones are calculated. The head-related transfer function synthesis unit 27 supplies the obtained drive signal P.sub.l(g, .omega.) and drive signal P.sub.r(g, .omega.) to the time-frequency inverse transform unit 28.

[0263] In step S18, the time-frequency inverse transform unit 28 performs the time-frequency inverse transform on the drive signal P.sub.l(g, .omega.) and the drive signal P.sub.r(g, .omega.) supplied from the head-related transfer function synthesis unit 27, and outputs the drive signal P.sub.l(g, t) and the drive signal p.sub.r(g, t) obtained as a result to the subsequent stage, and the drive signal generation processing ends.

[0264] As described above, the audio processing device 11 obtains the rotation matrix R’(.DELTA.g.sub.t) on the basis of the difference .DELTA.g.sub.t, and obtains the current row vector H’(g.sub.t.sup.-1, .omega.) on the basis of the rotation matrix R’(.DELTA.g.sub.t) and the previous row vector H’(g.sub.t-1.sup.-1, .omega.).

[0265] By accumulating rotation by the difference .DELTA.g.sub.t, which is a minute rotation angle, to obtain the row vector H’(g.sub.t.sup.-1, .omega.) in this manner, it is possible to reduce the memory amount and the operation amount to be used. Consequently, sound can be reproduced more efficiently. In particular, according to the fifth method described above, it is possible to obtain the drive signals with a memory amount equivalent to that of the fourth method and with a smaller operation amount than that of the fourth method.

Second Embodiment

[0266]

[0267] Incidentally, in the fifth method described above, since the operation is performed using only the elements in the block having the effective element width 2C+1 determined by the constant C, that is, only the effective elements, not a few errors occur in the rotation matrix R’(g.sub.t), that is, the row vector H’(g.sub.t.sup.-1, .omega.).

[0268] Furthermore, when the operation in which such errors occur is repeatedly performed for a while, errors are accumulated, and the row vector H’(g.sub.t.sup.-1, .omega.) becomes a value away from the original value. That is, the error of the row vector H’(g.sub.t.sup.-1, .omega.) increases.

[0269] Accordingly, accumulation of errors may be prevented by performing an operation to obtain the accurate rotation matrix R’(g.sub.t.sup.-1) at a predetermined timing and resetting the values of the rotation matrix R’(g.sub.t.sup.-1), that is, the row vector H’(g.sub.t.sup.-1, .omega.) (hereinafter, also simply referred to as reset). Hereinafter, a method of performing reset at a predetermined timing in the fifth method will also be referred to as a sixth method.

[0270] In the sixth method, an operation with an operation amount on the order of the cube of the degree n is required to obtain the row vector H’(g.sub.t.sup.-1, .omega.) at the time of reset, but the reset is not performed frequently, so that the operation amount can be reduced as a whole.

[0271] As described above, in a case where the reset is appropriately performed, the audio processing device 11 is configured as illustrated in FIG. 14. Note that in FIG. 14, parts corresponding to those in the case of FIG. 11 are denoted by the same reference numerals, and the description thereof will be omitted as appropriate.

……
……
……

本文链接：https://patent.nweon.com/23256

Sony Patent | Signal processing device and method, and program

您可能还喜欢...

分类

最新AR/VR行业分享

最新AR/VR论文

最新AR/VR行业招聘

Sony Patent | Signal processing device and method, and program

您可能还喜欢...

Sony Patent | Information processing apparatus, method for processing information, and program

Sony Patent | Terrain Radar And Gradual Building Of A Route In A Virtual Environment Of A Video Game

Sony Patent | Information Processing Apparatus And Information Processing Method To Link Devices By Recognizing The Appearance Of A Device

分类

最新AR/VR行业分享

最新AR/VR论文

最新AR/VR行业招聘