Sony Patent | Signal processing device and method, and program

编辑：映维 | 分类：Sony | 2021年5月14日

Patent: Signal processing device and method, and program

Publication Number: 20210144504

Publication Date: 20210513

Applicant: Sony

Abstract

The present technology relates to signal processing device and method, and a program that make it possible to reproduce sound more effectively. A signal processing device includes: a rotation operation unit that rotates a head-related transfer function in a spherical harmonic domain by an operation on the basis of a rotation matrix corresponding to rotation of a head of a listener, the operation in which an order of the rotation matrix is limited; and a synthesis unit that synthesizes the head-related transfer function after rotation obtained by the operation and a sound signal of the spherical harmonic domain to generate a headphone drive signal. The present technology is applicable to an audio processor.

Claims

A signal processing device comprising: a rotation operation unit that rotates a head-related transfer function in a spherical harmonic domain by an operation on a basis of a rotation matrix corresponding to rotation of a head of a listener, the operation in which an order of the rotation matrix is limited; and a synthesis unit that synthesizes the head-related transfer function after rotation obtained by the operation and a sound signal of the spherical harmonic domain to generate a headphone drive signal.
The signal processing device according to claim 1, wherein, for a rotation operation of the head-related transfer function in at least one rotation direction, the rotation operation unit performs the rotation operation at a predetermined time with use of an operation result of the rotation operation in the rotation direction at another time before the predetermined time to determine the head-related transfer function after the rotation at the predetermined time.
The signal processing device according to claim 2, wherein the rotation operation unit performs the rotation operation in the rotation direction at the predetermined time on a basis of a rotation matrix corresponding to a difference between a rotation angle in the rotation direction of the head of the listener at the predetermined time and a rotation angle in the rotation direction of the head of the listener at the other time, and the operation result of the rotation operation in the rotation direction at the other time.
The signal processing device according to claim 3, wherein the rotation operation unit performs the rotation operation only on an element having the order within a predetermined range as the operation in which the order is limited.
The signal processing device according to claim 3, wherein, for an elevation angle direction as the rotation direction, the rotation operation unit performs the rotation operation in the rotation direction at the predetermined time with use of the operation result of the rotation operation in the rotation direction at the other time.
The signal processing device according to claim 3, wherein in a case where resetting of the rotation matrix is not performed, the rotation operation unit performs the rotation operation in the rotation direction at the predetermined time with use of the operation result of the rotation operation in the rotation direction at the other time, and in a case where the resetting of the rotation matrix is performed, the rotation operation unit performs the rotation operation in the rotation direction at the predetermined time on a basis of a rotation matrix corresponding to a rotation angle in the rotation direction of the head of the listener at the predetermined time and the head-related transfer function.
The signal processing device according to claim 6, wherein the resetting is performed for each degree, each order, or each time frequency.
The signal processing device according to claim 6, wherein, in a case where the headphone drive signal is generated for each of a plurality of the listeners, the resetting is performed for each of the listeners.
The signal processing device according to claim 6, wherein, in a case where the resetting is performed, the rotation operation unit performs the rotation operation with use of a rotation matrix determined in advance as the rotation matrix corresponding to the rotation angle in the rotation direction of the head of the listener at the predetermined time.
The signal processing device according to claim 1, wherein, in a case where a rotation matrix, which is included in the rotation matrix corresponding to the rotation of the head, for performing rotation to a predetermined rotation direction is represented by a sum of a plurality of matrices, the rotation operation unit performs, as the operation in which the order is limited, an operation of rotating the head-related transfer function with use of a sum of some of the plurality of the matrices as the rotation matrix for performing the rotation to the predetermined rotation direction.
A signal processing method comprising steps of: rotating a head-related transfer function in a spherical harmonic domain by an operation on a basis of a rotation matrix corresponding to rotation of a head of a listener, the operation in which an order of the rotation matrix is limited; and synthesizing the head-related transfer function after rotation obtained by the operation and a sound signal of the spherical harmonic domain to generate a headphone drive signal.
A program causing a computer to execute processing, the processing comprising steps of: rotating a head-related transfer function in a spherical harmonic domain by an operation on a basis of a rotation matrix corresponding to rotation of a head of a listener, the operation in which an order of the rotation matrix is limited; and synthesizing the head-related transfer function after rotation obtained by the operation and a sound signal of the spherical harmonic domain to generate a headphone drive signal.

Description

TECHNICAL FIELD

[0001] The present technology relates to signal processing device and method, and a program, and specifically to signal processing device and method, and a program that make it possible to reproduce sound more efficiently.

BACKGROUND ART

[0002] In recent years, development and spread of systems that record, transmit, and reproduce spatial information from an entire environment have been progressing in the field of sound. For example, in Super Hi-Vision, broadcasting is being planned using three-dimensional 22.2 multichannel sound.

[0003] Further, in the field of virtual reality, systems that also reproduce signals surrounding the entire environment for sound in addition to an image surrounding the entire environment are becoming popular.

[0004] Among them, there is a technique of representing three-dimensional audio information, which is flexibly adaptable to any recording/reproducing system. The technique is called ambisonics, and has been attracting attention. In particular, second or higher order ambisonics is called higher order ambisonics (HOA) (see NPTL 1, for example).

[0005] In three-dimensional multichannel sound, sound information spread along a spatial axis in addition to a time axis, and in ambisonics, information is held by performing frequency transformation, that is, spherical harmonic function transformation relative to an angular direction of three-dimensional polar coordinates. It is possible to consider that the spherical harmonic function transformation corresponds to time-frequency transformation of an audio signal with respect to a time axis.

[0006] Advantages of this method include ability to encode and decode information from any microphone array to any speaker array without limiting the number of microphones or the number of speakers.

[0007] In contrast, impediments to spread of ambisonics include need for a speaker array including a large number of speakers in a reproduction environment, and a narrow range (sweet spot) where it is possible to reproduce sound space.

[0008] For example, a speaker array including more speakers is necessary to increase spatial resolution of sound, but it is impractical to increase such a system at home or the like. In addition, in a space such as a movie theater, a region where it is possible to reproduce sound space is narrow, and it is difficult to give desired effects to an entire audience.

CITATION LIST

Non-Patent Literature

[0009] NPTL 1: Jerome Daniel, Rozenn Nicol, Sebastien Moreau, “Further Investigations of High Order Ambisonics and Wavefield Synthesis for Holophonic Sound Imaging,” AES 114th Convention, Amsterdam, Netherlands, 2003.

SUMMARY OF THE INVENTION

Problems to be Solved by the Invention

[0010] It is therefore conceivable to combine ambisonics and binaural reproduction technology. The binaural reproduction technology is generally called virtual auditory display (VAD), and is implemented using a head-related transfer function (HRTF).

[0011] Herein, the head-related transfer function expresses information regarding how sound is transmitted from every direction surrounding a human head to binaural eardrums as a function of frequency and arrival direction.

[0012] In a case where a synthesis obtained by synthesizing a target sound and the head-related transfer function from a certain direction is presented with headphones, a listener perceives the sound as if the sound comes from the direction of the head-related transfer function used, not from the headphones. The VAD is a system that utilizes such a principle.

[0013] In a case where a plurality of virtual speakers are reproduced by using the VAD, it is possible to achieve, by presentation with the headphones, the same effects as those of ambisonics in a speaker array including a plurality of speakers, which is difficult in reality.

[0014] However, such a system is able to reproduce sound sufficiently efficiently. For example, in a case where ambisonics and binaural reproduction technology are combined, not only an amount of operations such as a convolution operation of the head-related transfer function increases, but a usage amount of a memory used for the operations and the like also increases.

[0015] The present technology has been made in light of such a situation, and makes it possible to reproduce sound more efficiently.

Means for Solving the Problems

[0016] A signal processing device according to one aspect of the present technology includes: a rotation operation unit that rotates a head-related transfer function in a spherical harmonic domain by an operation on the basis of a rotation matrix corresponding to rotation of a head of a listener, the operation in which an order of the rotation matrix is limited; and a synthesis unit that synthesizes the head-related transfer function after rotation obtained by the operation and a sound signal of the spherical harmonic domain to generate a headphone drive signal.

[0017] A signal processing method or a program according to one aspect of the present technology includes steps of: rotating a head-related transfer function in a spherical harmonic domain by an operation on the basis of a rotation matrix corresponding to rotation of a head of a listener, the operation in which an order of the rotation matrix is limited; and synthesizing the head-related transfer function after rotation obtained by the operation and a sound signal of the spherical harmonic domain to generate a headphone drive signal.

[0018] In one aspect of the technology, the head-related transfer function in the spherical harmonic domain is rotated by the operation in which the order of the rotation matrix is limited on the basis of the rotation matrix corresponding to the rotation of the head of the listener, and the head-related transfer function after the rotation obtained by the operation and the sound signal of the spherical harmonic domain are synthesized to generate the headphone drive signal.

Effects of the Invention

[0019] According to one aspect of the present technology, it is possible to reproduce sound more efficiently.

[0020] It is to be noted that effects of the present technology are not necessarily limited to the effects described here, and may be any of the effects described in the present disclosure.

BRIEF DESCRIPTION OF THE DRAWINGS

[0021] FIG. 1 is a diagram describing simulation of stereophony using a head-related transfer function.

[0022] FIG. 2 is a diagram describing calculation of a drive signal in a first technique.

[0023] FIG. 3 is a diagram describing calculation of a drive signal in a case where head tracking is performed.

[0024] FIG. 4 is a diagram describing calculation of a drive signal in a second technique.

[0025] FIG. 5 is a diagram describing calculation of a drive signal in a third technique.

[0026] FIG. 6 is a diagram describing an operation amount and a necessary memory amount.

[0027] FIG. 7 is a diagram describing calculation of a drive signal in a fourth technique.

[0028] FIG. 8 is a diagram describing a rotation matrix.

[0029] FIG. 9 is a diagram describing the rotation matrix.

[0030] FIG. 10 is a diagram describing the rotation matrix.

[0031] FIG. 11 is a diagram illustrating a configuration example of an audio processor.

[0032] FIG. 12 is a diagram describing a difference in an elevation angle direction.

[0033] FIG. 13 is a flow chart describing drive signal generation processing.

[0034] FIG. 14 is a diagram illustrating a configuration example of an audio processor.

[0035] FIG. 15 is a flow chart describing drive signal generation processing.

[0036] FIG. 16 is a diagram illustrating a configuration example of a control system.

[0037] FIG. 17 is a diagram describing resetting and an operation amount.

[0038] FIG. 18 is a diagram describing resetting for each degree.

[0039] FIG. 19 is a diagram describing resetting for each time frequency.

[0040] FIG. 20 is a diagram illustrating a configuration example of a control system.

[0041] FIG. 21 is a diagram illustrating a configuration example of a computer.

MODES FOR CARRYING OUT THE INVENTION

[0042] Some embodiments to which the present technology is applied are described below in detail with reference to the drawings.

First Embodiment

[0043] The present technology achieves a reproduction system that is more efficient in an operation amount and a memory usage amount by determining a head-related transfer function in a spherical harmonic domain corresponding to rotation of a head with use of accumulation of minute rotations and synthesizing, in the spherical harmonic domain, the head-related transfer function and an input signal of sound to be reproduced.

[0044] For example, spherical harmonic function transformation on a function f(.theta., .PHI.) on spherical coordinates is expressed by the following expression (1).

[ Math . .times. 1 ] .times. F n m = .intg. 0 2 .times. .pi. .times. .intg. 0 .pi. .times. f .function. ( .theta. .times. , .PHI. ) .times. Y n m _ .function. ( .theta. , .PHI. ) .times. sin .times. .times. .theta. .times. .times. d .times. .times. .theta. .times. .times. d .times. .times. .PHI. ( 1 ) ##EQU00001##

[0045] In the expression (1), .theta. and .PHI. respectively represent an elevation angle and a horizontal angle in the spherical coordinates, and Y.sub.n.sup.m(.theta., .PHI.) represents a spherical harmonic function. In addition, the spherical harmonic function Y.sub.n.sup.m(.theta., .PHI.) with “-” at a top thereof represents a complex conjugate of the spherical harmonic function Y.sub.n.sup.m(.theta., .PHI.).

[0046] Herein, the spherical harmonic function Y.sub.n.sup.m(.theta., .PHI.) is expressed by the following expression (2).

[ Math . .times. 2 ] Y n m .function. ( .theta. , .PHI. ) = ( - 1 ) m .times. ( 2 .times. n + 1 ) .times. ( n - m ) ! 4 .times. .pi. .function. ( n + m ) ! .times. P n m .function. ( cos .times. .times. .theta. ) .times. e im .times. .times. .PHI. ( 2 ) ##EQU00002##

[0047] In the expression (2), n and m represent a degree and an order of the spherical harmonic function Y.sub.n.sup.m(.theta., .PHI.), and are -n.ltoreq.m.ltoreq.n. The order m is also referred to as order or period, and hereinafter, in a case where it is not necessary to particularly distinguish n and m, the degree n and the order m are collectively referred to as degrees.

[0048] In addition, in the expression (2), i represents a pure imaginary number, and P.sub.n.sup.m(x) represents an associated Legendre function.

[0049] The associated Legendre function P.sub.n.sup.m(x) is expressed by the following expression (3) or (4) in a case where n.gtoreq.0 and 0.ltoreq.m.ltoreq.n. It is to be noted that the expression (3) is in a case where m=0.

[ Math . .times. 3 ] P n 0 .function. ( x ) = 1 2 n .times. n ! .times. d n d .times. x n .times. ( x 2 - 1 ) n ( 3 ) [ Math . .times. 4 ] P n m .function. ( x ) = ( 1 - x 2 ) m / 2 .times. d n d .times. x n .times. P m 0 .function. ( x ) ( 4 ) ##EQU00003##

[0050] In addition, in a case where -n.ltoreq.m.ltoreq.0, the associated Legendre function P.sub.n.sup.m(x) is expressed by the following expression (5).

[ Math . .times. 5 ] P n m .function. ( x ) = ( - 1 ) - m .times. ( n + m ) ! ( n - m ) ! .times. P n - m .function. ( x ) ( 5 ) ##EQU00004##

[0051] Further, inverse transformation from a function F.sub.n.sup.m obtained by the spherical harmonic function transformation into the function f(.theta., .PHI.) on the spherical coordinates is as expressed in the following expression (6).

[ Math . .times. 6 ] f .function. ( .theta. , .PHI. ) = n = 0 .infin. .times. m = - n n .times. F n m .times. Y n m .function. ( .theta. , .PHI. ) ( 6 ) ##EQU00005##

[0052] From the above, transformation from an input signal D’.sub.n.sup.m(.omega.) of sound after correction in a radial direction, which is held in the spherical harmonic domain, into a speaker drive signal S(x.sub.i, .omega.) of each of L number of speakers arranged on a spherical surface having a radius R is as expressed in the following expression (7).

[ Math . .times. 7 ] S .times. ( x i , .omega. ) = N n = 0 .times. n m = - n .times. D n ’ .times. .times. m .function. ( .omega. ) .times. Y n m .function. ( .beta. i , .alpha. i ) ( 7 ) ##EQU00006##

[0053] It is to be noted that in the expression (7), x.sub.i represents a position of the speaker, and w represents a time frequency of a sound signal. The input signal D’.sub.n.sup.m(.omega.) is a sound signal corresponding to each degree n and each order m of the spherical harmonic function for a predetermined time frequency .omega..

[0054] Further, x.sub.i=(R sin .beta..sub.i cos .alpha..sub.i, R sin .beta..sub.i sin .alpha..sub.i, R cos .beta..sub.i), and i represents a speaker index that specifies the speaker. Herein, i=1, 2, … , L, and .beta..sub.i and .alpha..sub.i respectively represent an elevation angle and a horizontal angle that indicate a position of the i-th speaker.

[0055] Such transformation expressed by the expression (7) is spherical harmonic inverse transformation corresponding to the expression (6). In addition, in a case of determining the speaker drive signal S(xi, .omega.) by the expression (7), it is necessary for the L number of speakers and a degree N of the spherical harmonic function, that is, a maximum value N of the degree n to satisfy a relationship expressed by the following expression (8). The L number of speakers is the number of reproducing speakers.

[Math. 8]

L>(N+1).sup.2 (8)

[0056] Incidentally, a general technique of simulating stereophony at ears by representation with headphones is, for example, a method using the head-related transfer function as illustrated in FIG. 1.

[0057] In an example illustrated in FIG. 1, an inputted ambisonics signal is decoded to generate a speaker drive signal of each of virtual speakers SP11-1 to SP11-8, which are a plurality of virtual speakers. The signal decoded at this time corresponds to, for example, the input signal D’.sub.n.sup.m(.omega.) described above.

[0058] Herein, each of the virtual speakers SP11-1 to virtual speakers SP11-8 is annularly disposed and virtually arranged, and the speaker drive signal of each of the virtual speakers is determined by the calculation of the expression (7) described above. It is to be noted that the virtual speakers are also simply referred to as virtual speakers SP11 hereinafter in a case where it is not necessary to particularly distinguish the virtual speakers SP11-1 to SP11-8.

[0059] In a case where the speaker drive signals of the respective virtual speakers SP11 are thus obtained, for each of the virtual speakers SP11, left and right drive signals (binaural signals) of headphones HD11 that actually reproduce sound are generated by a convolution operation using the head-related transfer function. Then, the sum of the respective drive signals of the headphones HD11 obtained for each of the virtual speakers SP11 is a final drive signal.

[0060] It is to be noted that such a technique is described in detail in, for example, “ADVANCED SYSTEM OPTIONS FOR BINAURAL RENDERING OF AMBISONIC FORMAT (Gerald Enzner et. al. ICASSP 2013) and the like.

[0061] The head-related transfer function H(x, .omega.) used to generate the left and right drive signals of the headphones HD11 is obtained by normalizing a transfer characteristic H.sub.1(x, .omega.) from a sound source position x in a state in which a head of a user, who is a listener, exists in free space to positions of eardrums of the user by a transfer characteristic H.sub.0(x, .omega.) from the sound source position x in a state in which the head does not exit to a head center O. That is, the head-related transfer function H(x, .omega.) for the sound source position x is obtained by the following expression (9).

[ Math . .times. 9 ] H .function. ( x , .omega. ) = H 1 .function. ( x , .omega. ) H 0 .function. ( x , .omega. ) ( 9 ) ##EQU00007##

[0062] Herein, the head-related transfer function H(x, .omega.) is convolved with an optional audio signal, and a thus-obtained result is presented with headphones or the like, which makes it possible to give, to the listener, an illusion as if sound comes from a direction of the convolved head-related transfer function H(x, .omega.), that is, a direction of the sound source position x.

[0063] In the example illustrated in FIG. 1, the left and right drive signals of the headphones HD11 are generated with use of such a principle.

[0064] Specifically, the position of each of the virtual speakers SP11 is set as the position x.sub.i, and the speaker drive signals of these virtual speakers SP11 are set as S(x.sub.i, .omega.).

[0065] In addition, the number of the virtual speakers SP11 is set as L (herein, L=8), and the final left and right drive signals of the headphones HD11 are respectively set as P.sub.l and P.sub.r.

[0066] In this case, in a case where the speaker drive signals S(x.sub.i, .omega.) are simulated by presentation with the headphones HD11, it is possible to determine the left and right drive signals P.sub.l and P.sub.r of the headphones HD11 by calculation of the following expression (10).

[ Math . .times. 10 ] P l = i = 1 L .times. S .function. ( x i , .omega. ) .times. H l .function. ( x i , .omega. ) .times. .times. P r = i = 1 L .times. S .function. ( x i , .omega. ) .times. H r .function. ( x i , .omega. ) ( 10 ) ##EQU00008##

[0067] It is to be noted that, in the expression (10), H.sub.l(x.sub.i, .omega.) and H.sub.r(x.sub.i, .omega.) represent normalized head-related transfer functions from the position x.sub.i of the virtual speaker SP11 to left and right eardrum positions of the listener, respectively.

[0068] Such an operation makes it possible to finally reproduce the input signal D’.sub.n.sup.m(.omega.) of the spherical harmonic domain by presentation with the headphones. That is, it is possible to achieve the same effects as those of ambisonics by presentation with the headphones.

[0069] It is to be noted that, hereinafter, in a case where it is not necessary to particularly distinguish the drive signal P.sub.l and the drive signal P.sub.r for the time frequency .omega., the drive signal P.sub.l and the drive signal P.sub.r are also simply referred to as drive signals P(.omega.). In addition, in a case where it is not necessary to particularly distinguish the head-related transfer function H.sub.l(x.sub.i, .omega.) and the head-related transfer function H.sub.r(x.sub.i, .omega.), the head-related transfer function H.sub.l(x.sub.i, .omega.) and the head-related transfer function H.sub.r(x.sub.i, .omega.) are also simply referred to as head-related transfer functions H(x.sub.i, .omega.).

[0070] Further, hereinafter, the technique of combining ambisonics and binaural reproduction technology described above is also referred to as first technique.

[0071] In the first technique, for example, an operation illustrated in FIG. 2 is performed to obtain the drive signal P(.omega.) of 1.times.1, that is, one row and one column.

[0072] In FIG. 2, H(.omega.) represents a vector (matrix) of 1.times.L including the L number of head-related transfer functions H(x.sub.i, .omega.). In addition, D’(.omega.) represents a vector including the input signal D’.sub.n.sup.m(.omega.), and the vector D’(.omega.) becomes K.times.1, where the number of input signals D’.sub.n.sup.m(.omega.) of bins of the same time frequency .omega. is K. Further, Y(x) represents a matrix including the spherical harmonic function Y.sub.n.sup.m(.beta..sub.i, .alpha..sub.i) of each degree, and the matrix Y(x) becomes a matrix of L.times.K.

[0073] Accordingly, in the first technique, a matrix (vector) S obtained from a matrix operation of the matrix Y(x) of L.times.K and the vector D’(.omega.) of K.times.1 is determined, and a matrix operation of the matrix S and the vector (matrix) H(.omega.) of 1.times.L is further performed to obtain one drive signal P(.omega.).

[0074] In addition, in a case where the head of the listener wearing the headphones HD11 rotates in a predetermined direction expressed by a rotation matrix g.sub.j (hereinafter also referred to as direction g.sub.j), for example, the drive signal P.sub.l(g.sub.j, .omega.) of a left headphone of the headphones HD11 is as expressed in the following expression (11).

[ Math . .times. 11 ] P l .function. ( g j , .omega. ) = i = 1 L .times. S .function. ( x i , .omega. ) .times. H l .function. ( g j - 1 .times. x i , .omega. ) ( 11 ) ##EQU00009##

[0075] It is to be noted that the rotation matrix g.sub.j is a three-dimensional, i.e., 3.times.3 rotation matrix represented by .PHI., .theta., and .psi. that are rotational angles of Euler angles. In addition, in the expression (11), the drive signal P.sub.l(g.sub.j, .omega.) represents the drive signal P.sub.l described above, and is written as the drive signal P.sub.l(g.sub.j, .omega.) herein to clarify the position, that is, the direction g.sub.j and the time frequency .omega..

[0076] In this case, the rotation direction of the head of the listener, that is, the direction g.sub.j of the head of the listener may be obtained by some sensor, and left and right drive signals of the headphones HD11 may be calculated using the head-related transfer function of a relative direction g.sub.j.sup.-1x.sub.i of each of the virtual speakers SP11 viewed from the head of the listener from among a plurality of head-related transfer functions. Thus, even in a case where sound is reproduced by the headphones HD11, it is possible to fix a sound image position viewed from the listener in space similarly to a case where real speakers are used.

[0077] In addition, in the first technique, convolution of the head-related transfer function performed in the time frequency domain may be performed in a spherical harmonic domain. Doing so makes it possible to reduce the operation amount and the necessary memory amount as compared with the first technique, and to reproduce sound more efficiently. Such a technique of convoluting the head-related transfer function in the spherical harmonic domain is also referred to as second technique, and the second technique is described below.

[0078] For example, in a case where attention is focused on the left headphone, the vector P.sub.l(.omega.) including each of the drive signals P.sub.l(g.sub.l, .omega.) of the left headphone for all rotation directions of the head of the user (listener), who is a listener, is expressed by the following expression (12).

[ Math . .times. 12 ] P l .function. ( .omega. ) = .times. H .function. ( .omega. ) .times. S .function. ( .omega. ) = .times. H .function. ( .omega. ) .times. Y .function. ( x ) .times. D ’ .function. ( .omega. ) ( 12 ) ##EQU00010##

[0079] It is to be noted that, in the expression (12), S(.omega.) is a vector including the speaker drive signal S(x.sub.i, .omega.), and S(.omega.)=Y(x)D’(.omega.). In addition, in the expression (12), Y(x) represents a matrix including each degree and the spherical harmonic function Y.sub.n.sup.m(x.sub.i) of the position x.sub.i of each of the virtual speakers expressed by the following expression (13). Herein, i=1, 2, … , L, and a maximum value (maximum degree) of the degree n is N.

[0080] D’(.omega.) represents a vector (matrix) including the input signal D’.sub.n.sup.m(.omega.) of sound corresponding to each degree, which is expressed by the following expression (14). Each input signal D’.sub.n.sup.m(.omega.) is a sound signal of the spherical harmonic domain.

[0081] Further, in the expression (12), H(.omega.) represents a matrix, as expressed by the following expression (15), including the head-related transfer function H(g.sub.j.sup.-1x.sub.i, .omega.) of the relative direction g.sub.j.sup.-1x.sub.i of each of the virtual speakers viewed from the head of the listener in a case where the direction of the head of the listener is the direction g.sub.j. In this example, the head-related transfer function H(g.sub.j.sup.-1x.sub.i, .omega.) of each of the virtual speakers is prepared for each of the total M number of directions g.sub.1 to g.sub.M.

[ Math . .times. 13 ] Y .function. ( x ) = ( Y 0 0 .function. ( x 1 ) Y N N .function. ( x 1 ) Y 0 0 .function. ( x L ) Y N N .function. ( x L ) ) ( 13 ) [ Math . .times. 14 ] D ’ .function. ( .omega. ) = ( D 0 ‘0 .function. ( .omega. ) D N ’ .times. N .function. ( .omega. ) ) ( 14 ) [ Math . .times. 15 ] H .function. ( .omega. ) = ( H .function. ( g 1 - 1 .times. x 1 , .omega. ) H .function. ( g 1 - 1 .times. x L , .omega. ) H .function. ( g M - 1 .times. x 1 , .omega. ) H .function. ( g M - 1 .times. x L , .omega. ) ) ( 15 ) ##EQU00011##

[0082] In calculating the drive signal P.sub.l(g.sub.j, .omega.) of the left headphone in a case where the head of the listener is directed in the direction g.sub.j, it is sufficient if a row corresponding to the direction g.sub.j, which is the direction of the head of the listener, that is, a row including the head-related transfer function H(g.sub.j.sup.-1x.sub.i, .omega.) for the direction g.sub.j is selected from the matrix H(.omega.) of the head-related transfer functions to perform calculation of the expression (12).

[0083] In this case, only a necessary row is calculated as illustrated in FIG. 3, for example.

[0084] In this example, the head-related transfer function is prepared for each of the M number of directions; therefore, matrix calculation expressed by the expression (12) is as indicated by an arrow A11.

[0085] That is, in a case where the number of input signals D’.sub.n.sup.m(.omega.) of the time frequency .omega. is K, the vector D’(.omega.) is K.times.1, that is, a matrix of K rows and one column. In addition, the matrix Y(x) of the spherical harmonic function is L.times.K, and the matrix H(.omega.) is M.times.L. Accordingly, in the calculation of the expression (12), the vector P.sub.l(.omega.) is M.times.1.

[0086] Herein, a matrix operation (product-sum operation) of the matrix Y(x) and the vector D’(.omega.) is first performed in an online operation to determine the vector S(.omega.), which makes it possible to select a row corresponding to the direction g.sub.j of the head of the listener in the matrix H(.omega.) as indicated by the arrow A12 and reduce the operation amount at the time of calculation of the drive signal P.sub.l(g.sub.j, .omega.). In FIG. 3, a hatched portion in the matrix H(.omega.) represents the row corresponding to the direction g.sub.j, and an operation of this row and the vector S(.omega.) is performed to calculate the desired drive signal P.sub.l(g.sub.j, .omega.) of the left headphone.

[0087] Herein, the matrix H’(.omega.) is defined as expressed by the following expression (16), which makes it possible to express, by the following expression (17), the vector P.sub.l(.omega.) expressed by the expression (12).

[Math. 16]

H’(.omega.)=H(.omega.)Y(x) (16)

[Math. 17]

P.sub.l(.omega.)=H’(.omega.)D’(.omega.) (17)

[0088] In the expression (16), the head-related transfer function, more specifically, the matrix H(.omega.) including the head-related transfer function in the time-frequency domain, is transformed by the spherical harmonic function transformation using the spherical harmonic function into the matrix H’(.omega.) including the head-related transfer function in the spherical harmonic domain.

[0089] Accordingly, in calculation of the expression (17), convolution of the speaker drive signal and the head-related transfer function is performed in the spherical harmonic domain. In other words, in the spherical harmonic domain, the product-sum operation of the head-related transfer function and the input signal is performed. It is to be noted that it is possible to calculate and hold the matrix H’(.omega.) in advance.

[0090] In this case, in calculating the drive signal P.sub.l(g.sub.j, .omega.) of the left headphone in a case where the head of the listener is directed in the direction g.sub.j, it is sufficient if only the row corresponding to the direction g.sub.j of the head of the listener is selected from the matrix H’(.omega.) held in advance to calculate the expression (17).

[0091] In such a case, calculation of the expression (17) is calculation expressed by the following expression (18). This makes it possible to greatly reduce the operation amount and the necessary memory amount.

[ Math . .times. 18 ] P I .function. ( g j , .omega. ) = N n = 0 .times. n m = - n .times. H n ’ .times. .times. m .function. ( g j , .omega. ) .times. D n ’ .times. .times. m .function. ( .omega. ) ( 18 ) ##EQU00012##

[0092] In the expression (18), H’.sub.n.sup.m(g.sub.j, .omega.) is one element of the matrix H’(.omega.), that is, a head-related transfer function in the spherical harmonic domain, which is a component (element) corresponding to the direction g.sub.j of the head in the matrix H’(.omega.). In the head-related transfer function H’.sub.n.sup.m(g.sub.j, .omega.), n and m represent the degree n and the order m of the spherical harmonic function.

[0093] In such an operation expressed by the expression (18), the operation amount is reduced as illustrated in FIG. 4. That is, calculation expressed by the expression (12) is calculation to determine a product of the matrix H(.omega.) of M.times.L, the matrix Y(x) of L.times.K, and the vector D’(.omega.) of K.times.1 as indicated by an arrow A21 in FIG. 4.

[0094] Herein, H(.omega.)Y(x) is the matrix H’(.omega.) as defined in the expression (16); therefore, the calculation indicated by the arrow A21 eventually becomes as indicated by an arrow A22. In particular, it is possible to perform calculation for determining the matrix H’(.omega.) offline, that is, in advance; therefore, determining and holding the matrix H’(.omega.) in advance makes it possible to reduce the operation amount for determining the drive signals of the headphones online by that amount.

[0095] In a case where the matrix H’(.omega.) is thus determined in advance, the calculation indicated by the arrow A22, that is, the calculation of the expression (18) described above is performed to actually determine the drive signals of the headphones.

[0096] That is, as indicated by the arrow A22, the row corresponding to the direction g.sub.j of the head of the listener in the matrix H’(.omega.) is selected, and the drive signal P.sub.l(g.sub.j, .omega.) of the left headphone is calculated by a matrix operation of that selected row and the vector D’(.omega.) including the inputted input signal D’.sub.n.sup.m(.omega.). In FIG. 4, a hatched portion in the matrix H’(.omega.) represents the row corresponding to the direction g.sub.j, and an element included in this row is the head-related transfer function H’.sub.n.sup.m(g.sub.j, .omega.) expressed by the expression (18).

[0097] Incidentally, in the second technique described above, while it is possible to greatly reduce the operation amount and the necessary memory amount, it is necessary to hold all the rotation directions of the head of the listener, that is, the rows corresponding to the respective directions g.sub.j in a memory as the matrix H’(.omega.) of the head-related transfer functions.

[0098] Accordingly, a matrix (vector) including the head-related transfer function of the spherical harmonic domain for one direction gj may be set as H.sub.s(.omega.)=H’(g.sub.j), and only the matrix H.sub.s(.omega.) including the row corresponding to the one direction g.sub.j of the matrix H’(.omega.) may be held, and a rotation matrix R’(g.sub.i) for performing rotation corresponding to head rotation of the listener in the spherical harmonic domain may be held by the number of the plurality of directions g.sub.j. Hereinafter, such a technique is referred to as third technique.

[0099] The rotation matrix R’(g.sub.j) of each of the directions g.sub.j is different from the matrix H’(.omega.) and has no time frequency dependence. This makes it possible to greatly reduce the memory amount as compared with making the matrix H’(.omega.) hold the component of the direction g.sub.j of rotation of the head.

[0100] First, a product H’(g.sub.i.sup.-1, .omega.) of a row H(g.sub.i.sup.-1, .omega.) corresponding to a predetermined direction g.sub.j of the matrix H(.omega.) and the matrix Y(x) of the spherical harmonic function is considered as expressed by the following expression (19).

[Math. 19]

H’(g.sub.j.sup.-1,.omega.)=H(g.sub.j.sup.-1x,.omega.)Y(x) (19)

[0101] In the second technique described above, coordinates of the head-related transfer function used are rotated from x to g.sub.i.sup.-1x for the direction g.sub.j of the rotation of the head of the listener. However, the same result is obtainable by rotating coordinates of the spherical harmonic function from x to g.sub.jx without changing the coordinates of the position x of the head-related transfer function. That is, the following expression (20) is established.

[Math. 20]

H’(g.sub.j.sup.-1,.omega.)=H(g.sub.j.sup.-1x,.omega.)Y(x)=H(x,.omega.)Y(- g.sub.jx) (20)

[0102] Further, the matrix Y(g.sub.jx) of the spherical harmonic function is the product of the matrix Y(x) and the rotation matrix R’(g.sub.j.sup.-1), and is as expressed by the following expression (21). It is to be noted that the rotation matrix R’(g.sub.j.sup.-1) is a matrix that rotates the coordinates by g.sub.j in the spherical harmonic domain.

[Math. 21]

Y(g.sub.jx)=Y(x)R’(g.sub.j.sup.-1) (21)

[0103] Herein, for the set Q expressed by the following expression (22), elements other than elements in rows (n.sup.2+n+1+k) and columns (n.sup.2+n+1+m+m) of the rotation matrix R’(g.sub.j), which are (n.sup.2+n+1+k) and (n.sup.2+n+1+m) belonging to Q, are zero.

[Math. 22]

Q={q|n.sup.2+1.ltoreq.q.ltoreq.(n+1).sup.2, q,n.di-elect cons.{0,1,2 … }} (22)

[0104] Accordingly, it is possible to express the spherical harmonic function Y.sub.n.sup.m(g.sub.jx), which is an element of the matrix Y(g.sub.jx), by the following expression (23) using the element R’(n).sub.k,m(g.sub.j) in the (n.sup.2+n+1+k) rows and the (n.sub.2+n+1+m) columns of the rotation matrix R’(g.sub.i).

[ Math . .times. 23 ] Y n m .function. ( g j .times. x ) = n k = - n .times. Y n k .function. ( x ) .times. R k , m ’ .function. ( n ) .function. ( g j - 1 ) ( 23 ) ##EQU00013##

[0105] Herein, the element R’(n).sub.k,m(g.sub.j) is expressed by the following expression (24).

[Math. 24]

R’.sub.k,m.sup.(n)(g.sub.j)=e.sup.-im.PHI.r.sub.k,m.sup.(n)(.theta.)e.su- p.-ik.psi. (24)

[0106] In the expression (24), i represents a pure imaginary number, .theta., .PHI., and .psi. represent rotational angles of Euler angles of the rotation matrix, and r.sup.(n).sub.k,m(.theta.) is expressed by the following expression (25).

.times. [ Math . .times. 25 ] r k , m ( n ) .function. ( .theta. ) = ( n + k ) .times. ! ( n - k ) ! ( n + m ) .times. ! ( n - m ) ! .times. .sigma. .times. ( n + m n - k - .sigma. ) .times. ( n - m .sigma. ) .times. ( - 1 ) n - k - .sigma. .times. ( cos .times. .theta. 2 ) 2 .times. .sigma. + k + m .times. ( sin .times. .theta. 2 ) 2 .times. n - 2 .times. .sigma. - k - m ( 25 ) ##EQU00014##

[0107] From the above, a binaural reproducing signal reflecting the rotation of the head of the listener by using the rotation matrix R’(g.sub.j.sup.-1), for example, the drive signal P.sub.l(g.sub.j, .omega.) of the left headphone is obtained by calculating the following expression (26). In addition, in a case where the left and right head-related transfer functions are optionally considered to be symmetric, performing inversion is performed using a matrix R.sub.ref that makes either the matrix D’(.omega.) of the input signal or the row vector H.sub.s(.omega.) of a left head-related transfer function horizontally flip as pre-processing of the expression (26), which makes it possible to obtain a right headphone drive signal only by holding the row vector H.sub.s(.omega.) of the left head-related transfer function. Note that a case where different left and right head-related transfer functions are necessary is basically described below.

.times. [ Math . .times. 26 ] P I .function. ( g j , .omega. ) = H .function. ( g j - 1 .times. x , .omega. ) .times. Y .function. ( x ) .times. D ’ .function. ( .omega. ) = H .function. ( x , .omega. ) .times. Y .function. ( x ) .times. R ’ .function. ( g j - 1 ) .times. D ’ .function. ( .omega. ) = H S .function. ( .omega. ) .times. R ’ .function. ( g j - 1 ) .times. D ’ .function. ( .omega. ) ( 26 ) ##EQU00015##

[0108] In the expression (26), the drive signal P.sub.l(g.sub.j, .omega.) is determined by synthesizing the row vector H.sub.s(.omega.), the rotation matrix R’(g.sub.j.sup.-1), and the vector D’(.omega.).

[0109] The calculation as described above is, for example, calculation illustrated in FIG. 5. That is, the vector P.sub.l(.omega.) including the drive signal P.sub.l(g.sub.j, .omega.) of the left headphone is obtained by the product of the matrix H(.omega.) of M.times.L, the matrix Y(x) of L.times.K, and the vector D’(.omega.) of K.times.1, as indicated by an arrow A41 in FIG. 5. This matrix operation is as expressed by the expression (12) described above.

[0110] This operation is represented by using the matrix Y(g.sub.jx) of the spherical harmonic function prepared for each of M number of directions g.sub.j, as indicated by an arrow A42. That is, the vector P.sub.l(.omega.) including the drive signal P.sub.l(g.sub.j, .omega.) corresponding to each of the M number of directions g.sub.j is obtained by the product of the predetermined row H(x, .omega.) of the matrix H(o)), the matrix Y(g.sub.jx), and the vector D’(.omega.) from a relationship expressed by the expression (20).

[0111] Herein, the row H(x, .omega.), which is a vector, is 1.times.L, the matrix Y(g.sub.jx) is L.times.K, and the vector D’(.omega.) is K.times.1. This is further transformed by using relationships expressed by the expressions (17) and (21), which is as indicated by an arrow A43. That is, as expressed by the expression (26), the vector NO is obtained by the product of the row vector H.sub.s(.omega.) of 1.times.K, the rotation matrix R’(g.sub.j.sup.-1) of K.times.K of each of the M number of directions g.sub.j, and the vector D’(.omega.) of K.times.1.

[0112] It is to be noted that, in FIG. 5, hatched portions of the rotation matrix R’(g.sub.j.sup.-1) represent non-zero elements of the rotation matrix R’(g.sub.j.sup.-1).

[0113] In addition, FIG. 6 illustrates the operation amount and the necessary memory amount in such a third technique.

[0114] That is, it is assumed that, as illustrated in FIG. 6, the row vector H.sub.s(.omega.) of 1.times.K is prepared for each time frequency bin w, the rotation matrix R’(g.sub.j.sup.-1) of K.times.K is prepared for the M number of directions g.sub.j, and the vector D’(.omega.) is K.times.1. In addition, it is assumed that the number of time frequency bins .omega. is W, and the maximum value of the degree n of the spherical harmonic function, that is, the maximum degree is J.

[0115] At this time, the number of non-zero elements of the rotation matrix R’(g.sub.j) is (J+1)(2J+1)(2J+3)/3; therefore, the total calc/W of the number of product-sum operations per time frequency bin .omega. in the third technique is as expressed by the following expression (27).

[ Math . .times. 27 ] calc .times. / .times. W = ( J + 1 ) .times. ( 2 .times. J + 1 ) .times. ( 2 .times. J + 3 ) 3 + 2 .times. K ( 27 ) ##EQU00016##

[0116] In addition, in the operation by the third technique, it is necessary to hold the row vector H.sub.s(.omega.) of 1.times.K for each time frequency bin .omega. for left and right ears, and further, it is necessary to hold non-zero elements of the rotation matrix R’(g.sub.j.sup.-1) for each of the M number of directions. Accordingly, a memory amount “memory” necessary for the operation by the third technique is as expressed by the following expression (28).

[ Math . .times. 28 ] memory = M .times. ( J + 1 ) .times. ( 2 .times. J + 1 ) .times. ( 2 .times. J + 3 ) 3 + 2 .times. K .times. W ( 28 ) ##EQU00017##

[0117] In the third technique, holding the number of non-zero elements of the rotation matrix R’(g.sub.j.sup.-1) makes it possible to greatly reduce the necessary memory amount as compared with the second technique.

[0118] It is to be noted that, in the third technique, it is necessary to hold the rotation matrices R’(g.sub.j.sup.-1) for rotation of three axes of the head of the listener, that is, for optional M number of directions g.sub.j. To hold such rotation matrices R’(g.sub.j.sup.-1), a certain memory amount is necessary, though the amount is less than that in a case of holding the matrix H’(.omega.) with time frequency dependence.

[0119] Accordingly, the rotation matrix R’(g.sub.j.sup.-1) for performing rotation about the head of the listener as a rotation center in the spherical harmonic domain may be sequentially determined at the time of an operation. Hereinafter, such a technique is also referred to as fourth technique.

[0120] Herein, it is possible to express a rotation matrix R’(g) by the following expression (29). In addition, g in the expression (29) is a rotation matrix, and is represented by the product of a matrix u(.PHI.), a matrix a(.PHI.), and a matrix u(.psi.) as expressed by the following expression (30).

[Math. 29]

R’(g)=R’(u(.PHI.)a(.theta.)u(.psi.))=R’(u(.PHI.))R’(a(.theta.))R’(u(.psi- .)) (29)

[Math. 30]

g=u(.PHI.)a(.theta.)u(.psi.) (30)

[0121] It is to be noted that, in the expression (29), a(.theta.) and u(.PHI.) are rotation matrices that rotate coordinates by an angle .theta. and an angle .PHI. about a coordinate axis as a rotation axis of a coordinate system in which the position of the head of the lister is an origin point. In addition, u(.psi.) is a rotation matrix that is only different in the rotation angle from u(.PHI.) and rotates the coordinates by an angle .psi. about the same coordinate axis as the rotation axis. It is to be noted that rotation angles of the respective matrices u(.PHI.), a(.theta.), and u(.psi.), that is, the angle .PHI., the angle .theta., and the angle .psi. are Euler angles.

[0122] For example, it is assumed that there is an orthogonal coordinate system in which the position of the head of the listener is set as the origin point, and an x axis, a y axis, and a z axis orthogonal to each other are respective axes. Herein, in a state in which the listener is directed to front, a positive direction of the x axis is a direction of the front, and the z axis is an upward-downward direction viewed from the listener directed to the front, that is, an axis in a vertical direction. The angle .PHI., the angle .theta., and the angle .psi. are rotation angles to respective rotation directions relative to the state in which the listener is directed to the front, that is, to the positive direction of the x axis.

[0123] Specifically, the rotation angle of the head in a case where the head moves in the upward-downward direction about the y axis as the rotation axis while the listener seeing the front is the angle .theta. that is an elevation angle. Further, the rotation angle of the head in a case where the head moves in a horizontal direction viewed from the listener about the z axis as the rotation axis while the listener is directed to the front is the angle .theta. that is a horizontal angle.

[0124] The matrix a(.theta.) is a rotation matrix that rotates the coordinates (coordinate system) by the angle .theta. about the y axis as the rotation axis, and the matrix u(.PHI.) is a rotation matrix that rotates the coordinates (coordinate system) by the angle .PHI. about the z axis as the rotation axis. Specifically, these matrices a(.theta.) and u(.PHI.) are as expressed by the following expressions (31) and (32), respectively.

.times. [ Math . .times. 31 ] { a .function. ( .theta. ) = ( cos .times. .times. .theta. 0 sin .times. .times. .theta. 0 1 0 - .times. sin .times. .times. .theta. 0 cos .times. .times. .theta. ) | .theta. .di-elect cons. [ 0 , 2 .times. .pi. ] } ( 31 ) [ Math . .times. 32 ] { u .function. ( .PHI. ) = ( cos .times. .times. .PHI. - sin .times. .times. .PHI. 0 sin .times. .times. .PHI. cos .times. .times. .PHI. 0 0 0 1 ) | .PHI. .di-elect cons. [ 0 , 2 .times. .pi. ] } ( 32 ) ##EQU00018##

[0125] Accordingly, for example, the matrix a(0) acts on an optional position v=(v.sub.x, v.sub.y, v.sub.z).sup.T in the coordinate system with the position of the head of the listener as the origin point, which makes it possible to give rotation about the y axis as the rotation axis to the position v. A position v.sub.2 after the rotation of the position v is expressed by the following expression (33).

[0126] Similarly, the matrix u(.PHI.) acts on the position v, which makes it possible to give rotation about the z axis as the rotation axis to the position v. A position v.sub.3 after the rotation of the position v is expressed by the following expression (34).

[Math. 33]

v.sub.2=a(.theta.)v (33)

[Math. 34]

v.sub.3=u(.PHI.)v (34)

[0127] Accordingly, the rotation matrix R’(g)=R’(u(.PHI.)a(.theta.)u(.psi.)) is a rotation matrix that, in the spherical harmonic domain, rotates the coordinate system by the angle .PHI. in a horizontal angle direction, then rotates, by the angle .theta. in an elevation angle direction viewed from that coordinate system, the coordinate system rotated by the angle .PHI., and further rotates, by the angle .psi. in the horizontal angle direction viewed from that coordinate system, the coordinate system rotated by the angle .theta..

[0128] In addition, R’(u(.PHI.)), R’(a(.theta.)), and R’(u(.psi.)) represent the rotation matrices R’(g) in a case where the coordinates are rotated by rotations by the matrix (u(.PHI.)), the matrix (a(.theta.)), and the matrix (u(.psi.)), respectively.

[0129] In other words, the rotation matrix R’(u(.PHI.)) is a rotation matrix that rotates the coordinates by the angle .PHI. in the horizontal angle direction in the spherical harmonic domain, and the rotation matrix R’(a(.theta.)) is a rotation matrix that rotates the coordinates by the angle .theta. in the elevation angle direction in the spherical harmonic domain. In addition, the rotation matrix R’(u(.psi.)) is a rotation matrix that rotates the coordinates by the angle .psi. in the horizontal angle direction in the spherical harmonic domain.

[0130] Thus, for example, as indicated by an arrow A51 in FIG. 7, it is possible to express the rotation matrix R’(g)=R’(u(.PHI.)a(.theta.)u(.psi.)), which rotates the coordinates three times by the angle .PHI., the angle .theta., and the angle .psi. as rotation angles, by the product of three rotation matrices R’(u(.PHI.)), R’(a(.theta.)), and R’(u(.psi.)).

[0131] In this case, it is sufficient if, as data for obtaining the rotation matrix R’(g.sub.j.sup.-1), the rotation matrix R’(u(.PHI.)), the rotation matrix R’(a(.theta.)), and the rotation matrix R’(u(.psi.)) for the respective values of the rotation angles .PHI., .theta., and .psi. are held in tables in the memory. In addition, in a case where the same head-related transfer function is optionally used for the left and the right, the row vector H.sub.s(.omega.) is held for only one ear, and the matrix Rref described above for horizontal inversion is also held in advance, which makes it possible to obtain the rotation matrix for the other ear by determining the product of this and a generated rotation matrix.

[0132] In addition, in a case where the vector P.sub.l(.omega.) is actually calculated, one rotation matrix R’(g.sub.j.sup.-1) is calculated by calculating the product of respective rotation matrices read out from tables. Then, as indicated by an arrow A52, the product of the matrix H.sub.s(.omega.) of 1.times.K, the rotation matrix R’(g.sub.j.sup.-1) of K.times.K common to all the time frequency bins w, and the vector D’(.omega.) of K.times.1 is calculated for each of the time frequency bins w to determine the vector P.sub.l(.omega.).

[0133] Herein, for example, in a case where the rotation matrix R’(g.sub.j.sup.-1) itself of each rotation angle is held in the table, it is necessary to hold 360.sup.3=46656000 rotation matrices R’(g.sub.j.sup.-1), where accuracy of the angle .PHI., the angle .theta., and the angle .psi. of each rotation is one degree (1.degree.).

[0134] In contrast, in a case where the rotation matrix R’(u(.PHI.)), the rotation matrix R’(a(.theta.)), and the rotation matrix R’(u(.psi.)) of each rotation angle are held in tables, it is necessary to hold only 360.times.3=1080 rotation matrices, where accuracy of the angle .theta., the angle .theta., and the angle iv of each rotation is one degree (1.degree.).

[0135] Accordingly, in a case where the rotation matrix R’(g.sub.j.sup.-1) itself is held, it is necessary to hold data of the order of O(n.sup.3). In contrast, in a case where the rotation matrix R’(u(.PHI.)), the rotation matrix R’(a(.theta.)), and the rotation matrix R’(u(.psi.)) are held, only data of the order of O(n) is sufficient, which makes it possible to greatly reduce the memory amount.

[0136] Moreover, as indicated by the arrow A51, the rotation matrix R’(u(.PHI.)) and the rotation matrix R’(u(.psi.)) are diagonal matrices; therefore, it is sufficient if only diagonal components are held.

[0137] In addition, the rotation matrix R’(u(.PHI.)) and the rotation matrix R’(u(.psi.)) are both rotation matrices for performing rotation in the horizontal angle direction, which makes it possible to obtain the rotation matrix R’(u(.PHI.)) and the rotation matrix R’(u(.psi.)) from the same common table. In other words, the table of the rotation matrix R’(u(.PHI.)) and the table of the rotation matrix R’(u(.psi.)) may be the same.

[0138] It is to be noted that, in FIG. 7, hatched portions of the respective rotation matrices represent non-zero elements.

[0139] Further, for k and m in a case where (n.sup.2+n+1+k) and (n.sup.2+n+1+m) belong to the set Q expressed by the expression (22) described above, elements other than elements in rows (n.sup.2+n+1+k) and columns (n.sup.2+n+1+m+m) of the rotation matrix R’(a(.theta.)) are zero; therefore, it is sufficient if only elements other than zero are held as the rotation matrix R’(a(.theta.)), which makes it possible to further reduce the memory amount.

[0140] From the above, it is possible to further reduce the memory amount necessary to hold data for obtaining the rotation matrix R’(g.sub.j.sup.-1).

[0141] Specifically, for example, in a case where .PHI. number of rotation matrices R’(u(.PHI.)), .THETA. number of rotation matrices R’(a(.theta.)), and .PSI. number of rotation matrices R’(u(.psi.)) are held, the number M of rotation directions g.sub.j of the head becomes M=.PHI..times..THETA..times..PSI..

[0142] In the fourth technique, the rotation matrices R’(a(.theta.)) are held by accuracy of the angle .theta., that is, the .THETA. number of rotation matrices R’(a(.theta.)) are held; therefore, the memory amount necessary to hold the rotation matrices R’(a(.theta.)) is memory(a)=.THETA..times.x(J+1)(2J+1)(2J+3)/3.

[0143] In addition, for the rotation matrix R’(u(.PHI.)) and the rotation matrix R’(u(.psi.)), it is possible to use a common table, and in a case where the accuracy of the angle .PHI. and the angle .psi. are the same, it is sufficient if rotation matrices are held only by the angle .theta., that is, the .PHI. number of rotation matrices are held, and it is sufficient if only the diagonal components of these rotation matrices are held. Accordingly, assuming that a length of the vector D’(.omega.) is K, the memory amount necessary to hold the rotation matrices R’(u(.PHI.)) and the rotation matrices R’(u(.psi.)) is memory(b)=.PHI..times.K.

[0144] Further, assuming that the number of time frequency bins .omega. is W, the memory amount necessary to hold the row vector H.sub.s(.omega.) of 1.times.K by the time frequency bins w for the left and right ears is 2.times.K.times.W.

[0145] Accordingly, as the sum of these memory amounts, the memory amount necessary in the fourth technique is the memory amount memory=memory(a)+memory(b)+2KW.

[0146] Such a fourth technique makes it possible to greatly reduce the memory amount necessary for the operation amount substantially the same as that in the third technique. Specifically, the fourth technique exerts more effects, for example, in a case where the accuracy of the angle .PHI., the angle .theta., and the angle .psi. is set to one degree (1.degree.) or the like to withstand practical use in realizing a head tracking function.

[0147] Incidentally, in the fourth technique, it is possible to reduce, to 1080, the number of rotation matrices to be held, for example, by having rotation with respect to three axes at every one degree, that is, by setting the accuracy of the angle .PHI., the angle .theta., and the angle .psi. to one degree (1.degree.).

[0148] However, in the fourth technique, in terms of the operation amount, it is possible to reduce the maximum degree J of the degree n of the spherical harmonic function only to the cube order.

[0149] The reason for this is that the rotation matrix R’(a(.theta.)) for tracking rotation of the head of the listener (user) is a block diagonal matrix as illustrated in FIG. 8, for example.

[0150] It is to be noted that, in FIG. 8, a horizontal axis represents components of a column of the rotation matrix R’(a(.theta.)), and a vertical axis represents components of a row of rotation matrix R’(a(.theta.)). In addition, in FIG. 8, shades of gray at respective positions of the rotation matrix R’(a(.theta.)) indicate levels (dB) of elements corresponding to these positions of the rotation matrix R’(a(.theta.))

[0151] FIG. 8 illustrates the rotation matrix R’(a(.theta.)) in a case where the rotation angle .theta. is one degree. In this example, in a case where attention is focused on elements having a value of -400 dB or more, for example, in the rotation matrix R’(a(.theta.)), a portion including elements having such a value is a block having a size of (2n+1)(2n+1) for the degree n. For example, a square portion indicated by an arrow A71 is a portion of one block of a block diagonal matrix, and a width (thickness) W11 of the block is 2n+1. That is, in the square portion indicated by the arrow A71, (2n+1) elements are arranged in a row direction, and (2n+1) elements are also arranged in a column direction.

[0152] Using the rotation matrix R’(a(.theta.)) that is such a block diagonal matrix makes it possible to reduce the operation amount to some extent, but if it is possible to further reduce the operation amount, it is possible to obtain the drive signal more quickly and efficiently.

[0153] Accordingly, the present technology focuses on characteristics of the rotation matrix for minute rotation, and performing tracking of rotation of the head of the listener (user) by accumulation of the minute rotations makes it possible to reduce the operation amount to the square order of the degree J.

[0154] The technique of the present technology (hereinafter also referred to as proposed technique 1) is described in detail below.

[0155] Of rotation of three axes of the head of the listener, that is, the rotation matrix R’(u(.PHI.)), the rotation matrix R’(a(.theta.)), and the rotation matrix R’(u(.psi.)), only the rotation matrix R’(a(.theta.)) is a block diagonal matrix, and the other rotation matrices R’(u(.PHI.)) and R’(u(.psi.)) are fully diagonal matrices.

[0156] However, depending on how a rotation axis is selected, two or more rotation matrices may become block diagonal matrices in some cases. In an example of this specification, a rotation axis that causes two or more rotation matrices to become block diagonal matrices is not used, but the present technique is applicable to a case where two or more rotation matrices are block diagonal matrices.

[0157] It is assumed that the angle .theta. is 0 degrees in a case where the listener is directed to the direction of the front in the upward-downward direction (the vertical direction), that is, in the elevation angle direction.

[0158] The angle .theta. becomes one degree in a case where the listener moves his head from a state in which the angle .theta. is 0 degrees to an upward direction (to a positive direction of the z axis) by +1 degree, i.e., rotates his head about the y axis as the rotation axis to the positive direction of the z axis by +1 degree.

[0159] The rotation matrix R’(a(.theta.)) in such a case where the angle .theta. is one degree is as illustrated in FIG. 8 as described above.

[0160] In the example illustrated in FIG. 8, it can be seen that the rotation matrix R’(a(.theta.)) is a block diagonal matrix, and a portion of each block of the block diagonal matrix is a square including (2n+1) elements on one side for each degree n. At the same time, the rotation matrix R’(g) that is a synthesis of the rotation matrix R’(a(.theta.)), rotation matrix R’(u(.PHI.)), which is a diagonal matrix, and the rotation matrix R’(u(.psi.)), which is a diagonal matrix, is also a similar block diagonal matrix. Herein, the direction g.sub.j may be a discrete value or a continuous value; therefore, g.sub.j is hereinafter simply referred to as g.

[0161] Now, in a case where the head-related transfer function in the spherical harmonic domain is rotated for one block of the rotation matrix R’(g) that is a block diagonal matrix, that is, for a certain degree n, the head-related transfer function H’.sub.n.sup.m(g.sup.-1) after the rotation becomes as expressed by the following expression (35). That is, in a case where the head-related transfer function in the spherical harmonic domain is rotated by the angle of the direction g using a portion of a block of the degree n of the rotation matrix R’(g), the head-related transfer function H’.sub.n.sup.m(g.sup.-1) after the rotation becomes as expressed by the following expression (35).

[ Math . .times. 35 ] H n ’ .times. .times. m .function. ( g - 1 ) = n k = - n .times. H n ’ .times. .times. k .times. R k , m ’ .function. ( n ) .function. ( g ) ( 35 ) ##EQU00019##

[0162] In the expression (35), k represents an order before the rotation, and m represents an order after the rotation. In addition, H’.sub.n.sup.k represents elements of the degree n and the order k in the row vector H.sub.s(.omega.).

[0163] It can be seen from such calculation of the expression (35) that all (2n+1) elements R’.sup.(n).sub.k,m(g) are used to determine the element of the order m after one rotation.

[0164] However, in a case where the angle .theta. is minute, such as a case of the angle .theta.=one degree, most of the respective elements of the rotation matrix R’(a(.theta.)) that is a block diagonal matrix have a minute value. Accordingly, most of the elements R’.sup.(n).sub.k,m(g) of the rotation matrix R’(g) have a minute value.

[0165] That is, for example, the rotation matrix R’(a(.theta.)) illustrated in FIG. 9 indicates the rotation matrix R’(a(.theta.)) in a case where the angle .theta. is one degree that is the same as the rotation matrix R’(a(.theta.)) illustrated in FIG. 8.

[0166] That is, in FIG. 9, a horizontal axis represents components of a column of the rotation matrix R’(a(.theta.)), and a vertical axis represents components of a row of rotation matrix R’(a(.theta.)).

[0167] In addition, shades of gray at respective positions of the rotation matrix R’(a(.theta.)) indicate levels (dB) of elements corresponding to these positions of the rotation matrix R’(a(.theta.)).

[0168] However, in FIG. 8, a range of the level of each element of the rotation matrix R’(a(.theta.)) is from -400 dB to 0 dB, whereas, in FIG. 9, the range of the level of each element of the rotation matrix R’(a(.theta.)) is limited to a range from -100 dB to 0 dB.

[0169] As with an example illustrated in FIG. 9, in a case where an element having an effective value in the rotation matrix R’(a(.theta.)) is an element having a level of -100 dB to 0 dB, it can be seen that the element having the effective value exists only in the vicinity of diagonal components.

[0170] Further, it can be seen that the number of elements having the effective value in one focused row of the rotation matrix R’(a(.theta.)), that is, the number of elements having the effective value (hereinafter also referred to as effective element width) that are continuously disposed side by side in a lateral direction in FIG. 9 is almost the same in all degrees n.

[0171] Accordingly, the number of elements having an effective value in each degree n is only on the square order of J, which is nearly the maximum value of the degree n, even though the degree n increases.

[0172] Therefore, the element having a value within a range of a predetermined level, such as an element having a level of -100 dB to 0 dB of the rotation matrix R’(a(.theta.)) is set as an effective element, and only the effective element is used to perform an operation of rotating the head-related transfer function in the spherical harmonic domain, which makes it possible to reduce the operation amount. In other words, an element having a value within a range of a predetermined level of the rotation matrix R’(g) is set as an effective element, and only the effective element is used to perform the operation of rotating the head-related transfer function in the spherical harmonic domain, which make is possible to reduce the operation amount. The effective element width of the rotation matrix R’(g) is the same as the effective element width of the rotation matrix R’(a(.theta.)).

[0173] For example, in a case where the effective element width is 2C+1, calculation of the expression (35) described above is as expressed by the following expression (36).

[ Math . .times. 36 ] H n ’ .times. .times. m .function. ( g - 1 ) .apprxeq. k = max .function. ( - n , m - C ) min .function. ( n , .times. m + C ) .times. H n ’ .times. .times. k .times. R k , m ’ .function. ( n ) .function. ( g ) ( 36 ) ##EQU00020##

[0174] Note that, in the expression (36), min(a, b) represents a function that selects a smaller one of a and b. In the expression (36), max(a, b) represents a function that selects a larger one of a and b.

[0175] In the expression (35), (2n+1) elements R’.sup.(n).sub.k,m(g) of the order k ranging from n to n are used for each degree n, but only (2C+1) elements R’.sup.(n).sub.k,m(g) of the order k ranging from m-C to m+C, where m is set as a center, are used in calculation of the expression (36), thereby achieving a reduction in the operation amount. It is to be noted that, in a case where k is larger than n and in a case where k is smaller than -n, an operation is performed for k up to n and k up to -n, respectively, not to exceed a matrix range. The operation in which the order k is limited is performed in such a manner, that is, the operation is performed only on elements in which the order k has a value within a range determined by C, which makes it possible to reduce the operation amount.

[0176] In this case, the effective element width of 2C+1 is the same in all degrees n; therefore, it can be seen that the larger the degree J, the more advantageous the proposed technique 1 is in terms of the operation, as compared with the fourth technique described above.

[0177] It is to be noted that, in the expression (36), a constant C determined from the effective element width is applied to all degrees n. However, C determining the effective element width of 2C+1 is not limited to a constant, and a function C(n) of the degree n (where C(n)<n) may be used as C, or a function C(n, k) of the degree n and the order k may be used as C. Herein, it is sufficient if the function C(n) or the function C(n, k) is a natural number smaller than the degree n. In other words, it is sufficient if the operation is performed with the number of elements even slightly smaller than that in the operation using the elements of an entire block of the rotation matrix R'(a(.theta.)), which is a block diagonal matrix, that is, the rotation matrix R'(g).

[0178] In addition, the element used in the operation of the rotation matrix R’(a(.theta.)) may be an element itself of the rotation matrix R’(a(.theta.)) or may be an approximate value of the element of the rotation matrix R’(a(.theta.)).

[0179] That is, more generally, it is assumed that it is possible to express the rotation matrix R’(a(.theta.)) as R’(a(.theta.))=A1+A2+A3+ … by combining a certain plurality of matrices. In this case, for an approximate rotation matrix Rs’(a(.theta.)) represented by the sum of some extracted ones of matrices included in the rotation matrix R’(a(.theta.)), an operation may be performed using a smaller number of elements than (2n+1).times.(2n+1) elements in each of n-th order blocks.

……
……
……

本文链接：https://patent.nweon.com/18873

Sony Patent | Signal processing device and method, and program

您可能还喜欢...

分类

最新AR/VR行业分享

最新AR/VR论文

最新AR/VR行业招聘

Sony Patent | Signal processing device and method, and program

您可能还喜欢...

Sony Patent | Display device and electronic device

Sony Patent | Transparent smartphone

Sony Patent | Display apparatus and display control apparatus

分类

最新AR/VR行业分享

最新AR/VR论文

最新AR/VR行业招聘