Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Kernel Interpolation of Acoustic Transfer Functions with Adaptive Kernel For Directed and Residual Reverberations

Kernel Interpolation of Acoustic Transfer Functions with Adaptive Kernel For Directed and Residual Reverberations

Presentation slides for IEEE ICASSP 2023

NII S. Koyama's Lab

May 08, 2023
Tweet

More Decks by NII S. Koyama's Lab

Other Decks in Research

Transcript

  1. 1884 Kernel interpolation of acoustic transfer functions with adaptive kernel

    for directed and residual reverberations. Juliano G. C. Ribeiro1, Shoichi Koyama2, and Hiroshi Saruwatari1 1Graduate School of Information Science and Technology, The University of Tokyo 2Digital Content and Media Sciences Research Division, National Institute of Informatics (NII) 1 / 20
  2. Background Sound waves within a space behave unpredictably. Predicting sound

    propagation has several practical applications ñ acoustic transfer function (ATF). Our objective: region-to-region interpolation ñ continuous variation within constrained regions. 2 / 20
  3. Background ñ Previous region-to-region methods. [Samarasinghe+, 2015]: Embed physical properties

    using spherical wave functions. [Ribeiro+, 2020]: Embed physical properties using reproducing kernel Hilbert space (RKHS). [Ribeiro+, 2022]: Extend kernel function to include basic directionality. ñThe kernel method outperformed wave function expansion Better performance frequency-by-frequency and spatially. More broadly applicable: no need for truncation orders, agnostic to geometry of array. Can be generalized with plane wave expansion weighting. 3 / 20
  4. Problem statement Problem: Source/receiver r|s within regions in a space

    Ω Ă R3. Distribute L loudspeakers in source region ΩS Ă Ω and M microphones in receiver region ΩR Ă Ω. Interpolate ATF between regions from the N “ LM measurements. Source region: Receiver region: Region 4 / 20
  5. Problem statement Basic properties of ATF Superposition of components: hpr|s,

    kq “ hDpr|s, kq ` hRpr|s, kq. Direct component hD: known, equivalent to point source in free-field. hDpr|s, kq “ G0pr|s, kq “ eik}r´s} 4π}r ´ s} Reverberant component hR: unknown, but satisfies Helmholtz equation on position variables. p∇2 r ` k2qhRpr|s, kq “ p∇2 s ` k2qhRpr|s, kq “ 0 Reciprocity @r|s: hpr|s, kq “ hps|r, kq 5 / 20
  6. Problem statement Objective: obtain interpolation function ˆ hR from optimization.

    ˆ hR “ argmin gPH N ÿ n“1 |yn ´ gpqnq|2 ` λ}g}2 H Feature space H holds properties of reverberant field. If H is a RKHS: optimization has closed form solution called kernel ridge regression. Must define H to be a RKHS with flexible form. Herglotz wave function generalized model. 6 / 20
  7. Generalized model Express hR in Herglotz wavefunction to define RKHS

    pH , x¨, ¨yH , κq: hRpr|sq “ T ´˜ hR; r|s¯ :“ ż S2ˆS2 eikpˆ r¨r`ˆ s¨sq˜ hRpˆ r,ˆ sqdˆ rdˆ s, H “ !hR “ T ´˜ hR; r|s¯ : ˜ hR P L2pw, S2 ˆ S2q, ˜ hRpˆ r,ˆ sq “ ˜ hRpˆ s, ˆ rq @ˆ r,ˆ s P S2) xf, gyH “ ż S2ˆS2 ˜ fpˆ r,ˆ sq˜ gpˆ r,ˆ sq wpˆ r,ˆ sq dˆ rdˆ s, @f, g P H , κpr|s, r1|s1q “ T ˜ wpˆ r,ˆ sq e´ikpˆ r¨r1`ˆ s¨s1q ` e´ikpˆ r¨s1`ˆ s¨r1q 2 ; r|s ¸ . Weight function w can be determined freely while respecting physical properties of ATF. 7 / 20
  8. Previous models ñHow was the weight determined in the past?

    [Ribeiro+, 2020]: Uniform weight: w ” 1 Embedded physical properties. Too inflexible. [Ribeiro+. 2022]: Sunken sphere weight. More representative of ATF model. Uniform gain on the sides isn’t very realistic. wpˆ r,ˆ sq “ φpˆ rqφpˆ sq φpˆ vq “ 1 4π ˆ 1 ` γ2 ´ coshpζˆ v ¨ ˆ v0q coshpζq ˙ 8 / 20
  9. Proposed model Proposed adaptive weight w. We propose an adaptive

    kernel capable of learning the specific properties of the environment from the data. Weight function w divided into two components: w “ wdir ` wres. wdir represents directed components: plane wave components with high amplitude but sparse angular representation. wres represents residual components: plane wave components with lower amplitudes, but dense angular representation. Adaptive kernel κ superposition of the component kernels: κ “ κdir ` κres. 9 / 20
  10. Proposed model Directed weight wdir . Separable for simplicity: wdirpˆ

    r,ˆ sq “ φdirpˆ rqφdirpˆ sq. Sound field weights φdir convex combination of the unimodal weighting functions derived from the von Mises–Fisher. φdirpˆ vq “ D ÿ d“1 αd eβd ˆ v¨ˆ vd 4πCpβdq , }α}1 “ 1, Cpβdq “ #sinhpβdq βd , βd ‰ 0 1, βd “ 0 10 / 20
  11. Proposed model Directed kernel κdir . Kernel function κdir has

    closed form solution: κdirpr|s, r1|s1; wdirq “ 1 2 pκφdir pr, r1; φdirqκφdir ps, s1; φdirq ` κφdir ps, r1; φdirqκφdir pr, s1; φdirqq κφdir pr, r1; φdirq :“ D ÿ d“1 αd j0pηpkpr ´ r1q ´ iβdˆ vdqq Cpβdq , ηpzq :“ ? zTz, z P C3. Strong directionality on a sparse set of directions! 11 / 20
  12. Proposed model Residual weight wres . Defined as neural network,

    in order to freely learn patterns. Neural network parameters: θ θ θ. Simple architecture: 2 fully connected hidden layers with 20 neurons each and tanh activation. Capable of learning irregular shapes and patterns. 12 / 20
  13. Proposed model Residual kernel κres . Kernel obtained via integration.

    No closed form ñ integral operator is approximated numerically. Reciprocity is learned by data augmentation. Since the properties are enforced by the integrator and training process, θ θ θ learns freely. 13 / 20
  14. Proposed model Parameter optimization. α α α, β β β,

    θ θ θ chosen as to minimize the leave-one-out cross validation error. ELOOpα, β, θq “ 1 N N ÿ n“1 |ˆ hRpqn; ˘ Qn, ˘ yn, wq ´ yn|2, β β β and θ θ θ optimized using the gradient descent method and α α α using the reduced gradient method with line search. Unlike a deep learning model, this model needs no outside measurements. It learns from the N recorded ATFs alone. 14 / 20
  15. Numerical experiments Numerical simulations. Experiments performed with image source method

    on shoebox-shaped room with dimensions 3.2 m ˆ 4.0 m ˆ 2.7 m. Reverberation time T60 “ 0.45 s. Radii of ΩR and ΩS both 0.2 m. Center of ΩR; r0 “ r´0.65, ´0.80, ´0.48sT m. Center of ΩS; s0 “ r0.65, 0.80, 0.48sT m. L “ M “ 41. Noise was added so SNR “ 20dB. Compared Uniform, Sunkern sphere, Residual only, Directed only, and Proposed. 15 / 20
  16. Numerical experiments Evaluation criteria. Method evaluated on each frequency with

    the normalized square error (NMSE) of each frequency for a total of 9025 test points. NMSEpˆ h, hq “ 10 log10 ¨ ˚ ˝ řN1 n“1 ˇ ˇ ˇ ˆ hpq1 nq ´ hpq1 nq ˇ ˇ ˇ 2 řN1 n“1 |hpq1 nq|2 ˛ ‹ ‚ Reconstruction of ATF on a plane evaluated with normalized square error (NSE). NSEpˆ h, h, rq “ 10 log10 ˜|hpr|s0q ´ ˆ hpr|s0q|2 |hpr|s0q|2 ¸ 16 / 20
  17. Numerical experiments Colormaps of the real part, comparing the reconstruction

    of the signal generated by a single source at s0. (a) Original (b) Uniform (c) Sunken sphere (d) Residual only (e) Directed only (f) Proposed 18 / 20
  18. Numerical experiments Normalized square errors comparing the reconstruction of the

    signal generated by a single source at s0. (b) Uniform (c) Sunken sphere (d) Residual only (e) Directed only (f) Proposed 19 / 20
  19. Conclusion We devised an interpolation function that takes into consideration

    directed and residual reverberations in order to derive an adaptive kernel. By guaranteeing the general physical properties of the ATF are respected using the Herglotz wave function, the weight function associated with the adaptive kernel is free to learn without further restrictions. The formulated model learns the optimal model parameters using internal data only, with no need to experiment outside the derivation data set. The proposed method outperformed the previously established methods in both a frequency-by-frequency and on a spatial basis. We also evaluated the proposed method against the directed and residual weight components separately, confirming the advantages of optimizing both together. 20 / 20