空間音響処理における物理法則に基づく機械学習

空間音響処理における物理法則に基づく機械学習小山翔一国立情報学研究所／総合研究大学院大学

空間音響処理 ➢ 空間音響収録・再生 – Sound field is captured by multiple
mics and reproduced by headphones or loudspeakers – Head-related transfer function personalization ➢ 室内音響解析・制御 – Visualization and auralization of spatial sound – Estimation of room acoustic impulse responses/transfer functions ➢ 音源強調・分離 – Beamforming techniques require accurate steering vectors (array manifold vectors) – Source enhancement using wearable devices is more challenging December 2, 2025 4 複数のマイクを用いた音空間の解析・制御

音場推定とは？ December 2, 2025 5 様々な空間音響処理技術の基盤となる技術であり，幅広い応用を持つ複数マイクを用いて対象領域内の音場を推定 Microphone

応用1: マイクアレイ信号からのバイノーラル再現 December 2, 2025 6 VR音響のためのマイクアレイによる空間音響収録とそのバイノーラル再現 Recording Reproduction ➢
Unlike binaural synthesis in VR space, binaural reproduction in real environments requires spatial audio capturing by using multiple mics ➢ Required to estimate spatial sound in a wide area to achieve a wide listening area, e.g., 6DoF reproduction

応用2: ステアリングベクトルのアップサンプリング ➢ Estimation of steering vectors for wearable devices
with multiple mics is crucial for source enhancement compared to simple-shaped arrays ➢ Upsampling techniques for steering vectors will simplify the measurement of steering vectors December 2, 2025 7 インパルス応答測定によるステアリングベクトルの空間的補間

応用3: 空間アクティブ騒音制御 ➢ Active noise control (ANC) aims to cancel
noise by using loudspeaker signals, but its effect is limited to local region ➢ Spatial ANC by estimating spatial sound using multiple mics and synthesizing anti- spatial sound using multiple loudspeakers December 2, 2025 8 スピーカ信号による３次元領域内の騒音抑制 Quiet zone

音場推定 December 2, 2025 9 音場推定の内部／外部問題 Microphone Target region: Target
region: Microphone 内部問題外部問題ここでは内部問題に焦点を置く

音場推定 December 2, 2025 10 音場推定問題の定式化 Estimate pressure distribution in
the time domain or in frequency domain with ominidirectional mics at Microphone Target region:

音場推定 ➢ 一般的な関数補間としての問題設定 – is represented by model parameters December
2, 2025 11 Formulation of sound field estimation problem Loss term Regularization term 音場推定問題の定式化 Microphone Target region: Observation Samples in space/time/freq

音場推定 ➢ 一般的な関数補間としての問題設定 – is represented by model parameters December
2, 2025 12 Formulation of sound field estimation problem Squared ℓ 2 -norm penalty 音場推定問題の定式化 Microphone Target region: Squared error loss

関数補間法に対する物理的な性質の組み込み December 2, 2025 13 純粋にデータ駆動型のアプローチは過学習が問題となることがある機械学習技術に物理的な制約を組み入れる Physics-Informed Machine Learning
(PIML)が注目されている [Karniadakis+ 2021]

関数補間法に対する物理的な性質の組み込み ➢ 推定すべき関数は音場の支配方程式を満たすはず – 波動方程式（時間領域） – ヘルムホルツ方程式（周波数領域） December 2, 2025
14 どのような物理的性質を組み入れられる？支配方程式の解空間への制約を組み入れるための技術を紹介

音場推定におけるPIML December 2, 2025 15 最近の解説論文に基づき音場推定におけるPIMLについて紹介 Paper

ニューラルネットワーク以前の音場推定法 ➢ 物理的な制約を陽に用いた手法が提案されてきた歴史がある – 基底関数展開に基づく方法 [Williams+ 1999, Colton+ 2013] •
Plane wave expansion (or Herglotz wave function) • Spherical wave function expansion • Equivalent source distribution (or single-layer potential) – 無限次元展開あるいはカーネル回帰に基づく方法 • Harmonic analysis of infinite order [Ueno+ 2018] • Directionally-weighted kernel regression [Ueno+ 2021] December 2, 2025 16 従来の音場推定に関する包括的なレビュー論文： • Ueno and Koyama, “Sound Field Estimation: Theories and Applications,” Foundations and Trends®️ in Signal Processing, 2025.

支配方程式の要素解に対する基底関数展開 ➢ Function is modeled by basis functions and their
weights ➢ 波動方程式/ヘルムホルツ方程式の要素解による基底関数 [Williams+ 1999, Colton+ 2013] – Plane wave expansion (Herglotz wave function) – Spherical wave function expansion – Equivalent source distribution (single-layer potential) December 2, 2025 17 有限個の基底関数の線形結合による表現

支配方程式の要素解に対する基底関数展開 ➢ Plane wave expansion (or Herglotz wave function) December
2, 2025 18 Plane wave arrival direction

支配方程式の要素解に対する基底関数展開 ➢ Spherical wave function expansion December 2, 2025 19
Expansion center Spherical Bessel function Spherical harmonic function

支配方程式の要素解に対する基底関数展開 ➢ Spherical Bessel function December 2, 2025 20 2
4 6 8 10 12 14 x -0.2 0 0.2 0.4 0.6 0.8 1 n=0 n=2 n=4 n=6 Bessel function

支配方程式の要素解に対する基底関数展開 ➢ Spherical harmonic function December 2, 2025 21 Associated
Legendre function

支配方程式の要素解に対する基底関数展開 ➢ Equivalent source distribution (or single layer potential) December
2, 2025 22 Point source

支配方程式の要素解に対する基底関数展開 ➢ 有限次元の基底関数を用いた線形回帰 – Regularized least squares solution of expansion
coefs – Estimate the function December 2, 2025 23 基底関数の数や展開中心を適切に設定することが必要

支配方程式の制約を用いたカーネル回帰 ➢ is represented by weighted sum of kernel function
➢ Kernel function is a similarity function expressed as innter product on some functional space December 2, 2025 24 を無限次元とすることやを直接設計することも可能

支配方程式の制約を用いたカーネル回帰 ➢ In kernel ridge regression, is obtained as with
Gram matrix defined as ➢ Estimate the function December 2, 2025 25 関数空間とカーネル関数を適切に定義することが必要

支配方程式の制約を用いたカーネル回帰 ➢ Inner product and norm over are defined by
plane wave expansion with positive directional weighting [Ueno+ 2021] December 2, 2025 26 推定解をヘルムホルツ方程式の解空間に制約するためのカーネル関数指向性重み関数は音場の指向特性に関する事前情報を組み入れるように設計

支配方程式の制約を用いたカーネル回帰 ➢ Kernel function when is defined by using von
Mises–Fisher distribution ➢ When no prior information, i.e., uniform weight , December 2, 2025 27 with 推定解をヘルムホルツ方程式の解空間に制約するためのカーネル関数

支配方程式の制約を用いたカーネル回帰 ➢ Experimental results using real data from MeshRIR dataset
– Reconstructing pulse signal from single loudspeaker w/ 18 mic December 2, 2025 28 Ground truth Kernel regression w/ HE constraint Kernel regression w/ Gaussian kernel (Black dots indicate mic positions) [Koyama+ 2021]

ニューラルネットワークを用いた音場推定 ➢ High representational power – Solution space in basis
expansion and kernel regression is highly constrained – High adaptability to the target acoustic environment can be expected by using NNs ➢ From snapshot-based to learning-based – Basically, linear and kernel regressions use only a snapshot observation – Properties of the target acoustic environment can be learned from training data December 2, 2025 29 なぜ音場推定においてニューラルネットワークか？マイク数が極めて少数の場合などに高い推定精度を実現することが期待できる

支配方程式制約を組み入れた回帰のためのNN ➢ Regression by NNs – Target output is discretized
as – NN with input and output is designed with NN params – NN is trained using a pair of datasets to minimize the loss, e.g., December 2, 2025 30

支配方程式制約を組み入れた回帰のためのNN ➢ NNによる基底関数の展開係数の推定 – Train a NN estimating weights of
basis expansion – Continuous function can be reconstructed by using estimated expansion coefs – Can be regarded as physics-constrained neural network (PCNN) [Karakonstantis+ 2023, Lobato+ 2024] ➢ （近似的な）PDE lossの導入 – Loss function evaluating deviation from governing PDEs: PDE loss – Because of discrete output values, PDE loss is computed by finite difference or interpolation – In [Shigemi+ 2022], physics-informed convolutional neural network (PICNN) using bi- cubic spline interpolation is proposed December 2, 2025 31 どのようにNNに支配方程式の制約を組み入れるか？

陰的表現のためのNNを用いたPINN ➢ Implicit neural representation [Sitzmann+ 2020] – NNs are
used to implicity represent a continuous function – NN with input and output is designed with NN params – NN is trained for approximaging by using training data December 2, 2025 32

陰的表現のためのNNを用いたPINN ➢ Physics-informed neural network (PINN) [Raissi+ 2019] – Implicit
neural representation allows incorporating constraints on including its (partial) derivatives in loss function December 2, 2025 33 自動微分を用いて計算可能

陰的表現のためのNNを用いたPINN ➢ Physics-informed neural network (PINN) [Raissi+ 2019] – Case
when estimating function approximately satisfying Helmhotz eq December 2, 2025 34 Helmholtz方程式からの逸脱度を評価する損失関数

陰的表現のためのNNを用いたPINN ➢ PINNを用いた時間領域RIRの再構成 [Pezzoli+ 2023] – RIRs measured by lnear
array of 100 mics are reconstructed using only 33 channels December 2, 2025 35

現在のPIMLに基づく音場推定手法 December 2, 2025 36 Snapshot-based Learning-based Constrained Penalized [Ribeiro+
2024] [Karakonstantis+ 2023] [Olivieri+ 2024] [Shigemi+ 2022] PI-strategy Training-strategy [Labato+ 2024] [Chen+ 2023] [Ma+ 2024] [Karakonstantis+ 2024] [Masuyama+ 2025]

Physics-Constrained Neural Kernel ➢ Directional weighting function of kernel function
is adapted to environment December 2, 2025 37 陰的表現のためのNNによるHelmholtz方程式制約下でのカーネル関数 Directed component Residual component Kernel function based on plane wave expansion [Ribeiro+ 2024]

Physics-Constrained Neural Kernel ➢ Directed component – Weighted sum of
(sparse) von Mises–Fisher distributions to represent direct sound and early reflections December 2, 2025 38 Sparsity constraint Normalization const 陰的表現のためのNNによるHelmholtz方程式制約下でのカーネル関数

Physics-Constrained Neural Kernel ➢ Residual component – Implicit neural representation
to represent late reverberation December 2, 2025 39 Computed by numerical integration : Implicit neural representation 陰的表現のためのNNによるHelmholtz方程式制約下でのカーネル関数

Physics-Constrained Neural Kernel ➢ Kernel function is sum of directed
and residual kernels – Hyperparameters are jointly optimized by a steepest descent-based algorithm – Solution still satisfies Helmholtz equation – Inference by linear operation based on kernel ridge regression December 2, 2025 40 Directed kernel Residual kernel 推定は時間領域でのFIRフィルタとして実現可能陰的表現のためのNNによるHelmholtz方程式制約下でのカーネル関数

Physics-Constrained Neural Kernel ➢ Numerical experiment: T60: 400 ms, #
mics: 41, spherical shell array December 2, 2025 41 [Koyama+ 2025] Proposed PCNK Proposed PCNK

まとめ ➢ 空間音響処理におけるPIML – 空間音響処理の基盤となる音場推定におけるPIMLについて解説 – 関数補間に対して物理的な性質を組み入れるアプローチ • 支配方程式の要素解に対する基底関数展開 •
支配方程式の制約を用いたカーネル回帰 • 支配方程式制約を組み入れた回帰のためのNN • 陰的表現のためのNNを用いたPINN – 現在のPIMLに基づく音場推定 • Physics-Constrained Neural Kernel December 2, 2025 42 Thank you for your attention!

空間音響処理における物理法則に基づく機械学習

空間音響処理における物理法則に基づく機械学習

More Decks by NII S. Koyama's Lab

Other Decks in Research

Featured

Transcript