in Japan – Main lab is located in central Tokyo – Associated with graduate university called SOKENDAI August 22, 2025 2 Kashiwa Annex NII 29 km Imperial Palace ICASSP 2028 venue
and Control VR/AR audio Active noise control Local-field recording and reproduction Signal enhancement Visualization/auralization Room acoustic analysis Our research topics Sound field estimation/control and its applications
distribution of continuous physical quantity of sound from discrete sensor observations? Target region: Microphone Fundamental problem, but very important in various applications
distribution of continuous physical quantity of sound from discrete sensor observations? Estimate pressure distribution with observations at discrete set of mics in the frequency domain Target region: Microphone
– Basis expansion-based methods [Colton+ 1992] • Plane wave expansion (or Herglotz wave function) • Spherical wave function expansion • Equivalent source distribution (or single-layer potential) – Infinite-dimensional expansion or kernel regression • Harmonic analysis of infinite order [Ueno+ 2018] • Directionally-weighted kernel regression [Ueno+ 2021] August 22, 2025 6 Comprehensive review is available at • Ueno and Koyama, “Sound Field Estimation: Theories and Applications,” Foundations and Trends® in Signal Processing, 2025.
interpolated is represented by weighted sum of kernel functions Ø Kernel function to constrain the solution to satisfy the Helmholtz equation – With directional weighting of von Mises‒Fisher distribution – With uniform weighting August 22, 2025 7 Kernel regression with constraint of Helmholtz equation where Kernel function Direction Sharpness
real data from MeshRIR dataset – Reconstructing pulse signal from single loudspeaker w/ 18 mic August 22, 2025 8 Ground truth Kernel regression w/ HE constraint Kernel regression w/ Gaussian kernel (Black dots indicate mic positions) [Koyama+ 2021] Applied to binaural rendering, spatial active noise control, etc.[Ueno+ 2025]
to acoustic environments • Estimator is fixed regardless of environment in current methods • High representational power of NNs allows adaptation to environment – Data-driven prior information • Data obtained in advance gives rich prior information on environment • High accuracy can be maintained even with extremely small number of mics August 22, 2025 9 • Physics-Constrained Neural Kernel • Autoencoder conditioned on source and sensor positions [Ribeiro+ 2024] [Koyama+ 2025]
Neural field) – NN implicitly representing continuous function August 22, 2025 10 Input Output [Sitzmann+ 2020] Loss function: Physical properties are not taken into consideration
– Implicit neural representation incorporating loss function that evaluates deviation from governing PDE (PDE loss) August 22, 2025 11 Input Output [Raissi+ 2019] Loss function: Penalized
is adapted to environment August 22, 2025 12 Implicit neural representation of kernel function with constraint of Helmholtz equation Microphone Directed component Residual component Kernel function based on plane wave expansion [Ribeiro+ 2024]
(sparse) von Mises‒Fisher distributions to represent direct sound and early reflections August 22, 2025 13 Implicit neural representation of kernel function with constraint of Helmholtz equation <latexit sha1_base64="I2jQ2Fgmjq+z5rZFxQfy+e9Ar9o=">AAADQ3icfZHNbtQwEMed8FWWj27hyMViQdqiZZWgqkVClSrKgQtQJLattNlGjneSWo2dyHaAxfJTcOGFeAiegRviCsJJdle0pYwU6af//GecmUnKnCkdBN88/9LlK1evrVzv3Lh56/Zqd+3OvioqSWFEi7yQhwlRkDMBI810DoelBMKTHA6Sk906f/AepGKFeKdnJUw4yQRLGSXaSXH3y4fYRJzoY8nNlElr+1HCTQSa2Ge4waTmQcsZ4ZzYdbyNI1Xx2Ijt0B69xq0eCxylktBlP7BHbXmdyYnIcsDL7m3Hqa1zsslZa3b7c/+6jbu9YBg0gc9DOIcemsdevOZ9jqYFrTgITXOi1DgMSj0xRGpGXfdOVCkoCT0hGYwdCsJBTUyzQosfOmWK00K6T2jcqH9XGMKVmvHEOevp1NlcLf4rN650+nRimCgrDYK2D6VVjnWB63tgt3SgOp85IFQy96+YHhO3Ru2u1jn1TMIHzWZVqtw0L8BNKeGVU96UIIku5CMTEZlx8tG6qbNoUNP/jEwsjI4uMromjLNPYM2SLrQysbAuyJ0xPHu087D/ZBhuDjffbvR2ns8PuoLuofuoj0K0hXbQS7SHRoii394D77E39L/63/0f/s/W6nvzmrvoVPi//gCenhPw</latexit> wdir(⌘; , ) = N X n=1 n e n h⌘,dn i C( n) <latexit sha1_base64="wce9c2fonwWBBzEGuOJL2nk9Ius=">AAACwHicfZFbSxtBFMcnq201vUXFJ18WQ8EWCbulaF8EUR98ERUaFbIhnJ2c3QzOZZ2ZLcZ1P4WfwFf9RH4bZ2Mi9dYDAz/+5z8z5xJnnBkbBHc1b2r63fsPM7P1j58+f/namJs/MirXFNtUcaVPYjDImcS2ZZbjSaYRRMzxOD7drvLHf1EbpuQfO8ywKyCVLGEUrJN6jcWV6DKKRRGlIASU0WUv3Ai/9xrNoBWMwn8J4RiaZBwHvbnaVdRXNBcoLeVgTCcMMtstQFtGOZb1KDeYAT2FFDsOJQg03WJUf+l/c0rfT5R2R1p/pP57owBhzFDEzinADszzXCW+luvkNvndLZjMcouSPnyU5Ny3yq+G4feZRmr50AFQzVytPh2ABmrdyOpPvonFavW+Nolx3eyg61LjnlP2M9Rglf5RRKBTAeel6zqNViv6n5HJidHRW0b3CBPsAsvikd60MjmxTsitMXy+tJdw9LMVrrXWDn81N7fGC50hS2SZrJCQrJNNsksOSJtQUpBrckNuvS1v4Cnv7MHq1cZ3FsiT8C7uAZQE35U=</latexit> (k k1 = 1) <latexit sha1_base64="44H961efgdVBY4JkEGr9+jx5ReM=">AAADfHiclZFdb9MwFIadlo9RPtaNS24sClILXZVUqONm0sS44AYY0rpNqrvIcZ3Ua+xktoMoln8Fv4xL/gzCSVPENobEkSI9Ouc95+T4jfKUKe37P7xG89btO3c37rXuP3j4aLO9tX2sskISOiZZmsnTCCuaMkHHmumUnuaSYh6l9CRaHJT1k89UKpaJI73M6ZTjRLCYEaxdKmx/Rwuc5zg0iGM9l9zMmLS2iyJupA2Dfg3DHtyDSBU8NGIvsGcfIEow5zgUsIViiYk5D32U0lh3nexCagO764nnFkVUY1iOQg4s3IGLeq4JhrZ3tlquYnNk7X/0QYskS+a6Z81Bt9KGomfDdscf+FXA6xDU0AF1HIZb3jc0y0jBqdAkxUpNAj/XU4OlZiSltoUKRXNMFjihE4cCc6qmpnp7C5+7zAzGmXSf0LDK/tlhMFdqySOnrI68WiuTf6tNCh2/nhom8kJTQVaL4iKFOoOlkdAZRYlOlw4wkcz9KyRz7KzQzu7WpTUR71dvqmLlrnlL3ZWSvneZjzmVWGfyhUFYJhx/se7qBPVL+peQibXQ0U1CN4Rx9pVa85tulDKxlq7J2RhcNe06HA8HwWgw+vSqs/+mNnQDPAFPQRcEYBfsg3fgEIwB8Xa9qRd7SeNn81nzZXNnJW14dc9jcCmao1839ST1</latexit> dir(r1, r2) = N X n=1 n j0 ⇣p (j ⌘ kr12)T(j ⌘ kr12) ⌘ C( n) Sparsity constraint Normalization const
to represent late reverberation August 22, 2025 14 Implicit neural representation of kernel function with constraint of Helmholtz equation <latexit sha1_base64="s4P1x3nuyJ2mvTCbZ2ZsvG/WxFE=">AAAC+XicfZHdahNBFMcn61eNX6leejMYhCol7BaphSIU9cIbYwXTFrIhnJ2cTYbuzC4zZ9U47FP4BN6Jt/oyeqsP4uw2EdsaDwz85n/+83HOSYpMWgrD763gwsVLl6+sXW1fu37j5q3O+u0Dm5dG4EDkWW6OErCYSY0DkpThUWEQVJLhYXL8rM4fvkVjZa7f0LzAkYKplqkUQF4ad/rvxi5WQDOjnEFbVRtxolyMBNUub5Bm9eYBf8KXvn5/pWvc6Ya9sAl+HqIFdNki9sfrrY/xJBelQk0iA2uHUVjQyIEhKTKs2nFpsQBxDFMcetSg0I5cU3jF73tlwtPc+KWJN+rfJxwoa+cq8c767/Zsrhb/lRuWlO6MnNRFSajFyUNpmXHKed1FPpEGBWVzDyCM9H/lYgYGBPlet089k6jNpm82tb6a5+irNPjSK68KNEC5eehiMFMF7ytf9TTerOl/RqmXRk+rjP4SqeQHrNwfWmmVemldkh9jdHZo5+Fgqxdt97ZfP+ruPV0MdI3dZffYBovYY7bHXrB9NmCCfWM/2E/2K3DBp+Bz8OXEGrQWZ+6wUxF8/Q016fgt</latexit> wres(⌘; ✓) = NN(⌘; ✓) <latexit sha1_base64="dexZadk2Poc298GGiqVC2ayTShI=">AAADX3icfZHfihMxFMYzrbpr1bWrV+pNsAi7UktnkVUQoagXKogr2t2FppYz6WkbO8kMSUatQ57CV/FlvPRNzEynxf3ngYHffOc7OTk5URoLY7vd30GtfunylY3Nq41r129s3Wxu3zo0SaY59nkSJ/o4AoOxUNi3wsZ4nGoEGcV4FM1fFvmjr6iNSNQnu0hxKGGqxERwsF4aNX+xOaQpjHImwc60zDUa53ZY5MmNwnYFe7v0OWVC2coYRflHrzr67bxKhhbcs5LsrOBdujKh+5w/Wv18cSwGNY2RrqvadNmRMl1m3Lpy7NamUbPV7XTLoGchrKBFqjgYbQc/2TjhmURleQzGDMJuaoc5aCu4b9JgmcEU+BymOPCoQKIZ5uXzOvrAK2M6SbT/lKWl+m9FDtKYhYy8s7irOZ0rxPNyg8xOng5zodLMouLLRpMspjahxa7oWGjkNl54AK6FvyvlM9DArd9o40SbSLbLdzIT46d5hX5Kje+88j5FDTbRD3MGeirhu/NTT1m7oP8ZhVoZPV1k9IcIKX6gy9d0oVWolXVFfo3h6aWdhcO9Trjf2f/wuNV7US10k9wj98kOCckT0iOvyQHpEx7cDXrBm+Bt7U99o75Vby6ttaCquU1ORP3OX+2/HOQ=</latexit> res(r1, r2) = Z S 2 wres(⌘; ✓)e jh⌘,rid⌘ Computed by numerical integration : Implicit neural representation
and residual kernels – Hyperparameters are jointly optimized by a steepest descent-based algorithm – Solution still satisfies Helmholtz equation – Inference by linear operation based on kernel ridge regression August 22, 2025 15 Implicit neural representation of kernel function with constraint of Helmholtz equation <latexit sha1_base64="0rmx5Ei2B3tRv8KkicAKVISG2OQ=">AAAC4XicfZHNThsxEMedLW1p+hXKsReLqFLVomi3qgKXSqjl0EsFSASQslE068wGK7bXsr2IdLUP0FsFR3iaXssL8DZ481EVUjqSpZ/+87fHM5Nowa0Lw+ta8GDp4aPHy0/qT589f/GysfLqwGa5YdhhmcjMUQIWBVfYcdwJPNIGQSYCD5PRlyp/eILG8kztu7HGnoSh4iln4LzUb7TjEWgN9BOdQr+IJbhjI4sBN2VJ3y/oBm1Z9hvNsBVOgi5CNIMmmcVuf6V2Hg8ylktUjgmwthuF2vUKMI4zgWU9zi1qYCMYYtejAom2V0waLOkbrwxomhl/lKMT9e8bBUhrxzLxzuqX9m6uEv+V6+Yu3ewVXOncoWLTQmkuqMtoNS3qh4DMibEHYIb7v1J2DAaY8zOt3yqTyPXJhGxqfTfb6Ls0+M0rOxoNuMy8K2IwQwmnpe96GK9X9D8jV3Ojp/uM/hEu+Xcsiz90r5WruXVOfo3R3aUtwsGHVtRutfc+Nrc+zxa6TF6TNfKWRGSDbJGvZJd0CCOX5Bf5Ta4CFvwIfgZnU2tQm91ZJbciuLgBJGTucg==</latexit> = dir + res Directed kernel Residual kernel <latexit sha1_base64="LfIE5/umsVmKZKr2rKhL6DRhj+4=">AAACz3icfZFNbxMxEIad5auEj6Zw5GIRkBCKot2qKhwr6IELaiuRtlI2imad2Y1V27uyZ4GwWtQrXPkjXOk/6b+pN00q2lJGsvT4nXdsjycplHQUhqet4NbtO3fvrdxvP3j46PFqZ+3JvstLK3AgcpXbwwQcKmlwQJIUHhYWQScKD5Kj903+4DNaJ3PziWYFjjRkRqZSAHlp3HkRJ7qKEySoe3zOGWh9saFpkxl3umE/nAe/DtECumwRu+O11q94kotSoyGhwLlhFBY0qsCSFArrdlw6LEAcQYZDjwY0ulE1b6fmL70y4Wlu/TLE5+rfFRVo52Y68U4NNHVXc434r9ywpPTtqJKmKAmNOL8oLRWnnDd/wyfSoiA18wDCSv9WLqZgQZD/wfalaxLda863LnW+m230XVr86JWdAi1Qbl9XMdhMw9fad53FvYb+Z5RmafR0k9EfIrX8hnV1QTdapVlal+THGF0d2nXYX+9Hm/3NvY3u1rvFQFfYM/acvWIRe8O22Ae2ywZMsJ/sN/vDToK94EvwPTg+twatRc1TdimCH2e3guZ0</latexit> , , ✓ Estimation is still achieved by FIR filter in time domain
of ATF magnitude from discrete set of measurements of ATF magnitudes, e.g., – Estimating the sound field using signals not synchronized – Estimating the directivity of musical instruments or other vibrating bodies -0.1 -0.05 0 0.05 0.1 0.15 0.2 -1 -0.5 0 0.5 1 1.5 z (m) 0.05 0.1 0.15 0.2 Estimating “magnitude” distribution of acoustic transfer function (ATF) Microphone Target region: Pressure Magnitude
of basis expansion-based method – ATF at arbitrary positions of sources and mics and frequencies can be obtained owing to their conditioning – Combining multiple datasets with differenct measurement setups is possible – Retraining is unnecessary in inference; therefore, computationally efficient August 22, 2025 19 &ODPEFS %FDPEFS 1SPUPUZQFT -BUFOUWBSJBCMFT "WFSBHF -PH"5'NBHOJUVEFT Spatially independent Spatially dependent Spatially dependent Nonlinear autoencoder for ATF magnitude estimation [Koyama+ 2025]
and control – NNs will provide adaptability to acoustic environment and data-driven prior information – Physics-Constrained Neural Kernel • Implicit neural representation of kernel function • Constraint on Helmholtz equation by plane wave expansion-based representation – Autoencoder conditioned on source and mic positions • Nonlinear extention of basis expansion-based method • Magnitude field estimation by using a very small number of mics – Interested? • Join our special session “Neural spatial audio processing” at ICASSP 2026! August 22, 2025 24 Thank you for your attention!