Comparison of modal versus delay-and-sum beamforming in the context of data-based binaural synthesis

Comparison of Modal versus Delay-and-Sum Beamforming in the Context of
Data-Based Binaural Synthesis Sascha Spors, Hagen Wierstorf and Matthias Geier Quality and Usability Lab Deutsche Telekom Laboratories Technische Universität Berlin 132nd Convention of the Audio Engineering Society Budapest, April 2012

Introduction Beamforming Techniques Results Conclusions Dynamic Data-based Binaural Synthesis [Duraiswami
et al. 2005, Li et al. 2006, Melchior et al. 2009, ...] capture and decompose sound field into plane waves filter with far-field head-related transfer functions (HRTFs) plane wave decomposition Primary Source ¯ HL (φ, θ, γ, δ, ω) ¯ HR (φ, θ, γ, δ, ω) x y M P(x, ω) ¯ P(φ, θ, ω) (γ, δ) (φ, θ) analysis by spherical microphone arrays is subject to practical limitations perceptual impact of limitations not fully investigated S.Spors, H.Wierstorf and M.Geier Comparison of data-based binaural synthesis 132ndAES 1 / 15

Introduction Beamforming Techniques Results Conclusions Previous Studies Inﬂuence of spatial
bandlimitation and sampling in modal beamforming interaural cross correlation (ICC) [Rafaely et al. 2010] localization (ITD/ILD) [Avni et al. 2010] listening experiment incl. dual radius cardioid array [Melchior et al. 2009] localization/coloration w/o sampling [Spors et al. 2012] ... Aims of this paper comparison of two beamforming techniques separate consideration of spatial bandwidth and sampling localization properties estimated by binaural model S.Spors, H.Wierstorf and M.Geier Comparison of data-based binaural synthesis 132ndAES 2 / 15

Introduction Beamforming Techniques Results Conclusions Spatial Dimensionality of Sound Fields
Summary of concept from [Kennedy et al. 2007] A finite number of orthogonal components is sufficient to represent a multipath sound field with bounded error within a bounded source-free region Quantitative results for a spherical region of radius R spherical harmonics series expansion truncated to order N field truncation error ǫN decays exponentially with increasing N N increases linearly with frequency f and radius R for fixed ǫ Example for human head R = 9 cm, full audio bandwidth f = 20 kHz scattering around human head neglected ǫ < 67.8% for N > 45 ⇒ M ≥ 2116 sensors required S.Spors, H.Wierstorf and M.Geier Comparison of data-based binaural synthesis 132ndAES 3 / 15

Introduction Beamforming Techniques Results Conclusions Principle of Modal Beamforming modal
beamforming plane wave decomposition spherical harmonics decomposition N M P(x, ω) ˚ Pm n (ω) ¯ P(φ, θ, ω) (dual) open/rigid spheres with pressure/cardioid microphones representation of captured sound ﬁeld w.r.t. surface spherical harmonics plane wave decomposition from spherical harmonics expansion issues with numerical conditioning and complexity S.Spors, H.Wierstorf and M.Geier Comparison of data-based binaural synthesis 132ndAES 4 / 15

Introduction Beamforming Techniques Results Conclusions Practical Limitations of Modal Beamforming
1 spatial sampling repetitions of spatial spectrum (e.g. Gaussian sampling [Ahrens et al. 2012]) typically limitation of spatial bandwidth 2 equipment noise differential operation for low frequencies frequency dependent white noise gain (WNG) 3 sensor and position mismatch Example (N = 6, M = 84, uniform sampling) from [Rafaely, Analysis and Design of Spherical Microphone Arrays, 2005] S.Spors, H.Wierstorf and M.Geier Comparison of data-based binaural synthesis 132ndAES 5 / 15

Introduction Beamforming Techniques Results Conclusions Principle of Delay-and-Sum Beamforming +
... delay-and-sum beamforming τ0 τ1 τM M P(x, ω) ¯ P(φ, θ, ω) arbitrary (free-ﬁeld) sensor geometries delays compensate for the propagation time between microphones constructive interference by summation numerically efﬁcient and stable S.Spors, H.Wierstorf and M.Geier Comparison of data-based binaural synthesis 132ndAES 6 / 15

Introduction Beamforming Techniques Results Conclusions Practical Aspects of Delay-and-Sum Beamforming
1 spatial sampling leads to repetitions in spatial spectrum no limitation of spatial bandwidth 2 equipment noise no differential operation for low frequencies constant white noise gain (WNG) 3 sensor and position mismatch White noise gain (WNG) 102 103 104 −140 −120 −100 −80 −60 −40 −20 0 20 40 frequency (Hz) WNG (dB) delay−and−sum modal (N=23) Directivity index (DI) 102 103 104 0 5 10 15 20 25 30 35 40 45 50 frequency (Hz) DI (dB) delay−and−sum modal (N=23) (R = 0 . 5 m, N = 23, M = 770) [Rafaely 2005] S.Spors, H.Wierstorf and M.Geier Comparison of data-based binaural synthesis 132ndAES 7 / 15

Introduction Beamforming Techniques Results Conclusions Experimental Conditions cont. delay&sum beamforming
cont. modal beamforming delay&sum beamforming modal beamforming plane wave decomposition spherical harmonics decomposition plane wave (φpw , θpw ) HL (φ, θ, γ, δ, ω) HR (φ, θ, γ, δ, ω) x y N N M P(x, ω) ˚ Pm n (ω) ¯ P(φ, θ, ω) unit-amplitude plane wave as incident sound ﬁeld (φpw = 0◦) analytic spatially continuous solutions [Rafaely 2005] numerical simulation w/o equipment noise S.Spors, H.Wierstorf and M.Geier Comparison of data-based binaural synthesis 132ndAES 8 / 15

Introduction Beamforming Techniques Results Conclusions Experimental Parameters horizontal-plane HRTFs, 3
m distance [Wierstorf at al. 2011] sampling rate fS = 44.1 kHz radius of microphone array R = 0.5 m M = 770 microphones, maximum order N = 23 omnidirectional/cardioid microphones for delay-and-sum/modal beamformer localization estimated by binaural model [Dietz et al. 2011] Implementation Sound Field Analysis Toolbox (SOFiA) [Bernschütz et al. 2011] Auditory Modeling Toolbox (AMT) [Sondergaardet al. 2011] delay-and-sum beamforming requires 6 dB per octave high-pass ﬁlter S.Spors, H.Wierstorf and M.Geier Comparison of data-based binaural synthesis 132ndAES 9 / 15

Introduction Beamforming Techniques Results Conclusions Continuous Plane Wave Decomposition Frequency
Response Modal Beamforming (N = 23) Delay-and-sum Beamforming (R = 0 . 5 m, fS = 44 . 1kHz) S.Spors, H.Wierstorf and M.Geier Comparison of data-based binaural synthesis 132ndAES 10 / 15

Introduction Beamforming Techniques Results Conclusions Continuous Plane Wave Decomposition Temporal
Response Modal Beamforming (N = 23) φ (deg) time (ms) −180 −90 0 90 180 −1 −0.5 0 0.5 1 dB −50 −40 −30 −20 −10 0 Delay-and-sum Beamforming φ (deg) time (ms) −180 −90 0 90 180 −3 −2 −1 0 1 2 3 dB −50 −40 −30 −20 −10 0 (R = 0 . 5 m, fS = 44 . 1kHz) S.Spors, H.Wierstorf and M.Geier Comparison of data-based binaural synthesis 132ndAES 10 / 15

Introduction Beamforming Techniques Results Conclusions Spatially Sampled Plane Wave Decomposition
Frequency Response Modal Beamforming (N = 23) Delay-and-sum Beamforming (R = 0 . 5 m, fS = 44 . 1kHz, M = 770, Lebedev grid) S.Spors, H.Wierstorf and M.Geier Comparison of data-based binaural synthesis 132ndAES 11 / 15

Introduction Beamforming Techniques Results Conclusions Spatially Sampled Plane Wave Decomposition
Temporal Response Modal Beamforming (N = 23) φ (deg) time (ms) −180 −90 0 90 180 −3 −2 −1 0 1 2 3 dB −50 −40 −30 −20 −10 0 Delay-and-sum Beamforming φ (deg) time (ms) −180 −90 0 90 180 −3 −2 −1 0 1 2 3 dB −50 −40 −30 −20 −10 0 (R = 0 . 5 m, fS = 44 . 1kHz, M = 770, Lebedev grid) S.Spors, H.Wierstorf and M.Geier Comparison of data-based binaural synthesis 132ndAES 11 / 15

Introduction Beamforming Techniques Results Conclusions Perceived Direction Modal Beamforming 0◦
5◦ 10◦ 15◦ 20◦ −90◦ −45◦ 0◦ 45◦ 90◦ |∆azimuth angle| azimuth angle JND cont. N = 3 cont. N = 5 cont. N = 10 cont. N = 23 sampled Delay-and-sum Beamforming 0◦ 5◦ 10◦ 15◦ 20◦ −90◦ −45◦ 0◦ 45◦ 90◦ |∆azimuth angle| azimuth angle JND continuous sampled (R = 0 . 5 m, fS = 44 . 1kHz, M = 770, Lebedev grid) S.Spors, H.Wierstorf and M.Geier Comparison of data-based binaural synthesis 132ndAES 12 / 15

Introduction Beamforming Techniques Results Conclusions Weighted Frequency Response Modal Beamforming
−10 −5 0 5 10 15 20 100 1000 10000 ∆magnitude (dB) center frequency (Hz) cont. N = 3 cont. N = 5 cont. N = 10 cont. N = 23 sampled Delay-and-sum Beamforming −10 −5 0 5 10 15 20 100 1000 10000 ∆magnitude (dB) center frequency (Hz) continuous sampled (R = 0 . 5 m, fS = 44 . 1kHz, M = 770, Lebedev grid) comparison of left-ear HRTF/BRTF for source in front weighting by auditory ﬁlter bank & loudness compression S.Spors, H.Wierstorf and M.Geier Comparison of data-based binaural synthesis 132ndAES 13 / 15

Introduction Beamforming Techniques Results Conclusions Results from Informal Listening Spatially
continuous plane wave decomposition hardly any artifacts are audible for speech and music slight change in coloration for noise bursts perceived distance increases with increasing order N in modal beamforming Spatially sampled plane wave decomposition slight localization errors and coloration for modal beamforming (N = 23) spatial artifacts for delay-and-sum beamforming Listening examples can be downloaded from http://audio.qu.tu-berlin.de S.Spors, H.Wierstorf and M.Geier Comparison of data-based binaural synthesis 132ndAES 14 / 15

Introduction Beamforming Techniques Results Conclusions Conclusions numerical implementation is critical
modal beamforming seems to provide better results spatial artifacts of delay-and-sum beamforming not predicted by model ﬁnal conclusion requires formal listening test incl. equipment noise similarities to (perceptual) properties of WFS and NFC-HOA Future work improved delay-and-sum beamforming by selection of microphones results for complex sound ﬁelds incl. elevated sources S.Spors, H.Wierstorf and M.Geier Comparison of data-based binaural synthesis 132ndAES 15 / 15

Introduction Beamforming Techniques Results Conclusions Thanks! S.Spors, H.Wierstorf and M.Geier
Comparison of data-based binaural synthesis 132ndAES 15 / 15

Comparison of modal versus delay-and-sum beamfo...

Comparison of modal versus delay-and-sum beamforming in the context of data-based binaural synthesis

Sascha Spors

More Decks by Sascha Spors

Other Decks in Research

Featured

Transcript

Comparison of Modal versus Delay-and-Sum Beamforming in the Context of

Introduction Beamforming Techniques Results Conclusions Dynamic Data-based Binaural Synthesis [Duraiswami

Introduction Beamforming Techniques Results Conclusions Previous Studies Inﬂuence of spatial

Introduction Beamforming Techniques Results Conclusions Spatial Dimensionality of Sound Fields

Introduction Beamforming Techniques Results Conclusions Principle of Modal Beamforming modal

Introduction Beamforming Techniques Results Conclusions Practical Limitations of Modal Beamforming

Introduction Beamforming Techniques Results Conclusions Principle of Delay-and-Sum Beamforming +

Introduction Beamforming Techniques Results Conclusions Practical Aspects of Delay-and-Sum Beamforming

Introduction Beamforming Techniques Results Conclusions Experimental Conditions cont. delay&sum beamforming

Introduction Beamforming Techniques Results Conclusions Experimental Parameters horizontal-plane HRTFs, 3

Introduction Beamforming Techniques Results Conclusions Continuous Plane Wave Decomposition Frequency

Introduction Beamforming Techniques Results Conclusions Continuous Plane Wave Decomposition Temporal

Introduction Beamforming Techniques Results Conclusions Spatially Sampled Plane Wave Decomposition

Introduction Beamforming Techniques Results Conclusions Spatially Sampled Plane Wave Decomposition

Introduction Beamforming Techniques Results Conclusions Perceived Direction Modal Beamforming 0◦

Introduction Beamforming Techniques Results Conclusions Weighted Frequency Response Modal Beamforming

Introduction Beamforming Techniques Results Conclusions Results from Informal Listening Spatially

Introduction Beamforming Techniques Results Conclusions Conclusions numerical implementation is critical

Introduction Beamforming Techniques Results Conclusions Thanks! S.Spors, H.Wierstorf and M.Geier