Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Comparison of modal versus delay-and-sum beamforming in the context of data-based binaural synthesis

Comparison of modal versus delay-and-sum beamforming in the context of data-based binaural synthesis

Talk given at the 132nd Convention of the Audio Engineering Society presenting the paper
S. Spors, H. Wierstorf, and M. Geier. Comparison of modal versus delay-and-sum beamforming in the context of data-based binaural synthesis. In 132nd AES Convention. Audio Engineering Society (AES), April 2012.

Sascha Spors

May 09, 2012
Tweet

More Decks by Sascha Spors

Other Decks in Research

Transcript

  1. Comparison of Modal versus Delay-and-Sum Beamforming in the Context of

    Data-Based Binaural Synthesis Sascha Spors, Hagen Wierstorf and Matthias Geier Quality and Usability Lab Deutsche Telekom Laboratories Technische Universität Berlin 132nd Convention of the Audio Engineering Society Budapest, April 2012
  2. Introduction Beamforming Techniques Results Conclusions Dynamic Data-based Binaural Synthesis [Duraiswami

    et al. 2005, Li et al. 2006, Melchior et al. 2009, ...] capture and decompose sound field into plane waves filter with far-field head-related transfer functions (HRTFs) plane wave decomposition Primary Source ¯ HL (φ, θ, γ, δ, ω) ¯ HR (φ, θ, γ, δ, ω) x y M P(x, ω) ¯ P(φ, θ, ω) (γ, δ) (φ, θ) analysis by spherical microphone arrays is subject to practical limitations perceptual impact of limitations not fully investigated S.Spors, H.Wierstorf and M.Geier Comparison of data-based binaural synthesis 132ndAES 1 / 15
  3. Introduction Beamforming Techniques Results Conclusions Previous Studies Influence of spatial

    bandlimitation and sampling in modal beamforming interaural cross correlation (ICC) [Rafaely et al. 2010] localization (ITD/ILD) [Avni et al. 2010] listening experiment incl. dual radius cardioid array [Melchior et al. 2009] localization/coloration w/o sampling [Spors et al. 2012] ... Aims of this paper comparison of two beamforming techniques separate consideration of spatial bandwidth and sampling localization properties estimated by binaural model S.Spors, H.Wierstorf and M.Geier Comparison of data-based binaural synthesis 132ndAES 2 / 15
  4. Introduction Beamforming Techniques Results Conclusions Spatial Dimensionality of Sound Fields

    Summary of concept from [Kennedy et al. 2007] A finite number of orthogonal components is sufficient to represent a multipath sound field with bounded error within a bounded source-free region Quantitative results for a spherical region of radius R spherical harmonics series expansion truncated to order N field truncation error ǫN decays exponentially with increasing N N increases linearly with frequency f and radius R for fixed ǫ Example for human head R = 9 cm, full audio bandwidth f = 20 kHz scattering around human head neglected ǫ < 67.8% for N > 45 ⇒ M ≥ 2116 sensors required S.Spors, H.Wierstorf and M.Geier Comparison of data-based binaural synthesis 132ndAES 3 / 15
  5. Introduction Beamforming Techniques Results Conclusions Principle of Modal Beamforming modal

    beamforming plane wave decomposition spherical harmonics decomposition N M P(x, ω) ˚ Pm n (ω) ¯ P(φ, θ, ω) (dual) open/rigid spheres with pressure/cardioid microphones representation of captured sound field w.r.t. surface spherical harmonics plane wave decomposition from spherical harmonics expansion issues with numerical conditioning and complexity S.Spors, H.Wierstorf and M.Geier Comparison of data-based binaural synthesis 132ndAES 4 / 15
  6. Introduction Beamforming Techniques Results Conclusions Practical Limitations of Modal Beamforming

    1 spatial sampling repetitions of spatial spectrum (e.g. Gaussian sampling [Ahrens et al. 2012]) typically limitation of spatial bandwidth 2 equipment noise differential operation for low frequencies frequency dependent white noise gain (WNG) 3 sensor and position mismatch Example (N = 6, M = 84, uniform sampling) from [Rafaely, Analysis and Design of Spherical Microphone Arrays, 2005] S.Spors, H.Wierstorf and M.Geier Comparison of data-based binaural synthesis 132ndAES 5 / 15
  7. Introduction Beamforming Techniques Results Conclusions Principle of Delay-and-Sum Beamforming +

    ... delay-and-sum beamforming τ0 τ1 τM M P(x, ω) ¯ P(φ, θ, ω) arbitrary (free-field) sensor geometries delays compensate for the propagation time between microphones constructive interference by summation numerically efficient and stable S.Spors, H.Wierstorf and M.Geier Comparison of data-based binaural synthesis 132ndAES 6 / 15
  8. Introduction Beamforming Techniques Results Conclusions Practical Aspects of Delay-and-Sum Beamforming

    1 spatial sampling leads to repetitions in spatial spectrum no limitation of spatial bandwidth 2 equipment noise no differential operation for low frequencies constant white noise gain (WNG) 3 sensor and position mismatch White noise gain (WNG) 102 103 104 −140 −120 −100 −80 −60 −40 −20 0 20 40 frequency (Hz) WNG (dB) delay−and−sum modal (N=23) Directivity index (DI) 102 103 104 0 5 10 15 20 25 30 35 40 45 50 frequency (Hz) DI (dB) delay−and−sum modal (N=23) (R = 0 . 5 m, N = 23, M = 770) [Rafaely 2005] S.Spors, H.Wierstorf and M.Geier Comparison of data-based binaural synthesis 132ndAES 7 / 15
  9. Introduction Beamforming Techniques Results Conclusions Experimental Conditions cont. delay&sum beamforming

    cont. modal beamforming delay&sum beamforming modal beamforming plane wave decomposition spherical harmonics decomposition plane wave (φpw , θpw ) HL (φ, θ, γ, δ, ω) HR (φ, θ, γ, δ, ω) x y N N M P(x, ω) ˚ Pm n (ω) ¯ P(φ, θ, ω) unit-amplitude plane wave as incident sound field (φpw = 0◦) analytic spatially continuous solutions [Rafaely 2005] numerical simulation w/o equipment noise S.Spors, H.Wierstorf and M.Geier Comparison of data-based binaural synthesis 132ndAES 8 / 15
  10. Introduction Beamforming Techniques Results Conclusions Experimental Parameters horizontal-plane HRTFs, 3

    m distance [Wierstorf at al. 2011] sampling rate fS = 44.1 kHz radius of microphone array R = 0.5 m M = 770 microphones, maximum order N = 23 omnidirectional/cardioid microphones for delay-and-sum/modal beamformer localization estimated by binaural model [Dietz et al. 2011] Implementation Sound Field Analysis Toolbox (SOFiA) [Bernschütz et al. 2011] Auditory Modeling Toolbox (AMT) [Sondergaardet al. 2011] delay-and-sum beamforming requires 6 dB per octave high-pass filter S.Spors, H.Wierstorf and M.Geier Comparison of data-based binaural synthesis 132ndAES 9 / 15
  11. Introduction Beamforming Techniques Results Conclusions Continuous Plane Wave Decomposition Frequency

    Response Modal Beamforming (N = 23) Delay-and-sum Beamforming (R = 0 . 5 m, fS = 44 . 1kHz) S.Spors, H.Wierstorf and M.Geier Comparison of data-based binaural synthesis 132ndAES 10 / 15
  12. Introduction Beamforming Techniques Results Conclusions Continuous Plane Wave Decomposition Temporal

    Response Modal Beamforming (N = 23) φ (deg) time (ms) −180 −90 0 90 180 −1 −0.5 0 0.5 1 dB −50 −40 −30 −20 −10 0 Delay-and-sum Beamforming φ (deg) time (ms) −180 −90 0 90 180 −3 −2 −1 0 1 2 3 dB −50 −40 −30 −20 −10 0 (R = 0 . 5 m, fS = 44 . 1kHz) S.Spors, H.Wierstorf and M.Geier Comparison of data-based binaural synthesis 132ndAES 10 / 15
  13. Introduction Beamforming Techniques Results Conclusions Spatially Sampled Plane Wave Decomposition

    Frequency Response Modal Beamforming (N = 23) Delay-and-sum Beamforming (R = 0 . 5 m, fS = 44 . 1kHz, M = 770, Lebedev grid) S.Spors, H.Wierstorf and M.Geier Comparison of data-based binaural synthesis 132ndAES 11 / 15
  14. Introduction Beamforming Techniques Results Conclusions Spatially Sampled Plane Wave Decomposition

    Temporal Response Modal Beamforming (N = 23) φ (deg) time (ms) −180 −90 0 90 180 −3 −2 −1 0 1 2 3 dB −50 −40 −30 −20 −10 0 Delay-and-sum Beamforming φ (deg) time (ms) −180 −90 0 90 180 −3 −2 −1 0 1 2 3 dB −50 −40 −30 −20 −10 0 (R = 0 . 5 m, fS = 44 . 1kHz, M = 770, Lebedev grid) S.Spors, H.Wierstorf and M.Geier Comparison of data-based binaural synthesis 132ndAES 11 / 15
  15. Introduction Beamforming Techniques Results Conclusions Perceived Direction Modal Beamforming 0◦

    5◦ 10◦ 15◦ 20◦ −90◦ −45◦ 0◦ 45◦ 90◦ |∆azimuth angle| azimuth angle JND cont. N = 3 cont. N = 5 cont. N = 10 cont. N = 23 sampled Delay-and-sum Beamforming 0◦ 5◦ 10◦ 15◦ 20◦ −90◦ −45◦ 0◦ 45◦ 90◦ |∆azimuth angle| azimuth angle JND continuous sampled (R = 0 . 5 m, fS = 44 . 1kHz, M = 770, Lebedev grid) S.Spors, H.Wierstorf and M.Geier Comparison of data-based binaural synthesis 132ndAES 12 / 15
  16. Introduction Beamforming Techniques Results Conclusions Weighted Frequency Response Modal Beamforming

    −10 −5 0 5 10 15 20 100 1000 10000 ∆magnitude (dB) center frequency (Hz) cont. N = 3 cont. N = 5 cont. N = 10 cont. N = 23 sampled Delay-and-sum Beamforming −10 −5 0 5 10 15 20 100 1000 10000 ∆magnitude (dB) center frequency (Hz) continuous sampled (R = 0 . 5 m, fS = 44 . 1kHz, M = 770, Lebedev grid) comparison of left-ear HRTF/BRTF for source in front weighting by auditory filter bank & loudness compression S.Spors, H.Wierstorf and M.Geier Comparison of data-based binaural synthesis 132ndAES 13 / 15
  17. Introduction Beamforming Techniques Results Conclusions Results from Informal Listening Spatially

    continuous plane wave decomposition hardly any artifacts are audible for speech and music slight change in coloration for noise bursts perceived distance increases with increasing order N in modal beamforming Spatially sampled plane wave decomposition slight localization errors and coloration for modal beamforming (N = 23) spatial artifacts for delay-and-sum beamforming Listening examples can be downloaded from http://audio.qu.tu-berlin.de S.Spors, H.Wierstorf and M.Geier Comparison of data-based binaural synthesis 132ndAES 14 / 15
  18. Introduction Beamforming Techniques Results Conclusions Conclusions numerical implementation is critical

    modal beamforming seems to provide better results spatial artifacts of delay-and-sum beamforming not predicted by model final conclusion requires formal listening test incl. equipment noise similarities to (perceptual) properties of WFS and NFC-HOA Future work improved delay-and-sum beamforming by selection of microphones results for complex sound fields incl. elevated sources S.Spors, H.Wierstorf and M.Geier Comparison of data-based binaural synthesis 132ndAES 15 / 15