Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Predicting speech release from masking through spatial separation in distance

Predicting speech release from masking through spatial separation in distance

Talk given at Forum Acusticum 2014 in Krakow, Poland.

Alexandre Chabot-Leclerc

September 12, 2014
Tweet

Other Decks in Research

Transcript

  1. Predicting speech release from masking through spatial separation in distance

    ! Alexandre Chabot-Leclerc! Torsten Dau! ! Center for Applied Hearing Research! Technical University of Denmark! ! September 12, 2014! 1
  2. 2 • CRM speech material (Bolia et al., 2000)! On-axis

    spatial release 
 from masking Westermann et al. (2012) • Speech masker,
 Speech-modulated SSN • Presented over headphones • Compensation of room coloration • SNR measured at the ears 0.5 2 5 10 Maskers distance [m] −14 −12 −10 −8 −6 −4 −2 0 SRT [dB at the ears] Dichotic speech 0.5 2 5 10 Maskers distance [m] −14 −12 −10 −8 −6 −4 −2 0 SRT [dB at the ears] Dichotic speech Diotic speech 0.5 2 5 10 Maskers distance [m] −14 −12 −10 −8 −6 −4 −2 0 SRT [dB at the ears] Dichotic speech Diotic speech Dichotic noise
  3. Types of masking 3 Energetic! masking! affects audibility Modulation! masking

    Informational! masking! affects object! formation
  4. Long-term audibility-based binaural models 4 Binaural speech! intelligibility model (BSIM)!

    (Beutelmann et al. (2010) JASA) Lavandier and Culling (2010)! (Jelfs et al. (2011)) Better ear Binaural! advantage Binaraul benefit + Band-pass filtering BRIR (left) BRIR (right)
  5. Long-term binaural models do not predict SRM 5 0.5 2

    5 10 Maskers distance [m] −2 0 2 4 6 8 10 12 Spatial release from masking (dB) Data Data (diotic) 0.5 2 5 10 Maskers distance [m] −2 0 2 4 6 8 10 12 Spatial release from masking (dB) Data Data (diotic) Jelfs BSIM
  6. 6 P S+N - P N P N Audio-domain filtering

    and Hilbert envelope Temporal modulation filtering SNR env Ideal observer Speech + Noise Noise Input SNR [dB] Predicted % Correct Integration across channel The sEPSM The speech-based envelope power spectrum model Jørgensen and Dau (2011) JASA PS+N PN PN
  7. The sEPSM predicts almost all the monaural SRM 7 0.5

    2 5 10 Maskers distance [m] −2 0 2 4 6 8 10 12 Spatial release from masking (dB) Data Data (diotic) Jelfs BSIM 0.5 2 5 10 Maskers distance [m] −2 0 2 4 6 8 10 12 Spatial release from masking (dB) Data Data (diotic) Jelfs BSIM sEPSM
  8. The multi-resolution sEPSM (mr-sEPSM) Short-term calculation of the SNR in

    the modulation domain Jørgensen and Dau (2013) JASA 9
  9. 0.5 2 5 10 Maskers distance [m] −2 0 2

    4 6 8 10 12 Spatial release from masking (dB) Data Data (diotic) Jelfs BSIM sEPSM The mr-sEPSM does not predict improved intelligibility 10 0.5 2 5 10 Maskers distance [m] −2 0 2 4 6 8 10 12 Spatial release from masking (dB) Data Data (diotic) Jelfs BSIM sEPSM mr-sEPSM
  10. 0 2 10 5 10 0 Masker distance [m] Spatial

    release from masking [dB] Data The mr-sEPSM fails because of the smeared maskers 11 0 2 10 5 10 0 Masker distance [m] Spatial release from masking [dB] Data sEPSM (MM) 0 2 10 5 10 0 Masker distance [m] Spatial release from masking [dB] Data sEPSM (MM) ESII (EM) 0 2 10 5 10 0 Masker distance [m] Spatial release from masking [dB] Data sEPSM (MM) ESII (EM) mr-sEPSM Informational masking? More steady-state
  11. SRM due to separation in distance is dominated by informational

    masking •The dominant factor seems to be release from informational masking, due to easier segregation. •Release from long-term modulation masking accounts for a large portion of the SRM. •… but is counteracted by increased masking from the maskers becoming more steady-state. •Release from long-term energetic masking does not contribute to SRM. •Long-term binaural processing does not provide and SRM. 12
  12. Thank you This research was supported in part by: •The

    National Science and Engineering Research Council of Canada (NSERC) •Phonak, and •The Technical University of Denmark 14