Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Predicting speech release from masking through ...

Predicting speech release from masking through spatial separation in distance

Talk given at Forum Acusticum 2014 in Krakow, Poland.

Alexandre Chabot-Leclerc

September 12, 2014
Tweet

Other Decks in Research

Transcript

  1. Predicting speech release from masking through spatial separation in distance

    ! Alexandre Chabot-Leclerc! Torsten Dau! ! Center for Applied Hearing Research! Technical University of Denmark! ! September 12, 2014! 1
  2. 2 • CRM speech material (Bolia et al., 2000)! On-axis

    spatial release 
 from masking Westermann et al. (2012) • Speech masker,
 Speech-modulated SSN • Presented over headphones • Compensation of room coloration • SNR measured at the ears 0.5 2 5 10 Maskers distance [m] −14 −12 −10 −8 −6 −4 −2 0 SRT [dB at the ears] Dichotic speech 0.5 2 5 10 Maskers distance [m] −14 −12 −10 −8 −6 −4 −2 0 SRT [dB at the ears] Dichotic speech Diotic speech 0.5 2 5 10 Maskers distance [m] −14 −12 −10 −8 −6 −4 −2 0 SRT [dB at the ears] Dichotic speech Diotic speech Dichotic noise
  3. Types of masking 3 Energetic! masking! affects audibility Modulation! masking

    Informational! masking! affects object! formation
  4. Long-term audibility-based binaural models 4 Binaural speech! intelligibility model (BSIM)!

    (Beutelmann et al. (2010) JASA) Lavandier and Culling (2010)! (Jelfs et al. (2011)) Better ear Binaural! advantage Binaraul benefit + Band-pass filtering BRIR (left) BRIR (right)
  5. Long-term binaural models do not predict SRM 5 0.5 2

    5 10 Maskers distance [m] −2 0 2 4 6 8 10 12 Spatial release from masking (dB) Data Data (diotic) 0.5 2 5 10 Maskers distance [m] −2 0 2 4 6 8 10 12 Spatial release from masking (dB) Data Data (diotic) Jelfs BSIM
  6. 6 P S+N - P N P N Audio-domain filtering

    and Hilbert envelope Temporal modulation filtering SNR env Ideal observer Speech + Noise Noise Input SNR [dB] Predicted % Correct Integration across channel The sEPSM The speech-based envelope power spectrum model Jørgensen and Dau (2011) JASA PS+N PN PN
  7. The sEPSM predicts almost all the monaural SRM 7 0.5

    2 5 10 Maskers distance [m] −2 0 2 4 6 8 10 12 Spatial release from masking (dB) Data Data (diotic) Jelfs BSIM 0.5 2 5 10 Maskers distance [m] −2 0 2 4 6 8 10 12 Spatial release from masking (dB) Data Data (diotic) Jelfs BSIM sEPSM
  8. The multi-resolution sEPSM (mr-sEPSM) Short-term calculation of the SNR in

    the modulation domain Jørgensen and Dau (2013) JASA 9
  9. 0.5 2 5 10 Maskers distance [m] −2 0 2

    4 6 8 10 12 Spatial release from masking (dB) Data Data (diotic) Jelfs BSIM sEPSM The mr-sEPSM does not predict improved intelligibility 10 0.5 2 5 10 Maskers distance [m] −2 0 2 4 6 8 10 12 Spatial release from masking (dB) Data Data (diotic) Jelfs BSIM sEPSM mr-sEPSM
  10. 0 2 10 5 10 0 Masker distance [m] Spatial

    release from masking [dB] Data The mr-sEPSM fails because of the smeared maskers 11 0 2 10 5 10 0 Masker distance [m] Spatial release from masking [dB] Data sEPSM (MM) 0 2 10 5 10 0 Masker distance [m] Spatial release from masking [dB] Data sEPSM (MM) ESII (EM) 0 2 10 5 10 0 Masker distance [m] Spatial release from masking [dB] Data sEPSM (MM) ESII (EM) mr-sEPSM Informational masking? More steady-state
  11. SRM due to separation in distance is dominated by informational

    masking •The dominant factor seems to be release from informational masking, due to easier segregation. •Release from long-term modulation masking accounts for a large portion of the SRM. •… but is counteracted by increased masking from the maskers becoming more steady-state. •Release from long-term energetic masking does not contribute to SRM. •Long-term binaural processing does not provide and SRM. 12
  12. Thank you This research was supported in part by: •The

    National Science and Engineering Research Council of Canada (NSERC) •Phonak, and •The Technical University of Denmark 14