Revision of a binaural model predicting speech intelligibility against envelope-modulated noise interferers
Collin and Lavandier (2013) developed a preliminary model predicting binaural speech intelligibility against non-stationary noise interferers in rooms for normal-hearing listeners. The model has four parameters (frequency resolution, SNR ceiling, temporal resolutions of the better-ear listening and binaural unmasking components), the influence of which was not thoroughly tested. The aim of the present work was to realize a parametric study - based on a sensitivity analysis - on the model parameters to optimize their values using several experiments with conditions critically testing the model.
This study used the data from five experiments, four from the literature and one realized during this work. The data from the literature were taken from the two experiments of Culling and Mansell (2013) and from two of experiments of Collin and Lavandier (expt 1 and 4, 2013). Culling and Mansell used noise interferers with an envelope artificially modulated by a square wave with different modulation rates. Their stimuli isolate the two components of spatial unmasking, which is interesting to test their temporal resolutions separately. Collin and Lavandier used noise interferers with speech-modulated envelope. The first experiment is relevant to test the model predictions when reverberation is filling in the interferer gaps for different modulations depth. The second experiment allows testing the better-ear glimpsing component of the model for speech modulations in the interferer. The additional experiment realized during this study further tested the influence of reverberation, varying the modulation depth of speech-modulated noise interferers, in binaural conditions involving better-ear glimpsing and binaural unmasking in combination and in isolation.
The sensitivity analysis allows quantifying the impact of each model parameter on the predictions for the five experiments. It also highlights the potential interactions between these parameters. The main criterion used to evaluate model performance was the mean absolute error between data and prediction. The maximum absolute error and the correlation between data and prediction were also considered. The parameter values were revised to describe the different perceptual effects involved across the five experiments. Finally, the revised model will be tested on data not used to define its parameter (Ewert et al, 2017). This data set is interesting because it involves speech-like masker and maskers based on a stationary speech-shaped noise, while isolating the two components of spatial unmasking.