10th Speech in Noise Workshop, 11-12 January 2018, Glasgow

Periodicity is not a factor in making harmonic complexes less effective maskers of speech than noise

Stuart Rosen(a), Kurt Steinmetzger(b)

David M Perry
Ear Institute, UCL, UK

(a) Presenting
(b) Attending

Harmonic tone complexes, whether dynamic or static, are much less effective maskers than a noise with the same overall spectral envelope, but the reasons for this are not yet clear. For one thing, the periodicity of the complex may allow it to be more effectively segregated or ‘cancelled out’ than an aperiodic noise. On the other hand, the masking of modulations in the speech by modulations in the masker may be important, and the modulation spectra of noises and harmonic complexes are very different.

We compared the relative masking effectiveness of static and dynamic complexes in which the discrete spectral components form either a harmonic or inharmonic series. The harmonic dynamic complexes have continuous modulations in fundamental frequency (F0) modelled on genuine F0 contours found in connected speech. Static complexes varied in F0 from trial to trial to match the distribution of F0s in the dynamic ones. In addition, median F0s were varied in relation to the target sentences spoken by a relatively high-pitched male speaker (F0 ≈ 150 Hz), to be low (≈ 100 Hz), medium (≈ 150 Hz) or high (≈ 225 Hz).

Inharmonic complexes were created in two ways, either by shifting all the components in the harmonic series up or down by 25% of the median F0, or by spectrally rotating the harmonic complexes around a centre frequency near 2 kHz. For static contours, these two methods are equivalent for an appropriate choice of parameters. For dynamic contours, the actual shift in component frequencies changes throughout the stimulus, sometimes more and sometimes less than 25%, but the resulting sound is typically inharmonic. Crucially, the modulation spectra of all three variants of the static complexes are essentially identical, but are very different for the frequency-shifted and spectrally-rotated dynamic complexes.

Speech Reception Thresholds (SRTs), determined adaptively, revealed the following: SRTs tend to decrease with increasing masker F0. SRTs are generally worse for dynamic as opposed to static contours. Inharmonic and harmonic complexes lead to similar SRTs, except for the rotated dynamic complexes, which are more effective maskers than the other types, especially for the two higher masker F0s. It thus appears that an adequate theory for explaining the difference in masking effectiveness between harmonic complexes and noises must consider the role of modulations.

Funding — This work was supported in part by the Medical Research Council, UK (Grant Number G1001255).

Last modified 2017-11-17 15:56:08