10th Speech in Noise Workshop, 11-12 January 2018, Glasgow

Simulating hearing loss in neural networks: Does pre-training on intact speech boost performance on degraded input?

Robert Grimm(a), Michèle Pettinato
Computational Linguistics & Psycholinguistics Research Center, University of Antwerp, Belgium

(a) Presenting

We present results from machine learning experiments designed to mimic conditions in human listeners with cochlear implants (CIs). Postlingually deaf (PD) individuals, who receive CIs after a period of normal hearing, often perform better on hearing-related tasks than congenitally deaf (CD) individuals, who are born deaf and receive CIs after an early period of auditory deprivation. We consider two possible reasons for this.

(1) CD individuals might perform worse than PD individuals as a result of early auditory deprivation. Then, after implantation with a CI, the brain may have lost the plasticity necessary for full recovery.

(2) Alternatively, and in contrast to CD individuals, PD individuals might learn to differentiate fine-grained speech structure during their period of normal hearing that is impossible to acquire from CI-delivered signals. This feature structure might then boost hearing performance post-implantation –
giving PD individuals an advantage relative to CD individuals.

To evaluate the two possibilities, we train neural networks on (a) normal speech and (b) vo-coded speech that is modified to simulate the input received by people with CIs. We then compare the performance of two networks: a CD network, which is only trained on vo-coded speech; and a PD network, which is pre-trained on normal speech before the training data are vo-coded. We find that the PD network retains sensitivity to fine-grained spectral differences that is absent in the CD network, even though this does not lead to the PD network outperforming the CD network.

However, to transition from intact to vo-coded speech, the PD network only requires minor adjustments to its internal connectivity – which affords rapid adaptation to vo-coded speech. The CD network, in contrast, cannot rely on previous knowledge gained from intact speech and has to start learning from scratch. Thus, if we severely restrict the learning capacity of both networks once exposed to vocoded speech, the PD network reaches peak performance in a fraction of the time it takes for learning to plateau in the CD network.

This suggests that PD individuals outperform CD individuals because the manner in which they process intact speech only requires minor modifications in order to generalize to CI-delivered speech. As a result, they can rapidly adapt to CIs. CD individuals, once implanted, need to develop speech processing capacities from scratch; and unless they are implanted within the first months of life, the brain’s reduced plasticity leads to reduced performance relative to PD individuals.

Last modified 2017-11-17 15:56:08