10th Speech in Noise Workshop, 11-12 January 2018, Glasgow

Data-driven discovery of general mechanisms of cortical processing of natural sounds

Moritz J Boos(a)
University of Oldenburg, Applied Neurocognitive Psychology Lab, Germany

Jochem W. Rieger
Applied Neurocognitive Psychology Lab, Carl-von-Ossietzky University Oldenburg, Germany

Jörg Lücke
Machine Learning Group, Carl-von-Ossietzky University Oldenburg, Germany

(a) Presenting

In humans, brain activity can be predicted from a given stimulus representation combining fMRI with voxel-wise encoding models. Yet, the choice of stimulus representation can limit the interpretability of the encoding model. In the auditory domain, an efficient coding of natural sounds a sparsity constraint on their representation - accounts for tuning properties of auditory nerve fibers (Lewicki, 2002). We aim to join these two approaches unsupervised learning and voxel-wise encoding models in fMRI to find general principles underlying the cortical functional organization of auditory processing. For 15 participants from an open 7-Tesla fMRI dataset (Hanke et al., 2014), we predict BOLD activity elicited by an auditory movie. We learn a representation of the Mel-frequency spectrogram of the auditory movie using sparse coding with binary latents (BSC) (Lücke and Eggert, 2010; Henniges et al., 2010). We decompose the encoding model into latent dimensions, using principal component analysis. This reveals three dimensions that generalize across participants. The first dimension is centered on primary auditory cortex. It correlates highly with stimulus energy (r=.78). The other two dimensions are located more lateral, ventral, and anterior along the superior temporal sulcus. While both are sensitive to the varying auditory complexity of the stimulus, the second component relates to the speech signal to noise ratio (SNR) in the auditory movie. Using fMRI activity represented in this three dimensional latent space, we reconstruct ratings of the speech SNR of the stimulus for unseen participants and unseen parts of the movie (r=.8). In conclusion, unsupervised learning and a data-driven decomposition of fMRI activity reveals general mechanisms underlying auditory processing in human temporal cortices.

References
Hanke, Michael, Florian J Baumgartner, Pierre Ibe, Falko R Kaule, Stefan Pollmann, Oliver Speck, Wolf Zinke, and Jörg Stadler (2014). In: Scientific data 1.
Lewicki, Michael S (2002). In: Nature neuroscience 5.4, pp. 356–363.
Lücke, Jörg and Julian Eggert (2010). In: Journal of Machine Learning Research 11, pp. 2855–900.

Last modified 2017-11-17 15:56:08