Cross-modal emotion recognition: How similar are patterns between DNNs and human fMRI data?
Proceedings of the ICLR2021 Workshop on How Can Findings About The Brain Improve AI Systems? (Brain2AI@ICLR2021),
Number 18,
- May 2021
Deep neural networks (DNNs) have reached human-like performance in many
perceptual classification tasks including emotion recognition from human faces
and voices. Can the patterns in layers of DNNs inform us about the processes
in the human brain and vice versa? Here, we set out to address this question
for cross-modal emotion recognition. We obtained functional magnetic resonance
imaging (fMRI) data from 43 human participants presented with 72 audio-visual
stimuli of actors/actresses depicting six different emotions. The same stimuli were
classified with high accuracy in our pre-trained DNNs built according to a crosschannel convolutional architecture. We used supervised learning to classify four
properties of the audio-visual stimuli: The depicted emotion, the identity of the
actors/actresses, their gender, and the spoken sentence. Inspired by recent studies using representational similarity analyses (RSA) for uni-modal stimuli, we
assessed the similarities between the layers of the DNN and the fMRI data. As
hypothesized, we identified gradients in pattern similarities along the different
layers of the auditory, visual, and cross-modal channels of the DNNs. These
gradients in similarities varied in expected ways between the four classification
regimes: Overall, the DNNs relied more on the visual arm. For classifying spoken sentences, the DNN relied more on the auditory arm. Crucially, we found
similarities between the different layers of the DNNs and the fMRI data in searchlight analyses. These pattern similarities varied along the brain regions involved
in processing auditory, visual, and cross-modal stimuli. In sum, our findings highlight that emotion recognition from cross-modal stimuli elicits similar patterns in
DNNs and neural signals. In a next step, we aim to assess how these patterns
differ, which may open avenues for improving DNNs by incorporating patterns
derived from the processing of cross-modal stimuli in the human brain.
@InProceedings{KRGKBHW21, author = {Korn, Christoph and Redzepovic, Sasa and Gläscher, Jan and Kerzel, Matthias and Barros, Pablo and Heinrich, Stefan and Wermter, Stefan}, title = {Cross-modal emotion recognition: How similar are patterns between DNNs and human fMRI data?}, booktitle = {Proceedings of the ICLR2021 Workshop on How Can Findings About The Brain Improve AI Systems? (Brain2AI@ICLR2021)}, editors = {}, number = {18}, volume = {}, pages = {}, year = {2021}, month = {May}, publisher = {}, doi = {}, }