We see what we hear: dissonant music engages early visual processing


AbstractThe neuroscientific examination of music processing in audiovisual contexts offers a valuable framework to assess how auditory information influences the emotional encoding of visual information. Using fMRI during naturalistic film viewing, we investigated the neural mechanisms underlying music’s effect on valence inferences during mental state attribution. Thirty-eight participants watched the same short-film accompanied by systematically controlled consonant or dissonant music. Subjects were instructed to think about the main character’s intentions. The results revealed that increasing levels of dissonance led to more negatively-valenced inferences, displaying the profound emotional impact of musical dissonance. Crucially, at the neuroscientific level and despite music being the sole manipulation, dissonance evoked the response of the primary visual cortex response (V1). Functional/effective connectivity analysis showed a stronger coupling between the auditory ventral stream (AVS) and V1 in response to tonal dissonance, and demonstrated the modulation of early visual processing via top-down feedback inputs from the AVS to V1. These V1 signal changes indicate the influence of high-level contextual representations associated with tonal dissonance on early visual cortices, serving to facilitate the emotional interpretation of visual information. The findings substantiate the critical role of audio-visual integration in shaping higher-order functions such as social cognition.Significance statementThe present study reveals responses in the primary visual cortex modulated by musical information: tonal dissonance recruits early visual processing via feedback interactions from the auditory ventral pathway to the primary visual cortex. We demonstrate that the auditory “what” ventral stream plays a role in assigning meaning to non-verbal sound cues, such as dissonant music conveying negative emotions, providing an interpretative framework that serves to process the audio-visual experience. Our results highlight the significance of employing systematically controlled music, which can isolate emotional valence from the arousal dimension, to elucidate the brain’s sound-to-meaning interface and its distributive crossmodal effects on early visual encoding during naturalistic film viewing.Data sharingAll relevant data are available from the figshare database DOI: 10.6084/m9.figshare.21345240