Decoding the Cortical Dynamics of Sound-Meaning Mapping.


UNLABELLED: Comprehending speech involves the rapid and optimally efficient mapping from sound to meaning. Influential cognitive models of spoken word recognition (Marslen-Wilson and Welsh, 1978) propose that the onset of a spoken word initiates a continuous process of activation of the lexical and semantic properties of the word candidates matching the speech input and competition between them, which continues until the point at which the word is differentiated from all other cohort candidates (the uniqueness point, UP). At this point, the word is recognized uniquely and only the target word's semantics are active. Although it is well established that spoken word recognition engages the superior (Rauschecker and Scott, 2009), middle, and inferior (Hickok and Poeppel, 2007) temporal cortices, little is known about the real-time brain activity that underpins the computations and representations that evolve over time during the transformation from speech to meaning. Here, we test for the first time the spatiotemporal dynamics of these processes by collecting MEG data while human participants listened to spoken words. By constructing quantitative models of competition and access to meaning in combination with spatiotemporal searchlight representational similarity analysis (Kriegeskorte et al., 2006) in source space, we were able to test where and when these models produced significant effects. We found early transient effects ∼400 ms before the UP of lexical competition in left supramarginal gyrus, left superior temporal gyrus, left middle temporal gyrus (MTG), and left inferior frontal gyrus (IFG) and of semantic competition in MTG, left angular gyrus, and IFG. After the UP, there were no competitive effects, only target-specific semantic effects in angular gyrus and MTG. SIGNIFICANCE STATEMENT: Understanding spoken words involves complex processes that transform the auditory input into a meaningful interpretation. This effortless transition occurs on millisecond timescales, with remarkable speed and accuracy and without any awareness of the complex computations involved. Here, we reveal the real-time neural dynamics of these processes by collecting data about listeners' brain activity as they hear spoken words. Using novel statistical models of different aspects of the recognition process, we can locate directly which parts of the brain are accessing the stored form and meaning of words and how the competition between different word candidates is resolved neurally in real time. This gives us a uniquely differentiated picture of the neural substrate for the first 500 ms of word recognition.