Decoding the Illusion of Speech: Scientists Investigate the Brain’s Mechanism for Understanding Sound

by: Chelsey Rodriguez

In 1953, cognitive scientist Colin Cherry puzzled the world with what is known as the “Cocktail Party Effect.” This phenomenon questions how it is that we are able to focus on a particular sound in any given setting- such as a single conversation in a loud party. For at least a decade, scientists have speculated that specific neurons tune themselves so that we can focus on specific aspects of sound and ultimately detect speech. However, the mechanism used to extract meaning from noise is still unknown. A group of researchers at UC Berkeley’s Helen Wills Neuroscience Institute published their findings addressing this issue in Nature Communications in December 2016. Previous research theorizes that the brain makes assumptions of auditory information so that it does not have to perform more processing tasks than it needs. While this may be true, the Theunissen and Knight labs provided evidence in their publication that the human brain allows new auditory information to influence the way sounds are represented in the brain. In other words, not only does the brain use previous experiences with speech in order to make sense of sound, but it also becomes hypersensitive to new information so that it can easily mold prior auditory information stored in the brain.

Previous research has determined that in a specific region within the brain called the auditory cortex, neurons filter out noise from the environment in order to discriminate speech against non-speech. Moreover, there are different levels of “filters” that happen in the brain, and neuroscientists have posed the question, “do these different filters interact with one another?” This is important for understanding how auditory information is processed because it could help explain whether or not a person’s prior knowledge and experiences influence the information that is being passed throughout the brain. In particular, scientists are interested in figuring out whether or not a person’s internal state affects their perception of sounds present in reality.

In this experiment, the researchers performed electrocorticography (ECoG) in epilepsy patients who already had pieces of their skulls removed and electrodes implanted. ECoG utilizes these electrodes to record electrical activity in the brain. During the actual experiment, the patients were directed to listen to a garbled sentence. Afterwards, they were asked if they could understand what was said in the soundtrack, which none of the patients were able to do. The second sound they heard was a clear version of the garbled sentence- after hearing this sentence, the patients affirmed they understood the noise. Finally, the participants were exposed to the exact same garbled sentence as before. After hearing the clear version, they were able to understand the nonsensical sentence. It was almost as if they could not avoid hearing the clear version. This auditory experiment is very similar to the face and vase illusion. When people first look at the picture, the viewer either sees a face or a vase- let’s assume a face in this example. When the opposite image, the vase, is pointed out to the viewer then the person will continue to see the vase no matter what; that is exactly what happened in this experiment. After the listener was introduced to the meaning of the garbled sentence, the listener continued to extract meaning from the sentence even when it didn’t make sense.

While the behavioral experiments were happening, the researchers were running ECoG recordings. During these recordings, Theunissen’s team was able to record from approximately 500,000 neurons in the auditory cortical area of the brain. After hearing the garbled sentence for the first time, there was not much neural activity in brain regions related to speech. However, after hearing the garbled sentence for a second time (following the clear version of the sentence), a different neural pattern emerged. More specifically, this new neural pattern closely resembled the neural activity recorded when the clear sentence was played.

On the neural level, how can this be? Using a special mapping technique, the group was able to analyze the data and reveal that the brain was responding to traits of the clear sentence which were present in the garbled version. As well, they found that in order to decode speech effectively, the auditory cortex splits complex traits of sound into different levels of neurons. This split leads to various filters of spectro-temporal modulation, or pitch and frequency. This “Spectrotemporal Receptive Field is always there, and it’s plastic at the level of the auditory cortex” said co-author Frédéric Theunissen, a professor at UC Berkeley and member of Helen Wills Neuroscience Institute. This ever-occurring auditory modulation is what allows our brains to change the way it focuses on sound, and ultimately understand something as challenging as the garbled sentence for the second time. Thus, the researchers were able to conclude that neurons in the auditory cortex tune themselves in order to understand speech-like sounds after being introduced to new information related to language. By doing this, the brain can induce enhancement in the understanding of degraded speech signals. The tuning also allows the brain to focus on relevant information and ignore the rest that’s present in a person’s environment.

By understanding how the brain decodes auditory information, researchers and product developers can create speech devices that carry out the same processes. “If we understand not only at the periphery level but also centrally, we can provide information to the brain so that it can reconstruct the signal,” said Theunissen. These kinds of devices can then help improve speech impediments in populations of patients who may need help with constructing those sound signals. Professor Theunissen believes that “this kind of research could help alleviate issues with deafness by leading to improved designs of hearing aids.” In addition, tools used for speech recognition, such as Siri, can greatly improve with the knowledge gained in experiments such as those done in the Theunissen lab.