Researchers Gain Insight into the 'Cocktail Party Effect' That Helps Us Focus in Noisy Environments

"Some enchanted evening," Oscar Hammerstein wrote, "you will meet a stranger, across a crowded room." Whether they are falling in love or cutting a business deal, humans have an uncanny ability to zero in on and pay attention to just one talker in a noisy environment. We may remain aware of other sounds, but we have no problem tracking and completely understanding the speaker on whom we’re focusing.

Scientists call this remarkable skill the "cocktail party effect." They have known for some time that an ability to selectively pay attention must play a role in how it works. But understanding the precise mechanisms the brain’s neurons use to do this has been an elusive goal. In a new study published in the journal Neuron, Clark School Associate Professor Jonathan Simon (Biology/ECE), alum Nai Ding (ECE Ph.D. 2012), lead author Elana M. Zion Golumbic of Columbia University and their colleagues from universities and medical centers in New York are unlocking the mechanics, using data recorded directly from the surface of the brain.

In a crowded place, sounds from different talkers enter our ears mixed together, so our brains first must separate them using cues like when and from where the sounds are coming. But we also have the ability to then track a particular voice, which comes to dominates our attention and later, our memory. One major theory hypothesizes we can do this because our brains are able to lock on to patterns we expect to hear in speech at designated times, such as syllables and phrases in sentences. The theory predicts that in a situation with competing sounds, when we train our focus exclusively on one person, that person’s speech will dominate our brain’s information processing.

Of course, inside our brains, this focusing and processing takes the form of electrical signals racing around a complicated network of neurons in the auditory cortex.

To begin to unlock how the neurons figure things out, the researchers used a brain-signal recording device called electrocorticography (ECoG). These devices, implanted directly in the cortex of the brain, are used in epilepsy surgery. They consist of about 120 electrodes arranged in an array over the brain’s lateral cortex.

With the permission of the surgery patients, researchers gave them a cocktail party-like comprehension task in which they watched a brief, 9-12 second movie of two simultaneous talkers, side by side. A cue in the movie indicated to which talker the person should try to listen. The ECoG recorded what was happening in the patients’ brains as they focused on what one of the talkers was saying.

The researchers learned that low-frequency "phase entrainment" signals and high-frequency "power modulations" worked together in the brain to dynamically track the chosen talker. In and near low-level auditory cortices, attention enhances the tracking of speech we’re paying attention to, while ignored speech is still heard. But in higher-order regions of the cortex, we become more "selective"--there is no detectable tracking of ignored speech. This selectivity seems to sharpen as a speaker’s sentence unfolds.

"This new study reaffirms what we’ve already seen using magnetoencephalography (MEG)," said co-author Jonathan Simon, whose joint appointment is in both the Clark School and the College of Computer, Mathematical and Natural Sciences. Simon's lab uses MEG, a common non-invasive neuroimaging method, to record from ordinary individuals instead of neurosurgery patients. "In fact, the methods of neural data analysis developed in my lab for analyzing MEG results proved to be fantastic for analyzing these new recordings taken directly from the brain."

"We’re quite pleased to see both the low frequency and high frequency neural responses working together," said Simon, "since our earlier MEG results were only able to detect the low frequency components." Simon's own MEG research currently is investigating what happens when the brain is no longer able to pick out a talker from a noisy background due to the effects of aging or damaged hearing.

Simon also notes that the new study's results are in good agreement with the auditory theories of another Clark School researcher, Professor Shihab Shamma (ECE/ISR); his former student Mounya Elhilali (ECE Ph.D. 2004, now on faculty at Johns Hopkins University); and their colleagues, who are part of a wide-ranging, collaborative family of neuroscience researchers originating at Maryland.

The cocktail party effect raises broader questions about how people perceptually organize their noisy worlds and track speech in a realistic environment. This research brings us a large step closer to understanding this enormously important human activity.

| Read the article at Neuron | Listen to a story about the study that aired on NPR's All Things Considered |

Published March 7, 2013