Skip to main content

Changing sounds are key to understanding speech

June 22, 2010 By Chris Barncard

On the printed page, c*ns*n*nts m*tt*r m*r* th*n v*w*ls.

But for spoken words, it may not be as simple as vowels versus consonants, according to University of Wisconsin–Madison researchers. When you’re listening, consonants or vowels take a back seat to the way your cochlea wiggles.

“Plenty of studies have examined whether consonants or vowels are more important to the listener,” says Christian Stilp, graduate student in psychology at UW–Madison. “Opposite the pattern for reading, researchers often found that consonants were more expendable when understanding speech.”

Bully for vowels? Not quite.

The findings get messy when transitions from vowel to consonant and consonant to vowel are considered. In spoken phrases, speech sounds are not like beads on a string. Their tonal colors intermingle, and the resulting mixture proves just as important to listeners as vowel sounds.

To Stilp and Keith Kluender — UW–Madison psychology professor and co-author with Stilp on a study published in Proceedings of the National Academy of Sciences — it appeared researchers were missing something.

“We decided to try to make it simpler,” Stilp says. “Let’s stop thinking in terms of consonants and vowels altogether. Let’s think about ears instead of language.”

Stilp and Kluender developed a measure of change from one sound in a sentence to the next based on the way they are translated into nerve signals in the cochlea, a small structure inside the ear that changes sound into signals to the brain. Called cochlea-scaled entropy, the metric highlights sounds that matter most to the brain.

“We tried to make this realistic — biologically plausible, like what is actually going on as sound travels up your auditory nerves and into your brain,” Stilp says. “What do neurons like? They register change. They say, ‘Wake me when something new happens.'”

By considering very brief changes across amplitude and frequency that make sound-handling neurons fire more often and easily, Stilp and Kluender targeted pieces of test sentences that have small, moderate and large changes. In turn, they cut the sections at each level of change out, replacing them with noise and playing the sentences for listeners in their lab.

The more change was replaced with noise, the harder it was for people to understand the sentences.

“When we replace intervals of low change with noise, you’re still pretty good at understanding the sentence,” Stilp says. “But intelligibility keeps going down as we replace intervals of higher change with noise.”

Intelligibility dropped by about a third as sounds with greater change were removed. Circling back to compare the sounds marked as important by cochlea-scaled entropy revealed its superiority to simple vowel-consonant divisions.

“Vowels were replaced more often, but they weren’t a good predictor,” Stilp says. “It wasn’t a reliable way to predict intelligibility, which CSE did beautifully.”

Most likely to rate as high-change sounds are “low” vowels, sounds like “ah” in “father” or “top” that draw the jaw and tongue downward. Least likely to cause much change are “stop” consonants like “t” and “d” in “today.”

The spread of sounds in between did a nice job, Stilp said, describing the hierarchy of sounds known to linguists as sonority. Sonority helps to explain how syllables are put together in nearly every language in the world.

“There have been lots of attempts to measure sonority that didn’t come up with a satisfactory answer,” Stilp says. “From the way this turned out, we seemed to measure sonority pretty well. We’re interested to see if linguists and phoneticians find this useful in their studies of language.”