Understanding the Sounds of Human Speech

Student Chris Carignan (right) measures EPG palettes toward making 3-D computer models of them while fellow student Megan Osfar (left) and professor Ryan Shosted look on.
Student Chris Carignan (right) measures EPG palettes toward making 3-D computer models of them while fellow student Megan Osfar (left) and professor Ryan Shosted look on.

Linguistics researcher Ryan Shosted has travelled around the world seeking out understudied languages and now, as a Beckman Institute faculty member, focuses on the aerodynamics of speech production.

Beckman Institute researcher Ryan Shosted is an experimental phonologist who studies how phonetic principles shape language. His research involves the aerodynamics of speech, which is the study of the properties of air as it passes through the vocal tract.

“Even though I’m interested in linguistics generally, I’m really focused on the aspects of what a language sounds like,” Shosted said.

Fittingly, Shosted compares the physiology of speech production to the workings of a musical instrument.

“If you think about the vocal tract as a woodwind instrument, there’s air that you blow from the bottom up, there’s this little vibrator that’s the larynx, and you’re able to make a variety of what are essentially musical sounds,” he said. “What’s particularly interesting to me is the variability of how we can use that instrument.

“After years of practice, people who become skilled with a musical instrument are able to produce qualities of sound that novice players cannot,” he added. “Our species has probably been using its vocal tract to produce sound since our beginnings, and similar forms of sound production are present in other species, so we know that sound production in animals is quite ancient. Over tens of thousands of years, humans have become very adept at using the vocal tract for speech communications. I’m particularly fascinated with the possibility that humans can use their vocal tracts in a variety of ways to produce similar acoustic outputs, the way a talented musician can use an instrument.”

Shosted could be described as adventurous, both personally and academically. He has done fieldwork in southern Africa and the Czech Republic at eventful times in both locales, working in research areas that are, he says, “understudied.”

We are trying to come up with a better understanding of how humans make language, how humans make the sounds of language in particular.
– Ryan Shosted

Shosted grew up in Utah and Ohio, got an undergraduate degree in Ohio and a Ph.D. in Linguistics from Cal-Berkeley, and has lived in South America, Mozambique while that country was still recovering from civil war and a natural disaster, and the Czech Republic a few years after that country’s Velvet Revolution and the fall of the Berlin Wall.  

Shosted went to Mozambique on a Fulbright Scholarship, choosing that southern African country because he had learned its official language, Portuguese, while doing volunteer work in Brazil, and because of the chance to study the intersection of native Bantu languages with Portuguese. His arrival in Mozambique turned out to be more interesting than he expected.

“I was briefly homeless in Mozambique,” he said. “I got there and I didn’t actually have anywhere to live. I was robbed at knifepoint on my first night.”

“I had a more adventurous spirit back then,” he added. “Maybe I still do to some extent.”

These days Shosted spends most of his time as an Assistant Professor in the Department of Linguistics and doing research at his lab in the Foreign Languages Building. He still does fieldwork, but much of it is right here in Champaign County with an immigrant Guatemalan population. When Shosted does travel abroad, it is a little less adventurous than in his student days.

“I’m not done doing field work. In fact, I went to Guatemala a couple of years ago,” he said. Shosted added that while he has done field work in a wide variety of conditions over the years, after finding his planned accommodations in Guatemala to be somewhat substandard, he “decided to spend a little bit more money and find a hotel.”

The ability of humans to vary vocalizations for speech production is what Shosted found interesting in the languages he studied in the Czech Republic, Mozambique, and in the Mayan language of Q’anjob’al that he studies in Guatemala and in Champaign.

“What is it exactly inside that woodwind instrument of ours that makes the sounds that we make?” Shosted said. “And can we come up with a good correlation between the sound output and those really fine details that we as humans are so good at manipulating?

“I think that I’ve found a really interesting area to work in, which is this fine-tuning of the vocal tract, comparing the acoustics and articulation and the aerodynamics, and trying to come up with a more holistic model of the vocal tract. My colleagues and I are trying to come up with a better understanding of how humans make language, how humans make the sounds of language in particular.”

If you think about the vocal tract as a woodwind instrument, there’s air that you blow from the bottom up, there’s this little vibrator that’s the larynx, and you’re able to make a variety of what are essentially musical sounds. What’s particularly interesting to me is the variability of how we can use that instrument. – Ryan Shosted

Shosted, a member of Beckman’s Artificial Intelligence group, has a current research focus on speech aerodynamics, with the interest in Q'anjob’al also involving educational outreach and preservation efforts. His choice of languages for study reflects his interests. For example, Czech drew his attention because of words that appear to have no vowels.

“There is this great tongue twister (click here to listen to an audio file) in Czech that speakers love to use with foreign learners because it’s a string of four words, no written vowels. It actually means ‘stick your finger through your throat,’” Shosted said. “I became fascinated that there could be a language that was so different from English, in terms of how it sounded, and even though it seemed so challenging to me, it was completely natural to people who speak it natively. I think that really sparked my imagination.”

Shosted’s time in Mozambique was also challenging and revealing to him personally and professionally.

“One of the languages that I worked on, Changana, claimed to have a very large number of consonants, and so I became more interested in the phonetics of how these sounds are produced,” he said. “One of the sounds is this so-called whistled fricative sound. It was quite a challenge to learn that one.”

That sound brought him to an epiphany about his own vocalization.

“I realized that I actually kind of whistle my s’s when I speak and that the same sound is a phoneme of another language. If I whistle my s in ‘seven’ you don’t suddenly think it’s a different word, but in Changana, whistling your ‘s’ is the difference between saying ‘flower’ and ‘flowers’. I reflected on the fact that whistling s is something that happens naturally in the speech of a lot of people.

“If you listen to news broadcasts, a lot of people do, in fact, whistle their s’s. One of the things that really fascinates me is that seemingly minor things happen to the sounds of languages but then it kind of shoots off and becomes stable, becoming a categorical sound of the language that can shape the whole system.”

That sparked research interests that continued after Shosted joined the University of Illinois in 2007. His work has led him to study nasalization, defined as “the production of a sound while the velum is lowered, so that some air escapes through the nose during the production of the sound.”

Shosted uses aerodynamics as a tool for understanding what the vocal tract is doing, employing among other tools a device called a pneumotach, fitted to a mask that goes over the mouth and nose and can register airflow. He said the pneumotach can register finer distinctions between oral and nasal sounds that what is provided by other methods.

Shosted said his group is building an aerodynamic model of the vocal tract that would allow better predictions about how much force is needed in the lungs for vocalizations and the kind of air flow that results from that force, as well as how that affects the kind of sounds that are produced. They also use the articulograph in his colleague Chilin Shih’s laboratory at Beckman for studying movements of the mouth and tongue while these sounds are produced. Shosted says the methods work together to give improved insight into speech production.

“The acoustic signal that comes out of nasals is kind of muddy for certain reasons; it almost obscures what’s going on inside the vocal tract,” he said. “So by using the articulograph, coupled with the aerodynamics, we’re able to identify with some certainty when something is nasalized and exactly where the tongue is when it’s nasalized.

“What we’re doing here is another form of imaging that people could do with an MRI or with ultrasound, but there are limitations on those systems as well. So this is one way of getting at it. The work on nasalization and tongue position we’re doing is very comprehensive, covering a variety of languages like French, Hindi, Brazilian Portuguese, and English; it has not been done before at this level of detail.”

Shosted said there are also translational aspects to his work, such as giving insight into cleft palate speech, re-creating the workings of the vocal tract so that artificial speech can sound more human, and outreach efforts involving Q'anjob’al like helping local translators, creating an online dictionary, and language preservation projects. His research interests have focused on an inhaling, or ingressive sound, found in that language, which like his other research lines, is a fairly uncommon and understudied linguistic feature.  

“I’ve been interacting with members of the Mayan community here in Champaign-Urbana for a while but we didn’t recognize that quality of the sound until we did the aerodynamics,” Shosted said. “I put the oral mask on one of our speakers and we looked at the trace of the flow and, boom, it went negative. Whereas, for most other sounds in speech, the airflow is positive. If you make a ‘p’, ‘t’, ‘d’, you’re blowing out. Approximately 10 percent of human languages have an ingressive sound. These sounds are relatively common but we are still learning about how they’re produced.

“By some estimates, Q’anjob’al is spoken by less than a hundred thousand individuals,” he added. “To have them here in Central Illinois is serendipity. But I started working with them once I got here and learned, again, that the aerodynamic aspects of the language are some of the more fascinating. It just fit right in beautifully.”