Mark Hasegawa-Johnson wants to give everyone a voice.
A professor of electrical and computer engineering at the University of Illinois Urbana-Champaign and a researcher at the Beckman Institute for Advanced Science and Technology since 1999, Hasegawa-Johnson has devoted his career to creating new speech technologies and ensuring their accessibility for everyone. His research focuses on automatic speech recognition, and he is the lead researcher of the Speech Accessibility Project, a multi-year initiative that aims to improve voice-recognition technology by collecting audio samples from volunteers with a variety of speech-impacted disabilities.
In recognition of his expertise and contributions to knowledge-constrained signal generation, Hasegawa-Johnson has been named a fellow by the International Speech Communication Association.
In this Q&A, Hasegawa-Johnson discusses the need for accessibility and inclusion in speech technology research, special challenges he’s worked to address, and how he first became involved with speech communication.
He can be reached at jhasegaw@illinois.edu.
You’ve been researching speech technology for more than 30 years. What first interested you in the field?
The earliest influence that steered me in this direction was my high school English teacher who was a published poet and encouraged our interest in poetry. I started writing in his class and continued writing poetry on and off, even taking classes in college. That got me interested in the interaction between sound and meaning in words. There are so many things you can do with sounds. You can choose to use sounds to support the meaning of your words, or to fight against the meaning. That’s what piqued my interest in the study of speech.
How do you identify challenges that might benefit from advances in speech communication technology?
I read a lot in both the technical literature and public press. I also talk to people with disabilities about how they manage their disability day to day. The people with cerebral palsy and Parkinson’s disease that I’ve spoken with are incredibly inventive at taking advantage of technology in ways that people who don’t have that disability would never imagine.
Do you encounter any misconceptions about speech communication technology and what it can or can’t do?
There is a lot of misunderstanding about how speech transcription is affected by data privacy laws. Speech is governed by the same ethical principles and laws that cover all other data streams. Speech technology advances because people are willing to share their speech with companies and universities who can do something with it, and that misunderstanding tends to discourage people from wanting to participate.
Speech technology has advanced significantly since the 1980s. Are there any gaps that still exist?
The biggest gap in speech technology is dealing with variation. Speech technology works really well for people whose voices are homogeneous. It works less well for people who have neuromotor disorders that cause differences in their speech patterns, or for people who are speaking English as a second language or have a regional or socioeconomic dialect that’s less represented in the samples used to train the technology.
A lot of my research now is trying to better understand how we can compensate for differences in speaking patterns in a way that will enable speech technology to be usable by everyone.
You’ve spoken a lot about the need for improved accessibility and inclusion. What makes that a priority in your research?
A disability is not a physical fact about you. A disability is the interaction between physical differences in the way your body works and the things you’re able to do. And the things that you’re able to do are governed by how buildings are designed and how devices and organizations are created. If those buildings, devices, and organizations knew in advance that somebody with your physical abilities wanted to make use of them, they could create an accommodation that would allow you to access that building or device.
What we would like to do with speech is make those accommodations standard, so that physical differences don’t exclude anyone from using any functionality that’s available.
You joined the university as a faculty member in 1999. What brought you to the Beckman Institute?
I joined Beckman when I first came to the university, because speech is by necessity an interdisciplinary field. Every major advance in speech technology has been a collaboration between computer scientists, electrical engineers, linguists, psychologists, and speech and hearing scientists.
ISCA was created in 1990 as an international organization where linguists, speech scientists, and speech engineers could come together. Beckman is the University of Illinois version of the same concept. At Beckman, we have scientists and engineers working toward common goals in the same building, and that’s what the institute is for.
The ISCA Fellows program recognizes exceptional members who have made significant contributions to the field of speech technology. How will this fellowship further advance your research?
Being named a fellow is an endorsement of my name and expertise, which helps build collaborations with other universities and attract graduate students.
My first international paper was actually published at an ISCA conference in 1990. In a way, I’ve been a part of this organization for longer than the organization itself has existed, so it’s an honor to be recognized.
You’ve been on the receiving end of many awards and accolades. Which accomplishment makes you the proudest?
My graduate students: I’ve been able to mentor some outstanding students who have done amazing things in the world.