Towards more robust speech interactions for deaf and hard of hearing users
- Raymond Fok ,
- Harmanpreet Kaur ,
- Skanda Palani ,
- Martez E. Mott ,
- Walter S. Lasecki
ACM SIGACCESS Conference on Computers and Accessibility (ASSETS) |
Published by ACM
Mobile, wearable, and other ubiquitous computing devices are increasingly creating a context in which conventional keyboard and screen-based inputs are being replaced in favor of more natural speech-based interactions. Digital personal assistants use speech to control a wide range of functionality, from environmental controls to information access. However, many deaf and hard-of-hearing users have speech patterns that vary from those of hearing users due to incomplete acoustic feedback from their own voices. Because automatic speech recognition (ASR) systems are largely trained using speech from hearing individuals, speech-controlled technologies are typically inaccessible to deaf users. Prior work has focused on providing deaf users access to aural output via real-time captioning or signing, but little has been done to improve users’ ability to provide input to these systems’ speech-based interfaces. Further, the vocalization patterns of deaf speech often make accurate recognition intractable for both automated systems and human listeners, making traditional approaches to mitigate ASR limitations, such as human captionists, less effective. To bridge this accessibility gap, we investigate the limitations of common speech recognition approaches and techniques—both automatic and human-powered—when applied to deaf speech. We then explore the effectiveness of an iterative crowdsourcing workflow, and characterize the potential for groups to collectively exceed the performance of individuals. This paper contributes a better understanding of the challenges of deaf speech recognition and provides insights for future system development in this space.