Accessible Video Calling: Enabling Nonvisual Perception of Visual Conversation Cues

Lei Shi; Brianna Tomlinson; John Tang; Ed Cutrell; Daniel McDuff; Gina Venolia; Paul Johns; Kael Rowan

Accessible Video Calling: Enabling Nonvisual Perception of Visual Conversation Cues

Lei Shi ,
Brianna Tomlinson ,
John Tang ,
Ed Cutrell ,
Daniel McDuff ,
Gina Venolia ,
Paul Johns ,
Kael Rowan

Computer-Supported Cooperative Work 2019 | November 2019

Published by ACM | Organized by ACM

Nonvisually Accessible Video Calling (NAVC) is a prototype that detects visual conversation cues in a video call and uses audio cues to convey them to a user who is blind or low-vision. NAVC uses audio cues inspired by movie soundtracks to convey Attention, Agreement, Disagreement, Happiness, Thinking, and Surprise. When designing NAVC, we partnered with people who are blind or low-vision through a user-centered design process that included need-finding interviews and design reviews. To evaluate NAVC, we conducted a user study with 16 participants. The study provided feedback on the NAVC prototype and showed that the participants could easily discern some cues, like Attention and Agreement, but had trouble distinguishing others. The accuracy of the prototype in detecting conversation cues emerged as a key concern, especially in avoiding false positives and in detecting negative emotions, which tend to be masked in social conversations. This research identified challenges and design opportunities in using AI models to enable accessible video calling.