Spatialized Audio and Hybrid Video Conferencing: Where Should Voices be Positioned for People in the Room and Remote Headset Users?

CHI 2023 |

Published by ACM

DOI

Hybrid video calls include attendees in a conference room with loudspeakers and remote attendees using headsets, each with different options for rendering sound spatially. Two studies explored the listener experience with spatial audio in video calls. One study examined the in-room experience using loudspeakers, comparing among spatialization algorithms spreading voices out horizontally. A second study compared varying degrees of horizontal separation of binaurally rendered voices for a remote participant using a headset. In-room participants preferred the widest spatialization over monophonic, stereo, and stereo-binary audio in metrics related to intelligibility and helpfulness. Remote participants preferred different widths of the audio stage depending on the number of voices. In both studies, rendering sound spatially increased performance in speech stream identification. Results indicate spatial audio benefits for in-room and remote attendees in video calls, although the in-room attendees accepted a wider audio stage than remote users.