Speech Synthesis for Mixed-Language Navigation Instructions

Khythiraghavi Chandu; Sai Krishna Rallabandi; Sunayana Sitaram; Alan W Black

Speech Synthesis for Mixed-Language Navigation Instructions

Khythiraghavi Chandu ,
Sai Krishna Rallabandi ,
Sunayana Sitaram ,
Alan W Black

Interspeech 2017 | August 2017

Published by ISCA

Download BibTex

Text-to-Speech (TTS) systems that can read navigation instructions are one of the most widely used speech interfaces today. Text in the navigation domain may contain named entities such as location names that are not in the language that the TTS database is recorded in. Moreover, named entities can be compound
words where individual lexical items belong to different languages. These named entities may be transliterated into the script that the TTS system is trained on. This may result in incorrect pronunciation rules being used for such words. We describe experiments to extend our previous work in generating code-mixed speech to synthesize navigation instructions, with a mixed-lingual TTS system. We conduct subjective listening tests with two sets of users, one being students who are native speakers of an Indian language and very proficient in English, and the other being drivers with low English literacy, but familiarity with location names. We find that in both sets of users, there is a significant preference for our proposed system over a baseline system that synthesizes instructions in English.