Using Monolingual Speech Recognition for Spoken Term Detection in Code-switched Hindi-English Speech
Code-switching is the alternation of two or more languages in a single utterance or a conversation and is prevalent in multilingual communities all over the world. Spoken Term Detection (STD) is the task of detecting a given word or phrase in audio. STD has applications in audio indexing and mining. In this work, we explore Spoken Term Detection for code-switched conversational Hindi-English speech. Code-switching provides various challenges to this problem, including, 1. lack of training data to build robust code-switched Automatic Speech Recog- nition (ASR) systems, 2. non-standardized transcription due to borrowing and cross-transcription, 3. presence of translated or code-switched variants of the terms. In this work, we assume that a code-switched ASR System for Hindi-English does not exist, and make use of only a monolingual Hindi ASR to retrieve audio containing Hindi and English keywords. We use various techniques to normalize the output of a monolingual ASR system. We evaluate our techniques using Term Weighted Value (TWV) and find that phonetic matching of the query and ASR hypotheses at the utterance level is the most promising approach.