Language modeling: Attention mechanisms for extending context-awareness of LSTM
- Jurik Juraska ,
- Sarangarajan Parthasarathy (sarangp) ,
- William Gale
Language models for ASR are traditionally trained on a sentence-level corpus. In this internship, we explore the potential of taking advantage of context beyond the current sentence for next word prediction. We show that adding an attention mechanism to LSTM allows modeling of long contexts.