Redesiging Neural Architectures for Sequence to Sequence Learning
The Encoder-Decoder model with soft-attention is now the defacto standard for sequence to sequence learning, having enjoyed early success in tasks like translation, error correction, and speech recognition. In this talk, I will present a critique of various aspect of this popular model, including its soft attention mechanism, local loss function, and sequential decoding. I will present a new Posterior Attention Network for a more transparent joint attention that provides easy gains on several translation and morphological inflection tasks. Next, I will expose a little known problem of mis-calibration in state of the art neural machine translation (NMT) systems. For structured outputs like in NMT, calibration is important not just for reliable confidence with predictions, but also for proper functioning of beam-search inference. I will discuss reasons for mis-calibration and some fixes. Finally, I will summarize recent research efforts towards parallel decoding of long sequences.
- 日期:
- 演讲者:
- Sunita Sarawagi
- 所属机构:
- IIT Bombay
接下来观看
-
Advances in Natural Language Generation for Indian Languages
Speakers:- Dr. Raj Dabre
-
-
-
-
Microsoft Research India - who we are.
Speakers:- Kalika Bali,
- Sriram Rajamani,
- Venkat Padmanabhan
-
-