Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation

Vision-Language Navigation is the task of navigating an embodied agent to carry out natural language instructions inside real 3D environments. We propose a novel Reinforced Cross-Modal Matching (RCM) approach that enforces cross-modal grounding both locally and globally via reinforcement learning (RL) and further introduce a Self-Supervised Imitation Learning (SIL) method to explore unseen environments by imitating its own past, good decisions.

日期：: 2019年6月17日

- Qiuyuan Huang
  
  Principal Researcher
- Jianfeng Gao
  
  Distinguished Scientist & Vice President
研究领域
- Artificial intelligence
论文与出版物
- Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation

Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation

讲者

Qiuyuan Huang

Jianfeng Gao

相关链接

研究领域

论文与出版物