Research talk: Focal Attention: Towards local-global interactions in vision transformers
At present, deep neural networks have become prevalent for building AI systems for vision, language and multimodality. However, how to build efficient and task-oriented models are still challenging problems for researchers. In these lightning talks, Senior RSDE Baolin Peng and Senior Researcher Jianwei Yang from the Deep Learning Group at Microsoft Research Redmond, will discuss end-to-end dialog systems and new architecture for vision systems, respectively. For dialog systems, an end-to-end learning system is achieved by using self-learning from the conversations with a human in the loop. For vision systems, a sparse attention mechanism has been developed for the Vision Transformer to cope with high-resolution image inputs for image classification, object detection and semantic segmentation.
Learn more about the 2021 Microsoft Research Summit: https://Aka.ms/researchsummit (opens in new tab)
- Track:
- Deep Learning & Large-Scale AI
- Date:
- Speakers:
- Jianwei Yang
- Affiliation:
- Microsoft Research Redmond
-
-
Jianwei Yang
Principal Researcher
-
-
Deep Learning & Large-Scale AI
-
-
-
Research talk: Resource-efficient learning for large pretrained models
Speakers:- Subhabrata (Subho) Mukherjee
-
-
-
Research talk: Prompt tuning: What works and what's next
Speakers:- Danqi Chen
-
-
-
Research talk: NUWA: Neural visual world creation with multimodal pretraining
Speakers:- Lei Ji,
- Chenfei Wu
-
-
-
-
Research talk: Towards Self-Learning End-to-end Dialog Systems
Speakers:- Baolin Peng
-
Research talk: WebQA: Multihop and multimodal
Speakers:- Yonatan Bisk
-
Research talk: Closing the loop in natural language interfaces to relational databases
Speakers:- Dragomir Radev
-
Roundtable discussion: Beyond language models: Knowledge, multiple modalities, and more
Speakers:- Yonatan Bisk,
- Daniel McDuff,
- Dragomir Radev
-