Research talk: WebQA: Multihop and multimodal
Web search is fundamentally multimodal and multihop. Often, even before asking a question, individuals go directly to image search to find answers. Further, rarely do we find an answer from a single source, opting instead to aggregate information and reason through implications. Despite the frequency of this everyday occurrence, at present there is no unified question-answering benchmark that requires a single model to answer long-form natural language questions from text and open-ended visual sources that is akin to human experience. The researchers propose to bridge this gap between the natural language and computer vision communities with WebQA. They show that multihop text queries are difficult for a large-scale transformer model, and they also show that existing multi-modal transformers and visual representations do not perform well on open-domain visual queries. Our challenge for the community is to create a unified multimodal reasoning model that seamlessly transitions and reasons regardless of the source modality.
Learn more about the 2021 Microsoft Research Summit: https://Aka.ms/researchsummit (opens in new tab)
- Évènement :
- Microsoft Research Summit 2021
- Piste :
- Deep Learning & Large-Scale AI
- Date:
- Haut-parleurs:
- Yonatan Bisk
- Affiliation:
- Carnegie Mellon University
-
-
Yonatan Bisk
Professor
CMU
-
-
Deep Learning & Large-Scale AI
-
-
-
Research talk: Resource-efficient learning for large pretrained models
Speakers:- Subhabrata (Subho) Mukherjee
-
-
-
Research talk: Prompt tuning: What works and what's next
Speakers:- Danqi Chen
-
-
-
Research talk: NUWA: Neural visual world creation with multimodal pretraining
Speakers:- Lei Ji,
- Chenfei Wu
-
-
-
-
Research talk: Towards Self-Learning End-to-end Dialog Systems
Speakers:- Baolin Peng
-
Research talk: WebQA: Multihop and multimodal
Speakers:- Yonatan Bisk
-
Research talk: Closing the loop in natural language interfaces to relational databases
Speakers:- Dragomir Radev
-
Roundtable discussion: Beyond language models: Knowledge, multiple modalities, and more
Speakers:- Yonatan Bisk,
- Daniel McDuff,
- Dragomir Radev
-