Return to Microsoft Research Lab – Redmond

Deep Learning Group

新闻与深度文章

新闻报道 | Analytics India

Interview with the team behind Microsoft’s µTransfer

2022年3月23日

Recently, researchers – Edward Hu, Greg Yang, Jianfeng Gao from Microsoft, introduced µ-Parametrization, which offers maximal feature learning even in infinite-width limit.

新闻报道 | The Register

Microsoft, OpenAI method could make training large neural networks cheaper

2022年3月14日

Cost of tuning hyperparameters using μTransfer was 7% of what it would be to pre-train GPT-3. Companies scaling up their neural network models could cut expensive training costs by employing a technique developed by researchers at Microsoft and OpenAI.

新闻报道 | TechRadar

Microsoft, OpenAI may have solved a fundamental AI bottleneck

2022年3月9日

Microsoft and Open AI have developed a new method for optimizing massive AI models that are too expensive to train multiple times, such as GPT-3. A blog post published by Microsoft Research describes a technique called µ-Parametrization (or µP), which…

微软研究院博客

µTransfer: A technique for hyperparameter tuning of enormous neural networks

2022年3月8日 | Edward Hu, Greg Yang, 和 Jianfeng Gao

Great scientific achievements cannot be made by trial and error alone. Every launch in the space program is underpinned by centuries of fundamental research in aerodynamics, propulsion, and celestial bodies. In the same way, when it comes to building large-scale…

微软研究院博客

SOLOIST: Pairing transfer learning and machine teaching to advance task bots at scale

2021年6月16日 | Baolin Peng, Chunyuan Li, Jinchao Li, Lars Liden, 和 Jianfeng Gao

The increasing use of personal assistants and messaging applications has spurred interest in building task-oriented dialog systems (or task bots) that can communicate with users through natural language to accomplish a wide range of tasks, such as restaurant booking, weather…

微软研究院博客

HEXA: Self-supervised pretraining with hard examples improves visual representations

2021年2月25日 | Chunyuan Li, Lei Zhang, 和 Jianfeng Gao

Humans perceive the world through observing a large number of visual scenes around us and then effectively generalizing—in other words, interpreting and identifying scenes they haven’t encountered before—without heavily relying on labeled annotations for every single scene. One of the…

微软研究院博客

VinVL: Advancing the state of the art for vision-language models

2021年1月14日 | Pengchuan Zhang, Lei Zhang, 和 Jianfeng Gao

Humans understand the world by perceiving and fusing information from multiple channels, such as images viewed by the eyes, voices heard by the ears, and other forms of sensory input. One of the core aspirations in AI is to develop…

微软研究院博客

Microsoft DeBERTa surpasses human performance on the SuperGLUE benchmark

2021年1月6日 | Pengcheng He, Xiaodong Liu, Jianfeng Gao, 和 Weizhu Chen

Natural language understanding (NLU) is one of the longest running goals in AI, and SuperGLUE is currently among the most challenging benchmarks for evaluating NLU models. The benchmark consists of a wide range of NLU tasks, including question answering, natural…

微软研究院博客

Novel object captioning surpasses human performance on benchmarks

2020年10月14日 | Kevin Lin, Xiaowei Hu, 和 Lijuan Wang

Consider for a moment what it takes to visually identify and describe something to another person. Now imagine that the other person can’t see the object or image, so every detail matters. How do you decide what information is important…