News & features
In the news | Microsoft AI Blog for Business & Tech
Microsoft to engage with customers to further develop Turing natural language models
As part of the companywide AI at Scale initiative, Microsoft announced at its Ignite conference that it plans to begin working with select customers to further develop its Turing natural language representation (NLR) models. AI at Scale, which was announced…
In the news | The Batch
Toward 1 Trillion Parameters
An open source library could spawn trillion-parameter neural networks and help small-time developers build big-league models. What’s new: Microsoft upgraded DeepSpeed, a library that accelerates the PyTorch deep learning framework. The revision makes it possible to train models five times…
In the news | Analytics India Magazine
Microsoft Releases Latest Version Of DeepSpeed, Its Python Library For Deep Learning Optimisation
Recently, Microsoft announced the new advancements in the popular deep learning optimisation library known as DeepSpeed. This library is an important part of Microsoft’s new AI at Scale initiative to enable next-generation AI capabilities at scale.
DeepSpeed: Extreme-scale model training for everyone
| DeepSpeed Team, Rangan Majumder, and Junhua Wang
In February, we announced DeepSpeed, an open-source deep learning training optimization library, and ZeRO (Zero Redundancy Optimizer), a novel memory optimization technology in the library, which vastly advances large model training by improving scale, speed, cost, and usability. DeepSpeed has…
In the news | VentureBeat
Microsoft’s updated DeepSpeed can train trillion-parameter AI models with fewer GPUs
Microsoft today released an updated version of its DeepSpeed library that introduces a new approach to training AI models containing trillions of parameters, the variables internal to the model that inform its predictions. The company claims the technique, dubbed 3D…
XGLUE: Expanding cross-lingual understanding and generation with tasks from real-world scenarios
| Nan Duan, Yaobo Liang, and Daniel Campos
What we can teach a model to do with natural language is dictated by the availability of data. Currently, we have a lot of labeled data for very few languages, making it difficult to train models to accomplish question answering,…
ZeRO-2 & DeepSpeed: Shattering barriers of deep learning speed & scale
| DeepSpeed Team, Rangan Majumder, and Junhua Wang
In February, we announced DeepSpeed, an open-source deep learning training optimization library, and ZeRO (Zero Redundancy Optimizer), a novel memory optimization technology in the library, which vastly advances large model training by improving scale, speed, cost, and usability. DeepSpeed has…
In the news | The AI Blog
Microsoft announces new supercomputer, lays out vision for future AI work
Microsoft has built one of the top five publicly disclosed supercomputers in the world, making new infrastructure available in Azure to train extremely large artificial intelligence models, the company is announcing at its Build developers conference.
Objects are the secret key to revealing the world between vision and language
| Chunyuan Li, Lei Zhang, and Jianfeng Gao
Humans perceive the world through many channels, such as images viewed by the eyes or voices heard by the ears. Though any individual channel might be incomplete or noisy, humans can naturally align and fuse the information collected from multiple…