Within the media and entertainment industry, gaming reaches three billion people on the planet, drives multi-billion-dollar revenue streams, and is on track to be the primary source of entertainment across the world. Those in the industry also realize that the advances in the game technology ecosystem provide core new functionality that is applicable to a broad range of challenges and opportunities in media and entertainment. Gaming has driven critical scientific advances in modelling and simulating to render interactive 3D worlds, in developing AI agents to create intelligent characters, and in creating new human computer interface techniques to power real-time engagement.
Gaming technologies have a deep intersection with AI and simulations are a powerful and cost-effective approach to generate data to train and test AI techniques and models. We focus on research in this space because we recognize that these technologies are broadly relevant and applicable to empowering competitive innovation for our customers and partners in many industries.
A particularly intriguing opportunity lies in techniques that use physics simulation and AI as complementary capabilities. Our researchers working on Project Paidia are using reinforcement learning to train AI agents faster-than-real-time within physically simulated game worlds. In another effort, Project Triton seeks to enable detailed and automated immersive audio using specialized wave physics algorithms running in the cloud, with ongoing research on neural net acceleration.
The escalating expectations of viewers, users, and players for immersive and realistic experiences is driving massive increases in the complexity and costs in content creation. Character animation, voice-over, prop placement, world building, audio effects, AI navigation setup are all examples of interdependent, iterative production processes that are growing more rapidly than human-driven production teams can scale. Our researchers and engineers see an opportunity for scalable immersive content production to empower creators across games, film, and TV with powerful virtual content generation and rendering tools. These tools will leverage research into new physics+AI techniques designed for the cloud, yielding instant WYSIWYG feedback and differentiated immersive experiences.
We have a range of research efforts to pursue this vision:
AI for game testing aims to use AI agents to test out game environments extensively via parallel cloud compute, with potential to drastically reducing game testing (QA) times and time-to-finding bugs, which form a significant portion of a typical game’s development cost today.
Project Mishtu unlocks last-mile media content delivery for low-connectivity regions in emerging markets. In a recent successful pilot, a major media platform in India uploaded its content library to Azure, which was processed and downloaded to secure hubs that can provide content through local retailers to consumers lacking sufficient internet connectivity.
Project Paidia: reinforcement learning for game AI involves a collaboration with multiple Xbox game studios to examine modern reinforcement learning methods to build AI agents that can compete and collaborate intuitively with humans, potentially providing a major leap in the realism of game experiences. This would be a notable advance over most non-playing game characters today that use rudimentary AI systems based on hand-coded state machines.
Project Triton: immersive sound propagation simulates audio propagation in virtual environments, simultaneously reducing production costs away from manual markup, while boosting audio detail for gamers with immersive audio cues. It has shipped in multiple flagship games (Gears of War, Sea of Thieves, Call of Duty) and in active use within the HoloLens team. The tech is evolving towards scaling to massive and dynamic worlds.
Synthetic data generation for training speech AI explores synthetics for acoustics. Cloud wave simulation was used to generate numerous responses in synthetic conference rooms, something that is very expensive to do in the real world. The responses were used to augment the training dataset for Microsoft Teams’ recently shipped background noise suppression feature.
Watch For: live video analytics is a scalable media analysis platform built on Azure, operating production services for Bing, MSN, Mixer, and Xbox, analyzing large volumes of images, videos, and livestreams reaching tens of millions of users (for example: Mixer HypeZones, Bing’s Live Stream Search, MSN Esports Hub).
Read more in the Research Collection – Shall we play a game?