First TextWorld Problems—Microsoft Research Montreal’s latest AI competition is really cooking

已发布

作者 , Program Manager , Principal Researcher

textworld at neurips 2018

This week, Microsoft Research threw down the gauntlet with the launch of a competition challenging researchers around the world to develop AI agents that can solve text-based games. Conceived by the Machine Reading Comprehension team at Microsoft Research Montreal, the competition—First TextWorld Problems: A Reinforcement and Language Learning Challenge (opens in new tab)—runs from December 8, 2018 through May 31, 2019.

First TextWorld Problems is built on the TextWorld framework. TextWorld was released to the public in July 2018 at aka.ms/textworld (opens in new tab). TextWorld is an extensible, sandbox learning environment for reinforcement learning in text-based games. Beyond game simulation, it has the capacity to generate games stochastically from a user-specified distribution. Such a distribution of games opens new possibilities for the study of generalization and continual or meta-learning in a reinforcement learning setting, by enabling researchers to train and test agents on distinct but related games. TextWorld’s generator gives fine control over game parameters like the size of the game world, the branching factor and length of quests, the density of rewards, and the stochasticity of transitions. Game vocabulary can also be controlled; this directly affects the action and observation spaces. Researchers can also use TextWorld to handcraft games that test for specific knowledge and skills.

Spotlight: AI-POWERED EXPERIENCE

Microsoft research copilot experience

Discover more about research at Microsoft through our AI-powered experience

The theme for First TextWorld Problems is gathering ingredients to cook a recipe. Agents must determine the necessary ingredients from a recipe book, explore the house to gather ingredients, and return to the kitchen to cook up a delicious meal. Additionally, agents will need to use tools like knives and frying pans. Locked doors and other obstacles along the way must be overcome. The necessary ingredients and their locations change from game to game, as does the layout of the house itself; agents cannot simply memorize a procedure in order to succeed.

Hang on … did someone change the floorplan in this house? Example house layouts generated by TextWorld.

Hang on … did someone change the floorplan in this house? Example house layouts generated by TextWorld.

While a simple cooking task may seem quotidian by human standards, it is still very difficult for AI. Observations and actions are all text-based (see the example below), so a successful agent must learn to understand and manipulate its environment through language, as well as to ground its language in the environmental dynamics. It must also deal with classic, open reinforcement learning problems like partial observability and sparse rewards.

An example of a text-based cooking game whipped up in the TextWorld framework kitchen.

We hope this competition fosters research into generalization across tasks, meta-learning, zero-shot language understanding, common-sense reasoning, efficient exploration, and effective handling of combinatorial action spaces. The winning team will be awarded a prize of $2000 USD, plus an exclusive one-hour discussion session with a Microsoft Research researcher, as well as being featured in a Microsoft Research blog (opens in new tab) post and in an accompanying article in the Microsoft Research Newsletter (opens in new tab) (some restrictions apply, please check competition rules and regulations for details.)

Did we pique your interest? We encourage everyone to put their reinforcement learning prowess—and culinary talents—to the test in First TextWorld Problems. Go to aka.ms/textworld-challenge (opens in new tab) and sign up today!

相关论文与出版物

继续阅读

查看所有博客文章