TextWorld: A learning environment for training reinforcement learning agents, inspired by text-based games

Published

By , Program Manager , Principal Researcher , Editorial

Today, fresh out of the Microsoft Research Montreal lab, comes an open-source project called TextWorld. TextWorld is an extensible Python framework for generating text-based games. Reinforcement learning researchers can use TextWorld to train and test AI agents in skills such as language understanding, affordance extraction, memory and planning, exploration and more. Researchers can study these in the context of generalization and transfer learning. TextWorld further runs existing text-based games, like the legendary Zork, for evaluating how well AI agents perform in complex, human-designed settings.

Figure 1 – Enter the world of TextWorld. Get the code at aka.ms/textworld (opens in new tab).

Text-based games – also known as interactive fiction or adventure games – are games in which the play environment and the player’s interactions with it are represented solely or primarily via text. As players moves through the game world, they observe textual descriptions of their surroundings (typically divided into discrete ‘rooms’), what objects are nearby, and any other pertinent information. Players issue text commands to an interpreter to manipulate objects, other characters in the game, or themselves. After each command, the game usually provides some feedback to inform players how that command altered the game environment, if at all. A typical text-based game poses a series of puzzles to solve, treasures to collect, and locations to reach. Goals and waypoints may be specified explicitly or may have to be inferred from cues.

Spotlight: Microsoft research newsletter

Microsoft Research Newsletter

Stay connected to the research community at Microsoft.

Figure 2 – An example game from TextWorld with a house-based theme.

Text-based games couple the freedom to explore a defined space with the restrictions of a parser and game world designed to respond positively to a relatively small set of textual commands. An agent that can competently navigate a text-based game needs to be able to not only generate coherent textual commands but must also generate the right commands in the right order, with little to no mistakes in between. Text-based games encourage experimentation and successful playthroughs involve multiple game losses and in-game “deaths.” Close observation and creative interpretation of the text the game provides and a generous supply of common sense are also integral to winning text-based games. The relatively simple obstacles present in a TextWorld game serve as an introduction to the basic real-life challenges posed by text-based games. In TextWorld, an agent needs to learn how to observe, experiment, fail and learn from failure.

TextWorld has two main components: a game generator and a game engine. The game generator converts high-level game specifications, such as number of rooms, number of objects, game length, and winning conditions, into an executable game source code in the Inform 7 language. The game engine is a simple inference machine that ensures that each step of the generated game is valid by using simple algorithms such as one-step forward and backward chaining.

Figure 3 – An overview of the TextWorld architecture.

“One reason I’m excited about TextWorld is the way it combines reinforcement learning with natural language,” said Geoff Gordon, Principal Research Manager at Microsoft Research Montreal “These two technologies are both really important, but they don’t fit together that well yet. TextWorld will push researchers to make them work in combination.” Gordon pointed out that reinforcement learning has had a number of high-profile successes recently (like Go or Ms. Pac-Man), but in all of these cases the agent has fairly simple observations and actions (for example, screen images and joystick positions in Ms. Pac-Man). In TextWorld, the agent has to both read and produce natural language, which has an entirely different and, in many cases, more complicated structure.

“I’m excited to see how researchers deal with this added complexity, said Gordon.”

Microsoft Research Montreal specializes in start-of-the art research in machine reading comprehension, dialogue, reinforcement learning, and FATE (Fairness, Accountability, Transparency, and Ethics in AI). The lab was founded in 2015 as Maluuba and acquired by Microsoft in 2017. For more information, check out Microsoft Research Montreal (opens in new tab).

This release of TextWorld is a beta and we are encouraging as much feedback as possible on the framework from fellow researchers across the world. You can send your feedback and questions to [email protected] (opens in new tab). Also, for more information and to get the code, check out TextWorld (opens in new tab), and our related publications TextWorld: A Learning Environment for Text-based Games (opens in new tab) and Counting to Explore and Generalize in Text-based Games (opens in new tab). Thank you!

Related publications

Continue reading

See all blog posts