MindAgent: Emergent Gaming Interaction
- Steven Gong ,
- Qiuyuan Huang ,
- Xiaojian Ma ,
- Hoi Vo ,
- Zane Durante ,
- Yusuke Noda ,
- Zilong Zheng ,
- Song-chun Zhu ,
- Demetri Terzopoulos ,
- Feifei Li ,
- Jianfeng Gao
Processings of NAACL 2024 |
Organized by Generated the model and infrastructure with Microsoft Gaming US.
Large Language Models (LLMs) have the capacity of performing complex scheduling in a multi-agent system and can coordinate these agents into completing sophisticated tasks that require extensive collaboration. However, despite the introduction of numerous gaming frameworks, the community has insufficient benchmarks rather than building general multi-agents collaboration infrastructure that encompass both LLM and human-NPCs communications. In this work, we propose a novel infrastructure – MindAgent – to evaluate planning and coordination emergent capabilities for gaming interaction. In particular, our infrastructure leverages existing gaming framework to require understanding of the coordinator for a considerable multi-agents, collaborate with human players via un-finetuned proper instructions, and establish an in-context learning with feedback on few-shot prompt way. Furthermore, we introduce CuisineWorld, a new gaming scenario and related benchmark that dispatch a multi-agent collaboration efficiency and supervise multiple agents playing the game simultaneously. We conduct comprehensive evaluations with new auto-metric CoS for calculating the collaboration efficiency. Finally, our infrastructure can be deployed into real-world gaming scenarios in a customized VR game ”CuisineWorld” and adapted in existing border gaming ”Minecraft” domain. We hope our findings on LLMs and the new infrastructure for general-purpose scheduling and coordination can help shed light on how such skills can be obtained by learning from large text corpora.