Counter-Factual Reinforcement Learning: How to Model Decision-Makers That Anticipate the Future
- Ritchie Lee ,
- David Wolpert ,
- Scott Backhaus ,
- Russell Bent ,
- Brendan Tracey
Chapter 4, in Decision Making and Imperfection
Published by Springer | 2013, Vol 474
ISBN: 978-3-642-36406-8
This chapter introduces a novel framework for modeling interacting humans in a multi-stage game. This ”iterated semi network-form game” framework has the following desirable characteristics: (1) Bounded rational players, (2) strategic players (i.e., players account for one another’s reward functions when predicting one another’s behavior), and (3) computational tractability even on real-world systems. We achieve these benefits by combining concepts from game theory and reinforcement learning. To be precise, we extend the bounded rational ”level-K reasoning” model to apply to games over multiple stages. Our extension allows the decomposition of the overall modeling problem into a series of smaller ones, each of which can be solved by standard reinforcement learning algorithms. We call this hybrid approach ”level-K reinforcement learning”. We investigate these ideas in a cyber battle scenario over a smart power grid and discuss the relationship between the behavior predicted by our model and what one might expect of real human defenders and attackers.
© Springer-Verlag Berlin Heidelberg 2013