Scalable Methods and Expressive Models for Planning Under Uncertainty
The ability to plan in the presence of uncertainty about the effects of one’s own actions and the events of the environment is a core skill of a truly intelligent agent. This type of sequential decisionmaking has been modeled by Markov Decision Processes (MDPs), a framework known since at least the 1950’s [45, 3]. The importance of MDPs is not merely philosophic—they have been applied to several impactful real-world scenarios, from inventory management to military operations planning [80, 1]. Nonetheless, the adoption of MDPs in practice is greatly hampered by two aspects. First, modern algorithms for solving them are still not scalable enough to handle many realistically-sized problems. Second, the MDP classes we know how to solve tend to be restrictive, often failing to model significant aspects of the planning task at hand. As a result, many probabilistic scenarios fall outside of MDPs’ scope.
The research presented in this dissertation addresses both of these challenges. Its first contribution is several highly scalable approximation algorithms for existing MDP classes that combine two major planning paradigms, dimensionality reduction and deterministic relaxation. These approaches automatically extract human-understandable causal structure from an MDP and use this structure to efficiently compute a good MDP policy. Besides enabling us to handle larger planning scenarios, they bring us closer to the ideal of AI—building agents that autonomously recognize features important for solving a problem. While these techniques are applicable only to goal-oriented scenarios, this dissertation also introduces approximation algorithms for reward-oriented settings.