Information Directed Policy Sampling for Partially Observable Markov Decision Processes with Parametric Uncertainty

INFORMS International Conference on Service Science |

Publication | Publication

This paper formulates partially observable Markov decision processes, where state-transition probabilities and measurement outcome probabilities are characterized by unknown parameters. An information theoretic solution method that adaptively manages the resulting exploitation-exploration trade-off is proposed. Numerical experiments for response guided dosing in healthcare are presented.