Information Directed Policy Sampling for Partially Observable Markov Decision Processes with Parametric Uncertainty
- Peeyush Kumar ,
- Archis Ghate
INFORMS International Conference on Service Science |
This paper formulates partially observable Markov decision processes, where state-transition probabilities and measurement outcome probabilities are characterized by unknown parameters. An information theoretic solution method that adaptively manages the resulting exploitation-exploration trade-off is proposed. Numerical experiments for response guided dosing in healthcare are presented.