Online POMDP Planning

Importance Sampling for Online Planning Under Uncertainty

Abstract

The partially observable Markov decision process (POMDP) provides a principled general framework for robot planning under uncertainty. Leveraging the idea of Monte Carlo sampling, recent POMDP planning algorithms have scaled up to various challenging robotic tasks, including, e.g., real-time online planning for autonomous vehicles. To further improve online planning performance, this paper presents IS-DESPOT, which introduces importance sampling to DESPOT, a state-of-the-art sampling-based POMDP algorithm for planning under uncertainty. Importance sampling improves DESPOT’s performance when there are critical, but rare events, which are difficult to sample. We prove that IS-DESPOT retains the theoretical guarantee of DESPOT. We demonstrate empirically that importance sampling significantly improves the performance of online POMDP planning for suitable tasks. We also present a general method for learning the importance sampling distribution.

Video

References

  • Y.F. Luo, H.Y. Bai, D. Hsu, and W.S. Lee. Importance sampling for online planning under uncertainty. In Algorithmic Foundations of Robotics XII – Proc. Int. Workshop on the Algorithmic Foundations of Robotics (WAFR). 2016.
    BibTeX PDF