Hindsight Trust Region Policy Optimization H. Zhang, S. Bai, X. Lan, D. Hsu, and N. Zheng. Hindsight trust region policy optimization. In Proc. Int. Jnt. Conf. on Artificial Intelligence, 2021. BibTeX PDF