Online planning algorithms for POMDPS. Partially Observable Markov Decision Processes (POMDPs) provide a rich framework for sequential decision-making under uncertainty in stochastic domains. However, solving a POMDP is often intractable except for small problems due to their complexity. Here, we focus on online approaches that alleviate the computational complexity by computing good local policies at each decision step during the execution. Online algorithms generally consist of a lookahead search to find the best action to execute at each time step in an environment. Our objectives here are to survey the various existing online POMDP methods, analyze their properties and discuss their advantages and disadvantages; and to thoroughly evaluate these online approaches in different environments under various metrics (return, error bound reduction, lower bound improvement). Our experimental results indicate that state-of-the-art online heuristic search methods can handle large POMDP domains efficiently

References in zbMATH (referenced in 34 articles , 1 standard article )

Showing results 1 to 20 of 34.
Sorted by year (citations)

1 2 next

  1. Pajarinen, Joni; Thai, Hong Linh; Akrour, Riad; Peters, Jan; Neumann, Gerhard: Compatible natural gradient policy search (2019)
  2. Powell, Warren B.: A unified framework for stochastic optimization (2019)
  3. Pajarinen, Joni; Kyrki, Ville: Robotic manipulation of multiple objects as a POMDP (2017)
  4. Zhang, Zongzhang; Fu, Qiming; Zhang, Xiaofang; Liu, Quan: Reasoning and predicting POMDP planning complexity via covering numbers (2016)
  5. Çelik, Melih; Ergun, Özlem; Keskinocak, Pınar: The post-disaster debris clearance problem under incomplete information (2015)
  6. Lauri, Mikko; Ritala, Risto: Planning for multiple measurement channels in a continuous-state POMDP (2013)
  7. Bai, Haoyu; Hsu, David; Lee, Wee Sun; Ngo, Vien A.: Monte Carlo value iteration for continuous-state POMDPS (2011)
  8. Golovin, D.; Krause, A.: Adaptive submodularity: theory and applications in active learning and stochastic optimization (2011)
  9. He, R.; Brunskill, E.; Roy, N.: Efficient planning under uncertainty with macro-actions (2011)
  10. Veness, J.; Ng, K. S.; Hutter, M.; Uther, W.; Silver, D.: A Monte-Carlo AIXI approximation (2011)
  11. Wolf, Travis B.; Kochenderfer, Mykel J.: Aircraft collision avoidance using Monte Carlo real-time belief space search (2011) ioport
  12. Aras, R.; Dutech, A.: An investigation into mathematical programming for finite horizon decentralized POMDPS (2010)
  13. Brunskill, Emma; Kaelbling, Leslie Pack; Lozano-Pérez, Tomás; Roy, Nicholas: Planning in partially-observable switching-mode continuous domains (2010)
  14. Chatterjee, Krishnendu; Doyen, Laurent; Henzinger, Thomas A.: Qualitative analysis of partially-observable Markov decision processes (2010)
  15. Goulionis, J. E.; Stengos, D. I.; Tzavelas, G.: Stationary policies with Markov partition property (2010)
  16. Goulionis, John E.; Stengos, D. J.; Tzavelas, G.: Planning in uncertain multiagent settings for the healthcare management of Parkinson’s disease (2010)
  17. Wierstra, Daan; Förster, Alexander; Peters, Jan; Schmidhuber, Jürgen: Recurrent policy gradients (2010)
  18. Bernstein, D. S.; Amato, C.; Hansen, E. A.; Zilberstein, S.: Policy iteration for decentralized control of Markov decision processes (2009)
  19. Chong, Edwin K. P.; Kreucher, Christopher M.; Hero, Alfred O. III: Partially observable Markov decision process approximations for adaptive sensing (2009)
  20. Doshi, P.; Gmytrasiewicz, P. J.: Monte Carlo sampling methods for approximating interactive POMDPS (2009)

1 2 next

Further publications can be found at: http://www.pomdp.org/pomdp/papers/index.shtml