ApproxRL: A Matlab Toolbox for Approximate RL and DP. This toolbox contains Matlab implementations of a number of approximate reinforcement learning (RL) and dynamic programming (DP) algorithms. Notably, it contains the algorithms used in the numerical examples from the book: L. Busoniu, R. Babuska, B. De Schutter, and D. Ernst, Reinforcement Learning and Dynamic Programming Using Function Approximators, CRC Press, Automation and Control Engineering Series. April 2010, 280 pages, ISBN 978-1439821084.
Keywords for this software
References in zbMATH (referenced in 10 articles )
Showing results 1 to 10 of 10.
- Geramifard, Alborz; Dann, Christoph; Klein, Robert H.; Dabney, William; How, Jonathan P.: RLPy: a value-function-based reinforcement learning framework for education and research (2015) ioport
- Vamvoudakis, Kyriakos G.: Non-zero sum Nash Q-learning for unknown deterministic continuous-time linear systems (2015)
- Gaggero, Mauro; Gnecco, Giorgio; Sanguineti, Marcello: Approximate dynamic programming for stochastic $N$-stage optimization with application to optimal consumption under uncertainty (2014)
- Laber, Eric B.; Lizotte, Daniel J.; Qian, Min; Pelham, William E.; Murphy, Susan A.: Dynamic treatment regimes: technical challenges and applications (2014)
- Xu, Xin; Zuo, Lei; Huang, Zhenhua: Reinforcement learning algorithms with function approximation: recent advances and applications (2014)
- Fonteneau, Raphael; Murphy, Susan A.; Wehenkel, Louis; Ernst, Damien: Batch mode reinforcement learning based on the synthesis of artificial trajectories (2013)
- Jiang, Zhong-Ping; Jiang, Yu: Robust adaptive dynamic programming for linear and nonlinear systems: an overview (2013)
- Peters, Markus; Ketter, Wolfgang; Saar-Tsechansky, Maytal; Collins, John: A reinforcement learning approach to autonomous decision-making in smart electricity markets (2013) ioport
- Beck, C.L.; Srikant, R.: Error bounds for constant step-size $Q$-learning (2012)
- Xu, Hao; Jagannathan, S.; Lewis, F.L.: Stochastic optimal control of unknown linear networked control system in the presence of random delays and packet losses (2012)