Data-efficiency is a core requirement for artificial intelligence. We are investigating agent architectures and methods, that implement reinforcement learning from scratch based on the principle of 'Collect and Infer' and investigate their application to challenging control and robotic domains. 'Collect and Infer' focuses on the two important aspects of agent learning:
Collecting the 'right data' by advanced exploration methods, like 'Learning by playing' (Riedmiller et.al, ICML 2018)
Effective inference of knowledge from a database of collected transition data, like the 'MPO' algorithm (Maximum A Posteriori Policy Optimisation, Abdolmaleki et.al, 2018) or 'ABM', a highly effective batch RL method (Siegel et. al, ICML 2020)
The video is part of a Reinforcement Learning Lecture Series from the University of Alberta, in which I introduce the 'Collect and Infer' perspective on Reinforcement Learning. 'Collect and Infer' is a design principle for data efficient reinforcement learning agents.
Video showing experiments from the 'Learning by Playing' paper, that introduced Scheduled Auxiliary Control (SAC-X) (Riedmiller et.al, ICML 2018)
Learning to play Ball-in-Cup by learning from scratch from raw pixels using Scheduled Auxiliary Control (SAC-X) (Schwab et. al, RSS 2019)
Our 'Learning by Playing' paper explained in the 'Two Minute Papers' series [read the original paper ]
Slides from a keynote talk at EWRL 2018 introducing the 'Collect and Infer' perspective on Reinforcement Learning
Abbas Abdolmaleki, Jost Tobias Springenberg, Yuval Tassa, Rémi Munos, Nicolas Heess, Martin A. Riedmiller: Maximum a Posteriori Policy Optimisation. CoRR abs/1806.06920 (2018)
Martin Riedmiller, Roland Hafner, Thomas Lampe, Michael Neunert, Jonas Degrave, Tom Van de Wiele, Vlad Mnih, Nicolas Heess, Jost Tobias Springenberg: Learning by Playing Solving Sparse Reward Tasks from Scratch. ICML 2018: 4341-4350
Devin Schwab, Jost Tobias Springenberg, Murilo F. Martins, Thomas Lampe, Michael Neunert, Abbas Abdolmaleki, Tim Hertweck, Roland Hafner, Francesco Nori, Martin A. Riedmiller: Simultaneously Learning Vision and Feature-based Control Policies for Real-world Ball-in-a-Cup. CoRR abs/1902.04706 (2019)