Deconstructing Model-Based Visual Reinforcement Learning

  • Babaeizadeh, Mohammad*; Saffar, Mohammad; Hafner, Danijar; Erhan, Dumitru; Kannan, Harini; Finn, Chelsea; Levine, Sergey
  • Accepted abstract
  • [PDF] [Slides] [Join poster session]
    Poster session from 15:00 to 16:00 EAT and from 20:45 to 21:45 EAT
    Obtain the zoom password from ICLR


Model-based reinforcement learning (RL) methods have shown strong sample efficiency and performance across a variety of tasks, including when faced with high-dimensional visual observations. These methods learn to predict the environment dynamics and expected reward from interaction and use this predictive model to plan and perform the task. Despite their simplicity, model-based RL methods vary in their fundamental design choices, and it is not clear how these design decisions affect performance. We study the existing recipes for successful visual model-based methods, controlling for design choices such as the model architecture and the representation space, and test each recipe on multiple environments. We find that predicting future observations leads to significant generalization benefits compared to only predicting rewards. We also empirically find that observation prediction accuracy, somewhat surprisingly, correlates more strongly with downstream task performance than reward prediction accuracy. Together, these findings suggest a concrete mechanism underlying the widely-observed sample efficiency gains from model-based RL.

If videos are not appearing, disable ad-block!