Exploration in Approximate Hyper-State Space

  • Zintgraf, Luisa M*; Feng, Leo; Igl, Maximilian; Hartikainen, Kristian; Hofmann, Katja; Whiteson, Shimon
  • Accepted abstract
  • [PDF] [Slides] [Join poster session]
    Poster session from 15:00 to 16:00 EAT and from 20:45 to 21:45 EAT
    Obtain the zoom password from ICLR

Abstract

Bayes-optimal agents are those that optimally trade off exploration and exploita- tion under task uncertainty, i.e., maximise online return incurred while learning. Although computing such policies is intractable for most problems, recent ad- vances in meta-learning and approximate variational inference make it possible to learn approximately Bayes-optimal behaviour for tasks from a given prior distri- bution. In this paper, we address the problem of exploration during meta-learning, i.e., gathering the data required for an agent to learn how to learn in an initially unknown task. Our approach uses reward bonuses that incentivise the agent to explore in hyper-state space, i.e., the joint state and belief space. On a sparse HalfCheetahDir task we show that our method can learn adaptation strategies for sparse tasks where existing meta-learning methods fail.

If videos are not appearing, disable ad-block!