COBRA: Data-Efficient Model-Based RL through Unsupervised Object Discovery and Curiosity-Driven Exploration

Loic Matthey, Nicholas Watters, Matko Bosnjak, Christopher P Burgess, Alexander Lerchner

July, 2019

Agent solving a clustering task after 100 steps

Abstract

Data efficiency and robustness to task-irrelevant perturbations are long-standing challenges for deep reinforcement learning algorithms. Here we introduce a modular approach to addressing these challenges in a continuous control environment, without using hand-crafted or supervised information. Our Curious Object-Based seaRch Agent (COBRA) uses task-free intrinsically motivated exploration and unsupervised learning to build object-based models of its environment and action space. Subsequently, it can learn a variety of tasks through model-based search in very few steps and excel on structured hold-out tests of policy robustness.

Type

Preprint

Publication

arXiv

This project also led to the open-sourcing of the Spriteworld environment/rendering framework, a flexible and configurable Python-based learning environment.

COBRA: Data-Efficient Model-Based RL through Unsupervised Object Discovery and Curiosity-Driven Exploration

Abstract

Loic Matthey

Staff Research Scientist in Machine Learning