TempoRL: Learning When to Act

André Biedenkapp; Raghu Rajan; Frank Hutter; Marius Lindauer

Details

Originalsprache	Englisch
Titel des Sammelwerks	Proceedings of the international conference on machine learning (ICML)
Seitenumfang	18
Publikationsstatus	Elektronisch veröffentlicht (E-Pub) - 2021
Veranstaltung	38th International Conference on Machine Learning Research - Virtual Dauer: 18 Juli 2021 → 24 Juli 2021

Abstract

Reinforcement learning is a powerful approach to learn behaviour through interactions with an environment. However, behaviours are usually learned in a purely reactive fashion, where an appropriate action is selected based on an observation. In this form, it is challenging to learn when it is necessary to execute new decisions. This makes learning inefficient, especially in environments that need various degrees of fine and coarse control. To address this, we propose a proactive setting in which the agent not only selects an action in a state but also for how long to commit to that action. Our TempoRL approach introduces skip connections between states and learns a skip-policy for repeating the same action along these skips. We demonstrate the effectiveness of TempoRL on a variety of traditional and deep RL environments, showing that our approach is capable of learning successful policies up to an order of magnitude faster than vanilla Q-learning.

Zitieren

TempoRL: Learning When to Act. / Biedenkapp, André; Rajan, Raghu; Hutter, Frank et al.
Proceedings of the international conference on machine learning (ICML). 2021.

Publikation: Beitrag in Buch/Bericht/Sammelwerk/Konferenzband › Aufsatz in Konferenzband › Forschung › Peer-Review

Biedenkapp, A, Rajan, R, Hutter, F & Lindauer, M 2021, TempoRL: Learning When to Act. in Proceedings of the international conference on machine learning (ICML). 38th International Conference on Machine Learning Research, 18 Juli 2021. <https://arxiv.org/abs/2106.05262>

Biedenkapp, A., Rajan, R., Hutter, F., & Lindauer, M. (2021). TempoRL: Learning When to Act. In Proceedings of the international conference on machine learning (ICML) Vorabveröffentlichung online. https://arxiv.org/abs/2106.05262

Biedenkapp A, Rajan R, Hutter F, Lindauer M. TempoRL: Learning When to Act. in Proceedings of the international conference on machine learning (ICML). 2021 Epub 2021.

Biedenkapp, André ; Rajan, Raghu ; Hutter, Frank et al. / TempoRL: Learning When to Act. Proceedings of the international conference on machine learning (ICML). 2021.

Download

@inproceedings{c3aaa1dd9f2b41649307e304d2fbef88,

title = "TempoRL: Learning When to Act",

abstract = " Reinforcement learning is a powerful approach to learn behaviour through interactions with an environment. However, behaviours are usually learned in a purely reactive fashion, where an appropriate action is selected based on an observation. In this form, it is challenging to learn when it is necessary to execute new decisions. This makes learning inefficient, especially in environments that need various degrees of fine and coarse control. To address this, we propose a proactive setting in which the agent not only selects an action in a state but also for how long to commit to that action. Our TempoRL approach introduces skip connections between states and learns a skip-policy for repeating the same action along these skips. We demonstrate the effectiveness of TempoRL on a variety of traditional and deep RL environments, showing that our approach is capable of learning successful policies up to an order of magnitude faster than vanilla Q-learning. ",

keywords = "cs.LG",

author = "Andr{\'e} Biedenkapp and Raghu Rajan and Frank Hutter and Marius Lindauer",

note = "Accepted at ICML'21; 38th International Conference on Machine Learning Research ; Conference date: 18-07-2021 Through 24-07-2021",

year = "2021",

language = "English",

booktitle = "Proceedings of the international conference on machine learning (ICML)",

}

Download

TY - GEN

T1 - TempoRL: Learning When to Act

AU - Biedenkapp, André

AU - Rajan, Raghu

AU - Hutter, Frank

AU - Lindauer, Marius

N1 - Accepted at ICML'21

PY - 2021

Y1 - 2021

N2 - Reinforcement learning is a powerful approach to learn behaviour through interactions with an environment. However, behaviours are usually learned in a purely reactive fashion, where an appropriate action is selected based on an observation. In this form, it is challenging to learn when it is necessary to execute new decisions. This makes learning inefficient, especially in environments that need various degrees of fine and coarse control. To address this, we propose a proactive setting in which the agent not only selects an action in a state but also for how long to commit to that action. Our TempoRL approach introduces skip connections between states and learns a skip-policy for repeating the same action along these skips. We demonstrate the effectiveness of TempoRL on a variety of traditional and deep RL environments, showing that our approach is capable of learning successful policies up to an order of magnitude faster than vanilla Q-learning.

AB - Reinforcement learning is a powerful approach to learn behaviour through interactions with an environment. However, behaviours are usually learned in a purely reactive fashion, where an appropriate action is selected based on an observation. In this form, it is challenging to learn when it is necessary to execute new decisions. This makes learning inefficient, especially in environments that need various degrees of fine and coarse control. To address this, we propose a proactive setting in which the agent not only selects an action in a state but also for how long to commit to that action. Our TempoRL approach introduces skip connections between states and learns a skip-policy for repeating the same action along these skips. We demonstrate the effectiveness of TempoRL on a variety of traditional and deep RL environments, showing that our approach is capable of learning successful policies up to an order of magnitude faster than vanilla Q-learning.

KW - cs.LG

M3 - Conference contribution

BT - Proceedings of the international conference on machine learning (ICML)

T2 - 38th International Conference on Machine Learning Research

Y2 - 18 July 2021 through 24 July 2021

ER -

Research@Leibniz University

TempoRL: Learning When to Act

Autoren

Organisationseinheiten

Externe Organisationen

Details

Abstract

Zitieren

Von denselben Autoren

AMLTK: A Modular AutoML Toolkit in Python

AutoML in Heavily Constrained Applications

Verfahren zum Trainieren eines Algorithmus des maschinellen Lernens durch ein bestärkendes Lernverfahren

Interactive Hyperparameter Optimization in Multi-Objective Problems via Preference Learning

AutoML: advanced tool for mining multivariate plant traits