Details
Originalsprache | Englisch |
---|---|
Titel des Sammelwerks | Proceedings of the international conference on machine learning (ICML) |
Herausgeber (Verlag) | ML Research Press |
Seitenumfang | 14 |
ISBN (Print) | 978-171384506-5 |
Publikationsstatus | Veröffentlicht - 18 Juli 2021 |
Publikationsreihe
Name | Proceedings of Machine Learning Research |
---|---|
ISSN (Print) | 2640-3498 |
Abstract
Zitieren
- Standard
- Harvard
- Apa
- Vancouver
- BibTex
- RIS
Proceedings of the international conference on machine learning (ICML). ML Research Press, 2021. (Proceedings of Machine Learning Research).
Publikation: Beitrag in Buch/Bericht/Sammelwerk/Konferenzband › Aufsatz in Konferenzband › Forschung › Peer-Review
}
TY - GEN
T1 - Self-Paced Context Evaluation for Contextual Reinforcement Learning
AU - Eimer, Theresa
AU - Biedenkapp, André
AU - Hutter, Frank
AU - Lindauer, Marius
PY - 2021/7/18
Y1 - 2021/7/18
N2 - Reinforcement learning (RL) has made a lot of advances for solving a single problem in a given environment; but learning policies that generalize to unseen variations of a problem remains challenging. To improve sample efficiency for learning on such instances of a problem domain, we present Self-Paced Context Evaluation (SPaCE). Based on self-paced learning, \spc automatically generates \task curricula online with little computational overhead. To this end, SPaCE leverages information contained in state values during training to accelerate and improve training performance as well as generalization capabilities to new instances from the same problem domain. Nevertheless, SPaCE is independent of the problem domain at hand and can be applied on top of any RL agent with state-value function approximation. We demonstrate SPaCE's ability to speed up learning of different value-based RL agents on two environments, showing better generalization capabilities and up to 10x faster learning compared to naive approaches such as round robin or SPDRL, as the closest state-of-the-art approach.
AB - Reinforcement learning (RL) has made a lot of advances for solving a single problem in a given environment; but learning policies that generalize to unseen variations of a problem remains challenging. To improve sample efficiency for learning on such instances of a problem domain, we present Self-Paced Context Evaluation (SPaCE). Based on self-paced learning, \spc automatically generates \task curricula online with little computational overhead. To this end, SPaCE leverages information contained in state values during training to accelerate and improve training performance as well as generalization capabilities to new instances from the same problem domain. Nevertheless, SPaCE is independent of the problem domain at hand and can be applied on top of any RL agent with state-value function approximation. We demonstrate SPaCE's ability to speed up learning of different value-based RL agents on two environments, showing better generalization capabilities and up to 10x faster learning compared to naive approaches such as round robin or SPDRL, as the closest state-of-the-art approach.
KW - cs.LG
UR - http://www.scopus.com/inward/record.url?scp=85161344151&partnerID=8YFLogxK
M3 - Conference contribution
SN - 978-171384506-5
T3 - Proceedings of Machine Learning Research
BT - Proceedings of the international conference on machine learning (ICML)
PB - ML Research Press
ER -