Details
Originalsprache | Englisch |
---|---|
Seitenumfang | 14 |
Publikationsstatus | Elektronisch veröffentlicht (E-Pub) - 11 Juni 2021 |
Abstract
Zitieren
- Standard
- Harvard
- Apa
- Vancouver
- BibTex
- RIS
2021.
Publikation: Arbeitspapier/Preprint › Preprint
}
TY - UNPB
T1 - Automatic Risk Adaptation in Distributional Reinforcement Learning
AU - Schubert, Frederik
AU - Eimer, Theresa
AU - Rosenhahn, Bodo
AU - Lindauer, Marius
PY - 2021/6/11
Y1 - 2021/6/11
N2 - The use of Reinforcement Learning (RL) agents in practical applications requires the consideration of suboptimal outcomes, depending on the familiarity of the agent with its environment. This is especially important in safety-critical environments, where errors can lead to high costs or damage. In distributional RL, the risk-sensitivity can be controlled via different distortion measures of the estimated return distribution. However, these distortion functions require an estimate of the risk level, which is difficult to obtain and depends on the current state. In this work, we demonstrate the suboptimality of a static risk level estimation and propose a method to dynamically select risk levels at each environment step. Our method ARA (Automatic Risk Adaptation) estimates the appropriate risk level in both known and unknown environments using a Random Network Distillation error. We show reduced failure rates by up to a factor of 7 and improved generalization performance by up to 14% compared to both risk-aware and risk-agnostic agents in several locomotion environments.
AB - The use of Reinforcement Learning (RL) agents in practical applications requires the consideration of suboptimal outcomes, depending on the familiarity of the agent with its environment. This is especially important in safety-critical environments, where errors can lead to high costs or damage. In distributional RL, the risk-sensitivity can be controlled via different distortion measures of the estimated return distribution. However, these distortion functions require an estimate of the risk level, which is difficult to obtain and depends on the current state. In this work, we demonstrate the suboptimality of a static risk level estimation and propose a method to dynamically select risk levels at each environment step. Our method ARA (Automatic Risk Adaptation) estimates the appropriate risk level in both known and unknown environments using a Random Network Distillation error. We show reduced failure rates by up to a factor of 7 and improved generalization performance by up to 14% compared to both risk-aware and risk-agnostic agents in several locomotion environments.
KW - cs.LG
M3 - Preprint
BT - Automatic Risk Adaptation in Distributional Reinforcement Learning
ER -