Details
Original language | English |
---|---|
Publication status | Published - 2021 |
Event | Adaptive and Learning Agents Workshop, ALA 2021 at AAMAS 2021 - Virtual, Online, United Kingdom (UK) Duration: 3 May 2021 → 4 May 2021 |
Conference
Conference | Adaptive and Learning Agents Workshop, ALA 2021 at AAMAS 2021 |
---|---|
Country/Territory | United Kingdom (UK) |
City | Virtual, Online |
Period | 3 May 2021 → 4 May 2021 |
Abstract
The application of Reinforcement Learning (RL) Algorithms is often hindered by the combinatorial explosion of the state space. Previous works have leveraged abstractions which condense large state spaces to find tractable solutions, however they assumed that the abstractions are provided by a domain expert. In this work we propose a new approach to automatically construct Abstract Markov Decision Processes (AMDPs) for Potential Based Reward Shaping to improve the sample efficiency of RL algorithms. Our approach to construct abstract states is inspired by graph representation learning methods and effectively encodes topological and reward structure of the ground level MDP. We perform large scale quantitative experiments on Flag Collection domain. We show improvements of up to 6.5 times in sample efficiency and up to 3 times in run time over the baseline approach. Besides, with our qualitative analyses of the generated AMDP we demonstrate the capability of our approach to preserve topological and reward structure of the ground level MDP.
Keywords
- Abstract MDP, Graph Representations, Reinforcement Learning, State Representations
ASJC Scopus subject areas
- Computer Science(all)
- Artificial Intelligence
- Computer Science(all)
- Software
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
2021. Paper presented at Adaptive and Learning Agents Workshop, ALA 2021 at AAMAS 2021, Virtual, Online, United Kingdom (UK).
Research output: Contribution to conference › Paper › Research › peer review
}
TY - CONF
T1 - Graph Learning based Generation of Abstractions for Reinforcement Learning
AU - Xue, Yuan
AU - Kudenko, Daniel
AU - Khosla, Megha
PY - 2021
Y1 - 2021
N2 - The application of Reinforcement Learning (RL) Algorithms is often hindered by the combinatorial explosion of the state space. Previous works have leveraged abstractions which condense large state spaces to find tractable solutions, however they assumed that the abstractions are provided by a domain expert. In this work we propose a new approach to automatically construct Abstract Markov Decision Processes (AMDPs) for Potential Based Reward Shaping to improve the sample efficiency of RL algorithms. Our approach to construct abstract states is inspired by graph representation learning methods and effectively encodes topological and reward structure of the ground level MDP. We perform large scale quantitative experiments on Flag Collection domain. We show improvements of up to 6.5 times in sample efficiency and up to 3 times in run time over the baseline approach. Besides, with our qualitative analyses of the generated AMDP we demonstrate the capability of our approach to preserve topological and reward structure of the ground level MDP.
AB - The application of Reinforcement Learning (RL) Algorithms is often hindered by the combinatorial explosion of the state space. Previous works have leveraged abstractions which condense large state spaces to find tractable solutions, however they assumed that the abstractions are provided by a domain expert. In this work we propose a new approach to automatically construct Abstract Markov Decision Processes (AMDPs) for Potential Based Reward Shaping to improve the sample efficiency of RL algorithms. Our approach to construct abstract states is inspired by graph representation learning methods and effectively encodes topological and reward structure of the ground level MDP. We perform large scale quantitative experiments on Flag Collection domain. We show improvements of up to 6.5 times in sample efficiency and up to 3 times in run time over the baseline approach. Besides, with our qualitative analyses of the generated AMDP we demonstrate the capability of our approach to preserve topological and reward structure of the ground level MDP.
KW - Abstract MDP
KW - Graph Representations
KW - Reinforcement Learning
KW - State Representations
UR - http://www.scopus.com/inward/record.url?scp=85173570150&partnerID=8YFLogxK
M3 - Paper
AN - SCOPUS:85173570150
T2 - Adaptive and Learning Agents Workshop, ALA 2021 at AAMAS 2021
Y2 - 3 May 2021 through 4 May 2021
ER -