Details
Original language | English |
---|---|
Number of pages | 8 |
Publication status | Published - 2021 |
Event | Adaptive and Learning Agents Workshop, ALA 2021 at AAMAS 2021 - Virtual, Online, United Kingdom (UK) Duration: 3 May 2021 → 4 May 2021 |
Conference
Conference | Adaptive and Learning Agents Workshop, ALA 2021 at AAMAS 2021 |
---|---|
Country/Territory | United Kingdom (UK) |
City | Virtual, Online |
Period | 3 May 2021 → 4 May 2021 |
Abstract
Potential Based Reward Shaping has proven itself to be an effective method for improving the learning rate for Reinforcement Learning algorithms - especially when the potential function is derived from the solution to an Abstract Markov Decision Process (AMDP) encapsulating an abstraction of the desired task. The provenance of the AMDP is often a domain expert. In this paper we introduce a novel method for the full automation of creating and solving an AMDP to induce a potential function. We then show empirically that the potential function our method creates improves the sample efficiency of DQN in the domain in which we test our approach.
Keywords
- Abstraction, Reinforcement Learning, Reward Shaping
ASJC Scopus subject areas
- Computer Science(all)
- Artificial Intelligence
- Computer Science(all)
- Software
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
2021. Paper presented at Adaptive and Learning Agents Workshop, ALA 2021 at AAMAS 2021, Virtual, Online, United Kingdom (UK).
Research output: Contribution to conference › Paper › Research › peer review
}
TY - CONF
T1 - Latent Property State Abstraction For Reinforcement learning
AU - Burden, John
AU - Siahroudi, Sajjad Kamali
AU - Kudenko, Daniel
PY - 2021
Y1 - 2021
N2 - Potential Based Reward Shaping has proven itself to be an effective method for improving the learning rate for Reinforcement Learning algorithms - especially when the potential function is derived from the solution to an Abstract Markov Decision Process (AMDP) encapsulating an abstraction of the desired task. The provenance of the AMDP is often a domain expert. In this paper we introduce a novel method for the full automation of creating and solving an AMDP to induce a potential function. We then show empirically that the potential function our method creates improves the sample efficiency of DQN in the domain in which we test our approach.
AB - Potential Based Reward Shaping has proven itself to be an effective method for improving the learning rate for Reinforcement Learning algorithms - especially when the potential function is derived from the solution to an Abstract Markov Decision Process (AMDP) encapsulating an abstraction of the desired task. The provenance of the AMDP is often a domain expert. In this paper we introduce a novel method for the full automation of creating and solving an AMDP to induce a potential function. We then show empirically that the potential function our method creates improves the sample efficiency of DQN in the domain in which we test our approach.
KW - Abstraction
KW - Reinforcement Learning
KW - Reward Shaping
UR - http://www.scopus.com/inward/record.url?scp=85134046472&partnerID=8YFLogxK
M3 - Paper
AN - SCOPUS:85134046472
T2 - Adaptive and Learning Agents Workshop, ALA 2021 at AAMAS 2021
Y2 - 3 May 2021 through 4 May 2021
ER -