Details
Originalsprache | Englisch |
---|---|
Seitenumfang | 8 |
Publikationsstatus | Veröffentlicht - 2021 |
Veranstaltung | Adaptive and Learning Agents Workshop, ALA 2021 at AAMAS 2021 - Virtual, Online, Großbritannien / Vereinigtes Königreich Dauer: 3 Mai 2021 → 4 Mai 2021 |
Konferenz
Konferenz | Adaptive and Learning Agents Workshop, ALA 2021 at AAMAS 2021 |
---|---|
Land/Gebiet | Großbritannien / Vereinigtes Königreich |
Ort | Virtual, Online |
Zeitraum | 3 Mai 2021 → 4 Mai 2021 |
Abstract
Potential Based Reward Shaping has proven itself to be an effective method for improving the learning rate for Reinforcement Learning algorithms - especially when the potential function is derived from the solution to an Abstract Markov Decision Process (AMDP) encapsulating an abstraction of the desired task. The provenance of the AMDP is often a domain expert. In this paper we introduce a novel method for the full automation of creating and solving an AMDP to induce a potential function. We then show empirically that the potential function our method creates improves the sample efficiency of DQN in the domain in which we test our approach.
ASJC Scopus Sachgebiete
- Informatik (insg.)
- Artificial intelligence
- Informatik (insg.)
- Software
Zitieren
- Standard
- Harvard
- Apa
- Vancouver
- BibTex
- RIS
2021. Beitrag in Adaptive and Learning Agents Workshop, ALA 2021 at AAMAS 2021, Virtual, Online, Großbritannien / Vereinigtes Königreich.
Publikation: Konferenzbeitrag › Paper › Forschung › Peer-Review
}
TY - CONF
T1 - Latent Property State Abstraction For Reinforcement learning
AU - Burden, John
AU - Siahroudi, Sajjad Kamali
AU - Kudenko, Daniel
PY - 2021
Y1 - 2021
N2 - Potential Based Reward Shaping has proven itself to be an effective method for improving the learning rate for Reinforcement Learning algorithms - especially when the potential function is derived from the solution to an Abstract Markov Decision Process (AMDP) encapsulating an abstraction of the desired task. The provenance of the AMDP is often a domain expert. In this paper we introduce a novel method for the full automation of creating and solving an AMDP to induce a potential function. We then show empirically that the potential function our method creates improves the sample efficiency of DQN in the domain in which we test our approach.
AB - Potential Based Reward Shaping has proven itself to be an effective method for improving the learning rate for Reinforcement Learning algorithms - especially when the potential function is derived from the solution to an Abstract Markov Decision Process (AMDP) encapsulating an abstraction of the desired task. The provenance of the AMDP is often a domain expert. In this paper we introduce a novel method for the full automation of creating and solving an AMDP to induce a potential function. We then show empirically that the potential function our method creates improves the sample efficiency of DQN in the domain in which we test our approach.
KW - Abstraction
KW - Reinforcement Learning
KW - Reward Shaping
UR - http://www.scopus.com/inward/record.url?scp=85134046472&partnerID=8YFLogxK
M3 - Paper
AN - SCOPUS:85134046472
T2 - Adaptive and Learning Agents Workshop, ALA 2021 at AAMAS 2021
Y2 - 3 May 2021 through 4 May 2021
ER -