Details
Originalsprache | Englisch |
---|---|
Titel des Sammelwerks | Proceedings of the International Conference on Automated Planning and Scheduling (ICAPS) |
Herausgeber/-innen | Susanne Biundo, Minh Do, Robert Goldman, Michael Katz, Qiang Yang, Hankz Hankui Zhuo |
Seiten | 597-605 |
Seitenumfang | 9 |
ISBN (elektronisch) | 9781713832317 |
Publikationsstatus | Veröffentlicht - 5 Dez. 2021 |
Veranstaltung | 31st International Conference on Automated Planning and Scheduling - Guangzhou, China Dauer: 2 Aug. 2021 → 13 Aug. 2021 |
Publikationsreihe
Name | Proceedings International Conference on Automated Planning and Scheduling, ICAPS |
---|---|
Band | 2021-August |
ISSN (Print) | 2334-0835 |
ISSN (elektronisch) | 2334-0843 |
Abstract
ASJC Scopus Sachgebiete
- Entscheidungswissenschaften (insg.)
- Informationssysteme und -management
- Informatik (insg.)
- Artificial intelligence
- Informatik (insg.)
- Angewandte Informatik
Zitieren
- Standard
- Harvard
- Apa
- Vancouver
- BibTex
- RIS
Proceedings of the International Conference on Automated Planning and Scheduling (ICAPS). Hrsg. / Susanne Biundo; Minh Do; Robert Goldman; Michael Katz; Qiang Yang; Hankz Hankui Zhuo. 2021. S. 597-605 (Proceedings International Conference on Automated Planning and Scheduling, ICAPS; Band 2021-August).
Publikation: Beitrag in Buch/Bericht/Sammelwerk/Konferenzband › Aufsatz in Konferenzband › Forschung › Peer-Review
}
TY - GEN
T1 - Learning Heuristic Selection with Dynamic Algorithm Configuration
AU - Speck, David
AU - Biedenkapp, André
AU - Hutter, Frank
AU - Mattmüller, Robert
AU - Lindauer, Marius
N1 - Publisher Copyright: Copyright © 2021, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
PY - 2021/12/5
Y1 - 2021/12/5
N2 - A key challenge in satisfying planning is to use multiple heuristics within one heuristic search. An aggregation of multiple heuristic estimates, for example by taking the maximum, has the disadvantage that bad estimates of a single heuristic can negatively affect the whole search. Since the performance of a heuristic varies from instance to instance, approaches such as algorithm selection can be successfully applied. In addition, alternating between multiple heuristics during the search makes it possible to use all heuristics equally and improve performance. However, all these approaches ignore the internal search dynamics of a planning system, which can help to select the most helpful heuristics for the current expansion step. We show that dynamic algorithm configuration can be used for dynamic heuristic selection which takes into account the internal search dynamics of a planning system. Furthermore, we prove that this approach generalizes over existing approaches and that it can exponentially improve the performance of the heuristic search. To learn dynamic heuristic selection, we propose an approach based on reinforcement learning and show empirically that domain-wise learned policies, which take the internal search dynamics of a planning system into account, can exceed existing approaches in terms of coverage.
AB - A key challenge in satisfying planning is to use multiple heuristics within one heuristic search. An aggregation of multiple heuristic estimates, for example by taking the maximum, has the disadvantage that bad estimates of a single heuristic can negatively affect the whole search. Since the performance of a heuristic varies from instance to instance, approaches such as algorithm selection can be successfully applied. In addition, alternating between multiple heuristics during the search makes it possible to use all heuristics equally and improve performance. However, all these approaches ignore the internal search dynamics of a planning system, which can help to select the most helpful heuristics for the current expansion step. We show that dynamic algorithm configuration can be used for dynamic heuristic selection which takes into account the internal search dynamics of a planning system. Furthermore, we prove that this approach generalizes over existing approaches and that it can exponentially improve the performance of the heuristic search. To learn dynamic heuristic selection, we propose an approach based on reinforcement learning and show empirically that domain-wise learned policies, which take the internal search dynamics of a planning system into account, can exceed existing approaches in terms of coverage.
KW - cs.AI
KW - cs.LG
UR - http://www.scopus.com/inward/record.url?scp=85107633383&partnerID=8YFLogxK
U2 - 10.1609/icaps.v31i1.16008
DO - 10.1609/icaps.v31i1.16008
M3 - Conference contribution
T3 - Proceedings International Conference on Automated Planning and Scheduling, ICAPS
SP - 597
EP - 605
BT - Proceedings of the International Conference on Automated Planning and Scheduling (ICAPS)
A2 - Biundo, Susanne
A2 - Do, Minh
A2 - Goldman, Robert
A2 - Katz, Michael
A2 - Yang, Qiang
A2 - Zhuo, Hankz Hankui
T2 - 31st International Conference on Automated Planning and Scheduling
Y2 - 2 August 2021 through 13 August 2021
ER -