Learning Heuristic Selection with Dynamic Algorithm Configuration

David Speck; André Biedenkapp; Frank Hutter; Robert Mattmüller; Marius Lindauer

doi:10.1609/icaps.v31i1.16008

Details

Originalsprache	Englisch
Titel des Sammelwerks	Proceedings of the International Conference on Automated Planning and Scheduling (ICAPS)
Herausgeber/-innen	Susanne Biundo, Minh Do, Robert Goldman, Michael Katz, Qiang Yang, Hankz Hankui Zhuo
Seiten	597-605
Seitenumfang	9
ISBN (elektronisch)	9781713832317
Publikationsstatus	Veröffentlicht - 5 Dez. 2021
Veranstaltung	31st International Conference on Automated Planning and Scheduling - Guangzhou, China Dauer: 2 Aug. 2021 → 13 Aug. 2021

Publikationsreihe

Name	Proceedings International Conference on Automated Planning and Scheduling, ICAPS
Band	2021-August
ISSN (Print)	2334-0835
ISSN (elektronisch)	2334-0843

Abstract

A key challenge in satisfying planning is to use multiple heuristics within one heuristic search. An aggregation of multiple heuristic estimates, for example by taking the maximum, has the disadvantage that bad estimates of a single heuristic can negatively affect the whole search. Since the performance of a heuristic varies from instance to instance, approaches such as algorithm selection can be successfully applied. In addition, alternating between multiple heuristics during the search makes it possible to use all heuristics equally and improve performance. However, all these approaches ignore the internal search dynamics of a planning system, which can help to select the most helpful heuristics for the current expansion step. We show that dynamic algorithm configuration can be used for dynamic heuristic selection which takes into account the internal search dynamics of a planning system. Furthermore, we prove that this approach generalizes over existing approaches and that it can exponentially improve the performance of the heuristic search. To learn dynamic heuristic selection, we propose an approach based on reinforcement learning and show empirically that domain-wise learned policies, which take the internal search dynamics of a planning system into account, can exceed existing approaches in terms of coverage.

ASJC Scopus Sachgebiete

Entscheidungswissenschaften (insg.)
Informationssysteme und -management
Informatik (insg.)
Artificial intelligence
Informatik (insg.)
Angewandte Informatik

Zitieren

Learning Heuristic Selection with Dynamic Algorithm Configuration. / Speck, David; Biedenkapp, André; Hutter, Frank et al.
Proceedings of the International Conference on Automated Planning and Scheduling (ICAPS). Hrsg. / Susanne Biundo; Minh Do; Robert Goldman; Michael Katz; Qiang Yang; Hankz Hankui Zhuo. 2021. S. 597-605 (Proceedings International Conference on Automated Planning and Scheduling, ICAPS; Band 2021-August).

Publikation: Beitrag in Buch/Bericht/Sammelwerk/Konferenzband › Aufsatz in Konferenzband › Forschung › Peer-Review

Speck, D, Biedenkapp, A, Hutter, F, Mattmüller, R & Lindauer, M 2021, Learning Heuristic Selection with Dynamic Algorithm Configuration. in S Biundo, M Do, R Goldman, M Katz, Q Yang & HH Zhuo (Hrsg.), Proceedings of the International Conference on Automated Planning and Scheduling (ICAPS). Proceedings International Conference on Automated Planning and Scheduling, ICAPS, Bd. 2021-August, S. 597-605, 31st International Conference on Automated Planning and Scheduling, Guangzhou, China, 2 Aug. 2021. https://doi.org/10.1609/icaps.v31i1.16008

Speck, D., Biedenkapp, A., Hutter, F., Mattmüller, R., & Lindauer, M. (2021). Learning Heuristic Selection with Dynamic Algorithm Configuration. In S. Biundo, M. Do, R. Goldman, M. Katz, Q. Yang, & H. H. Zhuo (Hrsg.), Proceedings of the International Conference on Automated Planning and Scheduling (ICAPS) (S. 597-605). (Proceedings International Conference on Automated Planning and Scheduling, ICAPS; Band 2021-August). https://doi.org/10.1609/icaps.v31i1.16008

Speck D, Biedenkapp A, Hutter F, Mattmüller R, Lindauer M. Learning Heuristic Selection with Dynamic Algorithm Configuration. in Biundo S, Do M, Goldman R, Katz M, Yang Q, Zhuo HH, Hrsg., Proceedings of the International Conference on Automated Planning and Scheduling (ICAPS). 2021. S. 597-605. (Proceedings International Conference on Automated Planning and Scheduling, ICAPS). doi: 10.1609/icaps.v31i1.16008

Speck, David ; Biedenkapp, André ; Hutter, Frank et al. / Learning Heuristic Selection with Dynamic Algorithm Configuration. Proceedings of the International Conference on Automated Planning and Scheduling (ICAPS). Hrsg. / Susanne Biundo ; Minh Do ; Robert Goldman ; Michael Katz ; Qiang Yang ; Hankz Hankui Zhuo. 2021. S. 597-605 (Proceedings International Conference on Automated Planning and Scheduling, ICAPS).

Download

@inproceedings{6e0709733a724548be0b4ed0ea7d5aa2,

title = "Learning Heuristic Selection with Dynamic Algorithm Configuration",

abstract = " A key challenge in satisfying planning is to use multiple heuristics within one heuristic search. An aggregation of multiple heuristic estimates, for example by taking the maximum, has the disadvantage that bad estimates of a single heuristic can negatively affect the whole search. Since the performance of a heuristic varies from instance to instance, approaches such as algorithm selection can be successfully applied. In addition, alternating between multiple heuristics during the search makes it possible to use all heuristics equally and improve performance. However, all these approaches ignore the internal search dynamics of a planning system, which can help to select the most helpful heuristics for the current expansion step. We show that dynamic algorithm configuration can be used for dynamic heuristic selection which takes into account the internal search dynamics of a planning system. Furthermore, we prove that this approach generalizes over existing approaches and that it can exponentially improve the performance of the heuristic search. To learn dynamic heuristic selection, we propose an approach based on reinforcement learning and show empirically that domain-wise learned policies, which take the internal search dynamics of a planning system into account, can exceed existing approaches in terms of coverage. ",

keywords = "cs.AI, cs.LG",

author = "David Speck and Andr{\'e} Biedenkapp and Frank Hutter and Robert Mattm{\"u}ller and Marius Lindauer",

note = "Publisher Copyright: Copyright {\textcopyright} 2021, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.; 31st International Conference on Automated Planning and Scheduling, ICAPS 2021 ; Conference date: 02-08-2021 Through 13-08-2021",

year = "2021",

month = dec,

day = "5",

doi = "10.1609/icaps.v31i1.16008",

language = "English",

series = "Proceedings International Conference on Automated Planning and Scheduling, ICAPS",

pages = "597--605",

editor = "Susanne Biundo and Minh Do and Robert Goldman and Michael Katz and Qiang Yang and Zhuo, {Hankz Hankui}",

booktitle = "Proceedings of the International Conference on Automated Planning and Scheduling (ICAPS)",

}

Download

TY - GEN

T1 - Learning Heuristic Selection with Dynamic Algorithm Configuration

AU - Speck, David

AU - Biedenkapp, André

AU - Hutter, Frank

AU - Mattmüller, Robert

AU - Lindauer, Marius

PY - 2021/12/5

Y1 - 2021/12/5

N2 - A key challenge in satisfying planning is to use multiple heuristics within one heuristic search. An aggregation of multiple heuristic estimates, for example by taking the maximum, has the disadvantage that bad estimates of a single heuristic can negatively affect the whole search. Since the performance of a heuristic varies from instance to instance, approaches such as algorithm selection can be successfully applied. In addition, alternating between multiple heuristics during the search makes it possible to use all heuristics equally and improve performance. However, all these approaches ignore the internal search dynamics of a planning system, which can help to select the most helpful heuristics for the current expansion step. We show that dynamic algorithm configuration can be used for dynamic heuristic selection which takes into account the internal search dynamics of a planning system. Furthermore, we prove that this approach generalizes over existing approaches and that it can exponentially improve the performance of the heuristic search. To learn dynamic heuristic selection, we propose an approach based on reinforcement learning and show empirically that domain-wise learned policies, which take the internal search dynamics of a planning system into account, can exceed existing approaches in terms of coverage.

AB - A key challenge in satisfying planning is to use multiple heuristics within one heuristic search. An aggregation of multiple heuristic estimates, for example by taking the maximum, has the disadvantage that bad estimates of a single heuristic can negatively affect the whole search. Since the performance of a heuristic varies from instance to instance, approaches such as algorithm selection can be successfully applied. In addition, alternating between multiple heuristics during the search makes it possible to use all heuristics equally and improve performance. However, all these approaches ignore the internal search dynamics of a planning system, which can help to select the most helpful heuristics for the current expansion step. We show that dynamic algorithm configuration can be used for dynamic heuristic selection which takes into account the internal search dynamics of a planning system. Furthermore, we prove that this approach generalizes over existing approaches and that it can exponentially improve the performance of the heuristic search. To learn dynamic heuristic selection, we propose an approach based on reinforcement learning and show empirically that domain-wise learned policies, which take the internal search dynamics of a planning system into account, can exceed existing approaches in terms of coverage.

KW - cs.AI

KW - cs.LG

UR - http://www.scopus.com/inward/record.url?scp=85107633383&partnerID=8YFLogxK

U2 - 10.1609/icaps.v31i1.16008

DO - 10.1609/icaps.v31i1.16008

M3 - Conference contribution

T3 - Proceedings International Conference on Automated Planning and Scheduling, ICAPS

SP - 597

EP - 605

BT - Proceedings of the International Conference on Automated Planning and Scheduling (ICAPS)

A2 - Biundo, Susanne

A2 - Do, Minh

A2 - Goldman, Robert

A2 - Katz, Michael

A2 - Yang, Qiang

A2 - Zhuo, Hankz Hankui

T2 - 31st International Conference on Automated Planning and Scheduling

Y2 - 2 August 2021 through 13 August 2021

ER -

Research@Leibniz University

Learning Heuristic Selection with Dynamic Algorithm Configuration

Autorschaft

Organisationseinheiten

Externe Organisationen

Details

Publikationsreihe

Abstract

ASJC Scopus Sachgebiete

Zitieren

Von denselben Autoren

AMLTK: A Modular AutoML Toolkit in Python

AutoML in Heavily Constrained Applications

Verfahren zum Trainieren eines Algorithmus des maschinellen Lernens durch ein bestärkendes Lernverfahren

MO-SMAC: Multi-objective Sequential Model-based Algorithm Configuration

How Green is AutoML for Tabular Data?

AMLTK: A Modular AutoML Toolkit in Python

AutoML in Heavily Constrained Applications

Verfahren zum Trainieren eines Algorithmus des maschinellen Lernens durch ein bestärkendes Lernverfahren

MO-SMAC: Multi-objective Sequential Model-based Algorithm Configuration

How Green is AutoML for Tabular Data?

AMLTK: A Modular AutoML Toolkit in Python