Details
Originalsprache | Englisch |
---|---|
Titel des Sammelwerks | ECAI 2020 - 24th European Conference on Artificial Intelligence |
Herausgeber/-innen | Giuseppe De Giacomo, Alejandro Catala, Bistra Dilkina, Michela Milano, Senen Barro, Alberto Bugarin, Jerome Lang |
Seiten | 427-434 |
Seitenumfang | 8 |
ISBN (elektronisch) | 9781643681009 |
Publikationsstatus | Veröffentlicht - 24 Aug. 2020 |
Veranstaltung | 24th European Conference on Artificial Intelligence - ECAI 2020 - Santiago de Compostela, Spanien Dauer: 29 Aug. 2020 → 8 Sept. 2020 http://ecai2020.eu/ |
Publikationsreihe
Name | Frontiers in Artificial Intelligence and Applications |
---|---|
Band | 325 |
ISSN (Print) | 0922-6389 |
ISSN (elektronisch) | 1879-8314 |
Abstract
The performance of many algorithms in the fields of hard combinatorial problem solving, machine learning or AI in general depends on parameter tuning. Automated methods have been proposed to alleviate users from the tedious and error-prone task of manually searching for performance-optimized configurations across a set of problem instances. However, there is still a lot of untapped potential through adjusting an algorithm's parameters online since different parameter values can be optimal at different stages of the algorithm. Prior work showed that reinforcement learning is an effective approach to learn policies for online adjustments of algorithm parameters in a data-driven way. We extend that approach by formulating the resulting dynamic algorithm configuration as a contextual MDP, such that RL not only learns a policy for a single instance, but across a set of instances. To lay the foundation for studying dynamic algorithm configuration with RL in a controlled setting, we propose white-box benchmarks covering major aspects that make dynamic algorithm configuration a hard problem in practice and study the performance of various types of configuration strategies for them. On these white-box benchmarks, we show that (i) RL is a robust candidate for learning configuration policies, outperforming standard parameter optimization approaches, such as classical algorithm configuration; (ii) based on function approximation, RL agents can learn to generalize to new types of instances; and (iii) self-paced learning can substantially improve the performance by selecting a useful sequence of training instances automatically.
ASJC Scopus Sachgebiete
- Informatik (insg.)
- Artificial intelligence
Zitieren
- Standard
- Harvard
- Apa
- Vancouver
- BibTex
- RIS
ECAI 2020 - 24th European Conference on Artificial Intelligence. Hrsg. / Giuseppe De Giacomo; Alejandro Catala; Bistra Dilkina; Michela Milano; Senen Barro; Alberto Bugarin; Jerome Lang. 2020. S. 427-434 (Frontiers in Artificial Intelligence and Applications; Band 325).
Publikation: Beitrag in Buch/Bericht/Sammelwerk/Konferenzband › Aufsatz in Konferenzband › Forschung › Peer-Review
}
TY - GEN
T1 - Dynamic Algorithm Configuration: Foundation of a New Meta-Algorithmic Framework
AU - Biedenkapp, André
AU - Bozkurt, H. Furkan
AU - Eimer, Theresa
AU - Hutter, Frank
AU - Lindauer, Marius Thomas
N1 - Funding information: The authors acknowledge funding by the Robert Bosch GmbH, support by the state of Baden-Württemberg through bwHPC and the German Research Foundation through INST 39/963-1 FUGG.
PY - 2020/8/24
Y1 - 2020/8/24
N2 - The performance of many algorithms in the fields of hard combinatorial problem solving, machine learning or AI in general depends on parameter tuning. Automated methods have been proposed to alleviate users from the tedious and error-prone task of manually searching for performance-optimized configurations across a set of problem instances. However, there is still a lot of untapped potential through adjusting an algorithm's parameters online since different parameter values can be optimal at different stages of the algorithm. Prior work showed that reinforcement learning is an effective approach to learn policies for online adjustments of algorithm parameters in a data-driven way. We extend that approach by formulating the resulting dynamic algorithm configuration as a contextual MDP, such that RL not only learns a policy for a single instance, but across a set of instances. To lay the foundation for studying dynamic algorithm configuration with RL in a controlled setting, we propose white-box benchmarks covering major aspects that make dynamic algorithm configuration a hard problem in practice and study the performance of various types of configuration strategies for them. On these white-box benchmarks, we show that (i) RL is a robust candidate for learning configuration policies, outperforming standard parameter optimization approaches, such as classical algorithm configuration; (ii) based on function approximation, RL agents can learn to generalize to new types of instances; and (iii) self-paced learning can substantially improve the performance by selecting a useful sequence of training instances automatically.
AB - The performance of many algorithms in the fields of hard combinatorial problem solving, machine learning or AI in general depends on parameter tuning. Automated methods have been proposed to alleviate users from the tedious and error-prone task of manually searching for performance-optimized configurations across a set of problem instances. However, there is still a lot of untapped potential through adjusting an algorithm's parameters online since different parameter values can be optimal at different stages of the algorithm. Prior work showed that reinforcement learning is an effective approach to learn policies for online adjustments of algorithm parameters in a data-driven way. We extend that approach by formulating the resulting dynamic algorithm configuration as a contextual MDP, such that RL not only learns a policy for a single instance, but across a set of instances. To lay the foundation for studying dynamic algorithm configuration with RL in a controlled setting, we propose white-box benchmarks covering major aspects that make dynamic algorithm configuration a hard problem in practice and study the performance of various types of configuration strategies for them. On these white-box benchmarks, we show that (i) RL is a robust candidate for learning configuration policies, outperforming standard parameter optimization approaches, such as classical algorithm configuration; (ii) based on function approximation, RL agents can learn to generalize to new types of instances; and (iii) self-paced learning can substantially improve the performance by selecting a useful sequence of training instances automatically.
UR - http://www.scopus.com/inward/record.url?scp=85091309770&partnerID=8YFLogxK
U2 - 10.3233/FAIA200122
DO - 10.3233/FAIA200122
M3 - Conference contribution
SN - 978-1-64368-100-9
T3 - Frontiers in Artificial Intelligence and Applications
SP - 427
EP - 434
BT - ECAI 2020 - 24th European Conference on Artificial Intelligence
A2 - De Giacomo, Giuseppe
A2 - Catala, Alejandro
A2 - Dilkina, Bistra
A2 - Milano, Michela
A2 - Barro, Senen
A2 - Bugarin, Alberto
A2 - Lang, Jerome
T2 - 24th European Conference on Artificial Intelligence - ECAI 2020
Y2 - 29 August 2020 through 8 September 2020
ER -