Hyperparameter Optimization: Foundations, Algorithms, Best Practices and Open Challenges

Bernd Bischl; Martin Binder; Michel Lang; Tobias Pielok; Jakob Richter; Stefan Coors; Janek Thomas; Theresa Ullmann; Marc Becker; Anne-Laure Boulesteix; Difan Deng; Marius Lindauer

doi:10.1002/widm.1484

Details

Original language	English
Article number	e1484
Number of pages	70
Journal	Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
Volume	13
Issue number	2
Publication status	Published - 10 Mar 2023

Abstract

Most machine learning algorithms are configured by one or several hyperparameters that must be carefully chosen and often considerably impact performance. To avoid a time consuming and unreproducible manual trial-and-error process to find well-performing hyperparameter configurations, various automatic hyperparameter optimization (HPO) methods, e.g., based on resampling error estimation for supervised machine learning, can be employed. After introducing HPO from a general perspective, this paper reviews important HPO methods such as grid or random search, evolutionary algorithms, Bayesian optimization, Hyperband and racing. It gives practical recommendations regarding important choices to be made when conducting HPO, including the HPO algorithms themselves, performance evaluation, how to combine HPO with ML pipelines, runtime improvements, and parallelization. This work is accompanied by an appendix that contains information on specific software packages in R and Python, as well as information and recommended hyperparameter search spaces for specific learning algorithms. We also provide notebooks that demonstrate concepts from this work as supplementary files.

Keywords

stat.ML, cs.LG, automl, tuning, model selection, hyperparameter optimization, machine learning

ASJC Scopus subject areas

Computer Science(all)
General Computer Science

Cite this

Hyperparameter Optimization: Foundations, Algorithms, Best Practices and Open Challenges. / Bischl, Bernd; Binder, Martin; Lang, Michel et al.
In: Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, Vol. 13, No. 2, e1484, 10.03.2023.

Research output: Contribution to journal › Article › Research › peer review

Bischl, B, Binder, M, Lang, M, Pielok, T, Richter, J, Coors, S, Thomas, J, Ullmann, T, Becker, M, Boulesteix, A-L, Deng, D & Lindauer, M 2023, 'Hyperparameter Optimization: Foundations, Algorithms, Best Practices and Open Challenges', Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, vol. 13, no. 2, e1484. https://doi.org/10.1002/widm.1484

Bischl, B., Binder, M., Lang, M., Pielok, T., Richter, J., Coors, S., Thomas, J., Ullmann, T., Becker, M., Boulesteix, A.-L., Deng, D., & Lindauer, M. (2023). Hyperparameter Optimization: Foundations, Algorithms, Best Practices and Open Challenges. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 13(2), Article e1484. https://doi.org/10.1002/widm.1484

Bischl B, Binder M, Lang M, Pielok T, Richter J, Coors S et al. Hyperparameter Optimization: Foundations, Algorithms, Best Practices and Open Challenges. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery. 2023 Mar 10;13(2):e1484. doi: 10.1002/widm.1484

Bischl, Bernd ; Binder, Martin ; Lang, Michel et al. / Hyperparameter Optimization: Foundations, Algorithms, Best Practices and Open Challenges. In: Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery. 2023 ; Vol. 13, No. 2.

Download

@article{f13650d499b34566b15b0aca4578f8eb,

title = "Hyperparameter Optimization: Foundations, Algorithms, Best Practices and Open Challenges",

abstract = " Most machine learning algorithms are configured by one or several hyperparameters that must be carefully chosen and often considerably impact performance. To avoid a time consuming and unreproducible manual trial-and-error process to find well-performing hyperparameter configurations, various automatic hyperparameter optimization (HPO) methods, e.g., based on resampling error estimation for supervised machine learning, can be employed. After introducing HPO from a general perspective, this paper reviews important HPO methods such as grid or random search, evolutionary algorithms, Bayesian optimization, Hyperband and racing. It gives practical recommendations regarding important choices to be made when conducting HPO, including the HPO algorithms themselves, performance evaluation, how to combine HPO with ML pipelines, runtime improvements, and parallelization. This work is accompanied by an appendix that contains information on specific software packages in R and Python, as well as information and recommended hyperparameter search spaces for specific learning algorithms. We also provide notebooks that demonstrate concepts from this work as supplementary files. ",

keywords = "stat.ML, cs.LG, automl, tuning, model selection, hyperparameter optimization, machine learning",

author = "Bernd Bischl and Martin Binder and Michel Lang and Tobias Pielok and Jakob Richter and Stefan Coors and Janek Thomas and Theresa Ullmann and Marc Becker and Anne-Laure Boulesteix and Difan Deng and Marius Lindauer",

note = "Funding Information: Bavarian Ministry for Economic Affairs, Infrastructure, Transport and Technology, Grant/Award Number: BAYERN DIGITAL II; Bundesministerium f{\"u}r Bildung und Forschung, Grant/Award Number: 01IS18036A; Deutsche Forschungsgemeinschaft (Collaborative Research Center), Grant/Award Number: SFB 876‐A3; Federal Statistical Office of Germany; Research Center “Trustworthy Data Science and Security” Funding information Funding Information: The authors of this work take full responsibilities for its content. This work was supported by the Federal Statistical Office of Germany; the Deutsche Forschungsgemeinschaft (DFG) within the Collaborative Research Center SFB 876, A3; the Research Center “Trustworthy Data Science and Security”, one of the Research Alliance centers within the https://uaruhr.de ; the German Federal Ministry of Education and Research (BMBF) under Grant No. 01IS18036A; and the Bavarian Ministry for Economic Affairs, Infrastructure, Transport and Technology through the Center for Analytics‐Data‐Applications (ADA‐Center) within the framework of “BAYERN DIGITAL II.”",

year = "2023",

month = mar,

day = "10",

doi = "10.1002/widm.1484",

language = "English",

volume = "13",

number = "2",

}

Download

TY - JOUR

T1 - Hyperparameter Optimization: Foundations, Algorithms, Best Practices and Open Challenges

AU - Bischl, Bernd

AU - Binder, Martin

AU - Lang, Michel

AU - Pielok, Tobias

AU - Richter, Jakob

AU - Coors, Stefan

AU - Thomas, Janek

AU - Ullmann, Theresa

AU - Becker, Marc

AU - Boulesteix, Anne-Laure

AU - Deng, Difan

AU - Lindauer, Marius

N1 - Funding Information: Bavarian Ministry for Economic Affairs, Infrastructure, Transport and Technology, Grant/Award Number: BAYERN DIGITAL II; Bundesministerium für Bildung und Forschung, Grant/Award Number: 01IS18036A; Deutsche Forschungsgemeinschaft (Collaborative Research Center), Grant/Award Number: SFB 876‐A3; Federal Statistical Office of Germany; Research Center “Trustworthy Data Science and Security” Funding information Funding Information: The authors of this work take full responsibilities for its content. This work was supported by the Federal Statistical Office of Germany; the Deutsche Forschungsgemeinschaft (DFG) within the Collaborative Research Center SFB 876, A3; the Research Center “Trustworthy Data Science and Security”, one of the Research Alliance centers within the https://uaruhr.de ; the German Federal Ministry of Education and Research (BMBF) under Grant No. 01IS18036A; and the Bavarian Ministry for Economic Affairs, Infrastructure, Transport and Technology through the Center for Analytics‐Data‐Applications (ADA‐Center) within the framework of “BAYERN DIGITAL II.”

PY - 2023/3/10

Y1 - 2023/3/10

N2 - Most machine learning algorithms are configured by one or several hyperparameters that must be carefully chosen and often considerably impact performance. To avoid a time consuming and unreproducible manual trial-and-error process to find well-performing hyperparameter configurations, various automatic hyperparameter optimization (HPO) methods, e.g., based on resampling error estimation for supervised machine learning, can be employed. After introducing HPO from a general perspective, this paper reviews important HPO methods such as grid or random search, evolutionary algorithms, Bayesian optimization, Hyperband and racing. It gives practical recommendations regarding important choices to be made when conducting HPO, including the HPO algorithms themselves, performance evaluation, how to combine HPO with ML pipelines, runtime improvements, and parallelization. This work is accompanied by an appendix that contains information on specific software packages in R and Python, as well as information and recommended hyperparameter search spaces for specific learning algorithms. We also provide notebooks that demonstrate concepts from this work as supplementary files.

AB - Most machine learning algorithms are configured by one or several hyperparameters that must be carefully chosen and often considerably impact performance. To avoid a time consuming and unreproducible manual trial-and-error process to find well-performing hyperparameter configurations, various automatic hyperparameter optimization (HPO) methods, e.g., based on resampling error estimation for supervised machine learning, can be employed. After introducing HPO from a general perspective, this paper reviews important HPO methods such as grid or random search, evolutionary algorithms, Bayesian optimization, Hyperband and racing. It gives practical recommendations regarding important choices to be made when conducting HPO, including the HPO algorithms themselves, performance evaluation, how to combine HPO with ML pipelines, runtime improvements, and parallelization. This work is accompanied by an appendix that contains information on specific software packages in R and Python, as well as information and recommended hyperparameter search spaces for specific learning algorithms. We also provide notebooks that demonstrate concepts from this work as supplementary files.

KW - stat.ML

KW - cs.LG

KW - automl

KW - tuning

KW - model selection

KW - hyperparameter optimization

KW - machine learning

UR - http://www.scopus.com/inward/record.url?scp=85146325560&partnerID=8YFLogxK

U2 - 10.1002/widm.1484

DO - 10.1002/widm.1484

M3 - Article

VL - 13

JO - Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery

JF - Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery

SN - 1942-4787

IS - 2

M1 - e1484

ER -

Research@Leibniz University

Hyperparameter Optimization: Foundations, Algorithms, Best Practices and Open Challenges

Authors

Research Organisations

External Research Organisations

Details

Abstract

Keywords

ASJC Scopus subject areas

Cite this

By the same author(s)

AMLTK: A Modular AutoML Toolkit in Python

AutoML in Heavily Constrained Applications

Verfahren zum Trainieren eines Algorithmus des maschinellen Lernens durch ein bestärkendes Lernverfahren

MO-SMAC: Multi-objective Sequential Model-based Algorithm Configuration

How Green is AutoML for Tabular Data?

AMLTK: A Modular AutoML Toolkit in Python

AutoML in Heavily Constrained Applications

Verfahren zum Trainieren eines Algorithmus des maschinellen Lernens durch ein bestärkendes Lernverfahren

MO-SMAC: Multi-objective Sequential Model-based Algorithm Configuration

How Green is AutoML for Tabular Data?

AMLTK: A Modular AutoML Toolkit in Python