Details
Original language | English |
---|---|
Title of host publication | Genetic and Evolutionary Computation Conference (GECCO) |
Publication status | E-pub ahead of print - 2024 |
Abstract
of dynamically setting hyperparameters of an algorithm for a diverse set of instances rather than focusing solely on individual
tasks. Agents trained with Deep Reinforcement Learning (RL) offer a pathway to solve such settings. However, the limited generalization performance of these agents has significantly hindered
the application in DAC. Our hypothesis is that a potential bias in
the training instances limits generalization capabilities. We take
a step towards mitigating this by selecting a representative subset of training instances to overcome overrepresentation and then
retraining the agent on this subset to improve its generalization
performance. For constructing the meta-features for the subset selection, we particularly account for the dynamic nature of the RL
agent by computing time series features on trajectories of actions
and rewards generated by the agent’s interaction with the environment. Through empirical evaluations on the Sigmoid and CMA-ES
benchmarks from the standard benchmark library for DAC, called
DACBench, we discuss the potentials of our selection technique
compared to training on the entire instance set. Our results highlight the efficacy of instance selection in refining DAC policies for
diverse instance spaces.
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
Genetic and Evolutionary Computation Conference (GECCO). 2024.
Research output: Chapter in book/report/conference proceeding › Conference contribution › Research › peer review
}
TY - GEN
T1 - Instance Selection for Dynamic Algorithm Configuration with Reinforcement Learning: Improving Generalization
AU - Benjamins, Carolin
AU - Cenikj, Gjorgjina
AU - Nikolikj, Ana
AU - Mohan, Aditya
AU - Eftimov, Tome
AU - Lindauer, Marius
PY - 2024
Y1 - 2024
N2 - Dynamic Algorithm Configuration (DAC) addresses the challengeof dynamically setting hyperparameters of an algorithm for a diverse set of instances rather than focusing solely on individualtasks. Agents trained with Deep Reinforcement Learning (RL) offer a pathway to solve such settings. However, the limited generalization performance of these agents has significantly hinderedthe application in DAC. Our hypothesis is that a potential bias inthe training instances limits generalization capabilities. We takea step towards mitigating this by selecting a representative subset of training instances to overcome overrepresentation and thenretraining the agent on this subset to improve its generalizationperformance. For constructing the meta-features for the subset selection, we particularly account for the dynamic nature of the RLagent by computing time series features on trajectories of actionsand rewards generated by the agent’s interaction with the environment. Through empirical evaluations on the Sigmoid and CMA-ESbenchmarks from the standard benchmark library for DAC, calledDACBench, we discuss the potentials of our selection techniquecompared to training on the entire instance set. Our results highlight the efficacy of instance selection in refining DAC policies fordiverse instance spaces.
AB - Dynamic Algorithm Configuration (DAC) addresses the challengeof dynamically setting hyperparameters of an algorithm for a diverse set of instances rather than focusing solely on individualtasks. Agents trained with Deep Reinforcement Learning (RL) offer a pathway to solve such settings. However, the limited generalization performance of these agents has significantly hinderedthe application in DAC. Our hypothesis is that a potential bias inthe training instances limits generalization capabilities. We takea step towards mitigating this by selecting a representative subset of training instances to overcome overrepresentation and thenretraining the agent on this subset to improve its generalizationperformance. For constructing the meta-features for the subset selection, we particularly account for the dynamic nature of the RLagent by computing time series features on trajectories of actionsand rewards generated by the agent’s interaction with the environment. Through empirical evaluations on the Sigmoid and CMA-ESbenchmarks from the standard benchmark library for DAC, calledDACBench, we discuss the potentials of our selection techniquecompared to training on the entire instance set. Our results highlight the efficacy of instance selection in refining DAC policies fordiverse instance spaces.
M3 - Conference contribution
BT - Genetic and Evolutionary Computation Conference (GECCO)
ER -