Details
Originalsprache | Englisch |
---|---|
Titel des Sammelwerks | Proceedings of the 6th Annual Learning for Dynamics & Control Conference |
Seiten | 312-323 |
Seitenumfang | 12 |
Publikationsstatus | Veröffentlicht - 14 Juli 2024 |
Publikationsreihe
Name | Proceedings of Machine Learning Research |
---|---|
Band | 242 |
ISSN (elektronisch) | 2640-3498 |
Abstract
ASJC Scopus Sachgebiete
- Informatik (insg.)
- Artificial intelligence
- Informatik (insg.)
- Software
- Ingenieurwesen (insg.)
- Steuerungs- und Systemtechnik
- Mathematik (insg.)
- Statistik und Wahrscheinlichkeit
Zitieren
- Standard
- Harvard
- Apa
- Vancouver
- BibTex
- RIS
Proceedings of the 6th Annual Learning for Dynamics & Control Conference. 2024. S. 312-323 (Proceedings of Machine Learning Research; Band 242).
Publikation: Beitrag in Buch/Bericht/Sammelwerk/Konferenzband › Aufsatz in Konferenzband › Forschung
}
TY - GEN
T1 - An efficient data-based off-policy Q-learning algorithm for optimal output feedback control of linear systems
AU - Alsalti, Mohammad Salahaldeen Ahmad
AU - Lopez Mejia, Victor Gabriel
AU - Müller, Matthias
PY - 2024/7/14
Y1 - 2024/7/14
N2 - In this paper, we present a Q-learning algorithm to solve the optimal output regulation problem for discrete-time LTI systems. This off-policy algorithm only relies on using persistently exciting input-output data, measured offline. No model knowledge or state measurements are needed and the obtained optimal policy only uses past input-output information. Moreover, our formulation of the proposed algorithm renders it computationally efficient. We provide conditions that guarantee the convergence of the algorithm to the optimal solution. Finally, the performance of our method is compared to existing algorithms in the literature.
AB - In this paper, we present a Q-learning algorithm to solve the optimal output regulation problem for discrete-time LTI systems. This off-policy algorithm only relies on using persistently exciting input-output data, measured offline. No model knowledge or state measurements are needed and the obtained optimal policy only uses past input-output information. Moreover, our formulation of the proposed algorithm renders it computationally efficient. We provide conditions that guarantee the convergence of the algorithm to the optimal solution. Finally, the performance of our method is compared to existing algorithms in the literature.
KW - Proceedings of the 6th Annual Learning for Dynamics & Control Conference
KW - Q-learning
KW - optimal output regulation
KW - Data-based control
KW - reinforcement learning
UR - http://www.scopus.com/inward/record.url?scp=85203683766&partnerID=8YFLogxK
U2 - 10.48550/arXiv.2312.03451
DO - 10.48550/arXiv.2312.03451
M3 - Conference contribution
T3 - Proceedings of Machine Learning Research
SP - 312
EP - 323
BT - Proceedings of the 6th Annual Learning for Dynamics & Control Conference
ER -