An efficient data-based off-policy Q-learning algorithm for optimal output feedback control of linear systems

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschung

Autoren

Organisationseinheiten

Forschungs-netzwerk anzeigen

Details

OriginalspracheEnglisch
Titel des SammelwerksProceedings of the 6th Annual Learning for Dynamics & Control Conference
Seiten312-323
Seitenumfang12
PublikationsstatusVeröffentlicht - 14 Juli 2024

Publikationsreihe

NameProceedings of Machine Learning Research
Band242
ISSN (elektronisch)2640-3498

Abstract

In this paper, we present a Q-learning algorithm to solve the optimal output regulation problem for discrete-time LTI systems. This off-policy algorithm only relies on using persistently exciting input-output data, measured offline. No model knowledge or state measurements are needed and the obtained optimal policy only uses past input-output information. Moreover, our formulation of the proposed algorithm renders it computationally efficient. We provide conditions that guarantee the convergence of the algorithm to the optimal solution. Finally, the performance of our method is compared to existing algorithms in the literature.

ASJC Scopus Sachgebiete

Zitieren

An efficient data-based off-policy Q-learning algorithm for optimal output feedback control of linear systems. / Alsalti, Mohammad Salahaldeen Ahmad; Lopez Mejia, Victor Gabriel; Müller, Matthias.
Proceedings of the 6th Annual Learning for Dynamics & Control Conference. 2024. S. 312-323 (Proceedings of Machine Learning Research; Band 242).

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschung

Alsalti, MSA, Lopez Mejia, VG & Müller, M 2024, An efficient data-based off-policy Q-learning algorithm for optimal output feedback control of linear systems. in Proceedings of the 6th Annual Learning for Dynamics & Control Conference. Proceedings of Machine Learning Research, Bd. 242, S. 312-323. https://doi.org/10.48550/arXiv.2312.03451
Alsalti, M. S. A., Lopez Mejia, V. G., & Müller, M. (2024). An efficient data-based off-policy Q-learning algorithm for optimal output feedback control of linear systems. In Proceedings of the 6th Annual Learning for Dynamics & Control Conference (S. 312-323). (Proceedings of Machine Learning Research; Band 242). https://doi.org/10.48550/arXiv.2312.03451
Alsalti MSA, Lopez Mejia VG, Müller M. An efficient data-based off-policy Q-learning algorithm for optimal output feedback control of linear systems. in Proceedings of the 6th Annual Learning for Dynamics & Control Conference. 2024. S. 312-323. (Proceedings of Machine Learning Research). doi: 10.48550/arXiv.2312.03451
Alsalti, Mohammad Salahaldeen Ahmad ; Lopez Mejia, Victor Gabriel ; Müller, Matthias. / An efficient data-based off-policy Q-learning algorithm for optimal output feedback control of linear systems. Proceedings of the 6th Annual Learning for Dynamics & Control Conference. 2024. S. 312-323 (Proceedings of Machine Learning Research).
Download
@inproceedings{147aafd0d0f14e2488caf532a8a50e4b,
title = "An efficient data-based off-policy Q-learning algorithm for optimal output feedback control of linear systems",
abstract = " In this paper, we present a Q-learning algorithm to solve the optimal output regulation problem for discrete-time LTI systems. This off-policy algorithm only relies on using persistently exciting input-output data, measured offline. No model knowledge or state measurements are needed and the obtained optimal policy only uses past input-output information. Moreover, our formulation of the proposed algorithm renders it computationally efficient. We provide conditions that guarantee the convergence of the algorithm to the optimal solution. Finally, the performance of our method is compared to existing algorithms in the literature. ",
keywords = "Proceedings of the 6th Annual Learning for Dynamics & Control Conference, Q-learning, optimal output regulation, Data-based control, reinforcement learning",
author = "Alsalti, {Mohammad Salahaldeen Ahmad} and {Lopez Mejia}, {Victor Gabriel} and Matthias M{\"u}ller",
year = "2024",
month = jul,
day = "14",
doi = "10.48550/arXiv.2312.03451",
language = "English",
series = "Proceedings of Machine Learning Research",
pages = "312--323",
booktitle = "Proceedings of the 6th Annual Learning for Dynamics & Control Conference",

}

Download

TY - GEN

T1 - An efficient data-based off-policy Q-learning algorithm for optimal output feedback control of linear systems

AU - Alsalti, Mohammad Salahaldeen Ahmad

AU - Lopez Mejia, Victor Gabriel

AU - Müller, Matthias

PY - 2024/7/14

Y1 - 2024/7/14

N2 - In this paper, we present a Q-learning algorithm to solve the optimal output regulation problem for discrete-time LTI systems. This off-policy algorithm only relies on using persistently exciting input-output data, measured offline. No model knowledge or state measurements are needed and the obtained optimal policy only uses past input-output information. Moreover, our formulation of the proposed algorithm renders it computationally efficient. We provide conditions that guarantee the convergence of the algorithm to the optimal solution. Finally, the performance of our method is compared to existing algorithms in the literature.

AB - In this paper, we present a Q-learning algorithm to solve the optimal output regulation problem for discrete-time LTI systems. This off-policy algorithm only relies on using persistently exciting input-output data, measured offline. No model knowledge or state measurements are needed and the obtained optimal policy only uses past input-output information. Moreover, our formulation of the proposed algorithm renders it computationally efficient. We provide conditions that guarantee the convergence of the algorithm to the optimal solution. Finally, the performance of our method is compared to existing algorithms in the literature.

KW - Proceedings of the 6th Annual Learning for Dynamics & Control Conference

KW - Q-learning

KW - optimal output regulation

KW - Data-based control

KW - reinforcement learning

UR - http://www.scopus.com/inward/record.url?scp=85203683766&partnerID=8YFLogxK

U2 - 10.48550/arXiv.2312.03451

DO - 10.48550/arXiv.2312.03451

M3 - Conference contribution

T3 - Proceedings of Machine Learning Research

SP - 312

EP - 323

BT - Proceedings of the 6th Annual Learning for Dynamics & Control Conference

ER -

Von denselben Autoren