An efficient data-based off-policy Q-learning algorithm for optimal output feedback control of linear systems

Mohammad Salahaldeen Ahmad Alsalti; Victor Gabriel Lopez Mejia; Matthias Müller

doi:10.48550/arXiv.2312.03451

Details

Originalsprache	Englisch
Titel des Sammelwerks	Proceedings of the 6th Annual Learning for Dynamics & Control Conference
Seiten	312-323
Seitenumfang	12
Publikationsstatus	Veröffentlicht - 14 Juli 2024

Publikationsreihe

Name	Proceedings of Machine Learning Research
Band	242
ISSN (elektronisch)	2640-3498

Abstract

In this paper, we present a Q-learning algorithm to solve the optimal output regulation problem for discrete-time LTI systems. This off-policy algorithm only relies on using persistently exciting input-output data, measured offline. No model knowledge or state measurements are needed and the obtained optimal policy only uses past input-output information. Moreover, our formulation of the proposed algorithm renders it computationally efficient. We provide conditions that guarantee the convergence of the algorithm to the optimal solution. Finally, the performance of our method is compared to existing algorithms in the literature.

ASJC Scopus Sachgebiete

Informatik (insg.)
Artificial intelligence
Informatik (insg.)
Software
Ingenieurwesen (insg.)
Steuerungs- und Systemtechnik
Mathematik (insg.)
Statistik und Wahrscheinlichkeit

Zitieren

An efficient data-based off-policy Q-learning algorithm for optimal output feedback control of linear systems. / Alsalti, Mohammad Salahaldeen Ahmad; Lopez Mejia, Victor Gabriel; Müller, Matthias.
Proceedings of the 6th Annual Learning for Dynamics & Control Conference. 2024. S. 312-323 (Proceedings of Machine Learning Research; Band 242).

Publikation: Beitrag in Buch/Bericht/Sammelwerk/Konferenzband › Aufsatz in Konferenzband › Forschung

Alsalti, MSA, Lopez Mejia, VG & Müller, M 2024, An efficient data-based off-policy Q-learning algorithm for optimal output feedback control of linear systems. in Proceedings of the 6th Annual Learning for Dynamics & Control Conference. Proceedings of Machine Learning Research, Bd. 242, S. 312-323. https://doi.org/10.48550/arXiv.2312.03451

Alsalti, M. S. A., Lopez Mejia, V. G., & Müller, M. (2024). An efficient data-based off-policy Q-learning algorithm for optimal output feedback control of linear systems. In Proceedings of the 6th Annual Learning for Dynamics & Control Conference (S. 312-323). (Proceedings of Machine Learning Research; Band 242). https://doi.org/10.48550/arXiv.2312.03451

Alsalti MSA, Lopez Mejia VG, Müller M. An efficient data-based off-policy Q-learning algorithm for optimal output feedback control of linear systems. in Proceedings of the 6th Annual Learning for Dynamics & Control Conference. 2024. S. 312-323. (Proceedings of Machine Learning Research). doi: 10.48550/arXiv.2312.03451

Alsalti, Mohammad Salahaldeen Ahmad ; Lopez Mejia, Victor Gabriel ; Müller, Matthias. / An efficient data-based off-policy Q-learning algorithm for optimal output feedback control of linear systems. Proceedings of the 6th Annual Learning for Dynamics & Control Conference. 2024. S. 312-323 (Proceedings of Machine Learning Research).

Download

@inproceedings{147aafd0d0f14e2488caf532a8a50e4b,

title = "An efficient data-based off-policy Q-learning algorithm for optimal output feedback control of linear systems",

abstract = " In this paper, we present a Q-learning algorithm to solve the optimal output regulation problem for discrete-time LTI systems. This off-policy algorithm only relies on using persistently exciting input-output data, measured offline. No model knowledge or state measurements are needed and the obtained optimal policy only uses past input-output information. Moreover, our formulation of the proposed algorithm renders it computationally efficient. We provide conditions that guarantee the convergence of the algorithm to the optimal solution. Finally, the performance of our method is compared to existing algorithms in the literature. ",

keywords = "Proceedings of the 6th Annual Learning for Dynamics & Control Conference, Q-learning, optimal output regulation, Data-based control, reinforcement learning",

author = "Alsalti, {Mohammad Salahaldeen Ahmad} and {Lopez Mejia}, {Victor Gabriel} and Matthias M{\"u}ller",

year = "2024",

month = jul,

day = "14",

doi = "10.48550/arXiv.2312.03451",

language = "English",

series = "Proceedings of Machine Learning Research",

pages = "312--323",

booktitle = "Proceedings of the 6th Annual Learning for Dynamics & Control Conference",

}

Download

TY - GEN

T1 - An efficient data-based off-policy Q-learning algorithm for optimal output feedback control of linear systems

AU - Alsalti, Mohammad Salahaldeen Ahmad

AU - Lopez Mejia, Victor Gabriel

AU - Müller, Matthias

PY - 2024/7/14

Y1 - 2024/7/14

N2 - In this paper, we present a Q-learning algorithm to solve the optimal output regulation problem for discrete-time LTI systems. This off-policy algorithm only relies on using persistently exciting input-output data, measured offline. No model knowledge or state measurements are needed and the obtained optimal policy only uses past input-output information. Moreover, our formulation of the proposed algorithm renders it computationally efficient. We provide conditions that guarantee the convergence of the algorithm to the optimal solution. Finally, the performance of our method is compared to existing algorithms in the literature.

AB - In this paper, we present a Q-learning algorithm to solve the optimal output regulation problem for discrete-time LTI systems. This off-policy algorithm only relies on using persistently exciting input-output data, measured offline. No model knowledge or state measurements are needed and the obtained optimal policy only uses past input-output information. Moreover, our formulation of the proposed algorithm renders it computationally efficient. We provide conditions that guarantee the convergence of the algorithm to the optimal solution. Finally, the performance of our method is compared to existing algorithms in the literature.

KW - Proceedings of the 6th Annual Learning for Dynamics & Control Conference

KW - Q-learning

KW - optimal output regulation

KW - Data-based control

KW - reinforcement learning

UR - http://www.scopus.com/inward/record.url?scp=85203683766&partnerID=8YFLogxK

U2 - 10.48550/arXiv.2312.03451

DO - 10.48550/arXiv.2312.03451

M3 - Conference contribution

T3 - Proceedings of Machine Learning Research

SP - 312

EP - 323

BT - Proceedings of the 6th Annual Learning for Dynamics & Control Conference

ER -

Research@Leibniz University

An efficient data-based off-policy Q-learning algorithm for optimal output feedback control of linear systems

Autorschaft

Organisationseinheiten

Details

Publikationsreihe

Abstract

ASJC Scopus Sachgebiete

Zitieren

Von denselben Autoren

Robust Control of Constrained Linear Systems using Online Convex Optimization and a Reference Governor

Directly Deposited Thin-Film Strain Gauges for Force Measurement at Guide Carriages

Distributed MPC for Self-Organized Cooperation of Multi-Agent Systems

Informed Circular Fields: A Global Reactive Obstacle Avoidance Framework for Robotic Manipulator

Distributed Economic MPC with Adaptive Terminal Weights

Robust Control of Constrained Linear Systems using Online Convex Optimization and a Reference Governor

Directly Deposited Thin-Film Strain Gauges for Force Measurement at Guide Carriages

Distributed MPC for Self-Organized Cooperation of Multi-Agent Systems

Informed Circular Fields: A Global Reactive Obstacle Avoidance Framework for Robotic Manipulator

Distributed Economic MPC with Adaptive Terminal Weights

Robust Control of Constrained Linear Systems using Online Convex Optimization and a Reference Governor