Data-Driven H∞ Optimal Output Feedback Control for Linear Discrete-Time Systems Based on Off-Policy Q-Learning

Li Zhang; Jialu Fan; Wenqian Xue; Victor G. Lopez; Jinna Li; Tianyou Chai; Frank L. Lewis

doi:10.1109/TNNLS.2021.3112457

Details

Original language	English
Pages (from-to)	3553-3567
Number of pages	15
Journal	IEEE Transactions on Neural Networks and Learning Systems
Volume	34
Issue number	7
Publication status	Published - 18 Oct 2021

Abstract

This article develops two novel output feedback (OPFB) Q-learning algorithms, on-policy Q-learning and off-policy Q-learning, to solve H∞ static OPFB control problem of linear discrete-time (DT) systems. The primary contribution of the proposed algorithms lies in a newly developed OPFB control algorithm form for completely unknown systems. Under the premise of satisfying disturbance attenuation conditions, the conditions for the existence of the optimal OPFB solution are given. The convergence of the proposed Q-learning methods, and the difference and equivalence of two algorithms are rigorously proven. Moreover, considering the effects brought by probing noise for the persistence of excitation (PE), the proposed off-policy Q-learning method has the advantage of being immune to probing noise and avoiding biasedness of solution. Simulation results are presented to verify the effectiveness of the proposed approaches.

Keywords

Hcontrol, off-policy Q-learning, Q-learning, static output feedback (OPFB), zero-sum game

ASJC Scopus subject areas

Computer Science(all)
Software
Computer Science(all)
Computer Science Applications
Computer Science(all)
Computer Networks and Communications
Computer Science(all)
Artificial Intelligence

Cite this

Data-Driven H_∞ Optimal Output Feedback Control for Linear Discrete-Time Systems Based on Off-Policy Q-Learning. / Zhang, Li; Fan, Jialu; Xue, Wenqian et al.
In: IEEE Transactions on Neural Networks and Learning Systems, Vol. 34, No. 7, 18.10.2021, p. 3553-3567.

Research output: Contribution to journal › Article › Research › peer review

Zhang, L, Fan, J, Xue, W, Lopez, VG, Li, J, Chai, T & Lewis, FL 2021, 'Data-Driven H_∞ Optimal Output Feedback Control for Linear Discrete-Time Systems Based on Off-Policy Q-Learning', IEEE Transactions on Neural Networks and Learning Systems, vol. 34, no. 7, pp. 3553-3567. https://doi.org/10.1109/TNNLS.2021.3112457

Zhang, L., Fan, J., Xue, W., Lopez, V. G., Li, J., Chai, T., & Lewis, F. L. (2021). Data-Driven H_∞ Optimal Output Feedback Control for Linear Discrete-Time Systems Based on Off-Policy Q-Learning. IEEE Transactions on Neural Networks and Learning Systems, 34(7), 3553-3567. https://doi.org/10.1109/TNNLS.2021.3112457

Zhang L, Fan J, Xue W, Lopez VG, Li J, Chai T et al. Data-Driven H_∞ Optimal Output Feedback Control for Linear Discrete-Time Systems Based on Off-Policy Q-Learning. IEEE Transactions on Neural Networks and Learning Systems. 2021 Oct 18;34(7):3553-3567. doi: 10.1109/TNNLS.2021.3112457

Zhang, Li ; Fan, Jialu ; Xue, Wenqian et al. / Data-Driven H_∞ Optimal Output Feedback Control for Linear Discrete-Time Systems Based on Off-Policy Q-Learning. In: IEEE Transactions on Neural Networks and Learning Systems. 2021 ; Vol. 34, No. 7. pp. 3553-3567.

Download

@article{d9bc7c1487d0485c852325f4a6c51e33,

title = "Data-Driven H∞ Optimal Output Feedback Control for Linear Discrete-Time Systems Based on Off-Policy Q-Learning",

abstract = "This article develops two novel output feedback (OPFB) Q-learning algorithms, on-policy Q-learning and off-policy Q-learning, to solve H∞ static OPFB control problem of linear discrete-time (DT) systems. The primary contribution of the proposed algorithms lies in a newly developed OPFB control algorithm form for completely unknown systems. Under the premise of satisfying disturbance attenuation conditions, the conditions for the existence of the optimal OPFB solution are given. The convergence of the proposed Q-learning methods, and the difference and equivalence of two algorithms are rigorously proven. Moreover, considering the effects brought by probing noise for the persistence of excitation (PE), the proposed off-policy Q-learning method has the advantage of being immune to probing noise and avoiding biasedness of solution. Simulation results are presented to verify the effectiveness of the proposed approaches.",

keywords = "Hcontrol, off-policy Q-learning, Q-learning, static output feedback (OPFB), zero-sum game",

author = "Li Zhang and Jialu Fan and Wenqian Xue and Lopez, {Victor G.} and Jinna Li and Tianyou Chai and Lewis, {Frank L.}",

note = "Funding Information: This work was supported in part by the NSFC under Grant 61991400, Grant 61991404, Grant 61533015, and Grant 62073158; in part by the 2020 Science and Technology Major Project of Liaoning Province under Grant 2020JH1/10100008; and in part by the Liaoning Revitalization Talents Program under Grant XLYC2007135.",

year = "2021",

month = oct,

day = "18",

doi = "10.1109/TNNLS.2021.3112457",

language = "English",

volume = "34",

pages = "3553--3567",

journal = "IEEE Transactions on Neural Networks and Learning Systems",

issn = "2162-237X",

publisher = "IEEE Computational Intelligence Society",

number = "7",

}

Download

TY - JOUR

T1 - Data-Driven H∞ Optimal Output Feedback Control for Linear Discrete-Time Systems Based on Off-Policy Q-Learning

AU - Zhang, Li

AU - Fan, Jialu

AU - Xue, Wenqian

AU - Lopez, Victor G.

AU - Li, Jinna

AU - Chai, Tianyou

AU - Lewis, Frank L.

N1 - Funding Information: This work was supported in part by the NSFC under Grant 61991400, Grant 61991404, Grant 61533015, and Grant 62073158; in part by the 2020 Science and Technology Major Project of Liaoning Province under Grant 2020JH1/10100008; and in part by the Liaoning Revitalization Talents Program under Grant XLYC2007135.

PY - 2021/10/18

Y1 - 2021/10/18

N2 - This article develops two novel output feedback (OPFB) Q-learning algorithms, on-policy Q-learning and off-policy Q-learning, to solve H∞ static OPFB control problem of linear discrete-time (DT) systems. The primary contribution of the proposed algorithms lies in a newly developed OPFB control algorithm form for completely unknown systems. Under the premise of satisfying disturbance attenuation conditions, the conditions for the existence of the optimal OPFB solution are given. The convergence of the proposed Q-learning methods, and the difference and equivalence of two algorithms are rigorously proven. Moreover, considering the effects brought by probing noise for the persistence of excitation (PE), the proposed off-policy Q-learning method has the advantage of being immune to probing noise and avoiding biasedness of solution. Simulation results are presented to verify the effectiveness of the proposed approaches.

AB - This article develops two novel output feedback (OPFB) Q-learning algorithms, on-policy Q-learning and off-policy Q-learning, to solve H∞ static OPFB control problem of linear discrete-time (DT) systems. The primary contribution of the proposed algorithms lies in a newly developed OPFB control algorithm form for completely unknown systems. Under the premise of satisfying disturbance attenuation conditions, the conditions for the existence of the optimal OPFB solution are given. The convergence of the proposed Q-learning methods, and the difference and equivalence of two algorithms are rigorously proven. Moreover, considering the effects brought by probing noise for the persistence of excitation (PE), the proposed off-policy Q-learning method has the advantage of being immune to probing noise and avoiding biasedness of solution. Simulation results are presented to verify the effectiveness of the proposed approaches.

KW - Hcontrol

KW - off-policy Q-learning

KW - Q-learning

KW - static output feedback (OPFB)

KW - zero-sum game

UR - http://www.scopus.com/inward/record.url?scp=85164272276&partnerID=8YFLogxK

U2 - 10.1109/TNNLS.2021.3112457

DO - 10.1109/TNNLS.2021.3112457

M3 - Article

C2 - 34662280

AN - SCOPUS:85164272276

VL - 34

SP - 3553

EP - 3567

JO - IEEE Transactions on Neural Networks and Learning Systems

JF - IEEE Transactions on Neural Networks and Learning Systems

SN - 2162-237X

IS - 7

ER -

Research@Leibniz University

Data-Driven H_∞ Optimal Output Feedback Control for Linear Discrete-Time Systems Based on Off-Policy Q-Learning

Authors

Research Organisations

External Research Organisations

Details

Abstract

Keywords

ASJC Scopus subject areas

Cite this