Details
Original language | English |
---|---|
Title of host publication | Proceedings of the 13th International Conference on Agents and Artificial Intelligence |
Subtitle of host publication | Volume 2: ICAART |
Editors | Ana Paula Rocha, Luc Steels, Jaap van den Herik |
Pages | 237-245 |
Number of pages | 9 |
ISBN (electronic) | 9789897584848 |
Publication status | Published - 2021 |
Event | 13th International Conference on Agents and Artificial Intelligence, ICAART 2021 - Virtual, Online, Austria Duration: 4 Feb 2021 → 6 Feb 2021 |
Publication series
Name | ICAART |
---|---|
ISSN (electronic) | 2184-433X |
Abstract
In multi-agent reinforcement learning, several agents converge together towards optimal policies that solve complex decision-making problems. This convergence process is inherently stochastic, meaning that its use in safety-critical domains can be problematic. To address this issue, we introduce a new approach that combines multi-agent reinforcement learning with a formal verification technique termed quantitative verification. Our assured multi-agent reinforcement learning approach constrains agent behaviours in ways that ensure the satisfaction of requirements associated with the safety, reliability, and other non-functional aspects of the decision-making problem being solved. The approach comprises three stages. First, it models the problem as an abstract Markov decision process, allowing quantitative verification to be applied. Next, this abstract model is used to synthesise a policy which satisfies safety, reliability, and performance constraints. Finally, the synthesised policy is used to constrain agent behaviour within the low-level problem with a greatly lowered risk of constraint violations. We demonstrate our approach using a safety-critical multi-agent patrolling problem.
Keywords
- Assurance, Multi-agent reinforcement learning, Multi-agent system, Quantitative verification, Reinforcement learning
ASJC Scopus subject areas
- Computer Science(all)
- Artificial Intelligence
- Computer Science(all)
- Software
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
Proceedings of the 13th International Conference on Agents and Artificial Intelligence: Volume 2: ICAART . ed. / Ana Paula Rocha; Luc Steels; Jaap van den Herik. 2021. p. 237-245 (ICAART).
Research output: Chapter in book/report/conference proceeding › Conference contribution › Research › peer review
}
TY - GEN
T1 - Reinforcement Learning with Quantitative Verification for Assured Multi-Agent Policies
AU - Riley, Joshua
AU - Calinescu, Radu
AU - Paterson, Colin
AU - Kudenko, Daniel
AU - Banks, Alec
N1 - Funding Information: This paper presents research sponsored by the UK MOD. The information contained in it should not be interpreted as representing the views of the UK MOD, nor should it be assumed it reflects any current or future UK MOD policy.
PY - 2021
Y1 - 2021
N2 - In multi-agent reinforcement learning, several agents converge together towards optimal policies that solve complex decision-making problems. This convergence process is inherently stochastic, meaning that its use in safety-critical domains can be problematic. To address this issue, we introduce a new approach that combines multi-agent reinforcement learning with a formal verification technique termed quantitative verification. Our assured multi-agent reinforcement learning approach constrains agent behaviours in ways that ensure the satisfaction of requirements associated with the safety, reliability, and other non-functional aspects of the decision-making problem being solved. The approach comprises three stages. First, it models the problem as an abstract Markov decision process, allowing quantitative verification to be applied. Next, this abstract model is used to synthesise a policy which satisfies safety, reliability, and performance constraints. Finally, the synthesised policy is used to constrain agent behaviour within the low-level problem with a greatly lowered risk of constraint violations. We demonstrate our approach using a safety-critical multi-agent patrolling problem.
AB - In multi-agent reinforcement learning, several agents converge together towards optimal policies that solve complex decision-making problems. This convergence process is inherently stochastic, meaning that its use in safety-critical domains can be problematic. To address this issue, we introduce a new approach that combines multi-agent reinforcement learning with a formal verification technique termed quantitative verification. Our assured multi-agent reinforcement learning approach constrains agent behaviours in ways that ensure the satisfaction of requirements associated with the safety, reliability, and other non-functional aspects of the decision-making problem being solved. The approach comprises three stages. First, it models the problem as an abstract Markov decision process, allowing quantitative verification to be applied. Next, this abstract model is used to synthesise a policy which satisfies safety, reliability, and performance constraints. Finally, the synthesised policy is used to constrain agent behaviour within the low-level problem with a greatly lowered risk of constraint violations. We demonstrate our approach using a safety-critical multi-agent patrolling problem.
KW - Assurance
KW - Multi-agent reinforcement learning
KW - Multi-agent system
KW - Quantitative verification
KW - Reinforcement learning
UR - http://www.scopus.com/inward/record.url?scp=85103859598&partnerID=8YFLogxK
U2 - 10.5220/0010258102370245
DO - 10.5220/0010258102370245
M3 - Conference contribution
AN - SCOPUS:85103859598
T3 - ICAART
SP - 237
EP - 245
BT - Proceedings of the 13th International Conference on Agents and Artificial Intelligence
A2 - Rocha, Ana Paula
A2 - Steels, Luc
A2 - van den Herik, Jaap
T2 - 13th International Conference on Agents and Artificial Intelligence, ICAART 2021
Y2 - 4 February 2021 through 6 February 2021
ER -