Higher Replay Ratio Empowers Sample-Efficient Multi-Agent Reinforcement Learning

Linjie Xu; Zichuan Liu; Alexander Dockhorn; Diego Perez-Liebana; Jinyu Wang; Lei Song; Jiang Bian

doi:10.48550/arXiv.2404.09715

Details

Original language	English
Title of host publication	Proceedings of the 2024 IEEE Conference on Games, CoG 2024
Publisher	IEEE Computer Society
ISBN (electronic)	9798350350678
ISBN (print)	979-8-3503-5068-5
Publication status	Published - 5 Aug 2024
Event	6th Annual IEEE Conference on Games, CoG 2024 - Milan, Italy Duration: 5 Aug 2024 → 8 Aug 2024

Publication series

Name	IEEE Conference on Computatonal Intelligence and Games, CIG
ISSN (Print)	2325-4270
ISSN (electronic)	2325-4289

Abstract

One of the notorious issues for Reinforcement Learning (RL) is poor sample efficiency. Compared to single agent RL, the sample efficiency for Multi-Agent Reinforcement Learning (MARL) is more challenging because of its inherent partial observability, non-stationary training, and enormous strategy space. Although much effort has been devoted to developing new methods and enhancing sample efficiency, we look at the widely used episodic training mechanism. In each training step, tens of frames are collected, but only one gradient step is made. We argue that this episodic training could be a source of poor sample efficiency. To better exploit the data already collected, we propose to increase the frequency of the gradient updates per environment interaction (a.k.a. Replay Ratio or Update-To-Data ratio). To show its generality, we evaluate 3 MARL methods on 6 SMAC tasks. The empirical results validate that a higher replay ratio significantly improves the sample efficiency for MARL algorithms. The codes to reimplement the results presented in this paper are open-sourced at https://github.com/egg-west/rr-for-MARL.

Keywords

Multi-Agent Reinforcement Learning, Reinforcement Learning, Sample efficiency, Starcraft II

ASJC Scopus subject areas

Computer Science(all)
Artificial Intelligence
Computer Science(all)
Computer Graphics and Computer-Aided Design
Computer Science(all)
Computer Vision and Pattern Recognition
Computer Science(all)
Human-Computer Interaction
Computer Science(all)
Software

Cite this

Higher Replay Ratio Empowers Sample-Efficient Multi-Agent Reinforcement Learning. / Xu, Linjie; Liu, Zichuan; Dockhorn, Alexander et al.
Proceedings of the 2024 IEEE Conference on Games, CoG 2024. IEEE Computer Society, 2024. (IEEE Conference on Computatonal Intelligence and Games, CIG).

Research output: Chapter in book/report/conference proceeding › Conference contribution › Research › peer review

Xu, L, Liu, Z, Dockhorn, A, Perez-Liebana, D, Wang, J, Song, L & Bian, J 2024, Higher Replay Ratio Empowers Sample-Efficient Multi-Agent Reinforcement Learning. in Proceedings of the 2024 IEEE Conference on Games, CoG 2024. IEEE Conference on Computatonal Intelligence and Games, CIG, IEEE Computer Society, 6th Annual IEEE Conference on Games, CoG 2024, Milan, Italy, 5 Aug 2024. https://doi.org/10.48550/arXiv.2404.09715, https://doi.org/10.1109/CoG60054.2024.10645658

Xu, L., Liu, Z., Dockhorn, A., Perez-Liebana, D., Wang, J., Song, L., & Bian, J. (2024). Higher Replay Ratio Empowers Sample-Efficient Multi-Agent Reinforcement Learning. In Proceedings of the 2024 IEEE Conference on Games, CoG 2024 (IEEE Conference on Computatonal Intelligence and Games, CIG). IEEE Computer Society. https://doi.org/10.48550/arXiv.2404.09715, https://doi.org/10.1109/CoG60054.2024.10645658

Xu L, Liu Z, Dockhorn A, Perez-Liebana D, Wang J, Song L et al. Higher Replay Ratio Empowers Sample-Efficient Multi-Agent Reinforcement Learning. In Proceedings of the 2024 IEEE Conference on Games, CoG 2024. IEEE Computer Society. 2024. (IEEE Conference on Computatonal Intelligence and Games, CIG). doi: 10.48550/arXiv.2404.09715, 10.1109/CoG60054.2024.10645658

Xu, Linjie ; Liu, Zichuan ; Dockhorn, Alexander et al. / Higher Replay Ratio Empowers Sample-Efficient Multi-Agent Reinforcement Learning. Proceedings of the 2024 IEEE Conference on Games, CoG 2024. IEEE Computer Society, 2024. (IEEE Conference on Computatonal Intelligence and Games, CIG).

Download

@inproceedings{8dee49296a804a38ad3697efe4e17ea6,

title = "Higher Replay Ratio Empowers Sample-Efficient Multi-Agent Reinforcement Learning",

abstract = "One of the notorious issues for Reinforcement Learning (RL) is poor sample efficiency. Compared to single agent RL, the sample efficiency for Multi-Agent Reinforcement Learning (MARL) is more challenging because of its inherent partial observability, non-stationary training, and enormous strategy space. Although much effort has been devoted to developing new methods and enhancing sample efficiency, we look at the widely used episodic training mechanism. In each training step, tens of frames are collected, but only one gradient step is made. We argue that this episodic training could be a source of poor sample efficiency. To better exploit the data already collected, we propose to increase the frequency of the gradient updates per environment interaction (a.k.a. Replay Ratio or Update-To-Data ratio). To show its generality, we evaluate 3 MARL methods on 6 SMAC tasks. The empirical results validate that a higher replay ratio significantly improves the sample efficiency for MARL algorithms. The codes to reimplement the results presented in this paper are open-sourced at https://github.com/egg-west/rr-for-MARL.",

keywords = "Multi-Agent Reinforcement Learning, Reinforcement Learning, Sample efficiency, Starcraft II",

author = "Linjie Xu and Zichuan Liu and Alexander Dockhorn and Diego Perez-Liebana and Jinyu Wang and Lei Song and Jiang Bian",

note = "Publisher Copyright: {\textcopyright} 2024 IEEE.; 6th Annual IEEE Conference on Games, CoG 2024 ; Conference date: 05-08-2024 Through 08-08-2024",

year = "2024",

month = aug,

day = "5",

doi = "10.48550/arXiv.2404.09715",

language = "English",

isbn = "979-8-3503-5068-5",

series = "IEEE Conference on Computatonal Intelligence and Games, CIG",

publisher = "IEEE Computer Society",

booktitle = "Proceedings of the 2024 IEEE Conference on Games, CoG 2024",

address = "United States",

}

Download

TY - GEN

T1 - Higher Replay Ratio Empowers Sample-Efficient Multi-Agent Reinforcement Learning

AU - Xu, Linjie

AU - Liu, Zichuan

AU - Dockhorn, Alexander

AU - Perez-Liebana, Diego

AU - Wang, Jinyu

AU - Song, Lei

AU - Bian, Jiang

PY - 2024/8/5

Y1 - 2024/8/5

N2 - One of the notorious issues for Reinforcement Learning (RL) is poor sample efficiency. Compared to single agent RL, the sample efficiency for Multi-Agent Reinforcement Learning (MARL) is more challenging because of its inherent partial observability, non-stationary training, and enormous strategy space. Although much effort has been devoted to developing new methods and enhancing sample efficiency, we look at the widely used episodic training mechanism. In each training step, tens of frames are collected, but only one gradient step is made. We argue that this episodic training could be a source of poor sample efficiency. To better exploit the data already collected, we propose to increase the frequency of the gradient updates per environment interaction (a.k.a. Replay Ratio or Update-To-Data ratio). To show its generality, we evaluate 3 MARL methods on 6 SMAC tasks. The empirical results validate that a higher replay ratio significantly improves the sample efficiency for MARL algorithms. The codes to reimplement the results presented in this paper are open-sourced at https://github.com/egg-west/rr-for-MARL.

AB - One of the notorious issues for Reinforcement Learning (RL) is poor sample efficiency. Compared to single agent RL, the sample efficiency for Multi-Agent Reinforcement Learning (MARL) is more challenging because of its inherent partial observability, non-stationary training, and enormous strategy space. Although much effort has been devoted to developing new methods and enhancing sample efficiency, we look at the widely used episodic training mechanism. In each training step, tens of frames are collected, but only one gradient step is made. We argue that this episodic training could be a source of poor sample efficiency. To better exploit the data already collected, we propose to increase the frequency of the gradient updates per environment interaction (a.k.a. Replay Ratio or Update-To-Data ratio). To show its generality, we evaluate 3 MARL methods on 6 SMAC tasks. The empirical results validate that a higher replay ratio significantly improves the sample efficiency for MARL algorithms. The codes to reimplement the results presented in this paper are open-sourced at https://github.com/egg-west/rr-for-MARL.

KW - Multi-Agent Reinforcement Learning

KW - Reinforcement Learning

KW - Sample efficiency

KW - Starcraft II

UR - http://www.scopus.com/inward/record.url?scp=85203526586&partnerID=8YFLogxK

U2 - 10.48550/arXiv.2404.09715

DO - 10.48550/arXiv.2404.09715

M3 - Conference contribution

AN - SCOPUS:85203526586

SN - 979-8-3503-5068-5

T3 - IEEE Conference on Computatonal Intelligence and Games, CIG

BT - Proceedings of the 2024 IEEE Conference on Games, CoG 2024

PB - IEEE Computer Society

T2 - 6th Annual IEEE Conference on Games, CoG 2024

Y2 - 5 August 2024 through 8 August 2024

ER -

Research@Leibniz University

Higher Replay Ratio Empowers Sample-Efficient Multi-Agent Reinforcement Learning

Authors

Research Organisations

External Research Organisations

Details

Publication series

Abstract

Keywords

ASJC Scopus subject areas

Cite this

By the same author(s)

Strategy Game-Playing with Size-Constrained State Abstraction

Markov Senior: Learning Markov Junior Grammars to Generate User-specified Content

Match Point AI: A Novel AI Framework for Evaluating Data-Driven Tennis Strategies

Online Optimization of Curriculum Learning Schedules using Evolutionary Optimization

Personalized Dynamic Difficulty Adjustment Imitation Learning Meets Reinforcement Learning